169.254.169.254 timeout


#1

Version: [find the version by …]
Hello,

I have recently run into an issue where the login node does not assume the role attached for the alces user. Looking through the logs the reach to 169.254.169.254 for metadata times out. Root user is able to access, not other users are. Can you point me to the github code section that implements the metadata reach out, or otherwise provide troubleshooting steps?

Jeff


#2

Hi Jeff,

By default, the AWS metadata service is only accessible by the root user – there is a firewall rule added to prevent other users from accessing it as part of the initialize event of the clusterable handler.

However, other users are able access the instance role credentials stored in the /opt/clusterware/etc/config/cluster/instance-aws-iam.rc file in order to access the AWS service permissions that have been granted to the instance.


#3

Thank you for that info. That gives me a very helpful place to start troubleshooting further.


#4

Is there a reason why the alces user wouldn’t be able to access the /opt/clusterware/etc/config/cluster/instance-aws-iam.rc file, or otherwise not be able to use the role attached to the instance? I have mode some architecture changes for compliance in our environment, but none to the code or scripts.


#5

I can’t think of any reason why the file wouldn’t be accessible… The first thing I would check is that the file permissions are correct, i.e.:

[root@login1(mysolocluster) ~]# ls -l /opt/clusterware/etc/config/cluster/instance-aws-iam.rc
-rw-r--r-- 1 root root 891 Apr 11 13:10 /opt/clusterware/etc/config/cluster/instance-aws-iam.rc

Secondly, I’d try manually running an aws CLI command to verify that the credentials are working:

. /opt/clusterware/etc/config/cluster/instance-aws-iam.rc
export AWS_SECRET_ACCESS_KEY="${cw_INSTANCE_aws_iam_role_secret_access_key}"
export AWS_ACCESS_KEY_ID="${cw_INSTANCE_aws_iam_role_access_key_id}"
export AWS_SECURITY_TOKEN="${cw_INSTANCE_aws_iam_role_security_token}"
aws s3 ls s3://alces-flight-<your account hash>

(Aside: you can easily find your account hash by executing alces about aws)

I’d be expecting a response something like this:

[alces@login1(mysolocluster) ~]$ aws s3 ls s3://alces-flight-nmi0ztdmyzm3ztm3
                           PRE customizer/

#6

It is the oddest thing. File permissions are identical to yours shown above. AWS command with our hash for a user returns: “unable to locate credentials. You can configure credentials by running “aws configure”.” When run with root user it returns the same as your command above.


#7

Yes, that is odd! Sounds like the aws CLI tool isn’t able to read the manually set up AWS_* environment variables correctly for some reason.

Another option may be to remove the firewall rule (effectively opening access up to all users), or to add an explicit firewall rule to grant a particular user access:

i.e. Remove the rule manually (issue as root or using sudo):

iptables -D OUTPUT -d 169.254.169.254 -m owner '!' --uid-owner root '!' --gid-owner wheel -j DROP

or grant access to a specific user, e.g. alces in this case (again, issue as root or using sudo):

iptables -I OUTPUT -d 169.254.169.254 -m owner --uid-owner alces -j ACCEPT

#8

Granting the Alces user did indeed fix the issue. Thank you!


#9

Hi, I am also running on AWS (version 2017.2r1) and need my executable to make a call to meta-data, to get the ami-id for identification reasons.
$ curl http://169.254.169.254/latest/meta-data/ami-id
Granting iptables access to the “alces” user as shown by markt above works fine, but I need a solution that doesn’t involve ssh’ing into the cluster every time I start a new one up. Is there a way to fix this in the CloudFormation template (i.e. if I edit the Alces Flight template) or some other way? I don’t want to use Gridware or S3 startup scripts if possible.
Thanks.


#10

Hi @mo10,

There isn’t a way to make this change without using customization scripts as the rule is inserted at the top of the OUTPUT table as part of the cluster initialization routine.

If you don’t want to use customization scripts, perhaps you could add a sudo iptables command to your job script to ensure that access is available before running your job? You can check for an existing rule with sudo iptables -C ... and insert a new rule at the top of the table with sudo iptables -I ... if one isn’t already in place.


#11

Hi markt,

Thanks for the quick and helpful reply. While I may need to resort to your suggestion, I’m rethinking the use of customization scripts. I’ve created an S3 bucket with folders for the “default” profile, and put a script into the “configure.d” subdirectory to make the iptables change, and that works perfectly.

Now I am wondering how I can make this script and/or these folders in my S3 bucket (FlightProfileBucket) available to other people using our software. As a start I’ve made the folders and script file public (read-only), but how would one give the S3 bucket URL to the template for a bucket one does not own, if that is even possible?

Perhaps another option would be to create a Feature Profile for our application, but I haven’t found documentation on how to do that either.