How to configure SLURM to use spank-private-tmp?


#1

Hi everyone,

I’m trying to get the spank-private-tmp plugin to work on an alces flight cluster I have running on AWS.

https://github.com/hpc2n/spank-private-tmp

SLURM is already setup on the cluster and is working correctly.

I’ve compiled the plugin and set up the plugstack.conf file.

However, the same configuration needs to be on all the compute nodes.

On alces flight the SLURM configuration is in:

/opt/clusterware/opt/slurm/etc

But I noticed that if I make changes on the login node, they are not reflected on the compute nodes. This makes sense, but I was wondering how to fix it.

Do you have any suggestions as to how to get all the compute nodes to see the same SLURM configuration?

Regards,
Bernie


#2

Hi Bernie,

Thanks for your interest in Alces Flight!

The changes in slurm.conf are not replicated as you found as the files themselves are local to the nodes. We do this because nodes regularly come and go from the cluster, which means that the the usual method of NFS exporting the files from your Slurm master node can be unsuitable as nodes will regularly hang if the NFS export isn’t quite ready.

We’d suggest maybe copying the configuration to a shared storage area and using the pdsh command to ensure that the latest configuration is put into the right place. A quick script placed in your shared home directory called update_slurm_conf.sh, something like:

#!/bin/bash -l
service slurmd stop
cp /opt/clusterware/opt/slurm/etc/slurm.conf /tmp/slurm.conf.bak
cp /home/alces/new_slurm.conf /opt/clusterware/opt/slurm/etc/slurm.conf
service slurmd start

You could then run this across the nodes in your cluster by running the following:

[root@login1(mycluster) ~]# module load services/pdsh
[root@login1(mycluster) ~]# pdsh -g nodes '/home/alces/update_slurm_conf.sh'

Just remember that this file will need updating if autoscaling is enabled and nodes are coming and going on demand, since the node list in the slurm.conf file changes as nodes come and go.

Hope that helps!

Cheers,
Ruan


#3

Thanks Ruan.

I had trouble with the command in your script (on the compute nodes):

service slurmd stop

I get the message:

Redirecting to /bin/systemctl stop slurmd.service
Failed to stop slurmd.service: Unit slurmd.service not loaded.

Edit: I found the following seems to work:

systemctl stop clusterware-slurm-slurmd.service

and

systemctl start clusterware-slurm-slurmd.service

Cheers,
Bernie


#4

I’ve test this thoroughly now and it seems to work well.

However, I am wondering if it is possible to make it work automatically with autoscaling?

That is, when new nodes start, it would be nice if there was a way to automatically have their SLURM configurations updated etc.

Cheers,
Bernie


#5

Hi Bernie,

You found the slightly different name for the Clusterware Slurm daemon for systemd - glad you got that working!

To be able to orchestrate having the updated slurm.conf file automatically applied to nodes when they start and join the cluster, we’d advise taking a look at the Alces Clusterware Customizer tool, which can perform the changes you’re looking at when nodes join.

Hopefully armed with the documentation for this tool, you should be able to create a configure hook that will perform the changes you’re wanting. Remember that this is able to be done before the start action, so you won’t even need to restart services, just change your slurm.conf and you should be set.

Hope this helps!

Ruan


#6

Hi Ruan,

Thanks for the information.

I think I have got it to work.

One of the main problems I encountered was the fact that the /home filesystem does not appear to be mounted by the time the “configure” or even “node-started” events occur.

The other problem I encountered was the fact that if I have set the initial number of nodes to 0 under auto-scaling it doesn’t seem to start any nodes until I set “desired” to 1. This was mentioned in another forum thread:

Cheers,
Bernie


#7

Hi @bjpop,

That’s right – for compute nodes, the /home NFS export is mounted when the node is detected as having joined the cluster by the master (login) node. That can only happen once the cluster events ring service is started which happens during the start lifecycle event. We’d recommend placing files required for customization profiles within the profiles themselves in S3 and copying them to the correct locations them from within the customization scripts. For example, within the script you could use this to copy a my-config-file.conf file out of a resources subdirectory within the customization profile:

profile_dir=$(cd "$(dirname "${BASH_SOURCE[-1]}")" && pwd)
cp ${profile_dir}/resources/my-config-file.conf /etc/some-config-file.conf

With zero nodes available, as it stands the autoscaling logic in Flight Compute Solo isn’t able to determine the required details for the instance type configured within the autoscaling group. Note that you can kick off the first instance scale out from the master (login) node without having to sign in to the EC2 console by making use of the aws autoscaling describe-auto-scaling-groups and aws autoscaling set-desired-capacity commands.

Thanks,

Mark.


#8

Thanks Mark,

That information is very helpful.

Cheers,
Bernie