How does AutoScaling work in detail?


#1

Hello, and thank you for this amazing tool.

I have read through all of the Docs pages, but have not found a lot of information about how I should expect AutoScaling to work. I understand that basically if there are jobs in the queue then the cluster is scaled up, and if nodes are idle they are shut down. I am wondering, is there any documentation about the specifics of the scaling algorithm?

For example here are some behaviors I have noticed that I would like to understand better, and some specific questions:
– My selected compute node is a c4.8xlarge (36 cores). I queue up 70 jobs (using SLURM), and one compute node is created. So, I have 36 running and 34 pending jobs. I would expect that since I have pending jobs, a second node would be created (my max nodes is 32) but there is consistently only one node. Why does this happen?
– When first I queue up my jobs, it takes a relatively long time for the cluster to react and start allocating nodes - I have to wait ~5 minutes for any nodes to be created if I start with a cluster that only has a login node (I’m not counting the time to actually boot up the compute node here). Is there any way to make this delay shorter?
– If I start my cluster with, e.g., 5 initial compute nodes, will they ever be autoscaled down (removed)? If yes, is there a way to force the original nodes to persist? If no, is there a way to force them to autoscale down if they’ve been idle for too long?
– Is there any way to mix and match compute nodes or am I limited to only one type per cluster? E.g. can I have two c4.8xlarge nodes (36 CPUs each) and one c4.large node (2 CPUs) (total 74 CPUs) if I have 74 single-threaded jobs to run?
– Is there a way to scale across multiple regions to overcome the 20-instance limit for spot pricing?

Thank you for the help.


#2

Hi @dkv,

Thanks for using Alces Flight! Please could you let me know which product, and which version, you’re currently using, so I can be sure to answer appropriately?

The autoscaler checks for demand every five minutes, which is why you’re seeing a slight delay in compute nodes being generated. This is not currently configurable, but (as your next question demonstrates) it’s possible to start up the cluster with a number of compute nodes ready to go.

A compute group started with one or more initial compute nodes will always keep at least one node running, but it will terminate any additional nodes if there is insufficient demand. There is not currently a way to maintain a minimum number of more than one node without disabling autoscaling (by running alces configure autoscaling disable).

It’s not possible to launch nodes of differing types within a compute group.

AWS does not support scaling across multiple regions, I’m afraid. It may be possible to request an increased spot instance limit for your account from AWS.

Yours,

James


#3

Hi James,

Thank you for the reply! I’m using Alces Flight Solo Community Edition 2016.4r1 with the “Personal HPC Cluster” deployment option.

Your answers were very helpful. I do have a few follow-ups:
– Regarding my first point above, I’ve realized that the reason is that AWS also limits the number of running instances on a per-instance-type basis, so I can only launch one c4.8xlarge instance at a time (I’ve now requested a limit increase). Assuming that limit isn’t there, how will the scaling-up work? E.g. does it consider the number of CPUs per instance? (If I have 70 single-threaded jobs, I only want 70 CPUs i.e. two c4.8xlarge instances, but I could also imagine a scaling algorithm that assumes one job per node rather than one job per CPU and gives me 70 instances)
– I understand that if I start with one or more initial compute nodes, I will always have my login node and at least one compute node running until I manually shut down the cluster. I am wondering, if I start with zero initial compute nodes, will it scale back down to zero, or will it similarly scale down to one compute nodes?
– On the same note about down-scaling, how does the algorithm quantify “insufficient demand?”
– If a cluster gets scaled down due to a SPOT price increase above my threshold, will it be scaled back up once the SPOT price drops below the threshold again? Will the jobs that got interrupted as a result of the cluster shutdown be re-started, or will only the jobs that were “pending” prior to the shutdown be resumed?

Thanks again.


#4

Hi @dkv ,

The autoscaler will take available CPU cores into account. It considers demand in terms of number of CPUs required, and then calculates the number of machines required to fulfil that demand. So, in your example, it would calculate that demand for 70 CPU cores in a group of c4.8xlarge instances equates to two nodes (70 divided by 36, rounded up).

Launching a cluster with zero initial compute nodes means that it will be able to scale down to zero compute nodes.

The autoscaler will reduce the size of the compute group by terminating a node if there are no jobs currently running on it and it is approaching the end of its AWS billing hour. This check happens every five minutes.

If a Spot price increase causes a cluster to be scaled down, then AWS will scale it back up when the price drops below the threshold. It may be possible to configure the job scheduler to automatically rerun jobs that were prematurely terminated, though this is not default behaviour (since a job may have modified its input dataset in some way). In general jobs will not be “paused” in such a way that they can be resumed from partway through.

Hope this is helpful.

Yours,

James


#5

Yes, extremely helpful, thank you!! I’m now much more comfortable using the AutoScaling feature for a large job.


#6

2 posts were split to a new topic: Autoscaling from zero nodes


Autoscaling from zero nodes