Any GUI cluster monitoring tools?


#1

Are there any all-in-one GUI tools for getting a nice dashboard of a running cluster?
Something like Ganglia, but perhaps nicer…


#2

Hi jjv5,

There are many different monitoring tools out there, with the assistance of Alces Flight Customisations it would be possible to write some scripts to install and configure these tools for your cluster.

Let us know how your experience goes getting it to work.

Thanks,

Stu


#3

There are many different monitoring tools out there

Such as? This was my question.

would be possible to write some scripts to install and configure these tools for your cluster.

Well, I’m asking here in the Community forum specifically so I don’t need to do custom stuff. No one already uses any kind of GUI tool?

I’d like something simple for a new user of a cluster.


#4

The answer very much depends what you want to monitor and why?

For a overview dashboard have you tried the AWS console? CloudWatch and SystemManager are a good place to start setting up in your AWS account. If its OS level system monitoring you are after then most solutions will work on the base Flight Linux OS, just look for compatibility with CentOS. Ganglia being the obvious one thats designed to be lightweight for HPC clusters, but I’ve had others such as Munin, Nagios, Cacti, as well as commercial ones such as DataDog, all working on Flight clusters. Just following the install instructions of the respective tool should pretty much result in a working base system on a Flight cluster. If its Job monitoring you are after, I believe there are a number of GUI solutions for Slurm out there, but i don’t have much experience of those so maybe someone else can help. I generally find the CLI tools more than enough to get a good idea of whats going on…