Yes exactly - around 4.6TB I think it is. We already have a research grant (though I'd rather spend it on something else) and am currently in touch with the open data team discussing a public dataset (since December). But thanks for the pointers!
Hah, ok yes a year - that's great then. I like the idea of being able to leave a Cluster Flow setup with everything configured and ready to go, then it just spawning expensive compute nodes when I kick off a pipeline and them closing themselves automatically when done. No need to worry about forgetting to shut the cluster off when complete then. The 30GB limit will probably be limiting though. I may add a section to the docs at the end mentioning this possibility (but absolving myself of any risk!).