HPC clusters require high CAPEX investment to procure and build, and high OPEX to operate. Therefore, high ROI is expected. However, this is not the case for most HPC installations. True, HPC clusters are mostly used for research purposes so they can not be equally compared to commercial investments. Nevertheless, considering this context the ROI here can be measured in terms of research productivity.
HPC clusters are widely used at universities and research organisations, and due to the multidisciplinary nature of most of these organisation, the HPC clusters are logically expected to run various kinds of computation programs to solve different kind of problems. However, in most cases, they are exclusively used by one or a couple of disciplines because of organisational and funding reasons which are in fact invisibly influenced by technical reasons that is in turn invisibly caused by a poor technical design.
Computers are general-purpose processing machines that can run various software programs. However, they are turned into nearly embedded devices by hardwiring the hardware with the software and hence the reusability of the machine is lost. Most HPC clusters select a specific operating system as the only software platform, as a result only the applications that are compatible with that specific operating system can be run on the HPC cluster. This static model is acceptable if the HPC cluster was procured and built for specific users, and is actually being utilised above 70% throughout its lifecycle, otherwise the HPC cluster is not wisely utilised.
HPC clusters can be elastic. HPC computing nodes should be viewed as a pool of reusable hardware resources that can dynamically host multiple software platforms, hence they can accommodate various kind of applications, and consequently the HPC cluster can serve larger number of different users which potentially would increase the research productivity of the organisation at large.
HPC computing nodes can be dynamically segmented into multiple auto-scalable computing platforms based on demands and priorities. For instance, one segment runs a Linux operating system, and another segment runs a Windows operating system, if the Linux segment is under-utilised and there is more demand on the Windows segment, then automatically remove idle computing nodes from the Linux segment and allocate them to the Windows segment, and vice versa.
Elastic HPC clusters can easily be built on public cloud computing platforms due to the elasticity nature of the cloud. However, the same model can technically be achieved for on-premises HPC clusters by building it as a private cloud platform, or by virtualising the computing nodes and introducing an automation layer on top of the virtualised computing nodes, or by only introducing an automation layer on top of the bare-metal computing nodes. The automation layer can be integrated with a billing system in order to charge all disciplines separately. In this manner, the HPC cluster can be leveraged efficiently and effectively by all the disciplines of the organisation as needed, and hence maximise the ROI on the HPC investment.