The GPU develompement and evolution of the last years has shown a much stronger peak performance growth with respect to CPUs.
This strong growth, forces the sudden adoption of new programming paradigms and toolsets.
Programming frameworks often born, grow and die within two or three GPU architecture generations.
Due to this ever-evolving paradigm, ML and AI libraries always have to re-settle on the new paradigms, with high development efforts.
Simirarly, the development of software capable of fully exploiting such architectures is a big hassle even for the most skilled professionals.
The performance growth, driven mainly by technological improvements in the transistor's sizes, forced hardware producers to introduce hardware and low level software functionalities capable of enabling the GPU partitioning "à la CPU".
Such solutions (e.g. AMD MxGPU, Nvidia vGPU) are currently available in payware environments such as VMware and CITRIX, while these are not available inside Linux KVM. Concerning KVM there are some proofs of concept, but nothing enterprise-ready or of commercial grade.
In addition a PCI Express Consortium' protocol, called Single Root IO Virtualization (SR-IOV), allows similar functionalities with a potentially cross-producer approach, enabling GPU sharing ubiquitously.
Despite all the available solutions present some kind of specificity, the market direction is clear.
In fact, this paradigm injects in existing and new deployments multiple advantages:
Developing a layer capable of presenting a common interface on top of such technologies (and possibly similar techs developed in the future) in the open-source Linux KVM environment is crucial, paving the way for the wide adoption of this approach both in the research and in the industry fields.
Such toolset will ease and democratize the access to GPU partitioning technologies, enabling, for example, the following applications in Linux KVM:
The tool will be developed by the INFN and exploited in many deployments, enabling the support for such technologies in the in house KVM-based infrastructures.
This tool will operate as a plugin of the KVM-libvirt environment.
INFN will mainly tackle the tests concerning computing on GPUs using ForBC.
In order to enable the adoption of this tool even outside of the scientific community, a start-up will provide to commercial users support plans, formation for experts and will lead the development community of the tool.
The start-up will focus its efforts on VDI applications exploiting the same technology.
The following actors are taking part to the initiative:
We really think that aggregating as much actors as possible is the key for the creation of a real hardware-agnostic software platform.
The Istituto Nazionale di Fisica Nucleare (Italian National Institution for Nuclear Physics), is an around 80 years old institution which has driven some of the most important scientific efforts around the globe since its very beginning.
Nowadays INFN is involved in many experiments regarding high energy physics (mainly LHC at CERN), atrophysics and cosmology (Gravitational waves observatories), biomedic research and many other aspects.
Last but not least, since it is the fabric enabling behind the scenes the operations of all the previous scientific efforts, INFN if highly involved in IT research and administration of production IT services.
INFN always trusted the power of technology transfer, as a mean to exploit self developed leading edge tools which can survive thanks to a much wider audience provided from users external to the scientific community.