Once a specialist in gaming graphics cards, NVIDIA has become, in just a few short years, a giant in Big Data and a major player in the data center industry. Discover the behind-the-scenes of this transformation, and the different solutions offered today by the American company…
When a company enters a new market, the transition can end in a bitter failure. However, it can also be successful.
By deciding to tackle the market of Data Centers, Big Data and AI, NVIDIA has pulled off a real masterstroke allowing him to prepare for the world of tomorrow. Through this dossier, discover how the graphics card manufacturer has become a giant in the Big Data industry…
NVIDIA and HPC
It all starts in 2006. That’s when Stanford University discovered that the use of GPUs (Graphics Processing Units) for the most intensive tasks offers better performance per watt than traditional CPU processors.
Indeed, the elements used to process pixels can be used for scientific calculations: it is what is then called “GPU Compute”. . Several NVIDIA executives, including Bill Dally and John Nicholls, decided to seize this opportunity to enter the High-Performance Compute (HPC) market.
To accomplish this, NVIDIA is adding new features for HPC workloads to its GPUs. It is also developing the Tesla product range (later renamed Ampere), based on its range of Quadro professional workstations.
This breakthrough in the HPC market will be a real success, and now equips 127 supercomputers in the TOP500 including the two fastest supercomputers in the world owned by the US Department of Energy: Summit, located at Oak Ridge National Laboratory, and Sierra, at Lawrence Livermore National Lab.
Subsequently, researchers in artificial intelligence decided use GPUs to accelerate Machine Learning algorithms Deep Convolutional Neural Networks.
By combining DCNNs and GPUs, artificial intelligence training has greatly increased speed and accuracy. This has led to an explosion of AI research and applications, and propelled NVIDIA to the forefront.
The American firm quickly adapted its GPUs to these new workloads, adding mathematical functions and dedicated processing elements called “Tensor Cores”. It has also developed software libraries under the name cuDNN optimized for its CUDA programming framework and for neural networks.
Nvidia and Data Centers
In March 2019, NVIDIA Announces Acquisition of Melannoxthe Data Center networking company, for $6.9 billion. A surprising announcement for many experts, but also and above all an excellent strategic decision.
This purchase is NVIDIA’s largest purchase since its foundation, both in terms of price and consequences. By acquiring Mellanox, NVIDIA has officially become a data center company.
The company is convinced that Data Centers will be increasingly indispensable for process data exabytes generated by new technologies such as autonomous vehicles.
To address these data science needs, NVIDIA has developed several projects of servers based on Mellanox rack networking. Indeed, Mellanox’s technology makes a Data Center flexible enough to cope with changing workloads, hyperscaling and containerization.
One of the key components of this technology is the use of accelerators rather than CPUs for networking tasks. In the future, the company also plans to add artificial intelligence to its switching products to transfer data more efficiently.
In the second quarter of 2020, for the first time, revenues from Nvidia’s data center activities have exceeded those generated by its gaming products. . This is therefore a real turning point for the company.
Over the years, NVIDIA has developed numerous products for Big Data and Data Centers. First of all, the company offers GPUs specifically dedicated to Data Centers. This range was formerly called “Tesla”, but has recently been renamed “Ampere” to avoid confusion with the eponymous Elon Musk car brand.
The newest addition to this catalogue is the A100 Tensor Core. This GPU delivers unmatched acceleration capabilities for data analysis, HPC and AI workflows. It enables interconnect thousands of GPUs or on the contrary to partition a GPU into seven instances with MIG technology.
At the GTC 2019 keynote, NVIDIA CEO and founder, Jensen Huang, stated that Data Science has become the 4th pillar of the scientific method.. However, in view of the worldwide shortage of Data Scientists and AI researchers, he considers it essential to maximize the productivity of these experts.
To enable more developers to take advantage of its Data Science resources, NVIDIA is designing DGX workstations and servers Embedding all CUDA-X tools and libraries for Machine Learning research. With the help of several manufacturers such as Dell, HP and Lenovo, it is bringing new Data Science platforms to market.
With its DGX systems, NVIDIA provides companies with solutions for the configuration and upgrading of artificial intelligence infrastructures and Deep Learning.
This range of systems is fully designed for artificial intelligence workflows. It includes NVIDIA DGX workstations and servers, as well as the scalable DGX POD infrastructure solution.
NVIDIA HGX A100 Accelerated Computing Platform
With its HGX A100 platform, NVIDIA offers the world’s most powerful server platform for accelerating multi-precision calculations on Deep Learning, Machine Learning and HPC workflows.
This platform combines up to eight A100 Tensor Core GPUs to NVSwitch technology in a unified accelerator. It’s enough to solve the most complex and advanced calculations.
NVIDIA EGX for Edge Computing
As businesses increasingly operate at the edge of networks, NVIDIA is responding to a new demand with its NVIDIA EGX platform. The platform enables the benefits of AI-accelerated computing power. directly at the edge via the Cloud.
This GPU-accelerated computing platform dedicated to the deployment and management of AI applications is fully scalable. It also offers low latency and extremely easy to use.
The NGC Software Hub
The NGC hub brings together many GPU-accelerated software for Deep Learning, Machine Learning and High Performance Computing. This simplifies professional workflows, allowing data scientists to focus on creating innovative solutions.
There are Helm graphs to automate deployments, custom-made containers to more quickly deploy AI frameworks, pre-trained models and learning scripts.
NVIDIA Accelerates Apache Spark 3.0
In May 2020, NVIDIA chose to collaborate with the open-source community bringing GPU acceleration to Apache Spark 3.0: the new version of the Big Data analysis engine used by more than 500,000 data scientists worldwide.
This GPU acceleration covers the entire Spark 3.0 pipeline, from the ETL process to the AI drive, which can now be processed on the same Spark cluster rather than on a separate infrastructure. Using the preview version of Spark 3.0 on Databricks, Adobe has increased performance by a factor of 7 and reduced costs by 90%..
NVIDIA Sets Record for Big Data Analysis Performance
In June 2020, NVIDIA broke the Big Data Analysis Speed Record on the TPCx-BB standard benchmark. Specifically, the firm has achieved performances 20 times higher than previous records.
This feat was achieved using the RAPIDS suite of open-source Data Science software libraries, powered by 16 DGX A100 systems. In total, the systems combined 128 A100 GPUs and used Mellanox networking technology. The benchmark was executed in just 14.5 minutes thanks to this configuration, while the previous record was 4.7 hours on a CPU system.