NVIDIA announces new class of supercomputer and other AI-focused data center services

The NVIDIA DGX supercomputer utilizing GH200 Grace Hopper Superchips could possibly be the highest of its class. Learn what this and the corporate’s different bulletins imply for enterprise AI and high-performance computing.

Image: Sundry Photography/Adobe Stock
On May 28 on the COMPUTEX convention in Taipei, NVIDIA introduced a number of latest {hardware} and networking instruments, many targeted round enabling synthetic intelligence. The new lineup contains the 1-exaflop supercomputer, the DGX GH200 class; over 100 system configuration choices designed to assist firms host AI and high-performance computing wants; a modular reference structure for accelerated servers; and a cloud networking platform constructed round Ethernet-based AI clouds.
The bulletins — and the primary public discuss co-founder and CEO Jensen Huang has given because the begin of the COVID-19 pandemic — helped propel NVIDIA in sight of the coveted $1 trillion market capitalization.
Jump to:

What makes the DGX GH200 for AI supercomputers completely different?
NVIDIA’s new class of AI supercomputers reap the benefits of the GH200 Grace Hopper Superchips, and the NVIDIA NVLink Switch System interconnect to run generative AI language purposes, recommender programs and information analytics workloads (Figure A). It’s the primary product to make use of each the high-performance chips and the novel interconnect.
Figure A
The Grace Hopper chip is the spine of a lot of NVIDIA’s supercomputing and synthetic intelligence services and products. Image: NVIDIA
NVIDIA will provide the DGX GH200 to Google Cloud, Meta and Microsoft first. Next, it plans to supply the DGX GH200 design as a blueprint to cloud service suppliers and different hyperscalers. It is anticipated to be out there by the tip of 2023.

More about Innovation

The DGX GH200 is meant to let organizations run AI from their very own information facilities. 256 GH200 superchips in every unit present 1 exaflop of efficiency and 144 terabytes of shared reminiscence.
Specifically, NVIDIA defined the NVLink Switch System allows the GH200 chips to bypass a traditional CPU-to-GPU PCIe connection, rising the bandwidth whereas decreasing energy consumption.
Mark Lohmeyer, vp of compute at Google Cloud, identified in an NVIDIA press launch that the brand new Hopper chips and NVLink Switch System can “address key bottlenecks in large-scale AI.”
“Training large AI models is traditionally a resource- and time-intensive task,” stated Girish Bablani, company vp of Azure infrastructure at Microsoft, within the NVIDIA press launch. “The potential for DGX GH200 to work with terabyte-sized datasets would allow developers to conduct advanced research at a larger scale and accelerated speeds.”
NVIDIA may even hold some supercomputing functionality for itself; the corporate plans to work by itself supercomputer referred to as Helios, powered by 4 DGX GH200 programs.
NVIDIA’s new AI enterprise instruments are powered by supercomputing
Another new service, the NVIDIA AI Enterprise library, is designed to assist organizations entry the software program layer of the brand new AI choices. It contains greater than 100 frameworks, pretrained fashions and growth instruments. They are applicable for the event and deployment of manufacturing AI together with generative AI, laptop imaginative and prescient, speech AI and others.
On-demand assist from NVIDIA AI specialists will probably be out there to assist with deploying and scaling AI initiatives. It may help deploy AI on information heart platforms from VMware and Red Hat or on NVIDIA-Certified Systems.
SEE: These are the top-performing supercomputers on the earth.
Faster networking for AI within the cloud
NVIDIA needs to assist velocity up Ethernet-based AI clouds with the accelerated networking platform Spectrum-X (Figure B).
Figure B
Components of the Spectrum-X accelerated networking platform. Image: NVIDIA
“NVIDIA Spectrum-X is a new class of Ethernet networking that removes barriers for next-generation AI workloads that have the potential to transform entire industries,” stated Gilad Shainer, senior vp of networking at NVIDIA, in a press launch.
Spectrum-X can assist AI clouds with 256 200Gbps ports related by a single change or 16,000 ports in a two-tier spine-leaf topology.
Spectrum-X does so by using Spectrum-4, a 51Tbps Ethernet change constructed particularly for AI networks. Advanced RoCE extensions bringing collectively the Spectrum-4 switches, BlueField-3 DPUs and NVIDIA LinkX optics create an end-to-end 400GbE community optimized for AI clouds, NVIDIA stated.
Spectrum-X and its associated merchandise (Spectrum-4 switches, BlueField-3 DPUs and 400G LinkX optics) can be found now, together with ecosystem integration with Dell Technologies, Lenovo and Supermicro.
MGX Server Specification coming quickly
In extra information relating to accelerated efficiency in information facilities, NVIDIA has launched the MGX server specification. It is a modular reference structure for system producers engaged on AI and high-performance computing.
“We created MGX to help organizations bootstrap enterprise AI,” stated Kaustubh Sanghani, vp of GPU merchandise at NVIDIA.
Manufacturers will have the ability to specify their GPU, DPU and CPU preferences inside the preliminary, primary system structure. MGX is appropriate with present and future NVIDIA server kind components, together with 1U, 2U, and 4U (air or liquid cooled).
SoftBank is now engaged on constructing a community of knowledge facilities in Japan which is able to use the GH200 Superchips and MGX programs for5G companies and generative AI purposes.
QCT and Supermicro have adopted MGX and can have it available on the market in August.
Other information from NVIDIA at COMPUTEX
NVIDIA introduced a wide range of different new services and products based mostly round working and utilizing synthetic intelligence:

WPP and NVIDIA Omniverse got here collectively to announce a brand new engine for advertising. The content material engine will have the ability to generate video and pictures for promoting.
A wise manufacturing platform, Metropolis for Factories, can create and handle customized quality-control programs.
The Avatar Cloud Engine (ACE) for Games is a foundry service for online game builders. It allows animated characters to name on AI for speech technology and animation.

Alternatives to NVIDIA’s supercomputing chips
There aren’t many firms or prospects aiming for the AI and supercomputing speeds NVIDIA’s Grace Hopper chips allow. NVIDIA’s main rival is AMD, which produces the Instinct MI300. This chip contains each CPU and GPU cores, and is anticipated to run the two exaflop El Capitan supercomputer.
Intel supplied the Falcon Shores chip, however it just lately introduced that this may not be popping out with each a CPU and GPU. Instead, it has modified the roadmap to deal with AI and high-powered computing, however not embody CPU cores.

NVIDIA announces new class of supercomputer and other AI-focused data center services

Recent Articles

Related Stories

Stay on op - Ge the daily news in your inbox

Share this:

Related

Share this: