NVIDIA unveils its newest high-end chip for AI

14th November 2023

NVIDIA

Paige West

1 0

NVIDIA has announced the launch of the NVIDIA HGX H200, enhancing its AI computing platform.

The H200, based on the NVIDIA Hopper architecture, features the NVIDIA H200 Tensor Core GPU, equipped with advanced memory to manage significant data volumes for generative AI and high-performance computing (HPC) workloads.

The H200 introduces HBM3e, offering faster and larger memory to accelerate generative AI and large language models, as well as advancing scientific computing for HPC workloads. With HBM3e, the NVIDIA H200 provides 141GB of memory at 4.8 terabytes per second, significantly increasing capacity and bandwidth compared to its predecessor, the NVIDIA A100.

Ian Buck, Vice President of hyperscale and HPC at NVIDIA, said: “To create intelligence with generative AI and HPC applications, vast amounts of data must be efficiently processed at high speed using large, fast GPU memory. With NVIDIA H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges.”

The NVIDIA Hopper architecture has made notable performance improvements over its predecessor, continually advancing with ongoing software enhancements like the recent release of NVIDIA TensorRT-LLM. The H200 is expected to further enhance performance, nearly doubling inference speed on Llama 2, a 70 billion-parameter LLM, compared to the H100. Future software updates are anticipated to bring additional performance enhancements with the H200.

The H200 will be available in NVIDIA HGX H200 server boards with four- and eight-way configurations, compatible with HGX H100 systems' hardware and software. It will also be offered in the NVIDIA GH200 Grace Hopper Superchip with HBM3e, announced in August.

With these configurations, H200 can be deployed in various data centre types, including on-premises, Cloud, hybrid-Cloud, and Edge. Server makers such as ASRock Rack, ASUS, Dell Technologies, Eviden, GIGABYTE, Hewlett Packard Enterprise, Ingrasys, Lenovo, QCT, Supermicro, Wistron, and Wiwynn can update their systems with H200.

Cloud service providers like Amazon Web Services, Google Cloud, Microsoft Azure, Oracle Cloud Infrastructure, CoreWeave, Lambda, and Vultr are among the first to deploy H200-based instances starting next year.

The HGX H200, with NVIDIA NVLink and NVSwitch high-speed interconnects, supports various application workloads, including LLM training and inference for models beyond 175 billion parameters. An eight-way HGX H200 offers over 32 petaflops of FP8 deep learning compute and 1.1TB of high-bandwidth memory for generative AI and HPC applications.

Paired with NVIDIA Grace CPUs and NVLink-C2C interconnect, the H200 forms the GH200 Grace Hopper Superchip with HBM3e, designed for large-scale HPC and AI applications.

NVIDIA's computing platform is complemented by software tools that aid developers and enterprises in building and accelerating applications from AI to HPC, including the NVIDIA AI Enterprise suite for various workloads.

The NVIDIA H200 will be available from global system manufacturers and Cloud service providers starting in the second quarter of 2024.