IT Creations Partners

NVIDIA H200 Tensor Core GPU - System Overview

Additional Information

NVIDIA H100 vs H200 GPU Chart

NVIDIA’s H200 Tensor Core GPU features an SXM form factor while the NVIDIA H200 NVL (NVIDIA Low-Profile) is a PCIe-based card with a lower profile for space constrained environments. Both offer the same performance levels across the board. Crucial for high-performance computing, the memory bandwidth on the NVIDIA H200 Tensor Core GPU allows for faster data transfers when compared to the H100. Greater memory bandwidth ensures faster data access and the ability to evaluate and manipulate that data for significantly faster results. It’s up to 110X faster for high-performance computing tasks when compared to the Ampere-based A100! For GPT-3 175B Inference, a state-of -the-art language model featuring 175 billion parameters, it is 1.6X faster.

Performance and Memory

NVIDIA H200 Tensor Core GPU front view

It offers the most memory to date at 141GB of HBM3e memory to deliver 4 petaFLOPS of Floating Point 8 (FP8) performance. FP8 precision is less memory-intensive than FP32 and FP64 used by other high-performance GPUS from NVIDIA. Jointly developed by NVIDIA and ARM, FP8 allows for up to 4X higher inference performance by improving memory efficiency. FP8 is only available with NVIDIA’s latest GPU architecture for Ada Lovelace and Hopper. The NVIDIA H200 NVL also comes with a 5-year NVIDIA AI Enterprise subscription to help develop and deploy production-ready generative AI solutions, such as speech AI, computer vision, retrieval augmented generation (RAG), and others.

Both the SXM form factor and PCIe-based GPU support Multi-Instance GPU or MIG, allowing the resources of each GPU to be divided up to 7 instances, each with 18GB of memory. All performance indicators for both form factors are nearly identical except for thermal output and interconnect. The interconnect for the PCIe card is via 2-way or 4-way NVIDIA NVLink bridge, whereas the SXM version is supported on an NVLink board with 4 or 8 GPUs.

Cooling and Power

NVIDIA H200 NVL GPU front view

Offering dramatically increased performance, the NVIDIA H200 delivers the same energy efficiency as the H100 reducing TCO by up to 50% on a performance to cost ratio. The PCIe-based NVIDIA H100 NVP features passive cooling with air flow provided by the host system. The SXM version is also cooled by the host system, which pushes air over the towering heat sinks on each GPU. Maximum Thermal Design Power (TDP) is listed at 700W (configurable) for the SXM form factor and up to 600W (configurable) for the PCIe-based version.

Check Availability

NVIDIA H200 Tensor Core GPU - Specifications

Specs H200 SXM H200 NVL
FP64 34 TFLOPS 34 TFLOPS
FP64 Tensor Core 67 TFLOPS 67 TFLOPS
FP64 Tensor Core 67 TFLOPS 67 TFLOPS
FP32 67 TFLOPS 67 TFLOPS
TF32 Tensor Core 989 TFLOPS 989 TFLOPS
BFLOAT16 Tensor Core 1,979 TFLOPS 1,979 TFLOPS
FP16 Tensor Core 1,979 TFLOPS 1,979 TFLOPS
FP8 Tensor Core 3,958 TFLOPS 3,958 TFLOPS
INT8 Tensor Core 3,958 TFLOPS 3,958 TFLOPS
GPU Memory 141 GB 141 GB
GPU Memory Bandwidth 4.8 TB/s 4.8 TB/s
Decoders 7 NVDEC/7 JPEG 7 NVDEC/7 JPEG
Confidential Computing Supported Supported
TDP Up to 700W (configurable) Up to 600W (configurable)
Multi-Instance GPUs Up to 7 MIGs @ 18GB each Up to 7 MIGs @ 18GB each
Form Factor SXM PCIe
Interconnect NVIDIA NVLink: 900GB/s
PCIe Gen5: 128GB/s
2- or 4-way NVIDIA NVLink bridge: 900GB/s
PCIe Gen5: 128GB/s
Server Options NVIDIA HGX H200 partner and
NVIDIA-Certified Systems with 4 or 8 GPUs
NVIDIA MGX H200 NVL partner and
NVIDIA-Certified Systems with up to 8 GPUs
NVIDIA AI Enterprise Add-on Included

Check Availability

Get a quote for NVIDIA H200 Tensor Core GPU

If you know what you want but can't find the exact configuration you're looking for, have one of our knowledgeable sales staff contact you. Give us a list of the components you would like to incorporate into the system, and the quantities, if more than one. We will get back to you immediately with an official quote.

[email protected]

NVIDIA H200 Tensor Core GPU - Documentation

Check Availability