The NVIDIA H200 Tensor Core GPU is the successor to NVIDIA’s highly successful H100 GPU. It offers up to 141GB of HBM3 memory, nearly double the memory capacity of the NVIDIA H100, while offering 1.4X more memory bandwidth! This GPU is designed to supercharge generative AI, large language models, and can dramatically advance scientific, high-performance computing (HPC) workloads.
NVIDIA’s H200 Tensor Core GPU features an SXM form factor while the NVIDIA H200 NVL (NVIDIA Low-Profile) is a PCIe-based card with a lower profile for space constrained environments. Both offer the same performance levels across the board. Crucial for high-performance computing, the memory bandwidth on the NVIDIA H200 Tensor Core GPU allows for faster data transfers when compared to the H100. Greater memory bandwidth ensures faster data access and the ability to evaluate and manipulate that data for significantly faster results. It’s up to 110X faster for high-performance computing tasks when compared to the Ampere-based A100! For GPT-3 175B Inference, a state-of -the-art language model featuring 175 billion parameters, it is 1.6X faster.
It offers the most memory to date at 141GB of HBM3e memory to deliver 4 petaFLOPS of Floating Point 8 (FP8) performance. FP8 precision is less memory-intensive than FP32 and FP64 used by other high-performance GPUS from NVIDIA. Jointly developed by NVIDIA and ARM, FP8 allows for up to 4X higher inference performance by improving memory efficiency. FP8 is only available with NVIDIA’s latest GPU architecture for Ada Lovelace and Hopper. The NVIDIA H200 NVL also comes with a 5-year NVIDIA AI Enterprise subscription to help develop and deploy production-ready generative AI solutions, such as speech AI, computer vision, retrieval augmented generation (RAG), and others.
Both the SXM form factor and PCIe-based GPU support Multi-Instance GPU or MIG, allowing the resources of each GPU to be divided up to 7 instances, each with 18GB of memory. All performance indicators for both form factors are nearly identical except for thermal output and interconnect. The interconnect for the PCIe card is via 2-way or 4-way NVIDIA NVLink bridge, whereas the SXM version is supported on an NVLink board with 4 or 8 GPUs.
Offering dramatically increased performance, the NVIDIA H200 delivers the same energy efficiency as the H100 reducing TCO by up to 50% on a performance to cost ratio. The PCIe-based NVIDIA H100 NVP features passive cooling with air flow provided by the host system. The SXM version is also cooled by the host system, which pushes air over the towering heat sinks on each GPU. Maximum Thermal Design Power (TDP) is listed at 700W (configurable) for the SXM form factor and up to 600W (configurable) for the PCIe-based version.
Specs | H200 SXM | H200 NVL |
---|---|---|
FP64 | 34 TFLOPS | 34 TFLOPS |
FP64 Tensor Core | 67 TFLOPS | 67 TFLOPS |
FP64 Tensor Core | 67 TFLOPS | 67 TFLOPS |
FP32 | 67 TFLOPS | 67 TFLOPS |
TF32 Tensor Core | 989 TFLOPS | 989 TFLOPS |
BFLOAT16 Tensor Core | 1,979 TFLOPS | 1,979 TFLOPS |
FP16 Tensor Core | 1,979 TFLOPS | 1,979 TFLOPS |
FP8 Tensor Core | 3,958 TFLOPS | 3,958 TFLOPS |
INT8 Tensor Core | 3,958 TFLOPS | 3,958 TFLOPS |
GPU Memory | 141 GB | 141 GB |
GPU Memory Bandwidth | 4.8 TB/s | 4.8 TB/s |
Decoders | 7 NVDEC/7 JPEG | 7 NVDEC/7 JPEG |
Confidential Computing | Supported | Supported |
TDP | Up to 700W (configurable) | Up to 600W (configurable) |
Multi-Instance GPUs | Up to 7 MIGs @ 18GB each | Up to 7 MIGs @ 18GB each |
Form Factor | SXM | PCIe |
Interconnect | NVIDIA NVLink: 900GB/s PCIe Gen5: 128GB/s | 2- or 4-way NVIDIA NVLink bridge: 900GB/s PCIe Gen5: 128GB/s |
Server Options | NVIDIA HGX H200 partner and NVIDIA-Certified Systems with 4 or 8 GPUs | NVIDIA MGX H200 NVL partner and NVIDIA-Certified Systems with up to 8 GPUs |
NVIDIA AI Enterprise | Add-on | Included |
If you know what you want but can't find the exact configuration you're looking for, have one of our knowledgeable sales staff contact you. Give us a list of the components you would like to incorporate into the system, and the quantities, if more than one. We will get back to you immediately with an official quote.