NVLink Explained

Source-backed explainer

NVLink, explained for multi-GPU systems.

NVLink is NVIDIA's scale-up interconnect for moving data between GPUs and CPUs at much higher bandwidth than ordinary server PCIe paths. If you are reading product briefs full of NVSwitch, NVL72, and GPU bandwidth claims, this page is the fast way to make sense of them.

Last reviewed Official NVIDIA pages only for specs
Current top line
3.6 TB/s per GPU
Rack example
72 GPUs in NVL72
Why it matters
Collectives and model parallelism

Fabric sketch

Direct links inside a bigger switch fabric

Think of NVLink as the lane, and NVSwitch as the exchange that keeps every lane reachable at scale.

NVSwitch Fabric All-to-all paths for scale-up traffic GPU 0 GPU 1 GPU 2 GPU 3 GPU 4 GPU 5

Overview

Three ideas cover most of the NVLink conversation.

Most confusion comes from mixing up direct GPU links, rack-scale switching, and chip-to-chip variants. The product names change by generation, but the logic stays consistent.

1. NVLink is the fast scale-up path

NVIDIA's March 6, 2023 explainer describes NVLink as a high-speed interconnect for GPUs and CPUs, with fourth-generation bandwidth reaching 900 GB/s. The point is not generic I/O. The point is moving model state and collective traffic fast enough that multiple accelerators can behave like one coherent system.

2. NVSwitch makes the fabric bigger

Direct neighbor links are not enough once you scale beyond a small local island. NVIDIA's current NVLink page frames NVSwitch as the element that keeps the fabric all-to-all, letting larger systems retain full-bandwidth GPU-to-GPU paths.

3. NVLink-C2C shrinks the idea down to package level

Blackwell-era systems also use NVLink chip-to-chip inside superchips. NVIDIA's March 18, 2024 Blackwell announcement says the GB200 Grace Blackwell Superchip connects two B200 GPUs to Grace with a 900 GB/s ultra-low-power NVLink C2C interconnect.

Why not stop at PCIe?

Bandwidth is only part of the story.

PCIe is the general-purpose server interconnect. NVLink exists because tightly coupled AI and HPC workloads keep hitting communication walls before they hit raw compute walls.

Official delta

Fourth-generation NVLink was already more than 7x PCIe Gen 5.

That comparison comes directly from NVIDIA's March 6, 2023 blog post. Later generations push the gap wider as models become more communication-heavy.

Timeline

The official generation curve is steep.

The important thing is not memorizing every product codename. It is seeing how quickly interconnect bandwidth has moved from hundreds of gigabytes per second to multiple terabytes per second per GPU.

  1. 2016 era

    Pascal / P100

    NVIDIA says NVLink was first introduced as a GPU interconnect with the NVIDIA P100 GPU.

  2. 2020

    Ampere / A100

    Third-generation NVLink doubled max bandwidth per GPU to 600 GB/s, according to NVIDIA's 2023 explainer.

  3. 2022-2023

    Hopper / H100

    Fourth-generation NVLink reached 900 GB/s per GPU and became the baseline reference for modern DGX and HGX systems.

  4. March 18, 2024

    Blackwell / fifth generation

    NVIDIA's Blackwell launch says fifth-generation NVLink delivers 1.8 TB/s per GPU and supports communication among up to 576 GPUs.

  5. April 2026 official page

    Rubin / sixth generation

    NVIDIA's current NVLink page describes sixth-generation NVLink at 3.6 TB/s per GPU, with a 72-GPU NVL72 system reaching 260 TB/s total aggregate bandwidth.

Use cases

NVLink matters most when communication is on the critical path.

If GPUs work independently, PCIe can be fine. If every step needs fast exchanges of activations, gradients, KV cache, or model shards, interconnect quality starts deciding the result.

Training and model parallelism

Large training jobs need fast all-reduce and high-volume traffic between accelerators. That is why NVIDIA pairs NVLink with NVSwitch in DGX, HGX, and NVL systems.

Rack-scale inference

NVIDIA's Rubin page explicitly ties sixth-generation NVLink to reasoning workloads and mixture-of-experts patterns, where tokens and experts can force heavy all-to-all communication.

Superchips and memory-adjacent traffic

NVLink-C2C extends the same logic inside package-level systems such as Grace Hopper and Grace Blackwell, where CPU and GPU need very high local bandwidth.

When PCIe is enough

Single-GPU inference, embarrassingly parallel batches, small fine-tunes, and lightly coupled serving stacks often do not need the cost or system design complexity of NVLink.

Practical reading guide

How to interpret vendor claims

If the workload... Then NVLink usually helps when... And PCIe is often enough when...
moves state between GPUs every step collective communication becomes the bottleneck communication stays low relative to compute
uses many experts or model shards tokens trigger frequent all-to-all exchanges each accelerator mostly runs its own request
depends on a CPU-GPU superchip path the design uses NVLink-C2C or Grace-class systems host transfers are not dominant in the pipeline

FAQ

The fast answers people usually need first.

What is NVLink in one sentence?

It is NVIDIA's high-bandwidth scale-up interconnect for GPUs and CPUs in accelerated systems.

Is NVLink the same thing as NVSwitch?

No. NVLink is the link technology. NVSwitch is the switch fabric that expands those links into a larger all-to-all domain.

How is NVLink different from PCIe?

PCIe is the default server interconnect. NVLink is specialized for multi-GPU communication and, according to NVIDIA's own explainer, fourth-generation NVLink exceeded PCIe Gen 5 bandwidth by more than 7x.

What is the current official top-end NVLink number?

As of April 26, 2026, NVIDIA's official NVLink page lists sixth-generation NVLink at 3.6 TB/s of bandwidth per GPU for Rubin.

Does NVLink automatically make two GPUs act like one?

Not by itself. It gives software a much better communication path. The workload and system software still need to know how to use multiple GPUs effectively.

Is NVLink mainly for gaming?

No. The current product conversation is centered on AI, HPC, DGX, HGX, rack-scale inference, and superchip designs rather than consumer gaming.

Primary sources

The site is intentionally narrow: official NVIDIA sources for specs and dates.

This page avoids rumor-roundup content. The goal is to make official information readable, not to invent another hype layer.

Background explainer

What Is NVLink? | NVIDIA Blog

Used for the broad definition, third-generation A100 figure, fourth-generation H100 figure, and the PCIe comparison.