AIDC Networking: Scale-Up vs Scale-Out vs Scale-Across

The Essentials, Simplified

In modern AI Data Centers (AIDC), “scaling” is the central theme that defines computing and networking evolution. Traditionally, two major expansion models have shaped data center design:

Scale-Up: vertically increasing the capability of a single node or system

Scale-Out: horizontally expanding by adding more nodes within a data center

However, with the explosive rise of Generative AI (AIGC) and trillion-parameter models, NVIDIA introduced a third dimension:

Scale-Across: expanding across geographically distributed data centers to build GW-level AI factories

Together, these three models form the foundation of today’s AI compute and networking architecture.

Get started

Outlook: AI Factories of the Coming Decade

Established shortly after ChatGPT’s launch, with the support of Wistron, Foxconn, and Pegatron, Zettabyte emerged to combine the world’s leading GPU and data center supply chain with a sovereign-grade, neutral software stack.

Get started

Established shortly after ChatGPT’s launch, with the support of Wistron, Foxconn, and Pegatron, Zettabyte emerged to combine the world’s leading GPU and data center supply chain with a sovereign-grade, neutral software stack.

_{Scale-Up and Scale-Out: The Dual Engines of AI Networking}

_{1. Definitions}

_{Scale-Up (Vertical Expansion)}

Scale-Up improves the performance of a single compute node by increasing GPU count, memory bandwidth, or chip-to-chip interconnect performance.

‍

Key Characteristics:

Tightly coupled hardware
Extremely high bandwidth
Ultra-low latency (nanoseconds)

Typical Technologies:

NVIDIA NVLink / NVSwitch
AMD UALink (evolved from Infinity Fabric)
Broadcom Scale-Up Ethernet (SUE)

_{Scale-Out (Horizontal Expansion)}

Scale-Out increases total compute capacity by adding more servers or racks connected via RDMA-capable networks.

‍

Key Characteristics:

Cost-effective
Elastic scalability
Higher latency (microseconds to milliseconds)

Typical Deployments:

RoCEv2 Ethernet clusters
InfiniBand HDR / NDR pods
Multi-server AI Pods (e.g. DGX SuperPOD)

_{2. Fundamental Differences in Application Scenarios}

Workload Type	Communication Pattern	Best Scaling Method
Tensor Parallelism	Heavily memory-dependent, frequent small exchanges	Scale-Up
Expert Parallelism (MoE)	High bandwidth + low latency	Scale-Up
Data Parallelism	Bulk gradient exchanges	Scale-Out
Pipeline Parallelism	Stage-level messaging	Scale-Out

Scale-Up networks use memory-semantic load–store communication, while Scale-Out relies on message-passing semantics. This fundamental difference makes the two architectures inherently non-interchangeable.

_{Technical Differences Between Scale-Up and Scale-Out}

_{Latency Composition: Static vs Dynamic}

‍

Static Latency

Defined by SerDes, FEC, and chip design

Dynamic Latency

Caused by congestion, queue buildup, and flow control

_Scale-Up

Bypasses traditional L2/L3 network stacks
Uses credit-based flow control and link-level retransmission
Avoids DSP/CDR processing found in optical modules
Typical latency: < 100 ns

_Scale-Out

Requires congestion control (ECN / PFC / DCQCN / UEC)
Multi-hop routing increases jitter and tail latency
Typical latency: hundreds of ns to milliseconds

_{Interconnect Media: Copper vs Optics}

Item	Scale-Up	Scale-Out
Medium	Copper cables	Optical modules
Bandwidth	TB/s-level	100–800 Gb/s, up to several Tb/s
Latency	Extremely low	Higher (due to DSP, routing, etc.)
Power	Very low	Higher (optical DSP power)

At scale: Scale-Up uses copper, Scale-Out uses optics.

‍

_{Case Study: DGX A100 vs DGX H100 (256-GPU Pod)}

DGX A100 relies on InfiniBand → Scale-Out cluster
DGX H100 adds a second-tier NVSwitch → cabinet-level Scale-Up system

Both provide 256 GPUs, but communication characteristics differ drastically.

_{Scale-Up at Its Peak: NVIDIA NVL72 SuperNode}

NVIDIA NVL72 integrates 36 Grace CPUs and 72 Blackwell GPUs into a liquid-cooled cabinet, forming the first true cabinet-scale SuperNode.

‍

_{Scale-Up Interconnect Inside NVL72}

‍

NVLink 5 + NVSwitch Characteristics:

Per GPU: 1.8 TB/s bidirectional bandwidth
Total (72 GPUs): 129.6 TB/s
5,184 differential-pair copper connections
Full-mesh connectivity across the cabinet

Performance Highlights:

18× higher bandwidth than 800G RDMA
Eliminates ~100 ns of optical DSP/CDR latency
Unified giant memory system:
- 13.5 TB HBM
- 17 TB CPU LPDDR5X via NVLink-C2C

NVL72 behaves as a multi-server, cabinet-level unified compute node.

‍

_{Scale-Out Expansion Beyond the Cabinet}

‍

A SuperPOD of 8× NVL72 units (576 GPUs) is built by:

Equipping each GPU tray with four 800G CX8 RDMA NICs
Connecting cabinets via InfiniBand or RoCEv2

Two-Tier Network Architecture:

Scale-Up inside each cabinet
Scale-Out across cabinets within the data center

_{Scale-Across: Multi-Data-Center AI}

Scale-Across extends AI networking beyond a single physical site.

‍

_{Why Scale-Across Is Necessary}

‍

Single data centers face hard limits:

Power availability
Cooling capacity
Physical land constraints

Example:

One NVL72 consumes ~120 kW
A 1-GW AI factory requires ~8,300 cabinets

This is infeasible within a single facility.

‍

_{NVIDIA Spectrum-XGS}

‍

Key Components:

ConnectX-8 and BlueField-3 SuperNICs
800G / 1.6T optical modules
ZR / ZR+ coherent optics (80+ km DCI)
Hollow-Core Fiber (HCF)

HCF Benefits:

~30% lower latency
~50% lower attenuation

Scale-Across introduces a new inter-DC communication layer on top of traditional L3 architectures.

‍

_{Enabling Optical Technologies}

Technology	Role
Hollow-Core Fiber	Ultra-low latency transport (light travels in air)
Coherent Transceivers	High-speed long-haul inter-DC communication
Optical Circuit Switches (OCS)	Massive bandwidth + port density, ideal for Spine/DCI tiers

OCS is projected to capture >50% of traditional switch market share over time.

_{Systematic Comparison}

Dimension	Scale-Up	Scale-Out	Scale-Across
Physical Scope	Server / chassis / cabinet	Within one data center	Across multiple data centers
Communication Semantics	Load–store (memory semantic)	Message passing	Cross-region unified fabric
Latency	ns	μs–ms	ms (optimized)
Bandwidth	TB/s	100–800 Gb/s	Depends on optical modules
Medium	Copper	Optical	Coherent optics / HCF
Key Technologies	NVLink, UALink, SUE	InfiniBand, RoCE, UEC	Spectrum-XGS, OCS
Bottlenecks	PCB area, cooling, power	Congestion control, topology	DCI cost, optical power
Main Use Cases	TP/EP	DP/PP	Giant-scale training, AI factories

‍

Final Thoughts

The three dimensions will define the architecture of next-generation AI factories:

Scale-Up to build cabinet-level SuperNodes
Scale-Across to unify multiple data centers into GW-scale compute fabrics
Scale-Out to assemble mega-clusters inside a data center.

Future AI facilities will:

Host millions of GPUs
Consume multiple gigawatts of power
Span multiple geographic regions
Support AGI, robotics, autonomous systems, medical AI, and nation-scale LLMs.

Scale-Across is the foundational step toward this future.

AIDC Networking