zFABRIC™ creates one high-performance fabric across all your GPUs, maximizing throughput and minimizing latency for distributed AI workloads.




.webp)
AI is reshaping the digital economy, and AI infrastructure has become a strategic asset. Zettabyte is building the next-generation GPU data centers and software that push this shift forward with efficiency, density, and scale.

Have more questions?
Here are quick answers to common inquiries about zFABRIC™
zFABRIC is a high performance RDMA networking solution purpose built for AI and GPU clusters. zFABRIC is like using high quality, non dealer performance parts in your race car enabling AI clusters to scale efficiently across racks and data centers without relying on closed or vendor specific networking. zFABRIC delivers the performance required for distributed AI training while giving operators flexibility in hardware sourcing avoiding vendor lock-in thus allowing faster deployments and lower long-term operating costs.
zFABRIC lowers CAPEX and OPEX for our customers by enabling mixed hardware generations, supporting multiple network vendors, and reducing downtime through automated recovery. Customers who deploy zFABRIC avoid vendor lock-in, extend hardware lifespan, and reduce operational overhead, significantly improving TCO.
zFABRIC is designed to keep AI systems productive even when underlying components fail. Through automated failover, continuous link health monitoring, intelligent rerouting, and rapid recovery, zFABRIC minimizes disruption to training and inference workloads. This reduces GPU hang time, protects delivery timelines, and allows operators to meet SLA expectations minimizing manual intervention, resulting in more predictable operations and fewer costly interruptions, or mean time to recovery (MTTR). Overall, zFABRIC and Zettabyte's full product offerings allow organizations to bring systems online quickly while maintaining control and operational continuity.
No, zFABRIC is vendor agnostic and supports heterogeneous GPU and accelerator environments based on open RDMA standards such as RoCEv2. This allows organizations to deploy and operate AI infrastructure using NVIDIA, AMD, or other accelerators without being locked into a single vendor ecosystem. As a result, customers can source hardware more flexibly, extend the usable life of existing assets, adapt faster to supply or pricing changes, and reduce long-term infrastructure costs while maintaining consistent performance at scale.
zFABRIC primarily uses RoCEv2 (RDMA over Converged Ethernet) to deliver high-performance GPU networking on standard Ethernet infrastructure. This enables near InfiniBand performance while using widely available switches, optics, and cabling. As a result, customers can deploy AI clusters more quickly, scale across vendors and sites with less friction, and achieve high performance without the cost and constraints of proprietary networking stacks.
zFABRIC is designed to scale from thousands to hundreds of thousands of GPUs within a single AI environment. Scaling limits are determined by physical factors such as optics speed, switch capacity, and data center power and cooling, not by the zFABRIC software itself. This allows organizations to start at practical cluster sizes and expand over time without redesigning the network thus reducing deployment delays, protecting existing investments, and avoiding premature infrastructure replacement.
Yes, zFABRIC enables AI training and inference to run across geographically distributed data centers, allowing organizations to scale beyond a single site without redesigning their network. This makes it possible to bring capacity online faster, use existing facilities more effectively, and avoid costly overbuild in one location. By supporting long distance interconnection with production ready designs, zFABRIC allows teams to operate distributed AI systems reliably while improving utilization and lowering the total cost of scaling AI infrastructure.