What are the key specs of the Vera Rubin NVL72?

72 Rubin GPUs + 36 Vera CPUs in a liquid-cooled rack; 3.6 TB/s per-GPU NVLink 6 bandwidth (260 TB/s rack total); 75 TB total fast memory (20.7 TB HBM4 + 54 TB LPDDR5X); 1.6 PB/s HBM bandwidth; 3.6 EFLOPS NVFP4 inference (10x performance/watt and 1/10 cost per token vs Blackwell); 2.5 EFLOPS training with 1/4 the Blackwell GPU count for MoE models.

What is the Vera CPU and how does it differ from Intel/AMD server CPUs?

The Vera CPU is purpose-built for agentic AI and reinforcement learning—not a general-purpose server CPU. Its 88 Olympus cores at 1.2 TB/s bandwidth and 1.8 TB/s NVLink-C2C with Rubin GPUs make it 50% faster and 2x more efficient than traditional rack-scale CPUs for AI workloads. It cannot simply replace Intel Xeon or AMD EPYC for general compute.

When is Vera Rubin available and what does it cost?

NVIDIA confirmed Vera Rubin is in full production and will be delivered to customers in H2 2026. First recipients are major cloud providers (AWS, Azure, Google Cloud). Rack pricing is expected in the multi-million dollar range (the Blackwell NVL72 was priced around $2.7M). Individual access will be through cloud services.