Overview
System configuration — chili10-101d
Single-socket AMD Genoa + 4× L40S + 4× PCIe 5.0 Solidigm D7-PS1010 + 2× ConnectX-7. No PCIe switches, no NVLink — a clean topology purpose-fit for Tier-2 GDS validation.
Hardware summary
| Hostname | chili10-101d |
| Chassis | Gigabyte G293-Z23-AAM1-000 (2U, 4-GPU front-load) |
| CPU | AMD EPYC 9554 — 64 C / 64 T (SMT off), Zen 4 Genoa |
| DRAM | 377 GiB DDR5-4800 · 12-channel · 1 NUMA node |
| GPUs | 4× NVIDIA L40S · 46 GB GDDR6 · BAR1 64 GiB · PCIe Gen4 x16 |
| Data NVMe | 4× Solidigm D7-PS1010 (7.68 TB · PCIe 5.0 x4 · 14.5 GB/s read) |
| Boot NVMe | KIOXIA KXD5YLN13T84 (3.5 TB · quadrant 0x80) |
| Data NICs | 2× ConnectX-7 (mlx5_0 @ c1:00.0, mlx5_1 @ 01:00.0) · 400 GbE |
| Mgmt NIC | Broadcom BCM57416 10 GbE (enp3s0f1np1 · 10.100.200.56) |
| OS / kernel | Ubuntu 24.04.3 LTS · kernel 6.17.0-22-generic (HWE) |
| GPU driver | NVIDIA 580.126.20-open · CUDA CC 8.9 |
PCIe topology — per IOD quadrant
The EPYC 9554 IOD exposes 4 Gen5 root complexes at 0x00, 0x40, 0x80, 0xc0. No external PCIe switches. Each GPU/NIC/NVMe hangs directly off a root port — the best possible case for P2P DMA.
0x00GPU0 + NIC1 · PHB
- GPU 0 (02:00.0) · L40S
- NIC 1 (01:00.0) · ConnectX-7 · mlx5_1 · 10.100.240.56
- Broadcom mgmt NIC + SATA + BMC
0x40GPU1 + 2× NVMe
- GPU 1 (41:00.0) · L40S
- nvme3 (42:00.0) · Solidigm D7-PS1010 7.68 TB
- nvme4 (43:00.0) · Solidigm D7-PS1010 7.68 TB
0x80GPU2 + 3× NVMe (data + boot)
- GPU 2 (81:00.0) · L40S
- nvme0 (82:00.0) · Solidigm D7-PS1010 7.68 TB
- nvme1 (83:00.0) · Solidigm D7-PS1010 7.68 TB
- nvme2 (84:00.0) · KIOXIA boot drive
0xc0GPU3 + NIC0 · PHB
- GPU 3 (c2:00.0) · L40S
- NIC 0 (c1:00.0) · ConnectX-7 · mlx5_0 · 10.100.241.56
- ASPEED VGA
nvidia-smi topo -m
NIC0↔GPU3 and NIC1↔GPU0 are PHB (same host bridge) — the intended pairing for minimum-hop GPUDirect RDMA.
GPU0 GPU1 GPU2 GPU3 NIC0 NIC1 GPU0 X NODE NODE NODE NODE PHB GPU1 NODE X NODE NODE NODE NODE GPU2 NODE NODE X NODE NODE NODE GPU3 NODE NODE NODE X PHB NODE NIC0 = mlx5_0 (c1:00.0) NIC1 = mlx5_1 (01:00.0) PHB = same PCIe host bridge (best case) NODE = across IOD Infinity Fabric (next best)
Storage layout
4× D7-PS1010 in a single md-raid0 stripe, XFS mounted at /gds. Boot drive lives on a separate KIOXIA — not benchmarked.
/dev/md0 : raid0 across 4× D7-PS1010 (nvme3 + nvme4 + nvme0 + nvme1)
└─ XFS mounted at /gds
/dev/nvme2n1 : KIOXIA boot drive
├─ /boot/efi (vfat)
└─ / (ext4)
Per-drive spec : Gen5 x4 (32 GT/s) · 14.5 GB/s read
Array aggregate : ~58 GB/s theoretical, 53 GiB/s measured (Run 2) — 91%VAST storage fabric
NFSoRDMA target — 2× Mellanox SN5600 leaf switches, 5-CNode VAST cluster, 5-VIP multipath.
| Topology | 2× Mellanox SN5600 (leaf) · 5 CNodes (VAST) |
| VLAN | 240/241 (local fabric) · 233 (storage) |
| RoCE | v2 · DSCP 26 · PFC priority 3 · ECN on |
| VIPs | 10.100.233.10-14 (5 VIPs) |
| Mount | /mnt/vast-rdma · nconnect=32/64 · proto=rdma · vers=4.1 |
| Driver | vastnfs-dkms 4.5.5 · MOFED 24.10 · nvidia-fs 2.28.4 |