Overview

GDS setup — stack + reproducible steps

Every version below was validated end-to-end on chili10-101d. Steps are in dependency order — skipping ahead will leave you with a mount that's silently on TCP or a peermem module with an ABI mismatch.

Stack — pinned versions
Matching these exactly should reproduce Run 1-6.
Ubuntu24.04.3 LTS
Kernel6.17.0-22-generic (HWE)
NVIDIA driver580.126.20-open
CUDA toolkit12.9
nvidia-fs (GDS)2.28.4
MLNX OFED (DOCA)24.10 (3.3.0)
nvidia_peermembuilt against OFED 24.10
vastnfs-dkms4.5.5
mlnx-nfsrdma-dkmsrebuilt on OFED 24.10
NIXLmain @ 2026-04-14 (abseil-cpp 20250512)

Install sequence

Step 1
Kernel + headers pinned
Pick the HWE kernel and install matching headers before anything else. Everything that comes later (nvidia-fs, MOFED, vastnfs) builds DKMS modules against this kernel version.
sudo apt install -y linux-generic-hwe-24.04 linux-headers-$(uname -r)
sudo reboot  # confirm uname -r matches 6.17.x after reboot
Step 2
NVIDIA open driver + CUDA
The open kernel module variant is required for modern GDS. Do not install nvidia-driver-XXX-server (closed) — it silently breaks nvidia_peermem.
sudo ubuntu-drivers install --driver-only nvidia-driver-580-open
sudo apt install -y cuda-toolkit-12-9
nvidia-smi   # confirm driver loaded, 4× L40S present
Step 3
Install nvidia-fs (GDS kernel module)
This is the kernel-side half of GDS. The userspace half (libcufile) comes from the CUDA toolkit.
sudo apt install -y nvidia-gds-12-9
sudo modprobe nvidia-fs
gdscheck -p   # expect at least one supported device
Step 4
DOCA / MOFED 24.10
DOCA Host installs MOFED + the nvidia_peermem shim in one shot. Do not mix with in-tree rpcrdma — there's a symbol mismatch.
curl -LO https://www.mellanox.com/downloads/DOCA/DOCA_v3.3.0/host/doca-host_3.3.0-xxxx.deb
sudo dpkg -i doca-host_3.3.0-xxxx.deb
sudo apt update && sudo apt install -y doca-ofed
sudo /etc/init.d/openibd restart
Step 5
Rebuild nvidia_peermem against this OFED
After any OFED upgrade you MUST rebuild nvidia_peermem, otherwise the BAR1 memory registration silently fails and GDS over RDMA falls back to TCP.
sudo dkms build -m nvidia -v 580.126.20 -k $(uname -r)
sudo dkms install -m nvidia -v 580.126.20 -k $(uname -r)
sudo modprobe nvidia_peermem
lsmod | grep peermem   # must be loaded
Step 6
vastnfs-dkms 4.5.5
This replaces the in-tree NFS client with a VAST-patched version that supports multipath xprts, proper RDMA, and the libcufile_rdma hook. Cold reboot after install — a soft reload leaves the old rpcrdma dangling.
sudo apt install -y vastnfs-dkms=4.5.5-*
sudo shutdown -r now   # cold reboot — not reboot
Step 7
Mount the VAST namespace
5 VIPs for multipath, nconnect=32 or 64, RoCE v2 with PFC priority 3.
sudo mount -t nfs -o \
  vers=4.1,proto=rdma,nconnect=32,localports=0.0.0.0-,vip_hash=random \
  10.100.233.10:/vast-ns /mnt/vast-rdma
cat /proc/fs/nfsfs/cbstats   # expect RDMA xprts, one per VIP
Step 8
Build NIXL + nixlbench
NIXL requires abseil-cpp 20250512 — the apt version is too old. Build it as an external dep first, then NIXL picks it up.
git clone https://github.com/abseil/abseil-cpp && cd abseil-cpp && git checkout 20250512
cmake -S . -B build && cmake --build build -j && sudo cmake --install build
git clone https://github.com/nvidia/nixl && cd nixl
cmake -S . -B build -DNIXL_BUILD_BENCH=ON && cmake --build build -j

Gotchas

The non-obvious traps encountered during this enablement. Most fail silently — i.e. your mount works, but performance is quietly wrong.

trap
gdscheck NVMe “Unsupported” is a false negative
On Gen5 NVMe with PCIe ACS off and IOMMU in passthrough, gdscheck sometimes flags Unsupported even when P2P works. Confirm with a real gdsio run showing XferType: GPUD — that is the authoritative check.
trap
nvidia_peermem rebuild after OFED upgrade
Any MOFED/DOCA upgrade invalidates the previously built peermem module. Without rebuild, all RDMA traffic silently falls back to TCP and you will see mysterious 4× slowdowns with no error in dmesg.
trap
rpcrdma symbol mismatch → mlnx-nfsrdma-dkms
The in-tree rpcrdma module won’t load against MOFED because of an ABI mismatch. Installing mlnx-nfsrdma-dkms replaces it; vastnfs depends on that replacement.
trap
PFC priority 3 silent mount fallback to TCP
If the switch PFC config doesn’t match priority 3, the RoCE mount succeeds but falls back to TCP under load. Verify with ethtool -S mlx5_0 | grep prio3_pause.
trap
Abseil too old for NIXL build
The apt abseil-cpp is 20210324 or similar. NIXL requires 20250512+. Build from source before running NIXL’s cmake.
trap
vastnfs cold-reboot requirement
rmmod of the old rpcrdma + modprobe of vastnfs leaves refcounts inconsistent. A cold reboot is needed to get a clean state.
trap
cufile.json queue-depth tuning breaks with -22
Increasing GDS queue depth past defaults on this libcufile throws -22 (EINVAL) on first I/O. Leave cufile.json at defaults for now.
Source-of-truth docs (GitHub): gds-setup-recipe.md · vast-nfsordma-gds-setup.md