-
Notifications
You must be signed in to change notification settings - Fork 77
Description
Important Note: NVIDIA AI Enterprise customers can get support from NVIDIA Enterprise support. Please open a case here.
Case number 01086911
Describe the bug
vgpu-manager lspci fails in air-gapped environment (on rheli9 and previously rheli8 - other environments not tested)
To Reproduce
cd gpu-driver-container\vgpu-manager\rhel9
set VERSION=580.126.08
set OS_TAG=rhcos4.18
docker build --build-arg DRIVER_VERSION=%VERSION% -t vgpu-manager:%VERSION%-%OS_TAG% .
docker save -o vgpu-manager-%VERSION%-%OS_TAG%.tar vgpu-manager:%VERSION%-%OS_TAG%
Upload vgpu-manager-580.126.08-rhcos4.18.tar to Openshift OCP hosts in an air-gapped environment
Openshift Console, navigate to Operators->Installed Operators->NVIDIA GPU Operator->All instances->gpu-cluster-policy, update the version to 580.126.08
the drivers fail as lspci cannot open libpci.so.3
Expected behavior
Drivers should load without errors.
Environment (please provide the following information):
gpu-driver-containersource (Commit SHA or image digest): sha256:6f8a016ab415bf4cdd8e1868de5ae5cf44e9681c- NVIDIA Driver Version: 580.126.08
- Host OS: Openshift ocp
- Kernel Version: rhcos4.18
- Container Runtime Version: ?
- CPU Architecture x86_64
- GPU Model(s) T4/A2/A16
If applicable, also provide:
- Kubernetes Distro and Version: OpenShift
- NVIDIA GPU Operator version: 25.10.1
With https://github.com/NVIDIA/gpu-driver-container code from 2026-02-18, lspci fails in an air gapped environment. The error message is that it cannot load libpci.so.3. Also, setpci is required and it cannot be found. Dec 2025 modifications were done by users akri3 and Shivkumar Ople, to partially correct this problem but they do not work without additional code changes to the file gpu-driver-container\vgpu-manager\rhel9\ocp_dtk_entrypoint.
This is the fix we implemented in the source code (gpu-driver-container\vgpu-manager\rhel9\ocp_dtk_entrypoint):
Below line 34 /usr/sbin/lspci \ add these two lines:
/usr/sbin/setpci
/lib64/libpci.so.* \
Below line 110 "$DRIVER_TOOLKIT_SHARED_DIR/lspci" \ add this line:
"$DRIVER_TOOLKIT_SHARED_DIR/setpci" \
Below line 113 export PATH="${DRIVER_TOOLKIT_SHARED_DIR}/bin:$PATH"; add this section of code;
mkdir "${DRIVER_TOOLKIT_SHARED_DIR}/lib" -p
cp -v \
"$DRIVER_TOOLKIT_SHARED_DIR"/libpci.so.* \
"${DRIVER_TOOLKIT_SHARED_DIR}/lib"
export LD_LIBRARY_PATH="${DRIVER_TOOLKIT_SHARED_DIR}/lib:$LD_LIBRARY_PATH";
A similiar change would need to be done in the rhel8\ocp_dtk_entrypoint file