NVIDIA-SMI has failed on Remote Server
NVIDIA-SMI has failed on Remote Server
https://forums.developer.nvidia.com/t/nvidia-smi-has-failed-on-remote-server/315075
oleg.s
1 Nov 2024 Hello,
I have a fresh server on-site which I can remotely connect to via ssh…
| I have two A16 in the server. If I run lspci -v | grep -i nvidia I get: |
2a:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau 2b:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau 2c:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau 2d:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau b8:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau b9:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau ba:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau bb:00.0 3D controller: NVIDIA Corporation GA107GL [A2 / A16] (rev a1) Subsystem: NVIDIA Corporation Device 14a9 Kernel modules: nvidiafb, nouveau
Running sudo mokutil –sb-state it told me that it is off…
Then, what I did was:
sudo ubuntu-drivers install –gpgpu nvidia:535-server sudo apt install nvidia-utils-535-server reboot…
nvidia-smi → Error: NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running
sudo apt install –reinstall linux-headers-$(uname -r) reboot…
nvidia-smi → Error: NVIDIA-SMI has failed because it couldn’t communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running
Attached is the bug report… nvidia-bug-report.log.gz (234.9 KB)
I could not run nvidia-settings command as it was not installed…
Could you please help what I need to do? Attached is also the bug report.
| Solved… do not use basic drivers from ubuntu… use NVIDIA ones… datacenter-driver Downloads | NVIDIA Developer |
298 views
2 links
post by oleg.s on Nov 29, 2024
oleg.s Nov 2024 Solved… do not use basic drivers from ubuntu… use NVIDIA ones… datacenter-driver Downloads | NVIDIA Developer
post by MarkusHoHo on Dec 2, 2024
MarkusHoHo Moderator Dec 2024 Just for reference, Data centre driver documentation:
docs.nvidia.com NVIDIA Datacenter Drivers :: NVIDIA Data centre GPU Driver Documentation Documentation for NVIDIA® Datacenter Drivers.
14 days later Closed on Dec 16, 2024
Closed on Dec 16, 2024