Message boards :
Graphics cards (GPUs) :
Unable to determine the device handle for GPU 0000:02:00.0: Unknown Error
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 5 May 22 Posts: 24 Credit: 12,458,305 RAC: 0 Level ![]() Scientific publications
|
After rebooting the system and restarting the boinc GPUGRID, it first runs normally, but then appear error: $ nvidia-smi Unable to determine the device handle for GPU 0000:02:00.0: Unknown Error |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
which of your two machines are you talking about? The one running Linux or the other one running Windows? |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
After rebooting the system and restarting the boinc GPUGRID, this happens when the driver crashes or the GPU has some kind of problem and drops off. only a reboot can bring it back. check your power and PCIe connections to make sure they are good. I mostly encountered this issue with dodgy power cables. edit- scratch that, I see that these are laptops now. so not much you can do really for checking power connections. it could be that the cards are overheating when trying to run GPUGRID tasks. make sure the laptops have adequate airflow and are maintaining reasonable temps. maybe reduce overclocks if any. that might be all you can do without getting into the weeds and taking it apart to replace thermal paste, etc.
|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
which of your two machines are you talking about? it's the linux one.
|
|
Send message Joined: 5 May 22 Posts: 24 Credit: 12,458,305 RAC: 0 Level ![]() Scientific publications
|
Linux and I think that looks like driver crash as explained here. Windows machine case I think I don't have the right OpenCL NVIDIA library/driver or some issue like that and the GPUGRID didn't start at all in the windows11 machine. Could you advice where to download the required driver for the windows11? Or is there any other reasons causing this? |
|
Send message Joined: 5 May 22 Posts: 24 Credit: 12,458,305 RAC: 0 Level ![]() Scientific publications
|
I will try the thermal paste change as soon as I receive it by post. Currently I have the Kryonaut extreme 14.2W/mK and I will try some other brand that says 14.6W/mK. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Always best to grab Nvidia drivers straight from Nvidia. Get the Studio drivers. https://www.nvidia.com/Download/index.aspx?lang=en-us# |
|
Send message Joined: 5 May 22 Posts: 24 Credit: 12,458,305 RAC: 0 Level ![]() Scientific publications
|
ok. I got that drivers, but installation could be difficult issue. Currently with MX-Linux driver version is: 510.47.03 and downloaded version is later: 510.73.05. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Do a sudo apt purge *nvidia* to get rid of the existing drivers and reboot That will put you on the stock Nouveau drivers. Then install the new Studio 510.73.05 drivers. You should get the OpenCL component necessary for other projects and the current CUDA 11.6 libraries bundled into the Desktop driver. |
|
Send message Joined: 5 May 22 Posts: 24 Credit: 12,458,305 RAC: 0 Level ![]() Scientific publications
|
I removed the drivers with: sudo ddm-mx -p nvidia But the NVIDIA-installer still says that the drivers are there and refuses to install the later driver version: nvidia-installer log file '/var/log/nvidia-installer.log'
creation time: Fri May 27 18:38:09 2022
installer version: 510.73.05
PATH: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
nvidia-installer command line:
./nvidia-installer
Using: nvidia-installer ncurses v6 user interface
-> Detected 8 CPUs online; setting concurrency level to 8.
-> Installing NVIDIA driver version 510.73.05.
-> The NVIDIA driver appears to have been installed previously using a different installer. To prevent potential conflicts, it is recommended either to update the existing installation using the same mechanism by which it was originally installed, or to uninstall the existing installation before installing this driver.
Please review the message provided by the maintainer of this alternate installation method and decide how to proceed:
Please use the Debian packages instead of the .run file.
(Answer: Continue installation)
-> Running distribution scripts
executing: '/usr/lib/nvidia/pre-install'...
If you want to use the nvidia-installer please uninstall the Debian packages
first. The two methods of installation cannot be used at the same time.
Terminating nvidia-installer in 1 seconds.
Killing nvidia-installer
|
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Can't help you here. I know nothing about MX-Linux. Commands are entirely foreign to me. I know Ubuntu and Debian. And use the graphics-drivers ppa. I get rid of older drivers with a purge. It sounds like you are running the Nvidia .run.sh installer perchance. I believe it has its own uninstaller by running the Nvidia driver .run.sh script again with the --uninstall option. |
|
Send message Joined: 5 May 22 Posts: 24 Credit: 12,458,305 RAC: 0 Level ![]() Scientific publications
|
The command sequence found to remove the NVIDIA MXLinux driver is possibly: apt purge nvidia* -y apt-get purge $FORCE $(apt-cache pkgnames | grep nvidia | grep -v detect | grep -v cleanup | cut -d':' -f1) bumblebee* primus* primus*:i386 2>&1 apt autoremove and then new driver version 510.73.05 was installed and the system stopped crashing: Sat May 28 12:40:23 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.73.05 Driver Version: 510.73.05 CUDA Version: 11.6 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:02:00.0 Off | N/A |
| N/A 72C P0 N/A / N/A | 2291MiB / 4096MiB | 3% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1418 G /usr/lib/xorg/Xorg 4MiB |
| 0 N/A N/A 3344 C bin/python 2285MiB |
+-----------------------------------------------------------------------------+
|
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Congrats, well done! |
©2025 Universitat Pompeu Fabra