Advanced search

Message boards : Graphics cards (GPUs) : NVIDIA 410.104 Problem?

Author Message
kksplace
Send message
Joined: 4 Mar 18
Posts: 53
Credit: 1,400,776,749
RAC: 3,597,733
Level
Met
Scientific publications
wat
Message 51616 - Posted: 10 Mar 2019 | 2:38:51 UTC

Is anyone having a problem with installing/using NVIDIA driver 410.104 on their Linux machines? On one of my two hosts if I use this driver the computer freezes and automatically reboots about a minute after a previous reboot and continues this cycle. The only solution I have found is to login to tty2, purge all nvidia, then install nvidia-driver-390 (currently 390.116). Everything works fine with 390, and 410 works fine on my other host (Dell XPS 8930 with a 1070). (I would like to use 410 since it gives me better compute performance.)

Linux Mint 19.1, kernel 4.15.0-46-generic
EVGA 1080 FTW Hybrid
MSI Carbon Gaming AC mobo (board will show Ab when the computer freezes).
Intel 7820x with Corsair AIO.
Installing 410 from ppa:graphics-drivers/ppa

Any help/ideas greatly appreciated!

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51617 - Posted: 10 Mar 2019 | 23:23:18 UTC

All my Linux computers are running 418.43 with no problems. Before that they ran 415.27 with no problems. Time to get current.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,926,706,959
RAC: 6,516,431
Level
Arg
Scientific publications
watwatwatwatwat
Message 51618 - Posted: 11 Mar 2019 | 1:43:08 UTC - in response to Message 51617.

The 410 branch is supposed to be the long term branch. I had one hiccup on one machine I updated from 410.78 to 410.104 which for some reason dropped the dkms installation from the 410.78 installation and didn't install for some reason. A 410.104 reinstall fixed that and the other 3 machines updated with no issues from 410.78. I also have two machines on 415.27 and they have been running very well too. I would choose either branch for stability now. Not sure I need to experiment with 418 branch yet and will let it simmer for a while I think before I try it.

kksplace
Send message
Joined: 4 Mar 18
Posts: 53
Credit: 1,400,776,749
RAC: 3,597,733
Level
Met
Scientific publications
wat
Message 51621 - Posted: 12 Mar 2019 | 2:11:19 UTC - in response to Message 51618.

A little more experimentation shows I can't use the 418 driver either. It definitely seems to be a driver problem introduced recently since I had been using previous versions of 410.

Update: I don't have to purge Nvidia after all -- if I simply install nvidia-driver-390 from tty2 after the reboot-loop starts, then everything works fine. This is especially confusing since I have the problem only on one of my two hosts and I don't receive any errors when attempting to install either 410 or 418.

Well, 390 is working fine right now. Just wondering what is going on in order to prevent future problems.

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 399
Credit: 13,024,100,382
RAC: 766,238
Level
Trp
Scientific publications
watwatwat
Message 51622 - Posted: 12 Mar 2019 | 2:35:17 UTC

I had a couple of computers that refused to upgrade from 410. They were Linux Mint 19 so I just erased the drive and did a fresh install of Linux Mint 19.1 and Nvidia 418.43 and everything worked great. Dedicated DC clients so no worries about files or anything. My old motherboards also almost all needed to have their BIOS updated as well.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1284
Credit: 4,926,706,959
RAC: 6,516,431
Level
Arg
Scientific publications
watwatwatwatwat
Message 51628 - Posted: 13 Mar 2019 | 19:16:21 UTC

I believe the reason we are having issue with the Nvidia drivers is that they also are pulling in a huge amount and changes in the Nouveau, AMD and Mesa drivers also. There have been tons of new updates to the AMD and Nouveau drivers for the new cards and gaming improvements.

Don't forget that every time you update the Nvidia drivers the kernel gets recompiled to incorporate them. And that you won't be running the new drivers until you reboot the new kernel.

Also, the intermediate step to install new kernel mode drivers is to remove the old. That makes the system use the builtin Nouveau drivers as the main driver to paint the desktop.

Any hiccup or interruption in the process can leave you running stock Nouveau drivers instead of your expected Nvidia drivers.

The Nouveau drivers don't have compute capability. So any BOINC project running a gpu app will probably fail.

I believe that is exactly what happened on the one host I had issues with the 410.104 update. Rebooted to find only the Nouveau drivers running the desktop. Luckily I don't start BOINC at startup. I always manually start to check that everything is good with the Nvidia drivers and to run my overclock scripts. So I will catch any configuration change that is not conducive to running BOINC.

Post to thread

Message boards : Graphics cards (GPUs) : NVIDIA 410.104 Problem?

//