Message boards :
Number crunching :
Nvidia OpenCL problem for 364.* drivers
Message board moderation
| Author | Message |
|---|---|
robertmilesSend message Joined: 16 Apr 09 Posts: 503 Credit: 769,991,668 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The OpenCL section of the Nvidia 364.72 driver, and earlier 364.* drivers, has a problem which can cause an entire computer to lock up, or cause a few dozen OpenCL tasks (often not all from the same BOINC project) to give a quick Compute Error. Problem not seen in the 362.00 driver. Tasks from POEM@home seem the most likely to trigger this problem. Threads on the problems: https://www.primegrid.com/forum_thread.php?id=6769#94223 http://boinc.fzk.de/poem/forum_thread.php?id=1205#10896 I currently do not have hardware that can check whether GPUGRID has this problem, but you may want to watch for it. |
|
Send message Joined: 26 Nov 13 Posts: 17 Credit: 50,096,588 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Great. I just applied the 364.72 Nvidia update yesterday and now all of my GPUGrid tasks are crashing. One failed after considerable time had elapsed and the last two crashed just after starting. |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Could be something else entirely because this board is not full of WU failures due to these drivers and I've run them myself since they came out. GPUGrid doesn't use OpenCL |
|
Send message Joined: 26 Nov 13 Posts: 17 Credit: 50,096,588 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
I'll try a clean install of the drivers and see if that fixes the issue. |
robertmilesSend message Joined: 16 Apr 09 Posts: 503 Credit: 769,991,668 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have no information on whether this problem also affects CUDA tasks, but for OpenCL tasks, one task crashes after a few hours, then perhaps two dozen more (not necessarily from the same BOINC project) crash quickly. Restarting Windows appears to be required to make any more OpenCL tasks complete properly. |
|
Send message Joined: 26 Nov 13 Posts: 17 Credit: 50,096,588 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
I've reverted back to ver. 362.00 to see if this fixes my GPUGrid WU problems - when there's more WUs available I'll be able to tell. It looks like my ASUS GTX650-E-1GD5 GeForce GPU didn't run ver. 364.xx very well. Yeah, I know it's an old card. There were multiple errors in Windows Event Viewer and my ASUS GPUTweak was also blowing up. Ver. 362.00 fixed that. |
|
Send message Joined: 26 Nov 13 Posts: 17 Credit: 50,096,588 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Yeah, that fixed it. My WUs are completing normally now. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
'Upgraded' to 364.72 WHQL (Clean install wouldn't work on W10x64) and found that it crashed all POEM tasks (OpenCL) [driver restarts]. Ran MW and Einstein tasks without problems and so far it's running a task here without difficulty. FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 2 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
'Upgraded' to 364.72 WHQL (Clean install wouldn't work on W10x64) and found that it crashed all POEM tasks (OpenCL) [driver restarts]. Jacob Klein has already reported that one to NVidia, and got David Anderson to add an option to disallow OpenCL tasks, wherever they might pop up from. I think that's a sledgehammer to crack a very small nut, and I've told him so, but you might like to test the new v7.6.32 (you'll have to find the download yourself - it hasn't even gone into alpha testing yet). |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
:) I see my name got mentioned. Yeah, it's nice to have an option to disable OpenCL at the client, in my opinion, for cases like this where you may want the latest drivers for gaming, but can't support running OpenCL tasks due to NVIDIA. My ticket with them is regarding the OpenCL SDK examples failing on Maxwell, but I also mentioned to them that R364 drivers are failing Poem@Home tasks and causing TDRs, BSODs, restarts, and even making other tasks fail. The BOINC 7.6.32+ cc_config option for <no_opencl>1</no_opencl> ... works nicely as a workaround, for the scenario. The R364 drivers are still trash, in my opinion. The main reason I run them is to help find problems to get them fixed. In addition to the horrible OpenCL woes, the R364 drivers also have a bug with brief full screen corruption any time a CUDA task starts on my eVGA GTX 980Ti FTW at 144 Hz. Junk. The 362.00 drivers are the latest that have my solid recommendation. Regards, Jacob |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks for posting the information about this problem and also for the recommendation concerning the latest relatively bug free drivers (362.00). |
|
Send message Joined: 26 Feb 12 Posts: 184 Credit: 222,376,233 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I read somewhere that Primegrid will no longer send tasks to any computer that has these problematic drivers installed. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I read somewhere that Primegrid will no longer send tasks to any computer that has these problematic drivers installed. That would be wise, as the tasks supposedly gracefully complete with miscalculated results! :) We're tracking the problem/solution here: http://www.primegrid.com/forum_thread.php?id=6775 ... where I have an NVIDIA dev looking into it. So, look there for updates. Edit: Made hyperlink clickable, sorry about that. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 2 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
http://www.primegrid.com/forum_thread.php?id=6775 (just making it clicky so I can follow without editing every time) Edit - looks like you've got some experienced debuggers active there. Excellent news. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have confirmed that today's 365.10 drivers do NOT fix the OpenCL problems -- PrimeGrid miscalculation and Poem@Home TDRs. I'd recommend users to stick with 362.00, and projects to take action to prevent issuing OpenCL tasks to R364 users. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have a small status update, regarding my NVIDIA bug (Bug ID 1754468) for these OpenCL issues: - Status changed from "Open - pending review" to "Open - in progress" |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Another small update -- basically, while NVIDIA fixes the problems, they're requesting additional info to potentially make "Poem@Home" and "PrimeGrid calculation" test cases that could be used in their checklist to release new drivers. That's a GREAT idea, in my opinion :) |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Lots of updates in these 2 threads: Basically, the problems have been solved, but only the POEM crashes will land in the upcoming (any day) driver release. The PrimeGrid miscalcs will have to wait until the (sometime later this month) driver release. http://www.primegrid.com/forum_thread.php?id=6775 http://boinc.fzk.de/poem/forum_thread.php?id=1205 |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have confirmed that the new Doom 365.19 drivers: - Do NOT fix the OpenCL/CUDA miscalculations (Internal NVIDIA Bug ID: 200197534) - DO fix the Poem@Home TDR/crashes (NVIDIA Bug ID: 1754468) So... If you do any distributed computing involving OpenCL/CUDA calculating, I recommend that you **stick with 362.00** for correct calculations, until the next driver release which should have the miscalculation fix. Thanks, Jacob |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have confirmed that the new Doom 365.19 drivers:I have the 364.72 driver on 3 of my hosts, and my Einstein@home tasks are validating just fine. So I'm not sure about the extent this issue has on CUDA tasks. |
©2026 Universitat Pompeu Fabra