Message boards :
Number crunching :
all WUs downloaded recently produce "computation error" right away
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next
| Author | Message |
|---|---|
|
Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,798,881,008 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
which driver version is necesary and which driver version is save? |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
... updating drivers should do it which might be impossible, or at least risky in case of Windows XP; Zoltan, what's your opinion on this? |
|
Send message Joined: 17 Feb 13 Posts: 181 Credit: 144,871,276 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My drivers are locked to the versions that came with the devices: changing drivers causes failures. I will return as soon as the current system issues have been resolved as I believe GPUGrid performs valuable work. Now I am off to Folding and WCG..... John |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
For a more correct solution we will have to wait for Matt to update the old app next week. In the meanwhile as I said updating drivers should do it What the crap, Stefan? :) I'm already using the latest drivers! My failures are on Windows 10, using 381.65 and 381.78. Please provide more details on what drivers you think should work, and also why failures still happen on 381.65 and 381.78. Edit: I'm not 100% sure that I've been able to attempt a task using 381.78 yet. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
... My failures are on Windows 10, using 381.65 and 381.78. Please provide more details on what drivers you think should work, and also why failures still happen on 381.65 and 381.78. I was just going to ask here whether some-one has already tried the latest drivers - your posting answers my question, although in the negative sense. So Matt's assumption that the latest drivers should solve the current problem unfortunately seems to be wrong :-( |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
The problem should now be fixed for anyone with a CUDA 8-capable driver. Matt |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I see you've deprecated v8.48 completely, but left v9.15 (superficially - as far as we can see) unchanged. I couldn't get it to work earlier, but I'll try again within the hour - test machine is busy with another project just at the moment. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The problem should now be fixed for anyone with a CUDA 8-capable driver. which means that for Windows XP users, the problem is NOT solved yet, right? When will this be the case? |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
I've changed the rules for issuing the 915 version. Any Windows machine that is 64 bit and reports CUDA 8.0 capability will get it now. Matt |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've changed the rules for issuing the 915 version. Any Windows machine that is 64 bit and reports CUDA 8.0 capability will get it now. So which steps will be taken next to enable older drivers for XP to work? My XP with driver 368.81 did download version 915, the task did start, but was broken off after a few minutes with "too many exit(0)s" |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
I was expecting it to work on 64 bit XP, actually. Given that it doesn't there's not a tremendous amount I can do to fix it immediately. We haven't had an XP test platform for a long time: Microsoft's ended support for it 3 years ago! You really should upgrade... Matt |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK, let's put XP to bed - I think it's a red herring in this case. I have two - well, three - identical Windows 7/64 machines, each with GTX 970 GPUs. Two tests - first, with an older cuda 7.0 driver: no tasks available, no tasks sent. That's the right answer after deprecating v8.48 Second, the one which I upgraded with a cuda 8.0 driver earlier today (specifically, 368.81). Task was sent, and along with it the v9.15 application - again, as intended. So far so good. BUT - as reported earlier in this thread (but I appreciate you wouldn't want to read through the entire thing on a holiday Saturday), v9.15 isn't running on my Maxwell cards with the current batch of tasks. (It runs fine on a Pascal card in another machine) Symptoms are: Under BOINC, repeated iterations of Task e4s7_e2s3p0f357-ADRIA_FOLDGREED10_crystal_ss_contacts_100_ubiquitin_4-1-2-RND7142_0 exited with zero status but no 'finished' file until BOINC kills the task with the 'Too many exits' after 100 tries - exactly the message Erich got under XP. No difference between the OS versions - this difference applies to the hardware (different generations of GPU). It seems to have changed with this new batch of tasks, since the initial test release a week ago. Running standalone in a terminal window, I get D:\BOINCdata\slots\0>acemd.915-80 # ACEMD Molecular Dynamics Version [3212] # CUDA Synchronisation mode: BLOCKING # CUDA Synchronisation mode: BLOCKING # SWAN: Created context 0 on GPU 0 SWAN : FATAL : Cuda driver error 35 in file 'swanlibnv2.cpp' in line 448. # SWAN swan_assert 0 - that's the only diagnostic I've been able to capture. Nothing is written to the output or stderr files. Test task is 16240262 - I'll let it run through its 100 exits and report it as soon as I've posted this, so you can compare my Windows 7 output with Erich's XP. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Matt, It's a bit off-topic, but let me explain: These Windows XP x64 hosts are dedicated crunching boxes (therefore it does not matter if their OS is not supported anymore). A lot of effort have been put into them to make the GTX 980Ti work under Windows XP, selecting the right MB, "hacking" the NV driver to recognize the top-end cards, etc. The reason for *not* to upgrade them from Windows XP is to maximize their throughput (avoiding WDDM). The other path to achieve this is to use Linux, but you haven't put the SWAN_SYNC option into the latest Linux client (as far as my test proved it, but please correct me if I'm wrong), which hinders the performance of the top-end cards under Linux too. So you could motivate us to use Linux instead of the deprecated Windows XP if you would put that option in the Linux client, it could also increase the performance of the top end cards by 10~15% under Linux. But for now, if you could make a fresh CUDA 6.5 client, that would be great (and it would save us a lot of work). Thank you in advance! |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
Richard, That error means "insufficient driver version". According to the records that machine is running Windows 7 64b, not XP. Why are you running that driver version? Matt |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I was running the same cuda 7.0 driver version on all machines until this morning - I upgraded this morning for testing only. My experience is that each successive driver release is slower for general purpose computing (generalisation - YMMV). Since I'm not a gamer, I don't want need or want the latest game patches. I just picked one that was the last in its particular sequence, so more likely to be stable and bug-free. We have the benefit of Jacob Klein reporting into this thread as well (see message 46910) - he does test the latest drivers for the benefit of the wider BOINC community, and has persuaded NVidia to fix several bugs over the years. He reports the same as me. Edit - Your Pascal app release post (message 44869) says simply "NVIDIA Driver 360+" - I thought I'd aimed high enough above that? |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
Jacob was testing before I'd changed the issuing rules for 915 - he never even go the app to test, let alone see any failures. If you could try a later version I'd appreciate it. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sure, anything I can do to help. Supper has just beeped in the microwave, but I'll download while I eat, and install later. Edit - 381.65 on its way. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
MJH: Can you please explain: 1) What caused the problem? 2) What solved the problem? 3) Why were "updated drivers" previously recommended as a solution? |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Specify *which* problem, please. Version not downloading - server configuration (plan class specification, I suspect), fixed. Current set of tasks failing on older cards - not fixed, under exploration. |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
1) What caused the problem? The executables that we deploy time expire after a year or so due to licensing issues. 2) What solved the problem? I've reconfigured the scheduler to send the 915 app (supporting kepler+) to all 64 bit hosts that report CUDA 8 support. This seems not to work on Windows XP, despite the last 368 seemingly reporting cuda 8 support. Matt |
©2025 Universitat Pompeu Fabra