all WUs downloaded recently produce "computation error" right away

Message boards : Number crunching : all WUs downloaded recently produce "computation error" right away
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47121 - Posted: 27 Apr 2017, 11:20:20 UTC - in response to Message 47120.  

back at it again

Anyone else getting a bunch of faulty WU?

No - the WUs seem to be fine at the moment, and your failures since 26 April come from a range of different WU types.

The output of your most recent successful task shows

Driver version	: r376_38 : 37653

but your computer now shows

NVIDIA GeForce GTX 970 (4095MB) driver: 381.89

Since you're running Windows 10, I suspect you've suffered from the common 'automatice driver update by Microsoft'. Try updating your driver again, this time direct from the NVidia site.
ID: 47121 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Loohi

Send message
Joined: 27 Aug 16
Posts: 16
Credit: 43,745,875
RAC: 0
Level
Val
Scientific publications
wat
Message 47123 - Posted: 27 Apr 2017, 12:56:08 UTC - in response to Message 47121.  

Thanks for the detailed answer. Updates have been triggered voluntarily by me, for both Win10 Creator and Nvidia. I'll try to reinstall these drivers now and see if it makes a difference tomorrow.
ID: 47123 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47125 - Posted: 27 Apr 2017, 18:58:21 UTC
Last modified: 27 Apr 2017, 18:58:55 UTC

Just to clarify ....
Work units are NOT FINE on CC3/SM3 GPUs like my GTX 660 Ti GPUs :(

Still waiting for MJH to give us more details on what went wrong, and who must fix it..
ID: 47125 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47126 - Posted: 27 Apr 2017, 19:28:15 UTC - in response to Message 47125.  

Is it the workunits (some types? all types?) which fail on your GTX 660 Ti, or the new application?
ID: 47126 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47127 - Posted: 27 Apr 2017, 20:05:48 UTC - in response to Message 47125.  

Just to clarify ....
Work units are NOT FINE on CC3/SM3 GPUs like my GTX 660 Ti GPUs :(

Still waiting for MJH to give us more details on what went wrong, and who must fix it..


I've got my 660ti still running on the 359.06 driver with cuda 6.5 app and works fine.
ID: 47127 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47128 - Posted: 28 Apr 2017, 3:41:25 UTC - in response to Message 47127.  

Is it the workunits (some types? all types?) which fail on your GTX 660 Ti, or the new application?



Just to clarify ....
Work units are NOT FINE on CC3/SM3 GPUs like my GTX 660 Ti GPUs :(

Still waiting for MJH to give us more details on what went wrong, and who must fix it..


I've got my 660ti still running on the 359.06 driver with cuda 6.5 app and works fine.



The 9.18 (cuda80) app crashes on my GTX 660 Ti GPUs that are in the same PC as my GTX 970. To my knowledge, this machine is intentionally and correctly given 9.18 (cuda80) tasks, but there's a problem with the app.

MJH said:

15 Apr 2017 | 21:43:26 UTC
http://www.gpugrid.net/forum_thread.php?id=4545&nowrap=true#46932
For some reason the sm 3.0 support (and only that sm version) is broken.


17 Apr 2017 | 19:49:15 UTC
http://www.gpugrid.net/forum_thread.php?id=4551&nowrap=true#46981
The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918.


.....
But I don't know what that means!

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?
I feel like nobody is trying to fix it.
ID: 47128 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47130 - Posted: 28 Apr 2017, 10:12:13 UTC - in response to Message 47128.  

17 Apr 2017 | 19:49:15 UTC
http://www.gpugrid.net/forum_thread.php?id=4551&nowrap=true#46981
The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918.

.....
But I don't know what that means!

Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?
I feel like nobody is trying to fix it.

A Compiler is an integral part of the development software used by computer programmers to create useful applications.

In this case, the CUDA 8.0 compiler is maintained and distributed by NVidia to facilitate sales of their hardware products (GPUs). It would be difficult-to-impossible to do anything with a GPU without NVidia's compiler.

The CUDA compiler comprises two parts: the first part, which resides in the 'CUDA toolkit' on Matt's machine, produces intermediate code. The second part, which resides in the drivers on all our machines, converts the universal intermediate code into machine code instructions tailored to the specific hardware found in the target computer.

Matt hasn't identified (in public, at least) which of the two components he believes to be at fault. Since it's hardware-specific, my personal opinion is that it's likely to be the driver-level component - but I've been wrong before.

Either way, both components are the responsibility of NVidia. Any change would have to be implemented and distributed by them.

But you've encountered an age-old problem, previously described in terms of putting new wine into old bottles, or teaching old dogs new tricks. When a complex system relies on two symbiotic components (hardware and software, in this case), to what extent is it realistic to expect that every new pairing will work together ad infinitum?

Personally, I feel it's advantageous to keep computer systems 'balanced' - with hardware and software of a comparable vintage. My trusty and long-serving 9800 GTs have joined my Windows 3 computers in the museum - I haven't tried to convert them to run Cuda 8 or Windows 10. I suggest that, if you feel GTX 660 Ti cards are still energy-efficient enough to be useful, you put them into a chassis with a similar vintage of operating system and a Cuda 6.5 driver.
ID: 47130 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47131 - Posted: 28 Apr 2017, 11:28:46 UTC - in response to Message 47130.  
Last modified: 28 Apr 2017, 11:35:42 UTC

Thanks Richard, but ...

My GTX 660 Ti GPUs are supported by the driver version that I use, and the OS that I use, the Cuda version the application was build for, and the application that I'm trying to run.

I expect this to work. It sounds like GPUGrid also expects this to work. It does not work.

My very simple question, remains unanswered:
Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?

I feel like nobody is trying to fix it.
If it is something NVIDIA must fix, and if GPUGrid gave me enough info to identify the problem, then I could urge my NVIDIA contacts to look at it.

But MJH hasn't released details.

MJH?
ID: 47131 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47132 - Posted: 28 Apr 2017, 12:05:12 UTC

The fix I would prefer is for MJH to limit the relevant GPU application to the newer cards; i.e., Maxwell and later. Whatever "fix" he might come up with may limit the performance of the newer cards, or at least require a lot of his time and effort that might be spent in better ways on new apps.

There will be the usual moaning and groaning, and people will leave. But there are plenty of volunteers anyway, and even more problems. So reduce both.
ID: 47132 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
3de64piB5uZAS6SUNt1GFDU9dRhY
Avatar

Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 47133 - Posted: 28 Apr 2017, 13:58:36 UTC - in response to Message 47132.  
Last modified: 28 Apr 2017, 14:00:48 UTC

The fix I would prefer is for MJH to limit the relevant GPU application to the newer cards; i.e., Maxwell and later.


My two cents... I would agree for the long runs, as it doesnt make sense to run them on an old gtx660 anyway. But not as a general measure for long and short runs. Do we have any statistic about how many Kepler cards are still in use at GPUGRID? I reckon that there are a great many... and therefore we shouldnt jump the gun excluding them.

There will be the usual moaning and groaning, and people will leave. But there are plenty of volunteers anyway, and even more problems. So reduce both.


Well, if there are as many as I suspect (650ti, 660, 660ti, 670, 680), it would be very difficult to compensate that loss of crunching power. I have my doubts.
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.
ID: 47133 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47134 - Posted: 28 Apr 2017, 16:55:21 UTC - in response to Message 47133.  

OK, that makes sense. I forgot about the short runs, but the Keplers would be quite nice for that.
ID: 47134 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 47135 - Posted: 28 Apr 2017, 18:07:48 UTC

Kepler is not nearly old enough to drop support, nor is it inefficient enough, as it's still on 28nm like maxwell. I'm glad they dropped Fermi because of the higher lithography and inefficient architecture.
ID: 47135 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Loohi

Send message
Joined: 27 Aug 16
Posts: 16
Credit: 43,745,875
RAC: 0
Level
Val
Scientific publications
wat
Message 47138 - Posted: 29 Apr 2017, 4:23:18 UTC

Despite re-installing nvidia drivers, i'm still facing immediate computation errors since win10 creator's update - since this is not going to change, do you have any recommendations for me to try to start crunching again?
ID: 47138 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47139 - Posted: 29 Apr 2017, 5:34:16 UTC - in response to Message 47134.  

OK, that makes sense. I forgot about the short runs, but the Keplers would be quite nice for that.

Except that the availablity of short runs has dropped quite a bit lately :-(

Myself, I have already considered to switch to short runs with my two GTX750Ti, since after implementing the latest crunching software (acemd_918.80), the crunching times have inreased considerably, up to almost 60 hours (as noticed also by other members).
ID: 47139 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47141 - Posted: 29 Apr 2017, 8:57:35 UTC - in response to Message 47138.  

Despite re-installing nvidia drivers, i'm still facing immediate computation errors since win10 creator's update - since this is not going to change, do you have any recommendations for me to try to start crunching again?

It's beginning to look as if there might be a problem with that 381.89 driver, isn't it? It was only released on 25 April, and I haven't heard about anybody else trying to use it yet.

Maybe other users could post their observations, either way - and while we're waiting, you could try reverting to an older driver to see if that helps. Go to http://www.nvidia.com/Download/Find.aspx, fill in your card and operating system details, and choose from the search result list - anything between 372.54 and 381.65 should be fine. When you run the installer, choose 'custom' installation and check the 'clean install' box just to be on the safe side.
ID: 47141 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Loohi

Send message
Joined: 27 Aug 16
Posts: 16
Credit: 43,745,875
RAC: 0
Level
Val
Scientific publications
wat
Message 47142 - Posted: 29 Apr 2017, 12:05:45 UTC - in response to Message 47141.  
Last modified: 29 Apr 2017, 12:06:01 UTC

yeah maybe ill try that, but it's hard since after 2 faulty WUs, i have to wait another 24hr to get the next ones.
ID: 47142 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Loohi

Send message
Joined: 27 Aug 16
Posts: 16
Credit: 43,745,875
RAC: 0
Level
Val
Scientific publications
wat
Message 47143 - Posted: 30 Apr 2017, 7:39:51 UTC - in response to Message 47142.  

I went back and saw that successful WU were performed with the latest Nvidia drivers (also my current one now), so it's fair to assume that win 10 creators update is the culprit... Since nothing else changed. Does that basically mean that I'm not gonna be able to do any work until gpugrid makes win 10 creators update compatible? I fear this might take a long time...
ID: 47143 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47153 - Posted: 1 May 2017, 18:34:36 UTC - in response to Message 47131.  
Last modified: 1 May 2017, 18:41:13 UTC

Thanks Richard, but ...

My GTX 660 Ti GPUs are supported by the driver version that I use, and the OS that I use, the Cuda version the application was build for, and the application that I'm trying to run.

I expect this to work. It sounds like GPUGrid also expects this to work. It does not work.

My very simple question, remains unanswered:
Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix?

I feel like nobody is trying to fix it.
If it is something NVIDIA must fix, and if GPUGrid gave me enough info to identify the problem, then I could urge my NVIDIA contacts to look at it.

But MJH hasn't released details.

MJH?




Request for users affected by "9.18 (cuda80)" app instantly failing:

My NVIDIA contact has a request:

Please fill out the Driver Feedback survey below, if you are affected by the GPUGrid "9.18 (cuda80)" app immediately failing with "Computation Error" on your GPU. This helps them assign priority when fixing issues. Be thorough when filling it out, please.

http://surveys.nvidia.com/index.jsp?pi=6e7ea6bb4a02641fa8f07694a40f8ac6

Thanks,
Jacob
ID: 47153 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47154 - Posted: 1 May 2017, 20:27:23 UTC - in response to Message 47153.  

I have read somwere else that scientists have a major problem comminicating with ordinary people (i mean thick) and all the problems with this project seem to bare this out.

That's why science and the majority will never meet and more darkly science will be rejected by the majority.
ID: 47154 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47164 - Posted: 3 May 2017, 20:03:57 UTC

Guess who's going to download the 1.2 GB Cuda 8.0 toolkit, and install the 8 GB Visual Studio 2015 Community Edition IDE, in attempt to repro the SM3/CC3 compiler issues using the Cuda Toolkit samples?

Yeah. Me. I'm hardcore sometimes.
ID: 47164 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : Number crunching : all WUs downloaded recently produce "computation error" right away

©2025 Universitat Pompeu Fabra