Message boards :
Number crunching :
all WUs downloaded recently produce "computation error" right away
Message board moderation
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next
| Author | Message |
|---|---|
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 261 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
back at it again No - the WUs seem to be fine at the moment, and your failures since 26 April come from a range of different WU types. The output of your most recent successful task shows Driver version : r376_38 : 37653 but your computer now shows NVIDIA GeForce GTX 970 (4095MB) driver: 381.89 Since you're running Windows 10, I suspect you've suffered from the common 'automatice driver update by Microsoft'. Try updating your driver again, this time direct from the NVidia site. |
|
Send message Joined: 27 Aug 16 Posts: 16 Credit: 43,745,875 RAC: 0 Level ![]() Scientific publications
|
Thanks for the detailed answer. Updates have been triggered voluntarily by me, for both Win10 Creator and Nvidia. I'll try to reinstall these drivers now and see if it makes a difference tomorrow. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Just to clarify .... Work units are NOT FINE on CC3/SM3 GPUs like my GTX 660 Ti GPUs :( Still waiting for MJH to give us more details on what went wrong, and who must fix it.. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 261 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Is it the workunits (some types? all types?) which fail on your GTX 660 Ti, or the new application? |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Just to clarify .... I've got my 660ti still running on the 359.06 driver with cuda 6.5 app and works fine. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Is it the workunits (some types? all types?) which fail on your GTX 660 Ti, or the new application? Just to clarify .... The 9.18 (cuda80) app crashes on my GTX 660 Ti GPUs that are in the same PC as my GTX 970. To my knowledge, this machine is intentionally and correctly given 9.18 (cuda80) tasks, but there's a problem with the app. MJH said: 15 Apr 2017 | 21:43:26 UTC http://www.gpugrid.net/forum_thread.php?id=4545&nowrap=true#46932 For some reason the sm 3.0 support (and only that sm version) is broken. 17 Apr 2017 | 19:49:15 UTC http://www.gpugrid.net/forum_thread.php?id=4551&nowrap=true#46981 The peculiar exception for sm 3.0 devices is due to a compiler problem with CUDA 80 that affects only that hardware version. When that's fixed, hosts with a non-XP Windows will get 918. ..... But I don't know what that means! Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix? I feel like nobody is trying to fix it. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 261 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
17 Apr 2017 | 19:49:15 UTC A Compiler is an integral part of the development software used by computer programmers to create useful applications. In this case, the CUDA 8.0 compiler is maintained and distributed by NVidia to facilitate sales of their hardware products (GPUs). It would be difficult-to-impossible to do anything with a GPU without NVidia's compiler. The CUDA compiler comprises two parts: the first part, which resides in the 'CUDA toolkit' on Matt's machine, produces intermediate code. The second part, which resides in the drivers on all our machines, converts the universal intermediate code into machine code instructions tailored to the specific hardware found in the target computer. Matt hasn't identified (in public, at least) which of the two components he believes to be at fault. Since it's hardware-specific, my personal opinion is that it's likely to be the driver-level component - but I've been wrong before. Either way, both components are the responsibility of NVidia. Any change would have to be implemented and distributed by them. But you've encountered an age-old problem, previously described in terms of putting new wine into old bottles, or teaching old dogs new tricks. When a complex system relies on two symbiotic components (hardware and software, in this case), to what extent is it realistic to expect that every new pairing will work together ad infinitum? Personally, I feel it's advantageous to keep computer systems 'balanced' - with hardware and software of a comparable vintage. My trusty and long-serving 9800 GTs have joined my Windows 3 computers in the museum - I haven't tried to convert them to run Cuda 8 or Windows 10. I suggest that, if you feel GTX 660 Ti cards are still energy-efficient enough to be useful, you put them into a chassis with a similar vintage of operating system and a Cuda 6.5 driver. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks Richard, but ... My GTX 660 Ti GPUs are supported by the driver version that I use, and the OS that I use, the Cuda version the application was build for, and the application that I'm trying to run. I expect this to work. It sounds like GPUGrid also expects this to work. It does not work. My very simple question, remains unanswered: Is it a problem that GPUGrid must fix, or is it a problem that NVIDIA must fix? I feel like nobody is trying to fix it. If it is something NVIDIA must fix, and if GPUGrid gave me enough info to identify the problem, then I could urge my NVIDIA contacts to look at it. But MJH hasn't released details. MJH? |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The fix I would prefer is for MJH to limit the relevant GPU application to the newer cards; i.e., Maxwell and later. Whatever "fix" he might come up with may limit the performance of the newer cards, or at least require a lot of his time and effort that might be spent in better ways on new apps. There will be the usual moaning and groaning, and people will leave. But there are plenty of volunteers anyway, and even more problems. So reduce both. |
|
Send message Joined: 20 Apr 15 Posts: 285 Credit: 1,102,216,607 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
The fix I would prefer is for MJH to limit the relevant GPU application to the newer cards; i.e., Maxwell and later. My two cents... I would agree for the long runs, as it doesnt make sense to run them on an old gtx660 anyway. But not as a general measure for long and short runs. Do we have any statistic about how many Kepler cards are still in use at GPUGRID? I reckon that there are a great many... and therefore we shouldnt jump the gun excluding them. There will be the usual moaning and groaning, and people will leave. But there are plenty of volunteers anyway, and even more problems. So reduce both. Well, if there are as many as I suspect (650ti, 660, 660ti, 670, 680), it would be very difficult to compensate that loss of crunching power. I have my doubts. I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday. |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK, that makes sense. I forgot about the short runs, but the Keplers would be quite nice for that. |
|
Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Kepler is not nearly old enough to drop support, nor is it inefficient enough, as it's still on 28nm like maxwell. I'm glad they dropped Fermi because of the higher lithography and inefficient architecture. |
|
Send message Joined: 27 Aug 16 Posts: 16 Credit: 43,745,875 RAC: 0 Level ![]() Scientific publications
|
Despite re-installing nvidia drivers, i'm still facing immediate computation errors since win10 creator's update - since this is not going to change, do you have any recommendations for me to try to start crunching again? |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK, that makes sense. I forgot about the short runs, but the Keplers would be quite nice for that. Except that the availablity of short runs has dropped quite a bit lately :-( Myself, I have already considered to switch to short runs with my two GTX750Ti, since after implementing the latest crunching software (acemd_918.80), the crunching times have inreased considerably, up to almost 60 hours (as noticed also by other members). |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 261 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Despite re-installing nvidia drivers, i'm still facing immediate computation errors since win10 creator's update - since this is not going to change, do you have any recommendations for me to try to start crunching again? It's beginning to look as if there might be a problem with that 381.89 driver, isn't it? It was only released on 25 April, and I haven't heard about anybody else trying to use it yet. Maybe other users could post their observations, either way - and while we're waiting, you could try reverting to an older driver to see if that helps. Go to http://www.nvidia.com/Download/Find.aspx, fill in your card and operating system details, and choose from the search result list - anything between 372.54 and 381.65 should be fine. When you run the installer, choose 'custom' installation and check the 'clean install' box just to be on the safe side. |
|
Send message Joined: 27 Aug 16 Posts: 16 Credit: 43,745,875 RAC: 0 Level ![]() Scientific publications
|
yeah maybe ill try that, but it's hard since after 2 faulty WUs, i have to wait another 24hr to get the next ones. |
|
Send message Joined: 27 Aug 16 Posts: 16 Credit: 43,745,875 RAC: 0 Level ![]() Scientific publications
|
I went back and saw that successful WU were performed with the latest Nvidia drivers (also my current one now), so it's fair to assume that win 10 creators update is the culprit... Since nothing else changed. Does that basically mean that I'm not gonna be able to do any work until gpugrid makes win 10 creators update compatible? I fear this might take a long time... |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks Richard, but ... Request for users affected by "9.18 (cuda80)" app instantly failing: My NVIDIA contact has a request: Please fill out the Driver Feedback survey below, if you are affected by the GPUGrid "9.18 (cuda80)" app immediately failing with "Computation Error" on your GPU. This helps them assign priority when fixing issues. Be thorough when filling it out, please. http://surveys.nvidia.com/index.jsp?pi=6e7ea6bb4a02641fa8f07694a40f8ac6 Thanks, Jacob |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have read somwere else that scientists have a major problem comminicating with ordinary people (i mean thick) and all the problems with this project seem to bare this out. That's why science and the majority will never meet and more darkly science will be rejected by the majority. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Guess who's going to download the 1.2 GB Cuda 8.0 toolkit, and install the 8 GB Visual Studio 2015 Community Edition IDE, in attempt to repro the SM3/CC3 compiler issues using the Cuda Toolkit samples? Yeah. Me. I'm hardcore sometimes. |
©2025 Universitat Pompeu Fabra