Message boards :
Number crunching :
Problems with Task ID: 4071833, 4081911 and 4082018
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 2 Mar 09 Posts: 28 Credit: 4,975,808 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
As mentioned, that 3 wu's all ended in error. http://www.gpugrid.net/result.php?resultid=4071833 http://www.gpugrid.net/result.php?resultid=4081911 http://www.gpugrid.net/result.php?resultid=4082018 Any ideas? Thank you in advance.
|
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
All 3 WU's were failed with the new 6.14 client. The new 6.14 application runs at higher GPU usage. I guess your system can't handle the consequences of this higher GPU utilization. Maybe the power supply, maybe the cooling, maybe both. |
|
Send message Joined: 2 Mar 09 Posts: 28 Credit: 4,975,808 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It shouldn't be an hardware issue: Corsair 750W CMPSU-750AXEU, CM 690 II LITE, and a Gtx 560 Ti that never surpassed 74°C with GpuGrid (no problem up to more than 85°C with Furmark).
|
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
First port of call would be to upgrade to latest driver. Radio Caroline, the world's most famous offshore pirate radio station. Great music since April 1964. Support Radio Caroline Team - Radio Caroline |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All showa's errors were for standard length KASHIF_HIVPR tasks, and all occurred within 20sec. Now shows does not have any tasks. 2 "Energies have become nan" errors 1 "SWAN: FATAL : swanMemcpyDtoH failed - Assertion failed: 0, file swanlib_nv.c, line 390". All 3 tasks have been resent to other systems. No errors returned from other systems so far and 1 validated. So I expect the problem is with the system. I would be inclined to shut down, give the computer (especially GPU) a clean, and increase the fan speeds before trying again. The latest driver might help too. I would also suggest you try to manually control what tasks you get (one at a time, and if they fail stop and try the long tasks). Good luck, |
|
Send message Joined: 2 Mar 09 Posts: 28 Credit: 4,975,808 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm quite sure it's not an hardware issue: my pc crunched 10/15 wus for Milkyway@Home, and I got no errors whatsoever. GPU load up to 97-98%, temperatures up to 78°C. As soon as I can, I'll install new drivers, and I'll try to crunch other wus for this project. By the way, what drivers are recommended? Bye.
|
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm quite sure it's not an hardware issue: my pc crunched 10/15 wus for Milkyway@Home, and I got no errors whatsoever. GPU load up to 97-98%, temperatures up to 78°C. That just proves that your card is not totally messed up; GPUGrid tasks are not the same and tax the system in a different way. Also a 10 or 15min MW task is not the same as crunching a 12h task here. The chances of failure over 10min is much less than over 12h. It also suggests that the drivers are not totally messed up, and yet you still have a problem. Did you try anything I suggested? Shut down, clean fans, increase fan speeds, use the latest driver (27533). |
|
Send message Joined: 2 Mar 09 Posts: 28 Credit: 4,975,808 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm aware that Milky wu aren't the same of Gpugrid. As soon as I'm able to perform a system restart, I'll: - clean the fans - install Windoze update - install new drivers for my Nvidia card not necessarily in this order ;)
|
|
Send message Joined: 10 Oct 08 Posts: 18 Credit: 39,100,916 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Did you overclock the GPU? GPUGRID and overclocking doesn't seem to go well together. Anything over 10% seems to cause my WUs to error out. |
|
Send message Joined: 2 Mar 09 Posts: 28 Credit: 4,975,808 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No, the gpu is @ default. Since my post, I haven't been able to crunch for Gpugrid anymore, while I still can for Milkyway. Don't know what to think.
|
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We are now using a different Windows app (6.15), so you might want try here again. |
|
Send message Joined: 2 Mar 09 Posts: 28 Credit: 4,975,808 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Don't know why, but I can crunch (at least, so far) for 6.15 version.
|
©2025 Universitat Pompeu Fabra