Message boards :
Graphics cards (GPUs) :
Some wu-s erroring out.
Message board moderation
| Author | Message |
|---|---|
sir santSend message Joined: 1 Jul 09 Posts: 5 Credit: 27,036,793 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, haven't been running gpugrid for a long time. A few days ago desided to run gpugrid on my second box. Half the wu-s errored out. Been running Primegrid on that for a long time w\o errors, distrtgen also runs w\o errors. The box is: gtx 570 + 3xGts450 512 mbt, 4 gbt ddr3 memory, athlon II quad, antec hcg 900 wt psu, win xp 32 pro sp3, nvidia driver 266.58. boinc 6.10.60 x86. While running the gpu loads were fine, around 99% and it didn't pull anything from the cpu, which was idling basically. So where do i look for the problem? |
Carlesa25Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hello: The first thing to recommend is to update BOINC 6.12.34 is the latest stable version. Greetings. |
sir santSend message Joined: 1 Jul 09 Posts: 5 Credit: 27,036,793 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, didn't get around to try it asap, so did it now. Using newest boinc now, and some wu-s still erroring out. All else remains the same. So what's really wrong? |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It will probably get confused with the different cards in the machine. The GTX570 is CC 2.0 and the GTS450's are CC 2.1 cards. Unfortunately BOINC tries to treat them all the same. The GTX570 is ideal for GPUgrid, but the others are best used for some other project. Is it possible to put the GTS450's into another machine or relocate the GTX570? If not you might have to wait for BOINC 7 as that allows the user to configure which GPU's can be used for a project. But don't try it yet as its still in alpha test. BOINC blog |
sir santSend message Joined: 1 Jul 09 Posts: 5 Credit: 27,036,793 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, boinc does not get confused with different card's in the same box. I have had many different configurations with mixed ati and nvidia cards, and sometimes running the same project and no issues. I'll try again when i have spare time with different drivers, and maybe different os. The issue is minor probably, but its not easy to find it. |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
GPUGrid is a greater stress to your GPUs, than other projects. Check your GPU temperatures. (below 80°C is recommended, raise your fan speeds if necessary) Run your GPUs at factory preset clock frequencies and voltages (or below, if temps are still high) |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've had lots of GPUgrid tasks crashing lately and I think I have found the cause. Maybe this affects the OP too. I've been running some other applications that run at high priority and preempt BOINC client for many seconds. When that happens, science apps from other projects exit with exit code 0 and the "no heartbeat from BOINC" message and when BOINC gets more CPU time it restarts those apps and the tasks continue. However when the other science apps exit with code 0, the GPUgrid app exits with a non-zero error code which causes BOINC to not restart the task. BOINC gives the task "compute error" and gets a new task. Is that what is happening? Is that a known problem? Does the GPUgrid app really experience an error or could it be changed to give an exit code = 0? |
©2025 Universitat Pompeu Fabra