Message boards :
Number crunching :
GPU computation errors
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 11 Nov 09 Posts: 27 Credit: 4,925,174 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I was a victim of the new nVidia driver that had a fan speed problem and my card ran over 95deg C for a while and now has compute errors. But it only errors when I run 2 GPUGRID tasks simultaneously on my 9800GX x2. I can run 1 GPUGRID and one anything else and not have any errors. I cannot run 2 of the other GPU task either, because those tasks error out as well. How can I force BOINC to always run 1 GPUGRID task and one of something else? I just can't figure it out... Anyone's help is much appreciated! |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I was a victim of the new nVidia driver that had a fan speed problem and my card ran over 95deg C for a while and now has compute errors. But it only errors when I run 2 GPUGRID tasks simultaneously on my 9800GX x2. I can run 1 GPUGRID and one anything else and not have any errors. I cannot run 2 of the other GPU task either, because those tasks error out as well. How can I force BOINC to always run 1 GPUGRID task and one of something else? I just can't figure it out... Anyone's help is much appreciated! Short answer. You can't You can tell it not to use a particular gpu, but it applies to all projects. In your cc_config file put (within the options tag) <ignore_cuda_dev>0</ignore_cuda_dev> Where 0 is the cuda device number. BOINC blog |
|
Send message Joined: 11 Nov 09 Posts: 27 Credit: 4,925,174 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Thanks for the help... I would like to use both GPUs though. Is there a way to BOINC or GPUGRID only request or send one task at a time? I know it sounds like the same question, but it's different. Instead of controlling what is being worked on, starve BOINC with only 1 GPUGRID GPU task at a time, is what I'm asking. I don't want to queue anything from GPUGRID... The tasks are long in duration at 14 hours or so, wheras the other gpu tasks I want to run are 45 minutes. I just don't understand why I cannot run 2 tasks from any one project without getting an error. GPUGRID tasks error both immediately and the other project will error the task not completed when a task is completed. |
|
Send message Joined: 11 Nov 09 Posts: 27 Credit: 4,925,174 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Anyone have an answer to this???? I just need to find a way to be sent only one CUDA task at a time... It cannot run both cores, or they both error during the first 2 seconds or less. Thanks for the help!! |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Anyone have an answer to this???? About all you can do is set your cache to zero, but even then I think it will pickup a new wu when its close to finishing the one thats running. If it was me i'd get the card fixed or replaced. Given nvidia have admitted their driver was faulty you'd have a pretty good chance at getting them to wear the cost or compensating you in some way. Is the thing still under warranty? A lot of them come with 2 or more years now. BOINC blog |
|
Send message Joined: 11 Nov 09 Posts: 27 Credit: 4,925,174 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I submitted a ticket to XFX since the card is suppose to have a double lifetime warranty. We'll see what happens. This is what it says, that one has to agree to when a driver is downloaded from nVidia: 6.2 No Liability for Consequential Damages. TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, IN NO EVENT SHALL NVIDIA OR ITS SUPPLIERS BE LIABLE FOR ANY SPECIAL, INCIDENTAL, INDIRECT, OR CONSEQUENTIAL DAMAGES WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION, LOSS OF BUSINESS INFORMATION, OR ANY OTHER PECUNIARY LOSS) ARISING OUT OF THE USE OF OR INABILITY TO USE THE SOFTWARE, EVEN IF NVIDIA HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. I guess this means they can create a driver to destroy everything and not be liable.... I thought of something yesterday. I don't quite know how the "switch between tasks" works exactly, but I set GPUGRID to 9999 and the other project to run on my GPU to 9999 also. If I'm running one of each, it should not try to switch for that many minutes, right? Then when it does, they will both switch around the same time, and what I'm guessing will happen is that project A on core 0 and project B on core 1, will get switched to project A on core 1, and project B on core 0. I don't know, it's just a thoery. What do you think? If it works, it seems only one of each will run, unless there is overlap of the times somehow. How can I start the 9999 timers at the same time so there's no overlap? |
|
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You could also try upgrading to the latest BOINC client 6.10.43 Radio Caroline, the world's most famous offshore pirate radio station. Great music since April 1964. Support Radio Caroline Team - Radio Caroline |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I thought of something yesterday. I don't quite know how the "switch between tasks" works exactly, but I set GPUGRID to 9999 and the other project to run on my GPU to 9999 also. If I'm running one of each, it should not try to switch for that many minutes, right? Then when it does, they will both switch around the same time, and what I'm guessing will happen is that project A on core 0 and project B on core 1, will get switched to project A on core 1, and project B on core 0. I don't know, it's just a thoery. What do you think? If it works, it seems only one of each will run, unless there is overlap of the times somehow. How can I start the 9999 timers at the same time so there's no overlap? The switch between tasks is used for CPU tasks. It will (if it needs to share between projects) swap tasks based upon this time. The default is 60 (mins). Which means if it needs to swap one out it has to run for an hour before it can do it. GPU task run from beginning to end, they don't get swapped out normally. BOINC blog |
|
Send message Joined: 11 Nov 09 Posts: 27 Credit: 4,925,174 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Sorry, but I have to say that on my system, GPU tasks don't run to completion, as I have one Collatz that ran 16 of 45 minutes, then switched to running to 2 GPUGRID tasks, from one of each. Maybe something isn't right because it does this? |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
GPU tasks still have to use the CPU to some extent. Perhaps that explains it; the default switch is 60min for the CPU and a GPUGrid task could use more than an hour of CPU time. I think I saw the same thing in the past with MW tasks starting mid-run through a GPUGrid task (I dont much care for MW or Aqua, and the others have never got a look in). If you have a GTX275 it is likely to be able to finish a task in less that 1hour of CPU time, but lots of other cards are not so fast. |
©2025 Universitat Pompeu Fabra