Message boards :
Graphics cards (GPUs) :
Steps to diagnose failure to run tasks?
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 5 Feb 14 Posts: 6 Credit: 25,848,270 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I have a Win7 / Core-i7 / 8GB / GTX-660 computer that's not much in use, and being winter I'd like to run jobs since the waste heat is also useful. Years ago, I was able to run both rosetta@home CPU jobs and GPUgrid jobs, but now only rosetta@home seems to be working. I have already updated BOINC, and checked my card compatibility ("Still works") and deleted and re-installed the NVidia driver, and searched the forum for ideas, but to no avail. It seems that periodically, the count of "tasks failed" increments by two. It was 31, then 33, and now 35. I can't spend too much time on this, but please give me a checklist of things that might be wrong, so I can go through it. It was a $200 card, and it's sitting there idle, and apparently mining bitcoin isn't viable any more.
|
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Your nVidia driver is very old (r327.23 released Sept 2013). Your setup should be able to load latest nVidia driver r436.8. This is the most likely cause of your errors. nVidia drivers can be downloaded from nVidia here: https://www.geforce.com/drivers |
|
Send message Joined: 5 Feb 14 Posts: 6 Credit: 25,848,270 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Thank you, that's probably it. I googled for a driver, and the first page I found (which was https://www.geforce.com/drivers/results/66884 incidentally) didn't say anything about "why don't you use a newer driver instead"). I checked that it was a legitimate domain (nothing scammy) and valid for my GPU and went ahead and installed it. I just assumed it was the latest. Normally when you download an old driver for something it's on a page labelled "previous versions" and has a whole list of them. It hasn't started a job yet, but I'll give it a day. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
... It hasn't started a job yet, but I'll give it a day. Unfortunately, tasks are NOT available all the time, so just be patient. |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Additionally, in a gross estimate, your GTX660 graphics card should finish a long ACEMD WU in a range from 12 hours to 18 hours of continuous processing time, depending on kind of task. This estimate comes from comparing computing power between your GTX660 (about 1882 GFLOPs) to my GTX750TI (about 1306 GFLOPs) and applying rule of three to my times. Font: http://www.gpureview.com/show_cards.php?card1=680&card2=695 For this reason, I would recommend to set initially BOINC Manager processing preferences to: * Usage limits - Use at most: 75 % of the CPUs - (To reserve a free full CPU core) - Use at most: 100 % of CPU time * When to suspend: Uncheck all options in this section * Other - Store at least 0.01 days of work - (To download only one WU at a time) - Store up to an additional 0.01 days of work - (To download only one WU at a time) - Switch between tasks every 1440 minutes - (To not alernate between other projects tasks, if any. Check if GPUGrid task is running. If not, pause other GPU running task until it is) When you have processed some sample WUs, it will be time to set more conservative values if desired. Also visit GPUGrid preferences page for your account, and check to receive new ACEMD3 tasks. Old ACEMD short and long WUs are currently to extinguish. And: Unfortunately, tasks are NOT available all the time, so just be patient. Good luck! |
|
Send message Joined: 5 Feb 14 Posts: 6 Credit: 25,848,270 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
It's working now, thanks. I did change those preferences. |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Nice to see that You’ve been lucky to catch two ACEMD WUs, and both of them have finished successfully :-) |
|
Send message Joined: 5 Feb 14 Posts: 6 Credit: 25,848,270 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I did! They are taking about 25 hours to complete, but that's OK. |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Currently it seems to be an issue affecting a specific batch of Long runs v9.22 (cuda65). My cuda 65 computers are also failing theese tasks immediately, erroring with exit status -44 as yours. Same WUs resent to cuda 80 computers seem to progress correctly. Therefore, I deduce it is not due to our computers, but to this specific cuda 65 batch of WUs. I reccomend not to alter your current setup and wait the problem be resolved on GPUGrid side. |
|
Send message Joined: 5 Feb 14 Posts: 6 Credit: 25,848,270 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Random information: I have another newer computer, Rayquaza, but I didn't spend much on the graphics card, it's only a GT1030 (below entry level) but I thought I'd give it a shot, just to see what happened - but tasks on it fail immediately too (I've turned off fetching tasks.) However, I notice that the driver on that is old, as well! I don't know how this happens because I only built it recently. When I get a minute, I'll update the driver and try again. Side story: I know this is a GPU grid forum, but I found an old broken laptop of my daughter's, and reset it to see if it would run BOINC. The screen is cracked, the touchscreen feature only works on the bit to the right of the fault line, the keyboard doesn't connect properly, so it's basically useless, but it's running Rosetta@home tasks! And, it's only using 5 watts of power... |
©2025 Universitat Pompeu Fabra