Message boards :
Graphics cards (GPUs) :
Server won't give me work
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So, after the problems with the KASHIF work units, and coming in this mornign to find that three work units had all failed with a computation error, I decided to go ahead and upgrade to 6.6.28. Now the server won't give me any work saying I have no CUDA device. It's been running for a while now no problem, why this issue now? Any suggestions? |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
So, after the problems with the KASHIF work units, and coming in this mornign to find that three work units had all failed with a computation error, I decided to go ahead and upgrade to 6.6.28. Now the server won't give me any work saying I have no CUDA device. It's been running for a while now no problem, why this issue now? Any suggestions? Did you change the opt-in / opt-out setting in the preferences? If you changed from the right versino of BOINC it was opt-in, then they suddenly changed it to opt out so that CUDA cards are disabled by default. Change the preference on the web site here, then update the machine... |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks Paul. I'm gonna need a bit more help here (which is sad as I'm a professional scientist) but where is this opt/in opt/out setting. There's three preferences tabs (computing/gpugrid/computing) and in none of them do I see anything labeled opt-in or opt-out. Searching the pages for 'opt' isn't turning up anything, and when I look at my preferences it looks like the GPU should be available for computing. I'd like to get back to crunching since I've done nothing for the last almost 2 days due to a series of compute failures. Thanks. |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It is in the computing preferences - http://www.gpugrid.net/prefs.php?subset=global. Paul is talking about the setting "Suspend GPU work while computer is in use?". pixelicious.at - my little photoblog |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The error says you're in "device emulation" mode, meaning no CUDA device is found. Did you change anything else? Changed the GPU and installed drivers from the CD? MrS Scanning for our furry friends since Jan 2002 |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, Time Bandit helped me out as I was not clear ... sorry about that ... but that is only one possibility that leapt off the top of my pate... Can you give us the top lines of the message tab from a fresh start of BONC? It will look something like: Fri May 22 16:58:11 2009 Starting BOINC client version 6.6.29 for x86_64-apple-darwin Fri May 22 16:58:11 2009 Configured to use all coprocessors Fri May 22 16:58:12 2009 log flags: task, file_xfer, sched_ops, cpu_sched, cpu_sched_debug, sched_op_debug Fri May 22 16:58:12 2009 log flags: coproc_debug Fri May 22 16:58:12 2009 Libraries: libcurl/7.19.4 OpenSSL/0.9.7l zlib/1.2.3 c-ares/1.6.0 Fri May 22 16:58:12 2009 Data directory: /Library/Application Support/BOINC Data Fri May 22 16:58:12 2009 Milkyway@home Found app_info.xml; using anonymous platform Fri May 22 16:58:12 2009 Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU X5482 @ 3.20GHz [x86 Family 6 Model 23 Stepping 6] Fri May 22 16:58:12 2009 Processor features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM SSE3 MON DSCPL VMX EST TM2 SSSE3 CX16 TPR PDCM SSE4.1 Fri May 22 16:58:12 2009 OS: Darwin: 9.7.0 Fri May 22 16:58:12 2009 Memory: 16.00 GB physical, 235.23 GB virtual Fri May 22 16:58:12 2009 Disk: 453.32 GB total, 234.98 GB free Fri May 22 16:58:12 2009 Local time is UTC -7 hours Fri May 22 16:58:12 2009 Can't load library libcudart Fri May 22 16:58:12 2009 No coprocessors Fri May 22 16:58:13 2009 Not using a proxy .... bunch of project lines her Fri May 22 16:58:14 2009 World Community Grid Host location: none Fri May 22 16:58:14 2009 World Community Grid General prefs: using your defaults Fri May 22 16:58:14 2009 Preferences limit memory usage when active to 13107.20MB Fri May 22 16:58:14 2009 Preferences limit memory usage when idle to 16056.32MB Fri May 22 16:58:14 2009 Preferences limit disk usage to 200.00GB |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
No, I haven't changed anything else, just upgraded the BOINC client. I did notice however that after 4 tasks had failed due to a 'computation error' that new work hadn't been requested or given and it had been sitting idle. I have the same card and the same drivers. I even reinstalled the drivers from NVidia's website (no CD for this computer) but no change. Per the requested I'm pasting the error messages below. 5/23/2009 12:59:42 PM Starting BOINC client version 6.6.28 for windows_intelx86 5/23/2009 12:59:42 PM log flags: task, file_xfer, sched_ops 5/23/2009 12:59:42 PM Libraries: libcurl/7.19.4 OpenSSL/0.9.8j zlib/1.2.3 5/23/2009 12:59:42 PM Data directory: C:\Documents and Settings\All Users\Application Data\BOINC 5/23/2009 12:59:42 PM Running under account anthonmg 5/23/2009 12:59:42 PM Processor: 8 GenuineIntel Intel(R) Xeon(R) CPU E5440 @ 2.83GHz [x86 Family 6 Model 23 Stepping 6] 5/23/2009 12:59:42 PM Processor features: fpu tsc pae nx sse sse2 mmx 5/23/2009 12:59:42 PM OS: Microsoft Windows XP: Professional x86 Edition, Service Pack 3, (05.01.2600.00) 5/23/2009 12:59:42 PM Memory: 3.25 GB physical, 5.09 GB virtual 5/23/2009 12:59:42 PM Disk: 465.74 GB total, 394.19 GB free 5/23/2009 12:59:42 PM Local time is UTC -7 hours 5/23/2009 12:59:42 PM No CUDA devices found 5/23/2009 12:59:42 PM No coprocessors 5/23/2009 12:59:42 PM Not using a proxy 5/23/2009 12:59:42 PM Docking@Home URL: http://docking.cis.udel.edu/; Computer ID: 26200; location: (none); project prefs: default 5/23/2009 12:59:42 PM GPUGRID URL: http://www.gpugrid.net/; Computer ID: 32550; location: (none); project prefs: default 5/23/2009 12:59:42 PM GPUGRID General prefs: from GPUGRID (last modified 23-May-2009 00:54:35) 5/23/2009 12:59:42 PM GPUGRID Host location: none 5/23/2009 12:59:42 PM GPUGRID General prefs: using your defaults 5/23/2009 12:59:42 PM Preferences limit memory usage when active to 1663.63MB 5/23/2009 12:59:42 PM Preferences limit memory usage when idle to 2994.54MB 5/23/2009 12:59:42 PM Preferences limit disk usage to 100.00GB 5/23/2009 1:00:03 PM GPUGRID update requested by user 5/23/2009 1:00:07 PM GPUGRID Sending scheduler request: Requested by user. 5/23/2009 1:00:07 PM GPUGRID Requesting new tasks 5/23/2009 1:00:12 PM GPUGRID Scheduler request completed: got 0 new tasks 5/23/2009 1:00:12 PM GPUGRID Message from server: No work sent 5/23/2009 1:00:12 PM GPUGRID Message from server: Can't use CUDA app for Full-atom molecular dynamics: Your computer has no CUDA device 5/23/2009 1:00:12 PM GPUGRID Message from server: Full-atom molecular dynamics on Cell processor is not available for your type of computer. 5/23/2009 1:01:47 PM GPUGRID Sending scheduler request: To fetch work. 5/23/2009 1:01:48 PM GPUGRID Requesting new tasks 5/23/2009 1:01:53 PM GPUGRID Scheduler request completed: got 0 new tasks 5/23/2009 1:01:53 PM GPUGRID Message from server: No work sent 5/23/2009 1:01:53 PM GPUGRID Message from server: Can't use CUDA app for Full-atom molecular dynamics: Your computer has no CUDA device 5/23/2009 1:01:53 PM GPUGRID Message from server: Full-atom molecular dynamics on Cell processor is not available for your type of computer. 5/23/2009 1:02:28 PM GPUGRID Sending scheduler request: To fetch work. 5/23/2009 1:02:28 PM GPUGRID Requesting new tasks 5/23/2009 1:02:33 PM GPUGRID Scheduler request completed: got 0 new tasks 5/23/2009 1:02:33 PM GPUGRID Message from server: No work sent 5/23/2009 1:02:33 PM GPUGRID Message from server: Can't use CUDA app for Full-atom molecular dynamics: Your computer has no CUDA device 5/23/2009 1:02:33 PM GPUGRID Message from server: Full-atom molecular dynamics on Cell processor is not available for your type of computer. |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Which driver are you using? I cae you don't know and have already deleted the installation file GPU-Z should tell you. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Also, should say the the driver version is 6.14.11.8265. Just a month old. |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Have you already tried to reinstall the drivers? Somehow BOINC can't find a CUDA device... 5/23/2009 12:59:42 PM No CUDA devices found pixelicious.at - my little photoblog |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Um, yes, I can tell from the messages that it can't find the device. When I first joined the project in early April Boinc was unable to find a CUDA device. I upgraded the drivers from the ones that came with the machine when I got it last September. With that update, 182.46, everything worked fine and I've been crunching away. Then, lots of timed out workunits, never ending workunits. I was told to upgrade BOINC. Since then, no CUDA found. I just upgraded again to 182.65, didnt' help. Tried reinstalling the original drivers, 182.46, and 182.65, no success. |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I don't know why, but it sure does not see the CUDA device. And I am stumped as to what to try next. Are you running any sort of remote login or remote desktop? {edit} Virtualization software , alternative OS hosting? |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I was told to upgrade BOINC. Who was that? Must have been someone who doesn't like you? ;) No, seriously.. if your answer to Pauls question does not include any "yes" I'd revert to 6.6.20 and see if that gets you going again. If not - we'll all scratch our heads. You could try 6.5.0 and the drivers you list, 182.46 and 182.65 are both beta, as far as I know. 182.50 is the last WHQL driver of the non-185 series. I'd try 182.50 or 185.6x or 185.8x, though the latter 2 seem not to be trouble free yet. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hmm, today it started working again on its own. Argh. Ok, well, will watch and see if I get errors of if ti runs smoothly again. Thanks everyone for your input. |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Well, it's getting work, but ALL the work units seem to be failing after only a few seconds (<10-20 sec). Are we having more workunit problems? |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
You are running in device emulation. When you install boinc, you should use the advanced tab and say to run in unprotected mode. # Device 0: "Device Emulation (CPU)" gdf |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Interesting about the emulation mode and that it wasn't a problem before. Can you say where in the advanced tab this setting is? I don't see it under any of the subheadings in the advanced menus. Are you suggesting I find the config file and change it at the level of the text? Is unproected mode going to affect the stability of the system? |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Interesting about the emulation mode and that it wasn't a problem before. Can you say where in the advanced tab this setting is? I don't see it under any of the subheadings in the advanced menus. Are you suggesting I find the config file and change it at the level of the text? Is unproected mode going to affect the stability of the system? It is the third or fourth screen in during the install. I cannot recall if the repair install allows you to change this setting or not. Uninstall only removes BOINC is should not change your data or project settings unless you whack the directories yourself. You can also do a downlevel and up level of the version ... but you have to catch the screen as it goes by and make sure that the setting is unchecked. |
|
Send message Joined: 11 Apr 09 Posts: 17 Credit: 11,086,149 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The repair utility doesn't give you access there. I uninstalled and reinstalled. The option to run in protected mode (an unpriveldged account) was already unchecked, so it seems like that's not the issue. Also, still getting the, "Project has no work" error when I try to update and fetch new work. |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The repair utility doesn't give you access there. I uninstalled and reinstalled. The option to run in protected mode (an unpriveldged account) was already unchecked, so it seems like that's not the issue. Ok, there is still the issue where the device emulation comes up. Are you running remote monitoring software, a remote desktop, VM emulation, anything like that? If the video device is "virtualized" it cannot be used for work. Many of these software systems make the video card a "virtual" device that can be shared. Sadly, that means that it cannot be used by BOINC. Once you have errored out a certain number of tasks you cannot get any more until 24 hours has passed. This is to prevent you from "trashing" all the available tasks with a bad system set-up. When you get another task and return it safely, then you will be able to get another and another ... until you are back in good graces and can get the maximum. But, if the device is "broken" we have to fix that first... |
©2025 Universitat Pompeu Fabra