Message boards :
Graphics cards (GPUs) :
ACEMD2 6.12 cuda and 6.13 cuda31 for windows and linux
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · Next
| Author | Message |
|---|---|
|
Send message Joined: 2 Jul 10 Posts: 7 Credit: 28,599,565 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Well, it worked for a little while. The system froze up completely and had to do a reboot. The next messages are as follows: 'Tue 16 Nov 2010 10:02:13 PM EST No usable GPUs found Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Application uses missing NVIDIA GPU Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Application uses missing NVIDIA GPU Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Application uses missing NVIDIA GPU Tue 16 Nov 2010 10:02:14 PM EST GPUGRID Missing coprocessor for task r475s1f1_r130s2-TONI_MSM5-0-4-RND8696_0 Tue 16 Nov 2010 10:02:15 PM EST GPUGRID URL http://www.gpugrid.net/; Computer ID 84614; resource share 10000 Tue 16 Nov 2010 10:02:15 PM EST Reading preferences override file Tue 16 Nov 2010 10:02:15 PM EST Preferences: Tue 16 Nov 2010 10:02:15 PM EST max memory usage when active: 501.96MB Tue 16 Nov 2010 10:02:15 PM EST max memory usage when idle: 2007.86MB Tue 16 Nov 2010 10:02:33 PM EST max disk usage: 20.00GB Tue 16 Nov 2010 10:02:33 PM EST (to change preferences, visit the web site of an attached project, or select Preferences in the Manager) Tue 16 Nov 2010 10:02:33 PM EST Not using a proxy Tue 16 Nov 2010 10:03:53 PM EST GPUGRID task r475s1f1_r130s2-TONI_MSM5-0-4-RND8696_0 suspended by user' The 'Tasks' window shows 'GPU missing' I'll keep working BOINC with other projects and hope something new comes along. Unless you have other suggestions to try. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
These failures could be caused by many things but a few likely candidates would be, shortage of RAM (get more or dont run CPU apps), the card may just be overheating and hanging the system (need to check this and then up the fan speed to cool the card & system better), your power supply is inappropriate, operating system (Linux) is unstable and you need to re-install (caused by driver itself, HDD issues, update or app issues), the hard drive is corrupt, malware or bad apps are on the system. |
|
Send message Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've had no problems since I switched driver versions from 260.19.14 to 260.19.21. I'm also using SWAN_SYNC=0 since there are no other options for linux right now although I'd like to get 80% of that core back for other projects. :) I've finished 11 WUs since the driver upgrade: http://www.gpugrid.net/results.php?hostid=82590 64 bit Ubuntu 10.04 boinc 6.10.58 GTX460 nvidia driver version 260.19.21 |
|
Send message Joined: 24 Dec 09 Posts: 22 Credit: 15,875,809 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
New update on the 65nm problem. I've done more testing recently.. Specs: EVGA GTX 260 65nm (stock clock at 576/1242/999) EVGA GTX 260 55nm (stock clock at 626/1350/1053) Driver version: 260.61 CUDA version: 3020 (which I assume to be 3.2) Processor: Phenom II X4 940 at 3.5Ghz, running WCG. BOINC version: 6.10.58 64bit OS: Windows 7 Ultimate x64 This is the computer I am testing with. On the computer, the 65nm GTX 260 is shown as device 1 while the 55nm is shown as device 0. As you might recall, the 65nm had been struggling to run GPUGRID; there had been WU crashes which Windows will prompt you to "close the program" as well as some driver crashes. However, when running on 6.13 (and the new driver as well), the problem disappeared. While the WUs still error out (sadly), they never crash; not even a WU hang had occurred on the 65nm. The 65nm card even managed to finish two work units [WU 1][WU 2]. I will test for a couple more days, and will do so for each new ACEMD version. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Firstly I think a restart is in order, as you are getting what looks like runaway failures - could be from some other software on the system. While I’m not saying you can’t run a Phenom II 940 at 3.5GHz and crunch, I have one so I know what it can do, I think it is a bit on the high side, especially if you are having problems in a dual GTX260’s system and in particular if one is the older 65nm version, which as you know is notoriously difficult to crunch with here. At 3.5GHz the additional power draw and heat production is substantial (unless you have a water cooled system, and even then the capacitors dont get cooled as much). I found that 3.3GHz was plenty, but I generally run at stock and would suggest that for now you do this; you are testing after all. Basically try all the tricks, including upping the GPU fan speed (EVGA Precision or other app), using eFMer Priority to raise the application priority, and leaving at least one CPU core free (preferably all 4); I think the Windows Error Message to close the CUDA app is as a result of a timeout, so increasing the threads priority should help, as should freeing up cores. What I’m saying is try to give yourself as much chance as possible to run these tasks successfully, and add applications later. Even if it means turning off your firewalls and antivirus software it’s worthwhile, especialy if you use several (it removes possible problem apps), and if you are worried about security, disable Boinc networking, close all other programs, disable updates, and unplug the Ethernet cable for the duration of the runs. Good Luck, |
Fred J. VersterSend message Joined: 1 Apr 09 Posts: 58 Credit: 35,833,978 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Noticed, sorry if this is a bit off topic, if CPU speed increases, CPU use drops from 0.20 to 0.16CPU+1GPU (GTX480@1.43-1.5GHz)(CPU@3.366MHz)(Now 3.4GHz;3500FLOPS/sec; 10,000 Drystone Mops/sec.) According to BOINC manager info. Didn't know it is a BOINC feature, or am I wrong? Doesn't have much effect though, with Einstein it's hardly any difference. (And the 'GPU app' doesn't do anything most of the times, I skip GPU use on Einstein, I'll UNtag it, when the have/develop/etc. a better GPU app.) I don't mean to be all negative, but this doesn't contribute at all to GPU processing and don't know why they use it! But I'm gettin off TOPIC.......... Knight Who Says Ni N! |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Not at all, this is just how BOINC computes the amount of CPU used. It's really just a text of no use. gdf Noticed, sorry if this is a bit off topic, if CPU speed increases, CPU use drops |
SaengerSend message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
I have to wait for my second Toni yet, so far most of the WUs I get are at least half as fast as they were before the "upgrade", which was a very bad downgrade for me. Currently I get a Kashiv most of the time, and they take 2-3 times as long as with the older app. OK, they don't use that much CPU now, but that was a minor issue compared to the total crunch time that's used now, if it's used at all, the temperatures are still not as high as they used to be before on my GPU. Is there a possibility to stop the wasteful non-Tonis to be delivered to my computer? Should I simply abort all others until I get a suitable one? Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Have you tried to use export SWAN_SYNC=0 in your .bashrc? Soon, we will make this much easier. gdf I have to wait for my second Toni yet, so far most of the WUs I get are at least half as fast as they were before the "upgrade", which was a very bad downgrade for me. |
SaengerSend message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Have you tried to use export SWAN_SYNC=0 in your .bashrc? I've tried with one file that included .bashrc, although I've told you and other many times that I'm a user, not a nerd, and that that file is nowhere on my machine as you described it. I won't alter any remotely named file without knowing what's going to happen and guaranties that nothing will go wrong with my machine in general. To tell you what I've done: I've included the lines # for GPUgrid in BOINC # configures an interactive environmental variable export SWAN_SYNC=0 in the file /etc/bash.bashrc Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki |
liveoncSend message Joined: 1 Jan 10 Posts: 292 Credit: 41,567,650 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
What am I doing wrong now??? My Linux machines are as good as winter heating, but not more then that. 71711 62706 70922 I guess that I was complaining all summer long about Linux being overly aggressive & overheating all summer long, but it's winter now & summer then. BTW, it would have been nice not to have the windows open all summer long, now I'm freezing & just burning electricity.
|
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
There must be something wrong. The timing is too slow. Try to add export SWAN_SYNC=0 in your .bashrc file. gdf |
SaengerSend message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
There must be something wrong. The timing is too slow. There is no such file on my system, for f*** sake. If you want to keep ordinary users here, not just nerds, and using command line instructions is definitely a nerdy behaviour, please communicate on a way that users can understand you! And please explain where this dubious .bashrc is supposed to be, how it should look like exactly, what it will do to the machine. If you just give such unintelligible professional terminology type answers, don't expact any user to understand you. Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki |
liveoncSend message Joined: 1 Jan 10 Posts: 292 Credit: 41,567,650 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs BTW GDF, 71711 has always been using SWAN_SYNC=0, maybe it's something else? I did add export SWAN_SYNC=0 to the other two, but I don't think it'll help. :-(
|
SaengerSend message Joined: 20 Jul 08 Posts: 134 Credit: 23,657,183 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs There's as well no hidden file with that name, I know how to let them be seen in the nautilus. Why are you writing about a .profile, while here it's always called .bashrc? Where exactly is this .profile or .bashrc supposed to be to alter it? Gruesse vom Saenger For questions about Boinc look in the BOINC-Wiki |
liveoncSend message Joined: 1 Jan 10 Posts: 292 Credit: 41,567,650 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Can't explain beyond a truthful I don't know, but you can google it. It was a long time ago I did this. Just open the terminal & type sudo gedit .profile add export SWAN_SYNC=0 save then restart Linux. You can check if it's there by opening terminal after you've restarted & type env
|
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
It must not be set correctly, otherwise the use of the different sycronization would appear in the log of the results. Type echo $SHELL to check what shell do you have? gdf .bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs |
liveoncSend message Joined: 1 Jan 10 Posts: 292 Credit: 41,567,650 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
/bib/bash I just type evn doesn't that also tell you what you need to know & if SWAN_SYNC=0 is present? But I must be doing something wrong because nothing is happening.
|
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
.bashrc is hidden. That's what the "." in Linux does. I made this how to: BOInc 4 N00Bs Some have reported good results by boosting the priority in Linux (also works well in Windows). Not a Linux expert but you should be able to do this in the process manager by lowering the "nice" value. There may be a way or program to automatically set the nice value to what you want for any given process. Maybe someone knows how to do that in Linux and can post it here? |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Run time 174918.919998 CPU time 270.99 It is clear that swan_sync was not working. Did you restart after setting it? Setting the nice value (akin to setting the priority, I think) might help, but again I'm not a Linux expert either and if I knew how I would have already tested it and posted a nice how to do it :) |
©2025 Universitat Pompeu Fabra