Message boards :
Number crunching :
strange behaviour...
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi there, one of my boinc machines is a Win7 Pro 64Bit with an ASUS GTX570 card. The NVidia driver is the latest 320.49. This machine shows a strange behaviour: each of the WUs (http://www.gpugrid.net/results.php?hostid=158339) will be started without any failure, seems to run for hours, but nothing happens...no CPU usage, no GPU usage, no progress... What's wrong here ? best regards, Rene |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Did you reboot the machine? Power off, remove the power cord, wait 10+ mins and power back on? Driver reinstall, maybe just straight the new 326.80? Is BOINC actually saying "running" in the manager? MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, yes I did. The BOINC manager says that it is running and the messages file shows it too. I've got another machine for GPUGRID with the same OS and drivers, but with a GTX480 and a GTX560Ti. This machine doesn't show any unusual behaviour. Hmmm...the 326.80 isn't stable but beta. Since this is not a boinc-only machine, I'd prefer to stay with the stable drivers. Rene |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hmmm...the 326.80 isn't stable but beta. Since this is not a boinc-only machine, I'd prefer to stay with the stable drivers. Hi Rene, just for info I have 8 machines running here on 326.80 with no noticeable problems. In fact they all have both NVidia and AMD GPUs installed. |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, thanks for the info. Maybe I should give it a try... |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Non, not even with the new drivers does it work. The application still does nothing... I cancelled both WUs. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
pnitrox122-NOELIA_INS1P-1-12-RND5810_0 2Mgx191-NOELIA_INS1P-6-12-RND2605_0 I99R1-NATHAN_KIDc22_glu-3-10-RND8774_1 Yesterday I reported similar behavior while running a NOELIA_INS1P WU (even on Linux), http://www.gpugrid.net/forum_thread.php?id=3466&nowrap=true#33057 FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
|
Send message Joined: 25 Apr 13 Posts: 27 Credit: 240,283,511 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
This run keeps increasing its remaining time with no end in sight. Should I abort it? |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Was that on the GTX 650 Ti BOOST? I think you also have a GTX 660 as I recall. I want to give mine a try again on the just-released 327.23 drivers, but the 660s seem to have been somewhat problematic recently. |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
@skgiven the WU that is running (more or less) at the moment is a SANTI_RAP74. This one also does nothing... :( |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The WU which did run for some hours has lot's of "# BOINC suspending at user request (thread suspend)" lines in the log. If it's a new installation: did you already check "Nutze die GPU wenn der Computer benutzt wird" in the local BOINC settings, CPU tab? And "Wenn CPU-Auslastung geringer als x%" with x set to 0? MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Sure, see screenshot. Those message lines are more than interesting, but I can't explain what causes them. |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
The thing is that all other GPU tasks (SETI, Einstein, PrimeGrid, POEM) are running fine on this machine. |
|
Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I got a similar problem for months .. I already try all the tutorials on this forum and the clean reinstal win 8 64 bit .. and even observe the problem on my hardware manufacturer's website .. A problem is in communication GPU grid taks and nvidia drivers .. cuda and programming errors .. Just two-week working gpu grid normally and then comes tasks wrong and all work is ** I see a lot of people who do not have problems, but they probably use computers only for gpu grid, or is in use linux .. But for many people discourage these problems by counting in GPUGRID For example, the Collatz Conjecture I for about a week, two, the average rac 650000 .. as well as the gpu grid for few months, but then the problems started about which is fully forum .. Two days ago I did one job for about 8-9 hours .. they are running me two because I have two cards in sli .. After today crash nvidia driver and subsequent BSOD and forced restarts, obviously wasteful tasks and credit .. I already shows one manager onetasks performed for about 9-10 hours ... weeks before the clean installation, win 8 64bit on my ssd, I one task done in 14-16 hours ... Then I had an older bios on board and voalaa,, I counts one task for 8 hours until this morning when back on the old problem of crash nvidia drivers, and, chrome browser, and others .. I've never not install the beta nvidia drivers, just WHQL,because with the beta drivers it worse.. just going to install nvidia 327.23 driver...(( |
|
Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
when I installed nvidia drivers, nvidia driver fell again in a few second intervals, pop up notification of a collapse of the controls is flashed ... it's crazy.. after the next reboot while it works well but one task will count 9 -10 hours .. so again is really something wrong.. is proably never ending problems in this project :-) |
|
Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Just so.. |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, did a couple of debug sessions and took a look into the app_control code. It seems that the task gets suspended due to CPU throttling. I'll have a deeper look now to find out why this is happening. Will keep you posted... |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, this was a quick solution ;) I think the CPU throttling in BOINC 7.0.64 is non-optimal. When you take a look at my screenshot of the BOINC options, you'll notice that I only allow to use 75% of my CPUs. I'm not running any CPU-only WUs on this machine, so there is always just 1 active WU, since I've only got one GPU. After analyzing the debug output and the source code, I've just changed the option from 75% to 100%...BINGO!!! That worked :) Now the WU is running fine. But I think the CPU throttle handling in BOINC needs a bit of tweaking, since the GPUGrid task never ever used 75% of one CPU... |
|
Send message Joined: 17 Nov 12 Posts: 30 Credit: 111,887,025 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
@Josef maybe you should lower the GPU and memory clock speeds a bit. If the GPUs are running nearly at 100% for a long period of time, the electronics might not be able to support the factory clocks speeds any longer. In the past I've had the same problems (see http://www.gpugrid.net/forum_thread.php?id=3421#31554). After I lowered the clocks a bit, everything is running smooth. cheers Rene |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, this was a quick solution ;) I think the CPU throttling in BOINC 7.0.64 is non-optimal. Correct. That was a brief (and fortunately now abandoned) aberration in BOINC. Later developmental versions (and BOINC v7.2 when it's released "real soon now") will go back to the old behaviour - CPU throttling not applied to GPU apps. I've written up details of the exact versions affected on some project's message board - I'll try and work out which project it was, and copy them back here later. |
©2025 Universitat Pompeu Fabra