strange behaviour...

Message boards : Number crunching : strange behaviour...
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33072 - Posted: 18 Sep 2013, 19:36:05 UTC

Hi there,

one of my boinc machines is a Win7 Pro 64Bit with an ASUS GTX570 card. The NVidia driver is the latest 320.49. This machine shows a strange behaviour: each of the WUs (http://www.gpugrid.net/results.php?hostid=158339) will be started without any failure, seems to run for hours, but nothing happens...no CPU usage, no GPU usage, no progress...

What's wrong here ?

best regards,
Rene
ID: 33072 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33076 - Posted: 18 Sep 2013, 20:10:39 UTC - in response to Message 33072.  

Did you reboot the machine? Power off, remove the power cord, wait 10+ mins and power back on? Driver reinstall, maybe just straight the new 326.80? Is BOINC actually saying "running" in the manager?

MrS
Scanning for our furry friends since Jan 2002
ID: 33076 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33080 - Posted: 18 Sep 2013, 20:38:37 UTC - in response to Message 33076.  

Hi,

yes I did. The BOINC manager says that it is running and the messages file shows it too. I've got another machine for GPUGRID with the same OS and drivers, but with a GTX480 and a GTX560Ti. This machine doesn't show any unusual behaviour.

Hmmm...the 326.80 isn't stable but beta. Since this is not a boinc-only machine, I'd prefer to stay with the stable drivers.

Rene
ID: 33080 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33081 - Posted: 18 Sep 2013, 21:12:52 UTC - in response to Message 33080.  

Hmmm...the 326.80 isn't stable but beta. Since this is not a boinc-only machine, I'd prefer to stay with the stable drivers.

Hi Rene, just for info I have 8 machines running here on 326.80 with no noticeable problems. In fact they all have both NVidia and AMD GPUs installed.
ID: 33081 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33085 - Posted: 19 Sep 2013, 5:28:25 UTC - in response to Message 33081.  

Hi,

thanks for the info. Maybe I should give it a try...
ID: 33085 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33086 - Posted: 19 Sep 2013, 5:50:20 UTC - in response to Message 33085.  

Non, not even with the new drivers does it work. The application still does nothing... I cancelled both WUs.
ID: 33086 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33089 - Posted: 19 Sep 2013, 8:40:51 UTC - in response to Message 33086.  
Last modified: 19 Sep 2013, 8:42:23 UTC

pnitrox122-NOELIA_INS1P-1-12-RND5810_0
2Mgx191-NOELIA_INS1P-6-12-RND2605_0
I99R1-NATHAN_KIDc22_glu-3-10-RND8774_1

Yesterday I reported similar behavior while running a NOELIA_INS1P WU (even on Linux),
http://www.gpugrid.net/forum_thread.php?id=3466&nowrap=true#33057
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 33089 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Paul

Send message
Joined: 25 Apr 13
Posts: 27
Credit: 240,283,511
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 33091 - Posted: 19 Sep 2013, 9:10:39 UTC

This run keeps increasing its remaining time with no end in sight.

Should I abort it?
ID: 33091 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33094 - Posted: 19 Sep 2013, 14:34:19 UTC - in response to Message 33089.  

Was that on the GTX 650 Ti BOOST? I think you also have a GTX 660 as I recall. I want to give mine a try again on the just-released 327.23 drivers, but the 660s seem to have been somewhat problematic recently.
ID: 33094 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33095 - Posted: 19 Sep 2013, 16:37:30 UTC - in response to Message 33089.  

@skgiven

the WU that is running (more or less) at the moment is a SANTI_RAP74. This one also does nothing... :(
ID: 33095 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33097 - Posted: 19 Sep 2013, 18:52:56 UTC

The WU which did run for some hours has lot's of "# BOINC suspending at user request (thread suspend)" lines in the log. If it's a new installation: did you already check "Nutze die GPU wenn der Computer benutzt wird" in the local BOINC settings, CPU tab? And "Wenn CPU-Auslastung geringer als x%" with x set to 0?

MrS
Scanning for our furry friends since Jan 2002
ID: 33097 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33098 - Posted: 19 Sep 2013, 19:19:45 UTC - in response to Message 33097.  

Sure, see screenshot. Those message lines are more than interesting, but I can't explain what causes them.

ID: 33098 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33099 - Posted: 19 Sep 2013, 19:21:36 UTC - in response to Message 33098.  

The thing is that all other GPU tasks (SETI, Einstein, PrimeGrid, POEM) are running fine on this machine.
ID: 33099 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jozef J

Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,140,895,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 33100 - Posted: 19 Sep 2013, 20:00:18 UTC

I got a similar problem for months .. I already try all the tutorials on this forum and the clean reinstal win 8 64 bit .. and even observe the problem on my hardware manufacturer's website ..

A problem is in communication GPU grid taks and nvidia drivers .. cuda and programming errors ..
Just two-week working gpu grid normally and then comes tasks wrong and all work is **
I see a lot of people who do not have problems, but they probably use computers only for gpu grid, or is in use linux .. But for many people discourage these problems by counting in GPUGRID
For example, the Collatz Conjecture I for about a week, two, the average rac 650000 .. as well as the gpu grid for few months, but then the problems started about which is fully forum ..
Two days ago I did one job for about 8-9 hours .. they are running me two because I have two cards in sli .. After today crash nvidia driver and subsequent BSOD and forced restarts, obviously wasteful tasks and credit .. I already shows one manager onetasks performed for about 9-10 hours ...
weeks before the clean installation, win 8 64bit on my ssd, I one task done in 14-16 hours ... Then I had an older bios on board and voalaa,, I counts one task for 8 hours until this morning when back on the old problem of crash nvidia drivers, and, chrome browser, and others ..
I've never not install the beta nvidia drivers, just WHQL,because with the beta drivers it worse..

just going to install nvidia 327.23 driver...((
ID: 33100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jozef J

Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,140,895,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 33102 - Posted: 19 Sep 2013, 20:25:33 UTC - in response to Message 33100.  

when I installed nvidia drivers, nvidia driver fell again in a few second intervals, pop up notification of a collapse of the controls is flashed ... it's crazy.. after the next reboot while it works well but one task will count 9 -10 hours .. so again is really something wrong..
is proably never ending problems in this project :-)
ID: 33102 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jozef J

Send message
Joined: 7 Jun 12
Posts: 112
Credit: 1,140,895,172
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 33103 - Posted: 19 Sep 2013, 20:26:14 UTC - in response to Message 33099.  

Just so..
ID: 33103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33104 - Posted: 19 Sep 2013, 20:40:06 UTC - in response to Message 33099.  

Ok, did a couple of debug sessions and took a look into the app_control code.

It seems that the task gets suspended due to CPU throttling. I'll have a deeper look now to find out why this is happening.

Will keep you posted...
ID: 33104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33105 - Posted: 19 Sep 2013, 20:52:26 UTC - in response to Message 33104.  

Ok, this was a quick solution ;) I think the CPU throttling in BOINC 7.0.64 is non-optimal. When you take a look at my screenshot of the BOINC options, you'll notice that I only allow to use 75% of my CPUs. I'm not running any CPU-only WUs on this machine, so there is always just 1 active WU, since I've only got one GPU.

After analyzing the debug output and the source code, I've just changed the option from 75% to 100%...BINGO!!! That worked :)

Now the WU is running fine.

But I think the CPU throttle handling in BOINC needs a bit of tweaking, since the GPUGrid task never ever used 75% of one CPU...
ID: 33105 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
capeITLabs

Send message
Joined: 17 Nov 12
Posts: 30
Credit: 111,887,025
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwat
Message 33106 - Posted: 19 Sep 2013, 21:05:28 UTC - in response to Message 33100.  

@Josef

maybe you should lower the GPU and memory clock speeds a bit. If the GPUs are running nearly at 100% for a long period of time, the electronics might not be able to support the factory clocks speeds any longer.

In the past I've had the same problems (see http://www.gpugrid.net/forum_thread.php?id=3421#31554). After I lowered the clocks a bit, everything is running smooth.

cheers
Rene
ID: 33106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 318
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 33107 - Posted: 19 Sep 2013, 21:42:29 UTC - in response to Message 33105.  

Ok, this was a quick solution ;) I think the CPU throttling in BOINC 7.0.64 is non-optimal.

Correct. That was a brief (and fortunately now abandoned) aberration in BOINC. Later developmental versions (and BOINC v7.2 when it's released "real soon now") will go back to the old behaviour - CPU throttling not applied to GPU apps.

I've written up details of the exact versions affected on some project's message board - I'll try and work out which project it was, and copy them back here later.
ID: 33107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : strange behaviour...

©2025 Universitat Pompeu Fabra