BOINC 6.4.5 released for Windows, Windows x64, Linux and Linux x64

Message boards : Graphics cards (GPUs) : BOINC 6.4.5 released for Windows, Windows x64, Linux and Linux x64
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Blackbird74

Send message
Joined: 20 Nov 08
Posts: 3
Credit: 362,118
RAC: 0
Level

Scientific publications
watwatwatwatwat
Message 4235 - Posted: 10 Dec 2008, 13:04:04 UTC

Didn't see a post about this so thought I should put one up.

Change Log:

- client: tweak CPU scheduling policy. When there's a coproc job:
Windows: don't saturate CPUs
Unix: saturate CPUs

- client: in round-robin simulation, remove code that sets CPU shortfall for projects with no active results.

This is now wrong because there coproc apps might have pending results. Also remove nidle_cpus > 0 conditional that increments CPU shortfall; I think this is vestigial code.

- client: include deviceOverlap and multiProcessorCount in XML for CUDA devices. They were mistakenly omitted.

- client: in round-robin simulation, don't count a project in total resource share if it has coproc jobs and no CPU jobs.

- MGR: fix the terms of use wizard page.

Original Post:
http://boinc.berkeley.edu/dev/forum_thread.php?id=2518&nowrap=true#21694

Download area:
http://boinc.berkeley.edu/download_all.php
ID: 4235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krunchin-Keith [USA]
Avatar

Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4236 - Posted: 10 Dec 2008, 13:29:05 UTC - in response to Message 4235.  

Didn't see a post about this so thought I should put one up.

Change Log:

- client: tweak CPU scheduling policy. When there's a coproc job:
Windows: don't saturate CPUs
Unix: saturate CPUs

- client: in round-robin simulation, remove code that sets CPU shortfall for projects with no active results.

This is now wrong because there coproc apps might have pending results. Also remove nidle_cpus > 0 conditional that increments CPU shortfall; I think this is vestigial code.

- client: include deviceOverlap and multiProcessorCount in XML for CUDA devices. They were mistakenly omitted.

- client: in round-robin simulation, don't count a project in total resource share if it has coproc jobs and no CPU jobs.

- MGR: fix the terms of use wizard page.

Original Post:
http://boinc.berkeley.edu/dev/forum_thread.php?id=2518&nowrap=true#21694

Download area:
http://boinc.berkeley.edu/download_all.php

I didn't make one because this one still has problems.

It appears with a change in the client the DCF is getting maxed out to 100, this started with 6.4.3. What happens is this cause the cleint to think that every GPUGRID task is going to take way longer than it does. The 4 day deadline, to the client is too short, and it runs the task in high priority, not fetching more work either.

I can track back in my backups on this host to 6.4.2 and it has a DCF of through the versions upgraded as 100,100,100 and then 1.317483
My other two hosts still running 6.4.2 have DCF's of 1.107852 and 1.23629
This pretty much eliminates the application and points to the client version.

It did seem to be running max tasks again, and I had to (for windows) set my processor percentage back to 50% so as to have one dedicated CPU for GPUGIRD, otherwise the cpu usage drops and the gpu elapsed time goes up.

@GDF
You need to set the CPU USAGE in the different applications for this, Set Windows to CPU=1.0 and set linux to some low number such as CPU=0.02 since that is what users say linux uses. This way linux users can run max cpus + 1 gpugrid without penalty or having to use ncpus+1 and Windows users can be have a dedicated cpu for gpugrid without have to set processors to 1 less, and if gpugrid runs out of work, they can use the processor for a cpu task instead of it being idle. I would think you can do separate templates for each version o/s to account for this, i'm guessing that is where you adjust that factor. If not, contact David and ask how to do it. He is aware this is how it should be, adjusted on the project and not in the client, so one client can run both ways.
ID: 4236 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4237 - Posted: 10 Dec 2008, 15:06:34 UTC - in response to Message 4236.  

we are working on improving the Windows speed with Nvidia.
They have just sent some code to test.
g
ID: 4237 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krunchin-Keith [USA]
Avatar

Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4238 - Posted: 10 Dec 2008, 15:39:04 UTC - in response to Message 4237.  

we are working on improving the Windows speed with Nvidia.
They have just sent some code to test.
g

That just happens to be my Christmas wish this year.
ID: 4238 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF>HFR>RR] Jim PROFIT

Send message
Joined: 3 Jun 07
Posts: 107
Credit: 31,331,137
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 4239 - Posted: 10 Dec 2008, 15:46:37 UTC - in response to Message 4237.  

we are working on improving the Windows speed with Nvidia.
They have just sent some code to test.
g


Maybe the DCF problem will be solve.

Jim PROFIT
ID: 4239 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krunchin-Keith [USA]
Avatar

Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4241 - Posted: 10 Dec 2008, 17:57:04 UTC - in response to Message 4239.  

we are working on improving the Windows speed with Nvidia.
They have just sent some code to test.
g


Maybe the DCF problem will be solve.

Jim PROFIT

The DCF is part of BOINC, not NVIDIA.
ID: 4241 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Vid Vidmar*
Avatar

Send message
Joined: 27 Aug 08
Posts: 18
Credit: 1,146,374
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 4246 - Posted: 11 Dec 2008, 10:17:27 UTC - in response to Message 4237.  

we are working on improving the Windows speed with Nvidia.
They have just sent some code to test.
g


What about those pesky x86_64bit app memory leaks? I have tried just about everything. The only solution so far is to monitor memory usage and reboot before it gets filled up.
BR,

ID: 4246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krunchin-Keith [USA]
Avatar

Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4249 - Posted: 11 Dec 2008, 13:10:44 UTC

I found out about the DCF, this is because the client was changed to use FLOPS counting for GPU tasks, The reason it is off is the FLOPS estimate in the work unit is too low. The old version client works because it does not use that value. GDF will correct in new work units.

I do believe I have had a more steady flow of work in 6.4.5 but only 1 at a time so far as the DCF is too high. Once the correct FLOPS value is used, it should correct back to near 1 (over several task) and resume normal operation.

I did get more work, automatically, after one finished, well 49 minutes after, but at least that is better than having no work and two just waiting to report needing manual intervention as in 6.4.4

12/10/2008 10:00:30 PM|GPUGRID|Finished upload of JZa1465-GPUTEST5-15-20-acemd_0_2
...
12/10/2008 10:49:08 PM|GPUGRID|Sending scheduler request: To fetch work. Requesting 16354 seconds of work, reporting 1 completed tasks
12/10/2008 10:49:13 PM|GPUGRID|Scheduler request completed: got 1 new tasks
...
12/10/2008 10:50:11 PM|GPUGRID|Finished download of no10932-GPUTEST5-14-grama.ionized.psf
12/10/2008 10:50:12 PM|GPUGRID|Starting no10932-GPUTEST5-14-20-acemd_0
12/10/2008 10:50:12 PM|GPUGRID|Starting task no10932-GPUTEST5-14-20-acemd_0 using acemd version 653
ID: 4249 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4253 - Posted: 11 Dec 2008, 16:44:31 UTC - in response to Message 4249.  

I should have fixed the estimated flops for new workunits.
This return a correct timing only on 6.4.5 clients.


gdf
ID: 4253 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile The Gas Giant
Avatar

Send message
Joined: 20 Sep 08
Posts: 54
Credit: 607,157
RAC: 0
Level
Gly
Scientific publications
watwatwatwat
Message 4256 - Posted: 11 Dec 2008, 19:34:22 UTC

Just upgraded to 6.4.5. There is 1 GPU task running and none in the cache. I have work buffer set at 1.0 days and connect every 0.1 days. BOINC use to cache upto 4 wu's, but now get the following message on the work request.

12/12/2008 6:26:57 AM|GPUGRID|Sending scheduler request: To fetch work. Requesting 99540 seconds of work, reporting 0 completed tasks
12/12/2008 6:27:12 AM|GPUGRID|Scheduler request completed: got 0 new tasks
12/12/2008 6:27:12 AM|GPUGRID|Message from server: No work sent
12/12/2008 6:27:12 AM|GPUGRID|Message from server: (won't finish in time) BOINC runs 99.9% of time, computation enabled 100.0% of that
ID: 4256 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Nightlord
Avatar

Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 4259 - Posted: 11 Dec 2008, 21:55:42 UTC

I haven't touch my installations since 6.3.21 but I get the same message too now.

ID: 4259 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Sherman H.

Send message
Joined: 28 Sep 08
Posts: 27
Credit: 6,201,632,872
RAC: 0
Level
Tyr
Scientific publications
watwatwatwat
Message 4260 - Posted: 12 Dec 2008, 2:56:44 UTC

I got the same message on 2 machines running 6.3.19.
ID: 4260 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile K1atOdessa

Send message
Joined: 25 Feb 08
Posts: 249
Credit: 444,646,963
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4261 - Posted: 12 Dec 2008, 3:47:54 UTC - in response to Message 4236.  

It appears with a change in the client the DCF is getting maxed out to 100, this started with 6.4.3. What happens is this cause the cleint to think that every GPUGRID task is going to take way longer than it does. The 4 day deadline, to the client is too short, and it runs the task in high priority, not fetching more work either.


I upgraded to 6.4.5 given the note that it was the preferred client. After noticing this issue, I downgraded back to my previous 6.3.19 (which has worked best for me in the past). However, it still has the GPU tasks running High Priority with estimated completion times ~ 10474:34:16. Previously, these were about 11 hours (though not accurate, low by about 4 hours), and thus did not run High Priority.

It sounds as though this will self-correct eventually, though I was hoping it would revert back to previous functionality after doing a fresh uninstall/install of the BOINC 6.3.19 client. :-(
ID: 4261 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
STE\/E

Send message
Joined: 18 Sep 08
Posts: 368
Credit: 4,174,624,885
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 4282 - Posted: 13 Dec 2008, 10:06:46 UTC - in response to Message 4261.  

It sounds as though this will self-correct eventually, though I was hoping it would revert back to previous functionality after doing a fresh uninstall/install of the BOINC 6.3.19 client. :-(


I wouldn't count on that at the moment, it seems the more things change the worse things get. I've had 2 Box's with just 1 Wu on them for 2-3 day's now & still am getting crazy To Completion times as high as 27,000+ Hours on those Box's.

I've also noticed a few other Box's have dropped to only 2 or 3 Wu's so I suppose they will be down to just 1 Wu too eventually ... 0_o

ID: 4282 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JAMC

Send message
Joined: 16 Nov 08
Posts: 28
Credit: 12,688,454
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 4283 - Posted: 13 Dec 2008, 10:20:01 UTC
Last modified: 13 Dec 2008, 10:21:47 UTC

I am getting WU's with 282 and 538 hour to completion times- everything running high priority :(
6.4.5, XP Home
ID: 4283 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Kokomiko
Avatar

Send message
Joined: 18 Jul 08
Posts: 190
Credit: 24,093,690
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 4284 - Posted: 13 Dec 2008, 10:32:46 UTC

I've changed the DCF-Factor on my machines to a more realistic value and now my boxes (all Quads) are running with a nearly correct estimated time, 8 hours instead of 6:30h. Now I have the next problem. I've running the additional projects CPDN, PrimeGrid, WCG and MilkyWay. To get a new WU I have to stop 3 projects, especially CPDN (work for over 800 hours) and WCG (work for 48 hours). My workcache is set to 2 days. When I have downloaded on this way a second WU, the timer shows 24 hours for the next call. But my GTX280 need only 13 hours for this 2 WUs. So, If I'm absent and can't make a call manually, my PC will be 11 hours without new work. This should be corrected.
ID: 4284 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JAMC

Send message
Joined: 16 Nov 08
Posts: 28
Credit: 12,688,454
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 4285 - Posted: 13 Dec 2008, 10:40:13 UTC - in response to Message 4284.  

I've changed the DCF-Factor on my machines to a more realistic value and now my boxes (all Quads) are running with a nearly correct estimated time, 8 hours instead of 6:30h. Now I have the next problem. I've running the additional projects CPDN, PrimeGrid, WCG and MilkyWay. To get a new WU I have to stop 3 projects, especially CPDN (work for over 800 hours) and WCG (work for 48 hours). My workcache is set to 2 days. When I have downloaded on this way a second WU, the timer shows 24 hours for the next call. But my GTX280 need only 13 hours for this 2 WUs. So, If I'm absent and can't make a call manually, my PC will be 11 hours without new work. This should be corrected.


I saw this too when the cache was set to 1 day or more so I have just reduced it to .5 days and have not seen the 23 hour plus time for the next connect... I am often left with just the cuda WU being crunched and no others in line and have to manually suspend the other projects to prime the pump for more... 1/5 boxes not running GPU WU in high priority at the moment...
ID: 4285 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 31 Oct 08
Posts: 186
Credit: 3,578,903,157
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4314 - Posted: 14 Dec 2008, 1:31:59 UTC
Last modified: 14 Dec 2008, 1:37:05 UTC

Every gpu (9800gtx+) task is running at high priority and there is no need. Each task finishes in under 11 hours iregardless of the priority and the deadlines are at least 4 days away.

On a quad system, this causes one cpu to be dedicated to the gpugrid task. There is no need for this as I was getting 11 hours gpu completion with 5 tasks running and it is no different with 4 tasks running. This has dropped my overall credit production down.

I assume going back to 6.4.1 might fix this???


With 6.4.1, I was using about 800 seconds of CPU time to process an 11 hour ET job. Now it is taking 22,000 seconds to do the same job. There seems no way to disable the high priority for the gputask. They (BOINC) are not calculating the coprocessor efficiency and utilization correctly.
ID: 4314 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jack Shaftoe

Send message
Joined: 26 Nov 08
Posts: 27
Credit: 1,813,606
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 4320 - Posted: 14 Dec 2008, 15:13:09 UTC
Last modified: 14 Dec 2008, 15:24:38 UTC

Using 6.4.5 last yesterday, 2 blue screens:

Error code 100000ea, parameter1 8855c2c8, parameter2 89966940, parameter3 bacfbcbc, parameter4 00000001.

For more information, see Help and Support Center at http://go.microsoft.com/fwlink/events.asp.


One of my teammates had the same problem on his box. If I roll back to 6.3.x - what was the last recommended version? 6.3.19?
ID: 4320 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JAMC

Send message
Joined: 16 Nov 08
Posts: 28
Credit: 12,688,454
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 4321 - Posted: 14 Dec 2008, 15:19:30 UTC

So is 6.4.5 still the suggested version- hope for some quick fixes, or roll back to version 'x'?
ID: 4321 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Graphics cards (GPUs) : BOINC 6.4.5 released for Windows, Windows x64, Linux and Linux x64

©2025 Universitat Pompeu Fabra