Advanced search

Message boards : Graphics cards (GPUs) : acemd 6.53

Author Message
Profile Krunchin-Keith [USA]
Avatar
Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4309 - Posted: 13 Dec 2008 | 20:02:15 UTC

What is up with this new applciation.

my cpu times have gone up, this make the gpu elapsed time and ms/step also have gone up.

I see similar results across all three hosts.

Example, top three results are from 6.53, you can see an increase in cpu time, gpu time and ms/step. All the previous results are very cconsistant. No other changes were made in clients it shows up as soon as 6.53 was started. Does not matter what version clinet i was running on each host.


cputime - credit claim - granted - version - driver - cpus - client - gpu time - ms/step
51,765.08 3,232.06 3,232.06 6.53 178.24 1.90 6.4.02 62,292.532 73.285
51,160.42 3,232.06 3,232.06 6.53 178.24 1.90 6.4.02 62,612.755 73.662
52,259.07 3,232.06 3,232.06 6.53 178.24 1.90 6.4.02 63,227.636 74.385
49,119.95 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,451.948 69.943
49,250.64 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,452.824 69.944
48,724.11 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,492.902 69.992
48,458.51 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,268.219 69.727
48,364.13 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,162.575 69.603
49,230.23 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,228.002 69.680
49,079.31 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,123.869 69.557
49,090.84 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,325.965 69.795
48,103.39 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,306.325 69.772
49,156.69 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,354.071 69.828
49,170.42 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,486.919 69.985
49,379.95 3,232.06 3,232.06 6.52 178.24 1.90 6.4.02 59,273.031 69.733

____________
Alpha Tester ~~ BOINCin since 10-Apr-2004 (2.28) ~~~ Join team USA

Profile Kokomiko
Avatar
Send message
Joined: 18 Jul 08
Posts: 190
Credit: 24,093,690
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 4310 - Posted: 13 Dec 2008 | 20:27:25 UTC - in response to Message 4309.

My last 2 workunits (shorter WU) are ca. 30% shorter and I get only 2435.94444444444 credits for it. Now I have oftener to make a manually update to get new work (24 hour recall problem).


____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4312 - Posted: 13 Dec 2008 | 21:53:35 UTC - in response to Message 4309.

You have to look at the workunit name. Otherwise it is normal that the workunit time changes. Different workunits can have different time per step, different total times and therefore different credits.

Usually, we submit a small of set of WU to test and then more and more. We are preparing several molecular experiments to run concurrently. We are actually quite excited about it. Movies will be also published soon.

GDF

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4313 - Posted: 13 Dec 2008 | 21:55:27 UTC - in response to Message 4309.

We will check. There should be only minor differences with previous application.

gdf

Profile [BOINC@Poland]AiDec
Send message
Joined: 2 Sep 08
Posts: 53
Credit: 9,213,937
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwat
Message 4316 - Posted: 14 Dec 2008 | 7:49:36 UTC - in response to Message 4310.
Last modified: 14 Dec 2008 | 7:53:42 UTC

My last 2 workunits (shorter WU) are ca. 30% shorter and I get only 2435.94444444444 credits for it. Now I have oftener to make a manually update to get new work (24 hour recall problem).



1. I had exactly the same (lenght of WU). But for me it`s not a problem.

2. The problem is that app 6.53 said that my WU`s will take over 78h (sure it`s not true - GTX280 OC). It means what`s in follow: on the maschine with 2x280GTX I have 3 WU`s (2 in use) and BM will not request more (the 4th one WU). BM will not request more cause it means for BM that I`ll can`t finish in time. It means that after finishing 2 of 3 WU`s one of my 280 will be idle.

Just for now I`ve changed `Resource share` on every maschines up to 1000 to get work. Hope it will work until problem will be solved (6.54?)



As always - sorry for bad english ;)
____________

Profile Krunchin-Keith [USA]
Avatar
Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4326 - Posted: 14 Dec 2008 | 16:39:12 UTC - in response to Message 4312.
Last modified: 14 Dec 2008 | 16:42:31 UTC

You have to look at the workunit name. Otherwise it is normal that the workunit time changes. Different workunits can have different time per step, different total times and therefore different credits.

Usually, we submit a small of set of WU to test and then more and more. We are preparing several molecular experiments to run concurrently. We are actually quite excited about it. Movies will be also published soon.

GDF

Looking at the names, they were the same group TEST5 except one TEST6. The INCREASE in ms/step was when the app version changed from 6.52 to 6.53, same task group names. Only the last one on the list is a TEST6 and that shows the same ms/step under 6.53 as the TEST5 under 6.53 (the top three on my list). With 6.53 these are showing 73-74 ms/step. Prior to that I had over 30 results with 6.52 all in the 69 ms/step range.

And the last one completed is same way, increase ms/step.

161682
Name WLX3978-GPUTEST6-1-20-acemd_0
Workunit 120384
Created 13 Dec 2008 13:57:38 UTC
Sent 13 Dec 2008 19:02:36 UTC
Received 14 Dec 2008 14:07:56 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 6577
Report deadline 17 Dec 2008 19:02:36 UTC
CPU time 51465.75
stderr out

<core_client_version>6.4.2</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112
MDIO ERROR: cannot open file "restart.coor"
# Time per step: 73.314 ms
# Approximate elapsed time for entire WU: 62317.270 s
called boinc_finish

</stderr_txt>
]]>

Validate state Valid
Claimed credit 3232.06365740741
Granted credit 3232.06365740741
application version 6.53

Profile Kokomiko
Avatar
Send message
Joined: 18 Jul 08
Posts: 190
Credit: 24,093,690
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 4333 - Posted: 14 Dec 2008 | 17:31:32 UTC - in response to Message 4316.
Last modified: 14 Dec 2008 | 17:31:51 UTC



As always - sorry for bad english ;)


Bad english is the language of the science ... ;)

But your english is much better then mine ... :D
____________

Profile The Gas Giant
Avatar
Send message
Joined: 20 Sep 08
Posts: 54
Credit: 607,157
RAC: 0
Level
Gly
Scientific publications
watwatwatwat
Message 4339 - Posted: 14 Dec 2008 | 19:42:53 UTC

6.53 - time per step = 131 ms, overall 65550 sec per wu, credit 2435.
6.52 - time per step = 96 ms, overall 82022 sec, credit 3232.

All up getting slightly less credit per day with the new wu's.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4342 - Posted: 14 Dec 2008 | 20:13:04 UTC - in response to Message 4339.

We will soon upload new Linux and Windows applications.


gdf

Profile [BOINC@Poland]AiDec
Send message
Joined: 2 Sep 08
Posts: 53
Credit: 9,213,937
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwat
Message 4345 - Posted: 14 Dec 2008 | 22:02:13 UTC - in response to Message 4316.
Last modified: 14 Dec 2008 | 22:02:53 UTC

We will soon upload new Linux and Windows applications.

gdf


I hope it will help :) because:


2. The problem is that app 6.53 said that my WU`s will take over 78h (sure it`s not true - GTX280 OC). It means what`s in follow: on the maschine with 2x280GTX I have 3 WU`s (2 in use) and BM will not request more (the 4th one WU). BM will not request more cause it means for BM that I`ll can`t finish in time. It means that after finishing 2 of 3 WU`s one of my 280 will be idle.

Just for now I`ve changed `Resource share` on every maschines up to 1000 to get work. Hope it will work until problem will be solved (6.54?)


It doesn`t help 100%. After rising up `Resource share` to 1000 I get manually 1 more task (the 4th one) and I went sleep. When I awake I checked my box - 2x280 has finished those 4 WU`s and didn`t request more... Box was idle. Nice 24C ;)
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4347 - Posted: 14 Dec 2008 | 22:05:53 UTC - in response to Message 4345.

This is not an application problem. We are trying to normalize the estimated time. A project reset should solve the problem.

gdf

localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4352 - Posted: 15 Dec 2008 | 7:40:53 UTC - in response to Message 4347.

Hi GDF - ran a project reset on one of my boxes. Doesn't solve the (my) problem - in fact it makes it worse. Estimated run times had come down to the 100's, but after a rest WUs were showing 299 hours to completion. I'm still managing the work manually - and these shorter wWUs in fact make things harder to stop my hosts sitting idle.
I'm reluctant to reset the others as they are makig progress, albeit slowly, on reducing the time to completion at download.

P.

Profile [AF>HFR>RR] Jim PROFIT
Send message
Joined: 3 Jun 07
Posts: 107
Credit: 31,331,137
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 4356 - Posted: 15 Dec 2008 | 15:49:19 UTC

I try to reset the project, correct the DCF, but nothing correct the problem.

On computer with 2 GPU cards, when the 2 WU are finished, i can't get more WU if i don't updated manually. And i have only 2 WU, not like before.

Until the 6.4.3, i didn't have any issues, with this project, except some memories problems like any others users.

So you said that it's not a problem with the application or the project, so it's a problem with the new version of BOINC.
But do you know if this problem will be correct?

Jim PROFIT

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4357 - Posted: 15 Dec 2008 | 16:07:03 UTC - in response to Message 4356.

I reverted to 6.4.3 on the main page.

gdf.

Profile [BOINC@Poland]AiDec
Send message
Joined: 2 Sep 08
Posts: 53
Credit: 9,213,937
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwat
Message 4364 - Posted: 15 Dec 2008 | 18:49:33 UTC - in response to Message 4357.
Last modified: 15 Dec 2008 | 18:59:37 UTC

It doesn`t work. Reseting project is making the situation worse as Burdett have written. Estimated time 201 hours. One of my computers (with 280GTX) downloaded just one WU and will not take more. Because of `ETA`.

I`ve stopped all other tasks and I`ve changed DCF to 0.000001 and `Resource share` to 1000. For 1 of my boxes it`s not enough.


I reverted to 6.4.3 on the main page.

gdf.



Just for ur knowledge. I`m using 6.3.19 and 6.3.21 (I don`t trust this new s***). One is x32 and another x64. Going back to 6.4.3 could not solve problem.
____________

Profile Kokomiko
Avatar
Send message
Joined: 18 Jul 08
Posts: 190
Credit: 24,093,690
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 4369 - Posted: 15 Dec 2008 | 21:04:32 UTC

I further on use the 6.4.5. When I get a WU of the bad lot (this with another name that GPUTEST5 or GPUTEST6), I know, my DCF will be messed up. So I edit the DCF back to 1.00000 and then I get new work. I have on every box with this method 3 or 4 WUs.
____________

Profile [BOINC@Poland]AiDec
Send message
Joined: 2 Sep 08
Posts: 53
Credit: 9,213,937
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwat
Message 4461 - Posted: 18 Dec 2008 | 12:11:34 UTC
Last modified: 18 Dec 2008 | 12:22:28 UTC

@GDF (here we started, here I`ll continue):

Still big problems. Here you have some specs and done steps:

XP32
6.3.19 (x32)
Tried both 178.24 and newest 180.48
DCF normal
Resurce share 1000
All other tasks stopped

Manual project update = request 0

Comp is with 2x280GTX

I had 3 WU`s (can`t reach the 4th one...). I`ve finished 2 of them. Crunching last one (using just one 280GTX) with ETA 60h!

1. Manual project update = request 0

2. Resetting project.

3. Project reseted.

4. Manual request.

5. Correct request!

6. Receiving info:
`Message from server: No work sent.
Message from server: Full-atom molecular dynamics on Cell processor is not available for your type of computer.
Message from server: Full-atom molecular dynamics for Cell processor is not available for your type of computer.`

7. Status: No tasks and:
`Next request in 24h`



Tried `Manual project update` few times. Request correct. But always `Message from server: Full-atom molecular dynamics for Cell processor is not available for your type of computer.`



You don`t even have to answer :). It`s just to inform you.
____________

localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4463 - Posted: 18 Dec 2008 | 12:24:27 UTC - in response to Message 4461.

Something's screwy - GPUGrid no longer functions as a project that works without a biological coprocessor (me).

I don't understand the logic behind the 'wrong type of processor message' - it appears on my PCs when the DCF issues stop download of a WU .... not meaningful or applicable, but then it causes the 24hour back off which is a bigger problem.

I'm still getting issues with extended times to completion - I am currently showing ranges fron minutes to hundreds of hours. Sure I can manually reset the DCF - but it changes again quickly and over a very wide range. None of the recently available client versions fundamentally improve the situation for me.

Project resets are an irritation that change little for any length of time.

P.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4464 - Posted: 18 Dec 2008 | 12:37:29 UTC - in response to Message 4463.

HI,
we are constantly looking at it. We are reporting these problems to BOINC.
No solution yet. The odd part is that we run without problems and most people do the same, but still there is a part of users as you who have problems.

Hope this gets fixed soon.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4465 - Posted: 18 Dec 2008 | 12:38:54 UTC - in response to Message 4461.

Try to update the BOINC client to 6.4.3 please.


gdf

localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4468 - Posted: 18 Dec 2008 | 13:02:30 UTC - in response to Message 4465.
Last modified: 18 Dec 2008 | 13:10:01 UTC

............. 6.4.3 now on host 17415. I'm afraid to say it has made no difference; first attempt to gain work has resulted in the 24 hour backoff.

Profile K1atOdessa
Send message
Joined: 25 Feb 08
Posts: 249
Credit: 422,354,314
RAC: 938,897
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4480 - Posted: 18 Dec 2008 | 14:54:14 UTC - in response to Message 4465.

Try to update the BOINC client to 6.4.3 please.


gdf


Doesn't appear to work. Gets the 24-hour backoff. If I try to suspend other projects, I get the no work available for your processor message.

localizer
Send message
Joined: 17 Apr 08
Posts: 113
Credit: 1,656,514,857
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 4484 - Posted: 18 Dec 2008 | 15:34:42 UTC - in response to Message 4480.

......... GDF - I guess 6.4.3 was a typo as elsewhere you are advocating 6.4.2.
I've tried both - neither are a solution.

Thanks,
P.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 4495 - Posted: 18 Dec 2008 | 17:20:54 UTC - in response to Message 4484.

Yes, it was a typo.

gdf

Post to thread

Message boards : Graphics cards (GPUs) : acemd 6.53

//