GA: information and issues

Message boards : Graphics cards (GPUs) : GA: information and issues
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 · Next

AuthorMessage
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15368 - Posted: 22 Feb 2010, 18:42:11 UTC

Dear Crunchers,

we've submitted approximately 1000 WUs of the "GA" (gramicidin A) type. They are a re-issue of a system which we have already run for a while. The purpose of the runs is methodological: they use a model system to improve an algorithm that can be transferred to other molecules.

The video is here - though I'm making new ones.




ID: 15368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15374 - Posted: 22 Feb 2010, 22:48:39 UTC - in response to Message 15368.  

Keep challenging your methodologies and you will strengthen the research. Good decision for the long term future. I hope you identify subtle improvements you have made with the new applications and confirm existing results.

Thanks,
ID: 15374 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15381 - Posted: 23 Feb 2010, 14:49:57 UTC - in response to Message 15374.  
Last modified: 23 Feb 2010, 14:51:48 UTC

Thanks. Btw, all of them are acemd2, so have the higher bang for the buck ratio (ie credits/hour) of the new app.
ID: 15381 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15391 - Posted: 23 Feb 2010, 19:06:54 UTC - in response to Message 15381.  
Last modified: 23 Feb 2010, 20:04:04 UTC

Finished my first GA!

GTX295, Shaders at 1620, WinXP
i7-920 HT ON at 4.0 GHz, 8 CPU threads of WCG HCMD2 fully loaded.

GPU = 5 hours
CPU usage = 1230 seconds
Time per step = 23.927 ms
Points w/ bonus = 6945.175

compared to recent TONI series avg on the same machine
GPU = 4 hours 40 minutes
CPU usage = 555 seconds
Time per step = 25.651 ms
Points w/ bonus = 6123.06875

so the CPU time is up *2.5 and GPU just a little ... looks good to me.

I'm looking forward to your new videos, I hope these results help you find a better answer :-)
Thanks - Steve
ID: 15391 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile K1atOdessa

Send message
Joined: 25 Feb 08
Posts: 249
Credit: 444,646,963
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15394 - Posted: 23 Feb 2010, 23:15:43 UTC

I've been getting nothing but errors on the "TONI_GA" ACEMD - GPU molecular dynamics v6.03 (cuda) WU's over the past 36 hours.


"SWAN : FATAL : Failure executing kernel [mshake_position_kernel_1] [2] [66,1,1][64,1,1]
Assertion failed: 0, file ../swan/swanlib_nv.cpp, line 194

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information."


Running 6.10.17, with drivers 196.34. No issues with other WU's, even other types of v6.03's. Running 2x 8800GT + 1 GTS 240. Restart didn't help.

I've halted new WU's for now, but I think I may try to change the preferences not to get these new types. What should I de-select to prevent only these TONI_GA v6.03 types from downloading?
ID: 15394 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15404 - Posted: 24 Feb 2010, 17:16:08 UTC - in response to Message 15394.  
Last modified: 24 Feb 2010, 17:20:00 UTC

Hi K1atOdessa,

do you know if the fail on the GTS or on the 8800 (or both?).
At present you can't filter one WU type, but you can filter out acemd2 altogether (Your account, gpugrid preferences). However, this batch of WUs should be over.

T
ID: 15404 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15405 - Posted: 24 Feb 2010, 17:24:13 UTC - in response to Message 15391.  
Last modified: 24 Feb 2010, 17:24:30 UTC

Hi Steve, thanks for the report... timings look normal to me.

Only thing, I wouldn't swear that the CPU time is reproduced even if you run two identical WUs (I may be wrong). What's important is that it is much less than the GPU time.
ID: 15405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 15406 - Posted: 24 Feb 2010, 17:34:38 UTC - in response to Message 15394.  


"SWAN : FATAL : Failure executing kernel [mshake_position_kernel_1] [2] [66,1,1][64,1,1]


That's an out-of-memory error, but it's coming from an improbable place, making me think there's some other problem. Is the problem persisting over a hard-reset of the machine? Are you running anything else that might be using the GPU's memory?

MJH
ID: 15406 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile K1atOdessa

Send message
Joined: 25 Feb 08
Posts: 249
Credit: 444,646,963
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15407 - Posted: 24 Feb 2010, 18:06:25 UTC - in response to Message 15406.  


That's an out-of-memory error, but it's coming from an improbable place, making me think there's some other problem. Is the problem persisting over a hard-reset of the machine? Are you running anything else that might be using the GPU's memory?

MJH


It did continue after a hard shutdown / reboot. I am not doing anything else (games, etc.) with this machine. I've changed my WU options to only get the "old" WU types, and completed a couple "Full-atom molecular dynamics v6.71 (cuda23)" WUs with no issue.

ACEMD: yes
ACEMD ver 2.0: no
ACEMD beta: no


The interesting thing is that I did get one v6.03 WU to complete this morning, which was one I grabbed before changing the WU options. I am going to let it run with just the "ACEMD" type right now, but maybe switch back to accepting "ACMD ver 2.0" type after a couple days of no issues to see what happens. I'd like to get the benefit of the performance increase since my cards are not high-end and take a while to complete.

Any other ideas of something I should try?
ID: 15407 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15408 - Posted: 24 Feb 2010, 18:58:33 UTC
Last modified: 24 Feb 2010, 18:59:58 UTC

Very consistant timings ... three more on the same machine:
I have unhidden my computers (56900) so you can verify.

------------------------------------------------
GPU -------- CPU ----- Time Step
------------------------------------------------
17910 ------- 1238 ------ 23.892
17889 ------- 1233 ------ 23.864
17722 ------- 1206 ------ 23.64
17935 ------- 1230 ------ 23.93
------------------------------------------------

I am not complaining at all, they run very nicely for me.

The CPU seconds is still much less than when I run them
on my Vista PC which is a GTX285 i7-920 and takes ~5000 CPU sec.

Keep up the good work!
Thanks - Steve
ID: 15408 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15417 - Posted: 25 Feb 2010, 1:50:42 UTC - in response to Message 15408.  

K1atOdessa,

At some stage in the next few days it would probably be a good idea to make sure you have selected to receive work from other projects (ACEMD ver 2.0 and Betas) if the projects you have selected (ACEMD) have no work.

This is also in your projects settings.
ID: 15417 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15425 - Posted: 25 Feb 2010, 10:48:37 UTC - in response to Message 15408.  

Hi Steve, noted. Thanks for the info.
ID: 15425 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15427 - Posted: 25 Feb 2010, 13:19:00 UTC

is this just coindidence that two machines got incorrect function errors or is it something with the WU?
http://www.gpugrid.net/result.php?resultid=1902641

Based on the amont of time it processed on my machine it should have been finished. I just upgraded it to Win7 and it has been returning WUs OK after the upgrade, including a GA.
Thanks - Steve
ID: 15427 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15437 - Posted: 25 Feb 2010, 17:22:47 UTC - in response to Message 15427.  
Last modified: 25 Feb 2010, 17:24:01 UTC

It's a coincidence. The "other" machine is not returning any results.

Did you abort it manually based on elapsed time? something went wrong but I don't think the WU is any special.
ID: 15437 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Snow Crash

Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15438 - Posted: 25 Feb 2010, 17:52:04 UTC - in response to Message 15437.  
Last modified: 25 Feb 2010, 17:53:08 UTC

I did not abort the WU ... the machine is at home and I am at work :-)
It did return another WU of a different type since then so it looks like the machine is OK. While the driver is the same one I was using for Vista and the OS itself should not make a difference from a stability standpoint, I will lower my OC when I get home today. I will also check my error and system event logs and post anything *special*.

<ot>Are you seeing any trending in general on Win7 machines producing more errors?</ot>
Thanks - Steve
ID: 15438 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile K1atOdessa

Send message
Joined: 25 Feb 08
Posts: 249
Credit: 444,646,963
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15455 - Posted: 26 Feb 2010, 5:13:04 UTC - in response to Message 15407.  


That's an out-of-memory error, but it's coming from an improbable place, making me think there's some other problem. Is the problem persisting over a hard-reset of the machine? Are you running anything else that might be using the GPU's memory?

MJH


It did continue after a hard shutdown / reboot. I am not doing anything else (games, etc.) with this machine. I've changed my WU options to only get the "old" WU types, and completed a couple "Full-atom molecular dynamics v6.71 (cuda23)" WUs with no issue.

ACEMD: yes
ACEMD ver 2.0: no
ACEMD beta: no


The interesting thing is that I did get one v6.03 WU to complete this morning, which was one I grabbed before changing the WU options. I am going to let it run with just the "ACEMD" type right now, but maybe switch back to accepting "ACMD ver 2.0" type after a couple days of no issues to see what happens. I'd like to get the benefit of the performance increase since my cards are not high-end and take a while to complete.

Any other ideas of something I should try?


OK. So I restricted my machine to only the v6.71 WU's over the past 2 days. See tasks, 7/8 completed with no issues. I flipped the options back to allow v6.03 WU's and instant failures again. Two ran longer than just a couple seconds, but eventually failed.

So, any ideas why am I seeing this failure activity only on the newer v6.03 WU's? Are these v6.03 WU's doing something different that the v6.71 didn't? I've had to go back to restricting to download only the v6.71 WU's because otherwise I'd quickly hit the max errored WU's and sit for 24 hours to do it again.
ID: 15455 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 15664 - Posted: 10 Mar 2010, 11:54:40 UTC - in response to Message 15455.  
Last modified: 10 Mar 2010, 11:54:52 UTC

I've sent more GA runs.. let's see if the newer application improves things.

And, btw, a new movie here.
ID: 15664 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15667 - Posted: 10 Mar 2010, 12:50:28 UTC - in response to Message 15664.  

K1atOdessa, What cards does your system actually have?
I see GT8800 and GT240 ?!?

Have you swaped cards around and kept the same drivers?
If so, reinstall the driver to register the card, restart and then start crunching again.
ID: 15667 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15669 - Posted: 10 Mar 2010, 13:52:41 UTC - in response to Message 15667.  

K1atOdessa, Strike that last message.

I see you have two 8800GT's and one GT240 in the same system.

Restart your system, first!
Upgrade to the latest version of Boinc (6.10.36). Restart again.
See if that works.

If you installed any of these cards recently you could try to manually reinstall the drivers from device manager, individually and for each card!
ID: 15669 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile K1atOdessa

Send message
Joined: 25 Feb 08
Posts: 249
Credit: 444,646,963
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 15675 - Posted: 10 Mar 2010, 18:41:13 UTC - in response to Message 15669.  

Restart your system, first!
Upgrade to the latest version of Boinc (6.10.36). Restart again.
See if that works.


Thanks, I just saw in another thread that 6.10.36 is the current recommended version. I will upgrade to that later tonight to see what happens.

I've had all three cards in working fine on the older WU's for some time, but if the upgrade to newer BOINC version doesn't help, I'll try the the manual reinstall of drivers for each card.

Thanks for the tips.
ID: 15675 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 · Next

Message boards : Graphics cards (GPUs) : GA: information and issues

©2026 Universitat Pompeu Fabra