Message boards :
Graphics cards (GPUs) :
GA: information and issues
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Dear Crunchers, we've submitted approximately 1000 WUs of the "GA" (gramicidin A) type. They are a re-issue of a system which we have already run for a while. The purpose of the runs is methodological: they use a model system to improve an algorithm that can be transferred to other molecules. The video is here - though I'm making new ones. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Keep challenging your methodologies and you will strengthen the research. Good decision for the long term future. I hope you identify subtle improvements you have made with the new applications and confirm existing results. Thanks, |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Thanks. Btw, all of them are acemd2, so have the higher bang for the buck ratio (ie credits/hour) of the new app. |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Finished my first GA! GTX295, Shaders at 1620, WinXP i7-920 HT ON at 4.0 GHz, 8 CPU threads of WCG HCMD2 fully loaded. GPU = 5 hours CPU usage = 1230 seconds Time per step = 23.927 ms Points w/ bonus = 6945.175 compared to recent TONI series avg on the same machine GPU = 4 hours 40 minutes CPU usage = 555 seconds Time per step = 25.651 ms Points w/ bonus = 6123.06875 so the CPU time is up *2.5 and GPU just a little ... looks good to me. I'm looking forward to your new videos, I hope these results help you find a better answer :-) Thanks - Steve |
K1atOdessaSend message Joined: 25 Feb 08 Posts: 249 Credit: 444,646,963 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've been getting nothing but errors on the "TONI_GA" ACEMD - GPU molecular dynamics v6.03 (cuda) WU's over the past 36 hours. "SWAN : FATAL : Failure executing kernel [mshake_position_kernel_1] [2] [66,1,1][64,1,1] Assertion failed: 0, file ../swan/swanlib_nv.cpp, line 194 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information." Running 6.10.17, with drivers 196.34. No issues with other WU's, even other types of v6.03's. Running 2x 8800GT + 1 GTS 240. Restart didn't help. I've halted new WU's for now, but I think I may try to change the preferences not to get these new types. What should I de-select to prevent only these TONI_GA v6.03 types from downloading? |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Hi K1atOdessa, do you know if the fail on the GTS or on the 8800 (or both?). At present you can't filter one WU type, but you can filter out acemd2 altogether (Your account, gpugrid preferences). However, this batch of WUs should be over. T |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Hi Steve, thanks for the report... timings look normal to me. Only thing, I wouldn't swear that the CPU time is reproduced even if you run two identical WUs (I may be wrong). What's important is that it is much less than the GPU time. |
MJHSend message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level ![]() Scientific publications ![]()
|
That's an out-of-memory error, but it's coming from an improbable place, making me think there's some other problem. Is the problem persisting over a hard-reset of the machine? Are you running anything else that might be using the GPU's memory? MJH |
K1atOdessaSend message Joined: 25 Feb 08 Posts: 249 Credit: 444,646,963 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It did continue after a hard shutdown / reboot. I am not doing anything else (games, etc.) with this machine. I've changed my WU options to only get the "old" WU types, and completed a couple "Full-atom molecular dynamics v6.71 (cuda23)" WUs with no issue. ACEMD: yes ACEMD ver 2.0: no ACEMD beta: no The interesting thing is that I did get one v6.03 WU to complete this morning, which was one I grabbed before changing the WU options. I am going to let it run with just the "ACEMD" type right now, but maybe switch back to accepting "ACMD ver 2.0" type after a couple days of no issues to see what happens. I'd like to get the benefit of the performance increase since my cards are not high-end and take a while to complete. Any other ideas of something I should try? |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Very consistant timings ... three more on the same machine: I have unhidden my computers (56900) so you can verify. ------------------------------------------------ GPU -------- CPU ----- Time Step ------------------------------------------------ 17910 ------- 1238 ------ 23.892 17889 ------- 1233 ------ 23.864 17722 ------- 1206 ------ 23.64 17935 ------- 1230 ------ 23.93 ------------------------------------------------ I am not complaining at all, they run very nicely for me. The CPU seconds is still much less than when I run them on my Vista PC which is a GTX285 i7-920 and takes ~5000 CPU sec. Keep up the good work! Thanks - Steve |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
K1atOdessa, At some stage in the next few days it would probably be a good idea to make sure you have selected to receive work from other projects (ACEMD ver 2.0 and Betas) if the projects you have selected (ACEMD) have no work. This is also in your projects settings. |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Hi Steve, noted. Thanks for the info. |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
is this just coindidence that two machines got incorrect function errors or is it something with the WU? http://www.gpugrid.net/result.php?resultid=1902641 Based on the amont of time it processed on my machine it should have been finished. I just upgraded it to Win7 and it has been returning WUs OK after the upgrade, including a GA. Thanks - Steve |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
It's a coincidence. The "other" machine is not returning any results. Did you abort it manually based on elapsed time? something went wrong but I don't think the WU is any special. |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I did not abort the WU ... the machine is at home and I am at work :-) It did return another WU of a different type since then so it looks like the machine is OK. While the driver is the same one I was using for Vista and the OS itself should not make a difference from a stability standpoint, I will lower my OC when I get home today. I will also check my error and system event logs and post anything *special*. <ot>Are you seeing any trending in general on Win7 machines producing more errors?</ot> Thanks - Steve |
K1atOdessaSend message Joined: 25 Feb 08 Posts: 249 Credit: 444,646,963 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK. So I restricted my machine to only the v6.71 WU's over the past 2 days. See tasks, 7/8 completed with no issues. I flipped the options back to allow v6.03 WU's and instant failures again. Two ran longer than just a couple seconds, but eventually failed. So, any ideas why am I seeing this failure activity only on the newer v6.03 WU's? Are these v6.03 WU's doing something different that the v6.71 didn't? I've had to go back to restricting to download only the v6.71 WU's because otherwise I'd quickly hit the max errored WU's and sit for 24 hours to do it again. |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
I've sent more GA runs.. let's see if the newer application improves things. And, btw, a new movie here. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
K1atOdessa, What cards does your system actually have? I see GT8800 and GT240 ?!? Have you swaped cards around and kept the same drivers? If so, reinstall the driver to register the card, restart and then start crunching again. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
K1atOdessa, Strike that last message. I see you have two 8800GT's and one GT240 in the same system. Restart your system, first! Upgrade to the latest version of Boinc (6.10.36). Restart again. See if that works. If you installed any of these cards recently you could try to manually reinstall the drivers from device manager, individually and for each card! |
K1atOdessaSend message Joined: 25 Feb 08 Posts: 249 Credit: 444,646,963 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Restart your system, first! Thanks, I just saw in another thread that 6.10.36 is the current recommended version. I will upgrade to that later tonight to see what happens. I've had all three cards in working fine on the older WU's for some time, but if the upgrade to newer BOINC version doesn't help, I'll try the the manual reinstall of drivers for each card. Thanks for the tips. |
©2026 Universitat Pompeu Fabra