Advanced search

Message boards : News : New NATHAN_KID WUs on long

Author Message
Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 30211 - Posted: 22 May 2013 | 17:30:20 UTC

Hey guys,

I have put some new work on the long queue. These workunits are small and don't require any special algorithms, so should be no problems.

With these workunits, I'll be looking at the folding of a small peptide using a special force field. The biophysics community uses different force fields to represent the motions of proteins and other molecules, and we are attempting to test a new one so we can check some previous work and feel comfortable in work we are planning to do.

Happy crunching.

Nate

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30213 - Posted: 22 May 2013 | 17:37:15 UTC - in response to Message 30211.

Thanks, how are the WUs identified: (NATHAN_???).

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1626
Credit: 9,295,466,723
RAC: 18,414,230
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30214 - Posted: 22 May 2013 | 17:52:56 UTC

I'm currently running I62R8-NATHAN_KIDc22_2-0-8-RND1007.

KIDc22 is new to me: been running (steady progress) for nine hours on a GTX 670, and only reached 78%. That's hardly 'small' - in fact, the slowest I've seen.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30215 - Posted: 22 May 2013 | 18:00:36 UTC
Last modified: 22 May 2013 | 18:04:54 UTC

I've finished 4 so far, GTX680 took 9 hours 40 minutes, GTX670 about 10 hours 167,550 points.

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 458
Credit: 832,036,842
RAC: 1,201,055
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30217 - Posted: 22 May 2013 | 18:06:11 UTC

167k credits for 51k secs (small? ^^) on 570. Needs some cpu time too. Looks good to me.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30218 - Posted: 22 May 2013 | 18:13:31 UTC

I think Nathan's referring to some new ones, not the NATHAN_KID. That's why I asked the question above...

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30219 - Posted: 22 May 2013 | 18:27:07 UTC

This is confusing, these are new to me, I never saw them until yesterday. He must be running 2 different kinds wu's, I guess I'll find out soon enough.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30220 - Posted: 22 May 2013 | 18:29:19 UTC - in response to Message 30219.

This is confusing, these are new to me, I never saw them until yesterday. He must be running 2 different kinds wu's, I guess I'll find out soon enough.

So will we all... :-)

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1626
Credit: 9,295,466,723
RAC: 18,414,230
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30221 - Posted: 22 May 2013 | 18:31:37 UTC - in response to Message 30220.

This is confusing, these are new to me, I never saw them until yesterday. He must be running 2 different kinds wu's, I guess I'll find out soon enough.

So will we all... :-)

The last WU in the database (a few seconds ago) was http://www.gpugrid.net/workunit.php?wuid=4476102 - and it's a NATHAN_KIDc22

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30224 - Posted: 22 May 2013 | 19:00:46 UTC

Thanks for the interesting background information, Nathan! It's appreciated.

MrS
____________
Scanning for our furry friends since Jan 2002

GPUGRID
Send message
Joined: 12 Dec 11
Posts: 91
Credit: 2,730,095,033
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 30254 - Posted: 23 May 2013 | 13:26:33 UTC - in response to Message 30224.

Thanks for the interesting background information, Nathan! It's appreciated.

MrS

Indeed! I always feel good and like to know what i´m working on. It´s just awesome. The Nathan_Kid flavor are running just fine all over around here!

John C MacAlister
Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 30257 - Posted: 23 May 2013 | 14:04:30 UTC

Thanks from me, too, Nate.

wiyosaya
Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30456 - Posted: 27 May 2013 | 20:19:32 UTC

Thanks, Nate.

My 580 has little trouble finishing them in 24-hours, but my 460 takes about 25.5 hours to finish one. It would be nice to have WUs that finished on 460's in under 24 hours.
____________

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30467 - Posted: 28 May 2013 | 0:44:32 UTC - in response to Message 30456.

Thanks, Nate.

My 580 has little trouble finishing them in 24-hours, but my 460 takes about 25.5 hours to finish one. It would be nice to have WUs that finished on 460's in under 24 hours.

+1

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30469 - Posted: 28 May 2013 | 1:54:26 UTC
Last modified: 28 May 2013 | 1:55:21 UTC

I received 3 running Linux. All ran fine and completed in about 11hr20min on a 660TI. I then booted into XP and ran 2 more. They also ran and completed without issue but actually finished them about 10 minutes faster than Linux. Love the credits.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 30481 - Posted: 28 May 2013 | 12:02:23 UTC

Yes, the WUs are called NATHAN_KID... workunits. I should have mentioned that Initially. There will be some NEW new stuff soon as well, but I won't make a new post, but update here. There will also be a new batch of WUs coming from a new scientist that will go on the short queue. That will be announced in a new thread.

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 30483 - Posted: 28 May 2013 | 12:27:50 UTC

Ok, there are two groups from me on the grid right now:
NATHAN_KIDc22_2
NATHAN_KIDc22_SODcharge

So you guys know. The second group might run a little faster. You'll have to tell me.

Simba123
Send message
Joined: 5 Dec 11
Posts: 147
Credit: 69,970,684
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30484 - Posted: 28 May 2013 | 13:48:42 UTC

I'm currently running http://www.gpugrid.net/result.php?resultid=6905151

on a 660Ti @ 1137Mhz / 3019Mhz
and it's taking about 381 seconds per frame.
Which puts it at around 10.5 hours runtime.

kinda poor GPU usage at ~85%.

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 458
Credit: 832,036,842
RAC: 1,201,055
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30489 - Posted: 28 May 2013 | 16:52:27 UTC
Last modified: 28 May 2013 | 16:53:15 UTC

Great new units 98% gpu load on 570 with estimated 11,5 hours runtime.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30521 - Posted: 29 May 2013 | 8:41:52 UTC - in response to Message 30215.

I've finished 4 so far, GTX680 took 9 hours 40 minutes, GTX670 about 10 hours 167,550 points.

I’ve done two NATHAN_KIDc22s on my stock-clocks GTX 660. Both kept one CPU core fully occupied (acemd.2865P.exe), and both used a modest amount of GPU resource; see snapshots below.

The first took 15h31m of run time for 167,550 credits, the second 12h27m for 134,000 credits.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30527 - Posted: 29 May 2013 | 13:42:47 UTC - in response to Message 30483.

Ok, there are two groups from me on the grid right now:
NATHAN_KIDc22_2
NATHAN_KIDc22_SODcharge

So you guys know. The second group might run a little faster. You'll have to tell me.

Thanks for the info. The SOD WUs do run faster and they just barely allow a GTX 460 to slip under the 24 hour mark (if micromanaged). So for me it's a large improvement, but still on the long side.

Micromanagement drill: set a backup project, set BOINC to report immediately: then DL a GPUGrid WU, then turn off GPUGrid work fetch, then wait until you notice the GPUGrid WU is done and the backup project is running, then turn on work fetch to DL a new WU, then pause the WU from the backup project so the GPUGrid WU starts immediately, then un-pause the backup project WU so it will run again when the GPUGrid WU finishes. Repeat ad infinitum... If you're lucky you've squeaked under 24 hours.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30528 - Posted: 29 May 2013 | 14:08:44 UTC - in response to Message 30527.

The SOD WUs do run faster and they just barely allow a GTX 460 to slip under the 24 hour mark (if micromanaged).

Do what I just did: replaced my GTX 460 with a GTX 660. You'll sleep easy!

Vagelis Giannadakis
Send message
Joined: 5 May 13
Posts: 187
Credit: 349,254,454
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 30529 - Posted: 29 May 2013 | 14:09:07 UTC

I acknowledge the new NATHAN WUs, I'm about to finish I84R7-NATHAN_KIDc22_SODcharge-0-10-RND9833.

Runtime, just above 18h on a GTX 650TI on Linux x86_64.

Nice-behaving, these NATHANs! Keep 'em coming, Nate!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30531 - Posted: 29 May 2013 | 14:50:00 UTC - in response to Message 30528.

The SOD WUs do run faster and they just barely allow a GTX 460 to slip under the 24 hour mark (if micromanaged).

Do what I just did: replaced my GTX 460 with a GTX 660. You'll sleep easy!

If I had unlimited money I'd buy the fastest GPUs available. Some of us are retired and on fixed incomes. Buying less expensive cards and paying the electric bill is enough of a strain. Of course if you send $$$ I'll certainly purchase some GTX 660 GPUs ;-)
JK, BTW: congrats on your GTX 660, it's a nice card.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30533 - Posted: 29 May 2013 | 15:09:26 UTC - in response to Message 30531.

Some of us are retired and on fixed incomes.

Me too, but I had a strategy vs. She who holds the purse strings and must be obeyed. We replace our rigs every four year. My rig is almost four year old. She was persuaded that changing the PSU from 425W to 620W, and replacing the video card, was a better alternative to a new PC!

BTW: congrats on your GTX 660, it's a nice card.

Thank you! Love it!!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30554 - Posted: 30 May 2013 | 12:55:53 UTC - in response to Message 30533.

I'm getting system restarts while running NATHAN_KIDc22 WU's on XP-x86.
The only other app I have running is WUProp (NCI).
The tasks are recovering however.

I also had one of these WU's fail on Linux:


    Exit status 255 (0xff) Unknown error number

    Stderr output

    <core_client_version>7.0.27</core_client_version>
    <![CDATA[
    <message>
    process exited with code 255 (0xff, -1)
    </message>
    <stderr_txt>
    MDIO: cannot open file "output.restart.coor"

    </stderr_txt>
    ]]>



____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30556 - Posted: 30 May 2013 | 13:19:46 UTC - in response to Message 30554.

I'm getting system restarts while running NATHAN_KIDc22 WU's on XP-x86.
The only other app I have running is WUProp (NCI).
The tasks are recovering however.

I had several acemd crashes (more than are listed in the file below, think there were about 5) on one of the NATHAN_KID WUs yesterday on a non-OCed GTX 460:

core_client_version>7.0.64</core_client_version>
<![CDATA[
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
# Time per step (avg over 4705000 steps): 6.844 ms
# Approximate elapsed time for entire WU: 82131.953 s
called boinc_finish

</stderr_txt>
]]>


Did the usual drill: shut down BOINC THEN hit the X on the acemd error message then reboot (as the GPU can sometimes become unstable when this happens) It eventually finished successfully (and I beat the 24hr deadline by 12 minutes!). This WU failed on 2 previous machines:

http://www.gpugrid.net/workunit.php?wuid=4482496

terencewee*
Send message
Joined: 29 May 12
Posts: 8
Credit: 21,605,500
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 30574 - Posted: 31 May 2013 | 8:31:31 UTC - in response to Message 30556.


It eventually finished successfully (and I beat the 24hr deadline by 12 minutes!).


12mins?

I should feel especially lucky then... to get this (I10R4-NATHAN_KIDc22_2-6-8-RND2039) in with 51secs to spare. :)

GPU-load was 90%.



____________
terencewee*
Sicituradastra.

Vagelis Giannadakis
Send message
Joined: 5 May 13
Posts: 187
Credit: 349,254,454
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 30576 - Posted: 31 May 2013 | 9:47:18 UTC - in response to Message 30574.

Ahh, the thrill of sending your WU in within the 24h window!!

GPUGRID
Send message
Joined: 12 Dec 11
Posts: 91
Credit: 2,730,095,033
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 30599 - Posted: 1 Jun 2013 | 0:44:55 UTC - in response to Message 30576.

Ahh, the thrill of sending your WU in within the 24h window!!

Hint: sell all the old hardware around and grab the best GPU you can. Save energy and produce more :)
Gamers would love or 1 year "old" series 5 and older cards :D

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30602 - Posted: 1 Jun 2013 | 9:10:20 UTC - in response to Message 30554.
Last modified: 1 Jun 2013 | 9:31:53 UTC

I'm getting system restarts while running NATHAN_KIDc22 WU's on XP-x86.
The only other app I have running is WUProp (NCI).
The tasks are recovering however.

A different type of problem this time:
The WU ran for 29h but had only reached 30%. There was an acemd.2865P.exe pop-up error sitting on the screen when I checked. I exited Boinc, restarted and got the same error, however the Elapsed time now told me that the WU had only run for ~5h30min. When I suspended the task the error message disappeared. I started to run a different WU (from GPUGrid, then suspended it and ran an Einstein WU, which had been suspended), then suspended it and tried to run the Nathan WU. After about 10sec I got the same error message. I closed the message and the task went to 100% and Error after ~10sec.

I81R1-NATHAN_KIDc22_3-0-8-RND4827_0 4488285 31 May 2013 | 3:33:05 UTC 5 Jun 2013 | 3:33:05 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Stderr output

<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified.
(0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>


I also had one of these WU's fail on Linux:


Exit status 255 (0xff) Unknown error number

Stderr output

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 255 (0xff, -1)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"

</stderr_txt>
]]>


...and 2 more, same system, same error:

I69R6-NATHAN_KIDc22_3-0-8-RND5284_1 4488166 31 May 2013 | 3:14:50 UTC 31 May 2013 | 6:08:41 UTC Error while computing 9,273.96 9,196.11 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I97R8-NATHAN_KIDc22_3-0-8-RND1236_0 4488457 31 May 2013 | 6:08:41 UTC 31 May 2013 | 12:10:10 UTC Error while computing 21,353.54 21,177.70 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30603 - Posted: 1 Jun 2013 | 11:47:34 UTC - in response to Message 30602.

A different type of problem this time:
The WU ran for 29h but had only reached 30%. There was an acemd.2865P.exe pop-up error sitting on the screen when I checked. I exited Boinc, restarted and got the same error, however the Elapsed time now told me that the WU had only run for ~5h30min.

Stderr output

<core_client_version>7.0.64</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified.
(0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
Kernel not foundAssertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>

Looks like pretty much exactly what happened here:

http://www.gpugrid.net/forum_thread.php?id=3378&nowrap=true#30556

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30631 - Posted: 2 Jun 2013 | 9:27:52 UTC - in response to Message 30603.

I31R1-NATHAN_KIDc22_SODcharge-3-10-RND8395_0 4492253 1 Jun 2013 | 19:43:28 UTC 1 Jun 2013 | 23:27:30 UTC Error while computing 13,091.41 12,971.71 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I76R5-NATHAN_KIDc22_SODcharge-3-10-RND5030_0 4491243 1 Jun 2013 | 10:43:07 UTC 1 Jun 2013 | 19:43:28 UTC Error while computing 12,157.94 12,044.00 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I14R2-NATHAN_KIDc22_3-2-8-RND5964_0 4490995 1 Jun 2013 | 6:06:40 UTC 1 Jun 2013 | 10:43:07 UTC Error while computing 15,460.73 15,325.41 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)

Top two:
Stderr output

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 255 (0xff, -1)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"

</stderr_txt>
]]>


Stderr output

<core_client_version>7.0.27</core_client_version>
<![CDATA[
<message>
process exited with code 255 (0xff, -1)
</message>
<stderr_txt>
MDIO: cannot open file "output.restart.coor"
SWAN : FATAL : Cuda driver error 700 in file 'swanlibnv2.cpp' in line 1841.

</stderr_txt>
]]>

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Jim Daniels (JD)
Send message
Joined: 20 Jan 13
Posts: 9
Credit: 206,731,892
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 30662 - Posted: 5 Jun 2013 | 1:01:58 UTC - in response to Message 30602.

I have had two WU with the same error:

I36R2-NATHAN_KIDc22_2-3-8-RND5161_8

Run time 11.7 seconds. CPU time 0.55 seconds.

There are seven other error reports on this WU all with very small CPU times.

and an older one

I55R4-NATHAN_KIDc22_SODcharge-1-10-RND5713_0

Run time 140.58 seconds. CPU time 99.79 seconds.
There is no other error reports and one completion report for this WU.

klepel
Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,735,009,416
RAC: 626,127
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30699 - Posted: 7 Jun 2013 | 0:38:58 UTC

STRANGE: http://www.gpugrid.net/result.php?resultid=6930954, I19R7-NATHAN_KIDc22_3-1-8-RND5865_1:

Runtime: 14:51.51 and counting, advanced: 16.457 %, Remaining: 32:13:07.
AVGA OC Scanner X says: GPU load 66-82%, MEM load: 32%, 412 MB and MCU load 20%.

On a GTX570, AMD 6200 FX.

I have to go, so up for comments. I will not be able to do anything untill tomorrow morning.

Vagelis Giannadakis
Send message
Joined: 5 May 13
Posts: 187
Credit: 349,254,454
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 30702 - Posted: 7 Jun 2013 | 9:47:01 UTC - in response to Message 30699.

Something must be wrong with this WU, it has failed on another host:

http://www.gpugrid.net/result.php?resultid=6930512
____________

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 30704 - Posted: 7 Jun 2013 | 12:17:48 UTC - in response to Message 30702.

WU's can fail many times. I had one which failed 8 times and at the last one succeeded. So I think it's not really any indication.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30708 - Posted: 7 Jun 2013 | 13:12:36 UTC - in response to Message 30704.
Last modified: 7 Jun 2013 | 13:26:25 UTC

The computer it failed on was a titan (which cannot run these WU's):

    Coprocessors NVIDIA GeForce GTX TITAN (4095MB) driver: 314.22



If a WU fails on several known good systems then perhaps there is a problem, but you have to dig deep to find how reliable the previous systems were.

In the case of the GTX570, it's most likely that the GPU clocks dropped, but these were not reported, and neither was whether the system is set to prefer maximum performance, or how much the of CPU was being used...

The recommended settings are list in the FAQ's, and there is a suggested way to ask for help (because we cannot see your systems setup).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

klepel
Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,735,009,416
RAC: 626,127
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30710 - Posted: 7 Jun 2013 | 14:51:59 UTC - in response to Message 30699.

STRANGE: http://www.gpugrid.net/result.php?resultid=6930954, I19R7-NATHAN_KIDc22_3-1-8-RND5865_1:

UP-DATE on this one: Finished after 28:11:19 hours. Will miss 24 hours deadline...
It is uploading at the moment.
In the case of the GTX570, it's most likely that the GPU clocks dropped, but these were not reported, and neither was whether the system is set to prefer maximum performance, or how much the of CPU was being used...

This system does this RNDXXXX normally in around 58000 seconds, and has done so quite a few times, so this is really a strange WU. I never had issues with the system although this RND tasks takes quite long for one of the faster Video cards of the last generation.

Video card is EVGA GTX 570 SC (not further pressed, I prefer that it is running more or less cool with 68º to 73º C Fan speed at 85%) and on the AMD 6200 FX there is a core reserved for the Video card as recommended. The card is a recent replacement for a faulty card by EVGA, so new.

klepel
Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,735,009,416
RAC: 626,127
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30711 - Posted: 7 Jun 2013 | 14:56:42 UTC - in response to Message 30710.

Oh, and I just noticed the up-load file does have a size of 107.88 MB (so more or less the double as before), therefore it will take quite a while until it is up-loaded, as my internet connection will brake several times and will end up with the famous 5 hours brakes between each up-load try.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30714 - Posted: 7 Jun 2013 | 18:41:17 UTC

Klepel, as you say you've been running these Nathan KIDc22 WUs just fine before in 59ks. Check if the GPU clock is still up to where it should be. After driver resets it sometimes stays too low without throwing an error at GPU-Grid. In this case a reboot would help.

MrS
____________
Scanning for our furry friends since Jan 2002

Vagelis Giannadakis
Send message
Joined: 5 May 13
Posts: 187
Credit: 349,254,454
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 30721 - Posted: 7 Jun 2013 | 19:40:22 UTC - in response to Message 30708.

The computer it failed on was a titan (which cannot run these WU's):
    Coprocessors NVIDIA GeForce GTX TITAN (4095MB) driver: 314.22


Good catch! It seems I was too quick to blame the WU.

____________

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30723 - Posted: 7 Jun 2013 | 20:44:13 UTC - in response to Message 30711.

Oh, and I just noticed the up-load file does have a size of 107.88 MB (so more or less the double as before)


Sounds too long. Maybe wrong compression.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

klepel
Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,735,009,416
RAC: 626,127
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30727 - Posted: 8 Jun 2013 | 0:02:01 UTC

The strange WU is up-loaded: http://www.gpugrid.net/result.php?resultid=6930954

All seems normal. Only that it took nearly twice as long: 101,479.54 s
Stderr output: 57186.919 s

The next WU crashed, because of a electricity cut.

I will inform about the next WU after it's finished.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30732 - Posted: 8 Jun 2013 | 14:39:36 UTC

I've completed 18 NATHAN_KIDc22s, nine of the SODcharge variety for 134000 credits each, and nine of the RND variety for 167500 credits each EXCEPT...

The RND that completed this morning, well within 24 hours, gave 111700 credits:

http://www.gpugrid.net/result.php?resultid=6933145

That was a surprise!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30734 - Posted: 8 Jun 2013 | 15:23:17 UTC - in response to Message 30732.
Last modified: 8 Jun 2013 | 15:23:32 UTC

Your 2nd wingman handed the WU in after it got sent to you, but before you could send your in. In this case the bonus credit doesn't trigger for both of you, sadly an old problem.

MrS
____________
Scanning for our furry friends since Jan 2002

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30735 - Posted: 8 Jun 2013 | 15:42:25 UTC - in response to Message 30734.

Your 2nd wingman handed the WU in after it got sent to you, but before you could send your in. In this case the bonus credit doesn't trigger for both of you, sadly an old problem. MrS

Ah! I did not notice I had a wingman. I had thought GPUGrid's quorum was one. Why did it get sent to me before the wingman's try had run out of time?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30736 - Posted: 8 Jun 2013 | 16:46:25 UTC - in response to Message 30735.

It's a known issue with the credit system at GPUGrid. Basically it happens only when a WU is resent and after being resent is returned by the first recipient and validates. Subsequent to the validation bonuses are not awarded. It's a fairly rare event but apparently too difficult to fix.

No Bonus for finishing within 24 hours
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30745 - Posted: 9 Jun 2013 | 12:33:06 UTC - in response to Message 30735.

The WU was sent to the wingman first, but he didn't return it within a few days. At some point GPU-Grid needs the results to generate the next WU in this string / time iteration. So the WU was sent to you (not sure if this was before the deadline for the wingman or not), but the other guy evetually handed the result in. This is why GPU-Grid is not suitable for low-end hardware and why the deadline is rather short.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 30773 - Posted: 11 Jun 2013 | 14:40:51 UTC

FYI, there are two new batches of simulations that are among the same research project I just sent out yesterday and today. Names are NATHAN_KIDc22_full and NATHAN_KIDc22_noPhos.

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30775 - Posted: 11 Jun 2013 | 15:03:20 UTC

Boy, this place is turning in to you're own private Idaho, nobody else wants to run work units? Not that there's anything wrong with yours.

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 30777 - Posted: 11 Jun 2013 | 17:13:03 UTC - in response to Message 30775.

It's the circumstances. We are missing people to conferences, paternity and also summer is coming. So these weeks might end up being a bit unstable, but there is definitely more stuff to simulate! (We just need to get the people back :D)

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30779 - Posted: 11 Jun 2013 | 17:57:30 UTC

After 23 successful Nathan_KIDc22s, I just got an error on my first noPhos, after 8h29m:

http://www.gpugrid.net/result.php?resultid=6944425

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30784 - Posted: 11 Jun 2013 | 23:06:06 UTC

Just finished my first noPhos. Looks like no problemo.

http://www.gpugrid.net/result.php?resultid=6944390

Profile dskagcommunity
Avatar
Send message
Joined: 28 Apr 11
Posts: 458
Credit: 832,036,842
RAC: 1,201,055
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30786 - Posted: 12 Jun 2013 | 10:20:03 UTC

Yes 5 phos done, no problems.
____________
DSKAG Austria Research Team: http://www.research.dskag.at



FoldingNator
Send message
Joined: 1 Dec 12
Posts: 24
Credit: 60,122,950
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 30794 - Posted: 12 Jun 2013 | 16:29:25 UTC
Last modified: 12 Jun 2013 | 16:31:26 UTC

I was working on two Nathan Phos WU's. After 16 hours and 99,617% done freezes my computer.

@#$%^&*@#$%^&@#$%^&#$%^& :'( not funny!!! :@

Why not? because it isn't the first time, no it is the third week have those problems. Only with GPUGRID for what I know... and it is very annoying.
From anger I've after restart all GPUGRID tasks aborted in BOINC, that is what you see at the task/client status now... -_-'


Last WU: http://www.gpugrid.net/workunit.php?wuid=4514859
Second last WU: http://www.gpugrid.net/workunit.php?wuid=4514763[/url]

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30795 - Posted: 12 Jun 2013 | 17:16:41 UTC - in response to Message 30794.

Why do you guys hide you're computers? I've never been able to figure that out, I haven't had any lockups in a while on any of my 4 rigs, couldn't say what the problem is with yours. I disabled my onboard USB 3.0 and all my issue's went away.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30797 - Posted: 12 Jun 2013 | 18:36:57 UTC - in response to Message 30795.

I disabled my onboard USB 3.0 and all my issue's went away.

Do you have an Eltron chip? They had serious driver issues last year, leading to unstable transfer rates, devices dropping and unstable computers. Drivers from this spring are fine, though (using it myself.. guess why I know about the problems).

MrS
____________
Scanning for our furry friends since Jan 2002

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30798 - Posted: 12 Jun 2013 | 18:38:26 UTC - in response to Message 30794.

I was working on two Nathan Phos WU's. After 16 hours and 99,617% done freezes my computer.

What do you mean by that, did you run 2 WUs concurrently? If so be aware that you're alpha testing that functionality. It's not suggested or approved by the project in any way, so do it at your own risk.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30802 - Posted: 12 Jun 2013 | 19:15:04 UTC - in response to Message 30798.
Last modified: 12 Jun 2013 | 20:07:03 UTC

Going by this post he has two GPU's.

Also,
Coprocessors [2] NVIDIA GeForce GTX 570 (2559MB) driver: 314.22

Outcome Computation error
Client state Aborted by user
Exit status 203 (0xcb) EXIT_ABORTED_VIA_GUI

(a process exit code used by the core client/app). More commonly found at CPU projects such as ABC, Asteroids, Physics@home, Constellation, Quake-Catcher...

I92R9-NATHAN_KIDc22_noPhos-1-10-RND9632_0 4514859 11 Jun 2013 | 23:11:29 UTC 12 Jun 2013 | 16:20:04 UTC Aborted by user 56,038.62 10,735.35 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)
I93R9-NATHAN_KIDc22_noPhos-1-10-RND7759_0 4514763 11 Jun 2013 | 23:11:29 UTC 12 Jun 2013 | 16:20:04 UTC Aborted by user 55,998.47 10,639.65 --- Long runs (8-12 hours on fastest card) v6.18 (cuda42)

http://www.gpugrid.net/show_host_detail.php?hostid=152224

The NATHAN_KIDc22_noPhos WU's take ~12h on a GTX660. Some finish under 12h on GTX570's on Linux but others are ~13.6h on W7.
Either 2 GPU's running slow (most likely) or one GPU running 2 WU's fast until they crash (less likely). Quite a few failures on that system though,
http://www.gpugrid.net/results.php?hostid=152224&offset=0&show_names=1&state=0&appid=
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

FoldingNator
Send message
Joined: 1 Dec 12
Posts: 24
Credit: 60,122,950
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 30806 - Posted: 12 Jun 2013 | 21:53:11 UTC - in response to Message 30798.
Last modified: 12 Jun 2013 | 21:55:21 UTC

I was working on two Nathan Phos WU's. After 16 hours and 99,617% done freezes my computer.

What do you mean by that, did you run 2 WUs concurrently? If so be aware that you're alpha testing that functionality. It's not suggested or approved by the project in any way, so do it at your own risk.

MrS

I've 2 GPU's. They are running as two seperate cards and runs both one work unit each. (thnx skgiven...)


The NATHAN_KIDc22_noPhos WU's take ~12h on a GTX660. Some finish under 12h on GTX570's on Linux but others are ~13.6h on W7.
Either 2 GPU's running slow (most likely) or one GPU running 2 WU's fast until they crash (less likely). Quite a few failures on that system though,
http://www.gpugrid.net/results.php?hostid=152224&offset=0&show_names=1&state=0&appid=

Yes a lot! I know...
The first ones, from the end of May and the begin of June, I had some difficulties for a stable core clock and shader speed. Because my GTX570 are a little bit overclocked by EVGA and I was starting on a new system, so I had to figure out the best settings.
After that a few WU's were going good and since last week I had only errors. It's very frustrating... and I don't know it has to do with the WU's or my computer... hmm. Disappointing.

I'm running @ 680MHz core (1360 shader) and 1700MHz VRAM. Default on my cards is 732/1464/1900Mhz. Very strange, at my formal system I've used thems on 720/1440/1700 for stable clocks. Back in Februar I had mostly good and validated WU's with those settings. Strange...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30808 - Posted: 12 Jun 2013 | 23:37:31 UTC - in response to Message 30806.
Last modified: 12 Jun 2013 | 23:41:28 UTC

You might need to tweak your Voltage up slightly on your GPU.
Is it clean and is the PSU up to the task?
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

FoldingNator
Send message
Joined: 1 Dec 12
Posts: 24
Credit: 60,122,950
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 30811 - Posted: 13 Jun 2013 | 8:16:14 UTC - in response to Message 30808.

Higher voltage even now when I've lowered the speeds?
My PSU is good (Corsair AX850) and stable for what I know.

All the hardware runs at a benchtable, in open air... I've cleaned all components before I started end May with folding again. Removal of dust, new cooling paste, etc...

With the same hardware I'm running Primegrid, PPS Sieve CUDA based and Genefer GFN CUDA. That's a higher load then GPUGRID (99% vs 80-85%). Only 1 GFN task had an error last week (by my own fault), 1 of 6 tasks... But I'm going to monitor if it occurs more frequently in GFN than I thought, so we could exclude what the problem is (my computer or any folding program).

Profile nate
Send message
Joined: 6 Jun 11
Posts: 124
Credit: 2,928,865
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 30812 - Posted: 13 Jun 2013 | 8:43:58 UTC - in response to Message 30775.

Boy, this place is turning in to you're own private Idaho, nobody else wants to run work units? Not that there's anything wrong with yours.


:) Well not intentionally. The current WUs are part of a rushed project we will hopefully be able to publish quickly. Also, I often run projects for other people/collaborators, since I am one of the most experienced here. Santiago (the new guy) will have the short queue mostly dominated for a while with a very interesting project, so it's not only my work being done.

FoldingNator
Send message
Joined: 1 Dec 12
Posts: 24
Credit: 60,122,950
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwat
Message 30903 - Posted: 20 Jun 2013 | 23:08:08 UTC - in response to Message 30811.
Last modified: 20 Jun 2013 | 23:09:16 UTC

I'm back! :D
PrimeGrid challenge has finished and lightning storms are gone for now... so we can maybe fold again. ;-)

Before of all above I've runned last week eight GFN cuda tasks (12 hours per WU) and a few (20 or so) PPS Sieve tasks. All tasks are succesfull finished, so I don't think my computer is the problem of all WU errors at GPUGRID. But it's still strange in my opinion why PG WU's can finish completely and tasks from GPUGRID don't.

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 30904 - Posted: 21 Jun 2013 | 8:34:04 UTC - in response to Message 30903.
Last modified: 21 Jun 2013 | 8:38:38 UTC

Not really that surprising as in some projects like PG I don't expect them to try anything new. They just continue the same type of WU's they have tested forever. We have every time a new biological system, often including new functions. So I guess, shit happens :D (quite often)

flashawk
Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 30906 - Posted: 21 Jun 2013 | 12:51:52 UTC - in response to Message 30904.

We have every time a new biological system, often including new functions.


Wow, I didn't realize that, it clears up a lot of things. I imagined for some reason that you guys had made up some sort of templates in the beginning that could be used over, now I understand a little more of the difficulties. What about Cuda 5.5? Now that it was just released, could it be of any use to the project?

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 30907 - Posted: 21 Jun 2013 | 13:36:44 UTC - in response to Message 30906.

No idea on CUDA. Gianni knows more, but I think he keeps a pretty good eye on CUDA releases, so when they actually provide a significant boost we use them like in our last update.

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30909 - Posted: 21 Jun 2013 | 17:49:49 UTC

I haven't had any issues with nathan tasks. My 680s clear them in around 8-10 hours depending on the WU.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30910 - Posted: 21 Jun 2013 | 18:25:09 UTC - in response to Message 30909.

I haven't had any issues with nathan tasks. My 680s clear them in around 8-10 hours depending on the WU.

FWIW, my 660 (singular) has processed 46 of the new batch of Nathans, that first appeared May 27. They take 12h30m to 15h:30m, depending...

Four WUs died with errors, losing 18 hours of crunching; a dfhr36 and three noPhos.

5pot
Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30911 - Posted: 21 Jun 2013 | 18:51:28 UTC

Ouch.

matlock
Send message
Joined: 12 Dec 11
Posts: 34
Credit: 86,423,547
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 30916 - Posted: 22 Jun 2013 | 5:44:24 UTC

My 660 has been knocking them off in about 39,500 seconds (just under 11 hours).

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30917 - Posted: 22 Jun 2013 | 8:19:23 UTC - in response to Message 30916.

My 660 has been knocking them off in about 39,500 seconds (just under 11 hours).

I wonder why my 660 takes so much longer than yours?

It is all down to my Win 7 vs, your Ubuntu, or is something else in play here?

matlock
Send message
Joined: 12 Dec 11
Posts: 34
Credit: 86,423,547
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 30926 - Posted: 22 Jun 2013 | 17:11:45 UTC - in response to Message 30917.

It is all down to my Win 7 vs, your Ubuntu, or is something else in play here?


It looks like mine is about 12% faster on average. That could be down to Win7 vs Linux. I'm not running regular Ubuntu either, but Lubuntu with MATE. That avoids the bloat of Gnome3.

It might be worth trying to boot off a USB flash drive with Linux and run a WU or two to compare the difference on your setup. I've heard CPU and RAM can also play into running time. Additionally, I have an Asus 660 OC that I believe runs at 1020 MHz and 1085 MHz boost.

tomba
Send message
Joined: 21 Feb 09
Posts: 497
Credit: 700,690,702
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30927 - Posted: 22 Jun 2013 | 17:27:33 UTC - in response to Message 30926.

It is all down to my Win 7 vs, your Ubuntu, or is something else in play here?

It might be worth trying to boot off a USB flash drive with Linux and run a WU or two to compare the difference on your setup.

I have no idea how to do that!!!

Additionally, I have an Asus 660 OC that I believe runs at 1020 MHz and 1085 MHz boost.

I have an Asus 660 non-OC but I pushed it up to 1097 MHz, with no ill effects...

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 30945 - Posted: 23 Jun 2013 | 17:55:08 UTC - in response to Message 30926.

Additionally, I have an Asus 660 OC that I believe runs at 1020 MHz and 1085 MHz boost.

My 660 is an EVGA not OC and runs at 1110MHz. It needed around 46000 seconds to complete a Nate's long run. Win7 ultimate with 12GB RAM, 5.5GB in use and 88% of the CPU, i7 960 Bloomfield.
So you GPU is a lot faster.

____________
Greetings from TJ

GPUGRID
Send message
Joined: 12 Dec 11
Posts: 91
Credit: 2,730,095,033
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 31887 - Posted: 7 Aug 2013 | 21:22:21 UTC

I don´t see a thread about the current NOELIAs, but they are erroring on all my rigs. Please?

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31888 - Posted: 7 Aug 2013 | 22:39:32 UTC - in response to Message 31887.

I don´t see a thread about the current NOELIAs, but they are erroring on all my rigs. Please?

It's your driver, that's the first thing mentioned. Just when you call IT with a problem, the first thing they jell is: did you reboot?

By the way my 770 in the AMD finished a Santi that failed on your 770, same driver though.

The last 5 Noelia's I got, finished all without error. I like to take them all...
____________
Greetings from TJ

GPUGRID
Send message
Joined: 12 Dec 11
Posts: 91
Credit: 2,730,095,033
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 31890 - Posted: 8 Aug 2013 | 1:21:32 UTC - in response to Message 31888.

I don´t see a thread about the current NOELIAs, but they are erroring on all my rigs. Please?

It's your driver, that's the first thing mentioned. Just when you call IT with a problem, the first thing they jell is: did you reboot?

By the way my 770 in the AMD finished a Santi that failed on your 770, same driver though.

The last 5 Noelia's I got, finished all without error. I like to take them all...

Yes, I reboot all them once at day at least. Well then there is a problem with 320.49 and noelias, because it works with the other units. The problem is: no other drive will work with the 3x690 rig, and the one with 2x690+1x770 have issues aswell, but at least other drivers will recognize all the gpus.
Other drivers will lead to all sort of problems, including heat control and stability, things that i can´t leave behind.
So noelias are no option for me, sad.

TJ
Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31892 - Posted: 8 Aug 2013 | 8:34:06 UTC - in response to Message 31890.

I don´t see a thread about the current NOELIAs, but they are erroring on all my rigs. Please?

It's your driver, that's the first thing mentioned. Just when you call IT with a problem, the first thing they jell is: did you reboot?

By the way my 770 in the AMD finished a Santi that failed on your 770, same driver though.

The last 5 Noelia's I got, finished all without error. I like to take them all...

Yes, I reboot all them once at day at least. Well then there is a problem with 320.49 and noelias, because it works with the other units. The problem is: no other drive will work with the 3x690 rig, and the one with 2x690+1x770 have issues aswell, but at least other drivers will recognize all the gpus.
Other drivers will lead to all sort of problems, including heat control and stability, things that i can´t leave behind.
So noelias are no option for me, sad.

I don't think the 320.49 driver is the problem. My re-newed rig with FX8350, Sabertooth and Asus GTX770 has done 24 WU's LR and SR, from Noelia, Nathan and Santiago with zero errors since it started crunching. Five cores are doing Rosetta and temperature is still low with stock coolers.
____________
Greetings from TJ

Post to thread

Message boards : News : New NATHAN_KID WUs on long

//