acemdlong application 8.14 - discussion

Message boards : News : acemdlong application 8.14 - discussion
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

AuthorMessage
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32373 - Posted: 28 Aug 2013, 12:28:52 UTC

Odd error on a long run for my 780:

The simulation has become unstable. Terminating to avoid lock-up

All WUs are currently dry again as well.
ID: 32373 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GPUGRID

Send message
Joined: 12 Dec 11
Posts: 91
Credit: 2,730,095,033
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 32384 - Posted: 28 Aug 2013, 14:52:46 UTC

Running dry of WUs in all machines. No long or short are beeing splited for sometime. As others say, it´s not a driver issue, despite the new message from server.
ID: 32384 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
GPUGRID

Send message
Joined: 12 Dec 11
Posts: 91
Credit: 2,730,095,033
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 32396 - Posted: 28 Aug 2013, 16:06:10 UTC

They are flowing now, thank you.
ID: 32396 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Rick A. Sponholz
Avatar

Send message
Joined: 20 Jan 09
Posts: 52
Credit: 2,518,707,115
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32406 - Posted: 28 Aug 2013, 19:16:44 UTC

Ok, I'm getting wu's again, BUT I give up on what new app names to use in the app_config.xml file. Can anyone help me? Thanks in advance, Rick
ID: 32406 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Operator

Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32414 - Posted: 28 Aug 2013, 19:53:56 UTC - in response to Message 32345.  

Operator, please post your completed time on the long run for you titan when finished.

Cheers


5pot;

The first 'non-beta' long WU for me was I79R10-NATHAN_KIDKIXc22_6-3-50-RND1517_0 at 19,232.48 seconds (5.34 hrs). http://www.gpugrid.net/workunit.php?wuid=4727415

The second long WU for me was I46R3-NATHAN_KIDKIXc22_6-2-50-RND7378_1 at 19,145.93 seconds (5.318314 hrs). http://www.gpugrid.net/workunit.php?wuid=4723649

W7x64
Dell T3500 12GB
Titan x2 (both EVGA factory OC'd, and on air - limited to 80C)
1100hz (ballpark freq, not fixed) running 326.84 drivers from the developer site.

No crashes or funny business so far.

Operator
ID: 32414 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32415 - Posted: 28 Aug 2013, 20:28:46 UTC

So about another 1k seconds shaved off.

Im really really impressed with these cards.
ID: 32415 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32418 - Posted: 28 Aug 2013, 20:46:37 UTC - in response to Message 32414.  


W7x64
Dell T3500 12GB
Titan x2 (both EVGA factory OC'd, and on air - limited to 80C)
1100hz (ballpark freq, not fixed) running 326.84 drivers from the developer site.


Hello Operator, you are the one I need to ask something.
What is the PSU in your Dell and how many GPU power plugs do you have? I have a T7400 with a 1000W PSU but only two 6 pins and 1 not usual 8 pin GPU power plug, so I am very limited with this big box.
Thanks for your answer highly appreciated.

Greetings from TJ
ID: 32418 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Operator

Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32428 - Posted: 28 Aug 2013, 23:17:17 UTC - in response to Message 32418.  
Last modified: 28 Aug 2013, 23:17:46 UTC


Hello Operator, you are the one I need to ask something.
What is the PSU in your Dell and how many GPU power plugs do you have? I have a T7400 with a 1000W PSU but only two 6 pins and 1 not usual 8 pin GPU power plug, so I am very limited with this big box.
Thanks for your answer highly appreciated.


TJ;

I've sent you a PM so as not to crosspost.

Operator
ID: 32428 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32441 - Posted: 29 Aug 2013, 8:41:49 UTC - in response to Message 32370.  

There seems to be a small improvement in the new application but not in the Cuda version.


Well if you have a Titan or a 780 the apps now work! I think that was the point, at least it was for me (since March).

Now... I wish the server would give me more than just one WU at a time since I have TWO GPUs.

Operator


Well that might have been the point of the new app but the question and answer that you have quoted was about performance increase for Fermi cards. So nothing at all to do with Titan or 780.
ID: 32441 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32512 - Posted: 29 Aug 2013, 23:10:57 UTC

Getting the unknown error number crashes on kid WUs
ID: 32512 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32515 - Posted: 30 Aug 2013, 0:30:00 UTC

More have crashed, with the exact same unknown error number. I dont know if this is from the 8.02 or whatever app that has been pushed out, but it only recently began doing this.

/sigh
ID: 32515 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Operator

Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32519 - Posted: 30 Aug 2013, 4:28:37 UTC

Please can we go back to 8.00 (or maybe 8.01)?

8.02 makes all the Nathans error out now.

It got so messed up I just reset the project.

Was doing fine with the 8.00 on Titans.

Operator
ID: 32519 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 32523 - Posted: 30 Aug 2013, 7:45:40 UTC - in response to Message 32519.  

Opearator, 5pot,
looking at the stats you seem to be two of the 3 users which have a problems with it and did not have it with 800. Would it be possible to restart the machine just to make sure?

gdf
ID: 32523 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32524 - Posted: 30 Aug 2013, 7:58:56 UTC

8.02 is, in general, doing better than 8.00 but it looks like there's some regression that's affecting a few machines. For now I've reverted acemdlong to 8.00 (which will appear as 8.03 because of a bug in the server). 8.02 will stay on acemdshort for continued testing.

MJH
ID: 32524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 326,008
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32527 - Posted: 30 Aug 2013, 8:20:37 UTC - in response to Message 32512.  

Getting the unknown error number crashes on kid WUs

It isn't the error number which is unknown, it's the plain-English description for it.

Yours got a 0xffffffffc0000005: mine has just died with a 0xffffffffffffff9f, description equally unknown. Task 7221543.
ID: 32527 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Stoneageman
Avatar

Send message
Joined: 25 May 09
Posts: 224
Credit: 34,057,374,498
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32528 - Posted: 30 Aug 2013, 8:41:59 UTC

I've not had any issues with 8.0, 8.1 or 8.2 on my 500 & 600 cards, either Linux or Win. The only problem now is getting fresh wu's from the server.
ID: 32528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
HA-SOFT, s.r.o.

Send message
Joined: 3 Oct 11
Posts: 100
Credit: 5,879,292,399
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32529 - Posted: 30 Aug 2013, 8:47:55 UTC - in response to Message 32528.  

I've not had any issues with 8.0, 8.1 or 8.2 on my 500 & 600 cards, either Linux or Win. The only problem now is getting fresh wu's from the server.


Me too for linux and 680 and Titan.
ID: 32529 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32530 - Posted: 30 Aug 2013, 8:52:05 UTC - in response to Message 32528.  

Short runs (2-3 hours on fastest card) 0 801 1.95 (0.10 - 9.83) 501
ACEMD beta version 4 319 1.44 (0.01 - 5.58) 78
Long runs (8-12 hours on fastest card) 0 1,230 7.87 (0.27 - 47.95) 396

So, there are 4 Beta's and that's it?
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 32530 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 32532 - Posted: 30 Aug 2013, 8:58:28 UTC - in response to Message 32530.  

Noelia has over 1000 WU to submit but we cannot as there are still these problems.

Now they are running on beta to test.

gdf
ID: 32532 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32536 - Posted: 30 Aug 2013, 12:42:36 UTC - in response to Message 32527.  

Getting the unknown error number crashes on kid WUs

It isn't the error number which is unknown, it's the plain-English description for it.

Yours got a 0xffffffffc0000005: mine has just died with a 0xffffffffffffff9f, description equally unknown. Task 7221543.


Ty for the clarification.
ID: 32536 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 . . . 8 · Next

Message boards : News : acemdlong application 8.14 - discussion

©2025 Universitat Pompeu Fabra