New app is out for testing

Message boards : News : New app is out for testing
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 28498 - Posted: 13 Feb 2013, 21:44:34 UTC

We have finished beta testing and we are now submitting workunits into a new queue for short runs.
If all works, we are going to update also the long queue.

Only cuda4.2 for the new app of course. Soon we will disable cuda3.1 as the application is way too old.

gdf
ID: 28498 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 326,008
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28499 - Posted: 13 Feb 2013, 22:09:30 UTC

I've got one of these waiting to run, and I noticed it's up to replication _4 already:

http://www.gpugrid.net/workunit.php?wuid=4173049

3 of the previous runs ended with error -9

Anything special you'd like me to watch out for when it runs?
ID: 28499 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dagorath

Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28503 - Posted: 13 Feb 2013, 23:31:16 UTC - in response to Message 28499.  

Put your safety glasses on and watch for smoke?
BOINC <<--- credit whores, pedants, alien hunters
ID: 28503 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Serious Stuff

Send message
Joined: 20 Jan 10
Posts: 4
Credit: 2,569,014
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwatwat
Message 28504 - Posted: 14 Feb 2013, 1:26:27 UTC - in response to Message 28498.  

Does this mean that those of us who have only been able to run the cuda 3.1 code are no longer wanted?
ID: 28504 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 958,266,958
RAC: 31,461
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28505 - Posted: 14 Feb 2013, 6:57:06 UTC

Hm im suprised that cuda31 will finally disabled after switching it extra to short units queue. My 285gtx can normally do 6 wus per day :(
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 28505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 28506 - Posted: 14 Feb 2013, 8:26:09 UTC - in response to Message 28505.  

It will always be possible to run with 280s but on new drivers.
Simply the new application cannot be compiled with cuda3.1.

gdf
ID: 28506 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 958,266,958
RAC: 31,461
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28508 - Posted: 14 Feb 2013, 9:27:24 UTC
Last modified: 14 Feb 2013, 9:30:45 UTC

Possible but for the half performance of now, i dont invest >200w/h on 3 short wus per day ;) buuuut perhaps the new app runs better, so i will see and test some wus when 31 queue is empty. I will report then ;)

Ps: is it a typeerror to see now cuda32 on the site? Or is this cuda31 or something other?
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 28508 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 326,008
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28509 - Posted: 14 Feb 2013, 10:45:09 UTC - in response to Message 28503.  

Put your safety glasses on and watch for smoke?

Well, I went to bed and pulled the duvet over my head, which amounts to much the same thing.

Results for host 43404

As you can see, the _4 task completed successfully, as did the subsequent _7 - that was the was last opportunity to get any science done, according to the "max # of error/total/success tasks 7, 10, 6" policy. And now I've got another _4.

That's a horribly high error rate - are you sure this app was ready for prime time?

While we're here, could we have some thoughts about the naming of the various application types, please? It's very misleading to have two separate (but identically-named) filters for short runs - especially when the the second one (appid=18) seems to be described as "CUDA 3.2" on the task selection preference page, but jobs from that queue were allocated as cuda42 to my host.
ID: 28509 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28510 - Posted: 14 Feb 2013, 11:08:43 UTC - in response to Message 28509.  

Ps: is it a typeerror to see now cuda32 on the site? Or is this cuda31 or something other?
Yes it should be 3.1, but saying as it's being deprecated I wouldn't worry about it now.

ID: 28510 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28511 - Posted: 14 Feb 2013, 11:14:15 UTC - in response to Message 28510.  
Last modified: 14 Feb 2013, 11:14:48 UTC

Just watched a tasks complete and two subsequently fail after 2seconds.

trypsin_lig_375_run1-NOELIA_RL3_equ-0-1-RND1921_1 4141973 13 Feb 2013 | 9:40:31 UTC 13 Feb 2013 | 10:58:54 UTC Completed and validated 2,033.93 1,484.83 1,500.00 ACEMD beta version v6.48 (cuda42)

trypsin_lig_905_run3-NOELIA_RL3_equ-0-1-RND5342_2 4144209 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.07 0.06 --- ACEMD beta version v6.48 (cuda42)

trypsin_lig_905_run2-NOELIA_RL3_equ-0-1-RND6964_2 4144208 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.11 0.05 --- ACEMD beta version v6.48 (cuda42)

Stderr output

<core_client_version>7.0.44</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
ERROR: file mdioload.cpp line 207: Error reading parmtop file
called boinc_finish

</stderr_txt>
]]>

Both tasks that failed had already done so 2 times and have not been resent:

6459826 30790 14 Feb 2013 | 8:51:55 UTC 14 Feb 2013 | 9:18:53 UTC Error while computing 3.05 0.14 --- ACEMD beta version v6.48 (cuda42)
6503647 126506 14 Feb 2013 | 10:24:54 UTC 14 Feb 2013 | 10:30:32 UTC Error while computing 2.06 0.08 --- ACEMD beta version v6.48 (cuda42)
6503815 139265 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.11 0.05 --- ACEMD beta version v6.48 (cuda42)
6503960 --- --- --- Unsent --- --- ---
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 28511 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 52,725
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28514 - Posted: 14 Feb 2013, 12:16:55 UTC - in response to Message 28511.  

Just watched a tasks complete and two subsequently fail after 2seconds.

trypsin_lig_375_run1-NOELIA_RL3_equ-0-1-RND1921_1 4141973 13 Feb 2013 | 9:40:31 UTC 13 Feb 2013 | 10:58:54 UTC Completed and validated 2,033.93 1,484.83 1,500.00 ACEMD beta version v6.48 (cuda42)

trypsin_lig_905_run3-NOELIA_RL3_equ-0-1-RND5342_2 4144209 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.07 0.06 --- ACEMD beta version v6.48 (cuda42)

trypsin_lig_905_run2-NOELIA_RL3_equ-0-1-RND6964_2 4144208 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.11 0.05 --- ACEMD beta version v6.48 (cuda42)

Stderr output

<core_client_version>7.0.44</core_client_version>
<![CDATA[
<message>
- exit code 98 (0x62)
</message>
<stderr_txt>
ERROR: file mdioload.cpp line 207: Error reading parmtop file
called boinc_finish

</stderr_txt>
]]>

Both tasks that failed had already done so 2 times and have not been resent:

6459826 30790 14 Feb 2013 | 8:51:55 UTC 14 Feb 2013 | 9:18:53 UTC Error while computing 3.05 0.14 --- ACEMD beta version v6.48 (cuda42)
6503647 126506 14 Feb 2013 | 10:24:54 UTC 14 Feb 2013 | 10:30:32 UTC Error while computing 2.06 0.08 --- ACEMD beta version v6.48 (cuda42)
6503815 139265 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.11 0.05 --- ACEMD beta version v6.48 (cuda42)
6503960 --- --- --- Unsent --- --- ---



I had a bunch of failures as well:

http://www.gpugrid.net/workunit.php?wuid=4144270

http://www.gpugrid.net/workunit.php?wuid=4144240

http://www.gpugrid.net/workunit.php?wuid=4144211

http://www.gpugrid.net/workunit.php?wuid=4144208

http://www.gpugrid.net/workunit.php?wuid=4144196
ID: 28514 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28515 - Posted: 14 Feb 2013, 14:10:06 UTC - in response to Message 28514.  

These Betas are all failing on my systems, so I've had to suspend any more Beta testing for a while (otherwise I'll stop getting tasks):

trypsin_lig_941_run4-NOELIA_RL3_equ-0-1-RND4515_3 4144364 139265 14 Feb 2013 | 13:17:12 UTC 14 Feb 2013 | 13:19:09 UTC Error while computing 2.07 0.05 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_940_run3-NOELIA_RL3_equ-0-1-RND0852_1 4144359 139265 14 Feb 2013 | 13:17:12 UTC 14 Feb 2013 | 13:19:09 UTC Error while computing 2.07 0.05 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_941_run3-NOELIA_RL3_equ-0-1-RND2477_2 4144363 139859 14 Feb 2013 | 12:10:32 UTC 14 Feb 2013 | 12:16:50 UTC Error while computing 2.35 0.08 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_911_run2-NOELIA_RL3_equ-0-1-RND2760_2 4144232 139265 14 Feb 2013 | 11:45:48 UTC 14 Feb 2013 | 11:47:38 UTC Error while computing 2.11 0.05 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_929_run2-NOELIA_RL3_equ-0-1-RND8942_1 4144310 139859 14 Feb 2013 | 12:22:28 UTC 14 Feb 2013 | 12:28:57 UTC Error while computing 2.26 0.08 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_933_run4-NOELIA_RL3_equ-0-1-RND6668_1 4144329 139859 14 Feb 2013 | 11:59:09 UTC 14 Feb 2013 | 12:04:48 UTC Error while computing 2.29 0.06 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_912_run3-NOELIA_RL3_equ-0-1-RND2352_2 4144238 139859 14 Feb 2013 | 12:16:50 UTC 14 Feb 2013 | 12:22:28 UTC Error while computing 2.24 0.08 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_900_run3-NOELIA_RL3_equ-0-1-RND4793_2 4144189 139265 14 Feb 2013 | 11:45:48 UTC 14 Feb 2013 | 11:47:38 UTC Error while computing 2.06 0.05 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_916_run3-NOELIA_RL3_equ-0-1-RND4035_2 4144255 139859 14 Feb 2013 | 11:46:58 UTC 14 Feb 2013 | 11:52:44 UTC Error while computing 2.21 0.09 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_900_run2-NOELIA_RL3_equ-0-1-RND3255_2 4144188 139859 14 Feb 2013 | 11:41:13 UTC 14 Feb 2013 | 11:46:58 UTC Error while computing 2.20 0.05 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_905_run3-NOELIA_RL3_equ-0-1-RND5342_2 4144209 139265 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.07 0.06 --- ACEMD beta version v6.48 (cuda42)
trypsin_lig_905_run2-NOELIA_RL3_equ-0-1-RND6964_2 4144208 139265 14 Feb 2013 | 11:03:01 UTC 14 Feb 2013 | 11:03:51 UTC Error while computing 2.11 0.05 --- ACEMD beta version v6.48 (cuda42)

I would suggest that anyone also seeing numerous Errors, stop running the Beta's for a while. Stick to the Long &/or Short tasks and after you complete a few try the odd Beta again.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 28515 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 326,008
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28516 - Posted: 14 Feb 2013, 15:13:00 UTC - in response to Message 28515.  

Tried a few as confirmation, with the same result - 12 errors in a row.

Beta tasks for host 132158

But it must be a data error - you can see the host has over 100 valid tasks, all done last weekend after the call went out to clear the queue so that proper application testing could resume.

At least these tasks weren't of the crashing/BSODing kind.
ID: 28516 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Stoneageman
Avatar

Send message
Joined: 25 May 09
Posts: 224
Credit: 34,057,374,498
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28519 - Posted: 14 Feb 2013, 17:30:02 UTC

Thought I'd dip a toe back into the Beta testing pool, but I'm getting 'No beta tasks available'. Is it windows only?
ID: 28519 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28520 - Posted: 14 Feb 2013, 17:42:30 UTC - in response to Message 28519.  

I think it is Windows only.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 28520 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 28522 - Posted: 14 Feb 2013, 20:08:24 UTC - in response to Message 28520.  
Last modified: 14 Feb 2013, 20:09:31 UTC

Hi, a subset of the betas had indeed a problem that makes them fail immediately. We devised a way to selectively remove single unsent tasks and cancelled them, so many should have disappeared from the queue; those already downloaded will disappear gradually.
ID: 28522 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28524 - Posted: 15 Feb 2013, 1:11:08 UTC - in response to Message 28522.  

By 'disappear gradually' I presume you mean they will fail, get resent, fail, get resent, fail and then be cancelled. But for the stubborn scheduler, the 2sec runtime wouldn't be such an issue.

Anyway, I've been running a few again and they are not failing. However the other issues persist. Of note is the dependence on high CPU Kernel time. At 85% CPU usage I was seeing 10% GPU usage, and on another system with only 50% CPU usage (but high Kernel usage) I only saw 2% GPU utilization. Another app was hogging the Kernel and memory, and GPU Utilization went up to 50% when I suspended it.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 28524 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28526 - Posted: 15 Feb 2013, 12:16:40 UTC - in response to Message 28524.  

trypsin_lig_901_run1-NOELIA_RL3_equ-0-1-RND1273_7

errors Too many errors (may have bug)

All the same 2" errors,
http://www.gpugrid.net/workunit.php?wuid=4144191
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 28526 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 958,266,958
RAC: 31,461
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28541 - Posted: 16 Feb 2013, 10:32:30 UTC

Since today there is only 1 user left who connected the last 24hours to the short cuda31 queue (Serverstats). Im proud to tell, im this lonely guy ;) So i need at least 3 more (@24h crunching) days to clear this queue up (~4 hours per WU). Only as little estimate when the adminstaff can deactived it, and the problems with the queue selection on some computers should go away then ;)
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 28541 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 958,266,958
RAC: 31,461
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28543 - Posted: 16 Feb 2013, 18:12:56 UTC
Last modified: 16 Feb 2013, 18:21:17 UTC

hmm ok gpugrid dont sends me anymore tasks from cuda31 queue. strange. who should compute them now? O.o
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 28543 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : New app is out for testing

©2025 Universitat Pompeu Fabra