New systems in Long queue

Message boards : News : New systems in Long queue
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
noelia

Send message
Joined: 5 Jul 12
Posts: 35
Credit: 393,375
RAC: 0
Level

Scientific publications
wat
Message 27733 - Posted: 19 Dec 2012, 16:41:25 UTC

Hi all!

A good amount of new WUs will be around for the next weeks in the long queue. The systems are called hfXA_long and will provide around 90000 credits each.

Thanks and Merry Christmas to all!!
Noelia
ID: 27733 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TheFiend

Send message
Joined: 26 Aug 11
Posts: 100
Credit: 2,720,959,686
RAC: 1,295,045
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27737 - Posted: 19 Dec 2012, 19:14:34 UTC

And a Merry Xmas to you and all the team at GPUGRID
ID: 27737 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,714,845,728
RAC: 648,677
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27742 - Posted: 19 Dec 2012, 22:43:06 UTC - in response to Message 27733.  
Last modified: 19 Dec 2012, 22:50:45 UTC

My computers just downloaded a couple of them. They seem to be very long (looks like about 14 to 16 hours before they finish on my computers.) GPU usage is 90%+ on both windows 7 and xp. I hope the upload files aren't too big, so I don't get an error for that.

What exactly are we crunching with these Wu's, in layman's terms please?

Merry Christmas!
ID: 27742 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 949,416,958
RAC: 67,995
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27743 - Posted: 19 Dec 2012, 23:13:03 UTC - in response to Message 27742.  
Last modified: 19 Dec 2012, 23:42:43 UTC

. I hope the upload files aren't too big, so I don't get an error for that.

Merry Christmas!


Hope that too, short before christmas i have a speedissue with the 3G connection of my main boinc cluster 2-6 kb/sec shared Upload overall arent that much :/

But yes, 99% GPU Load is great :)

Merry chrstmas for you too!
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 27743 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,714,845,728
RAC: 648,677
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27744 - Posted: 20 Dec 2012, 2:03:29 UTC

I hope I am wrong on this but it looks like the output file for these work units are going to be larger than 128 MB. After one of the units was 25% done, the 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_4 file was about 35 MB in size, at 33% done it was about 47 MB in size. If this projection holds when the file will be 100% done, the file will be about 140 MB, too large to upload, unless you raise the size limit of the upload files. I hate to see an otherwise successfully completed unit, error out like this.

This error occurred before:

5/8/2012 3:26:52 AM | GPUGRID | Computation for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 finished
5/8/2012 3:26:52 AM | GPUGRID | Output file 1H46_11_8-PAOLA_RNP-0-5-RND3163_0_4 for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 exceeds size limit.
5/8/2012 3:26:52 AM | GPUGRID | File size: 131283476.000000 bytes. Limit: 128000000.000000 byte

http://www.gpugrid.net/forum_thread.php?id=2970#24795

But you still have an opportunity to correct this.

ID: 27744 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 27745 - Posted: 20 Dec 2012, 3:32:51 UTC - in response to Message 27744.  

I'm uploading one right now, it's 109.95MB and it took 8 hours 15 minutes to complete.
ID: 27745 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
werdwerdus

Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27747 - Posted: 20 Dec 2012, 4:46:51 UTC
Last modified: 20 Dec 2012, 4:48:03 UTC

yep they get compressed before uploaded right? with gzip or something. I have 2 uploading right now. took 9:22 and 9:09 on two GTX 660 Ti gpus. 109.95 MB each. 97-98% gpu utilization on winxp.
XtremeSystems.org - #1 Team in GPUGrid
ID: 27747 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TheFiend

Send message
Joined: 26 Aug 11
Posts: 100
Credit: 2,720,959,686
RAC: 1,295,045
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27748 - Posted: 20 Dec 2012, 8:12:47 UTC

I've had one error out on one of my GTX670's :(

11 hours.. :(



Name 1x22_18-NOELIA_hfXA_long-0-2-RND2038_0
Workunit 3964872
Created 19 Dec 2012 | 15:06:36 UTC
Sent 19 Dec 2012 | 18:43:06 UTC
Received 20 Dec 2012 | 7:46:04 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 0 (0x0)
Computer ID 109019
Report deadline 24 Dec 2012 | 18:43:06 UTC
Run time 39,198.89
CPU time 38,715.33
Validate state Invalid
Credit 0.00
Application version Long runs (8-12 hours on fastest card) v6.16 (cuda42)
Stderr output

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<stderr_txt>
MDIO: cannot open file "restart.coor"
# Time per step (avg over 6250000 steps): 6.274 ms
# Approximate elapsed time for entire WU: 39213.953 s
called boinc_finish

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>1x22_18-NOELIA_hfXA_long-0-2-RND2038_0_4</file_name>
<error_code>-131</error_code>
</file_xfer_error>

</message>
]]>


Had another complete in 9 hours.
ID: 27748 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gattorantolo [Ticino]
Avatar

Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 27749 - Posted: 20 Dec 2012, 9:07:23 UTC - in response to Message 27748.  
Last modified: 20 Dec 2012, 9:08:27 UTC

Me too, after 10 hours...the task was already finished :-((( GTX680
What`s the problem?
Member of Boinc Italy.
ID: 27749 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gattorantolo [Ticino]
Avatar

Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 27751 - Posted: 20 Dec 2012, 10:48:18 UTC - in response to Message 27749.  
Last modified: 20 Dec 2012, 10:49:10 UTC

Me too, after 10 hours...the task was already finished :-((( GTX680
What`s the problem?

The second error after 11 hours...i stop the "long run"...i`am crunching now ACEMD standard!!!!!
Member of Boinc Italy.
ID: 27751 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 27752 - Posted: 20 Dec 2012, 11:08:57 UTC - in response to Message 27751.  

I have increased the upload size to 200MB now, but this will take place only on new results. I guess it's at the border line between 128MB (the previous limit), so it depends on the compression.

It's our mistake as the internal test did not complain, but now it's a bit of a problem to cancel them.

gdf
ID: 27752 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,714,845,728
RAC: 648,677
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27753 - Posted: 20 Dec 2012, 11:22:12 UTC - in response to Message 27744.  

I hope I am wrong on this but it looks like the output file for these work units are going to be larger than 128 MB. After one of the units was 25% done, the 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_4 file was about 35 MB in size, at 33% done it was about 47 MB in size. If this projection holds when the file will be 100% done, the file will be about 140 MB, too large to upload, unless you raise the size limit of the upload files. I hate to see an otherwise successfully completed unit, error out like this.

This error occurred before:

5/8/2012 3:26:52 AM | GPUGRID | Computation for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 finished
5/8/2012 3:26:52 AM | GPUGRID | Output file 1H46_11_8-PAOLA_RNP-0-5-RND3163_0_4 for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 exceeds size limit.
5/8/2012 3:26:52 AM | GPUGRID | File size: 131283476.000000 bytes. Limit: 128000000.000000 byte

http://www.gpugrid.net/forum_thread.php?id=2970#24795

But you still have an opportunity to correct this.



Here is the reason why they failed, from my event log:

12/20/2012 6:17:55 AM | GPUGRID | Computation for task 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1 finished
12/20/2012 6:17:55 AM | GPUGRID | Output file 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_4 for task 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1 exceeds size limit.
12/20/2012 6:17:55 AM | GPUGRID | File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes
12/20/2012 6:18:08 AM | GPUGRID | Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_0
12/20/2012 6:18:08 AM | GPUGRID | Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_1
12/20/2012 6:18:16 AM | GPUGRID | Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_0
12/20/2012 6:18:16 AM | GPUGRID | Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_2
12/20/2012 6:18:45 AM | GPUGRID | Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_1
12/20/2012 6:18:45 AM | GPUGRID | Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_3
12/20/2012 6:19:03 AM | GPUGRID | Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_2
12/20/2012 6:19:03 AM | GPUGRID | Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_7
12/20/2012 6:19:04 AM | GPUGRID | Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_3
12/20/2012 6:19:05 AM | GPUGRID | Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_7

Guys, you're suppose to learn from your mistakes, not repeat them.

ID: 27753 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 27754 - Posted: 20 Dec 2012, 11:29:36 UTC - in response to Message 27753.  
Last modified: 20 Dec 2012, 11:30:44 UTC

Ok.
We have checked in the DB. There are 87 failures like this and 1800 successes for this batch.

As I said the problem is that the submission script did not picked it up.

All new results will have a limit of 256MB.

gdf

Starting from the new application in January we expect to upload much smaller files. 1/3 of the current size.
ID: 27754 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,714,845,728
RAC: 648,677
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27755 - Posted: 20 Dec 2012, 12:18:22 UTC - in response to Message 27754.  

Well, it happened again.

12/20/2012 6:58:25 AM | GPUGRID | Output file 1x17_3-NOELIA_hfXA_long-0-2-RND7641_0_4 for task 1x17_3-NOELIA_hfXA_long-0-2-RND7641_0 exceeds size limit.
12/20/2012 6:58:25 AM | GPUGRID | File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes
12/20/2012 6:58:36 AM | GPUGRID | Starting task 10x12_4-NOELIA_hfXA_long-0-2-RND0606_1 using acemdlong version 616 (cuda42) in slot 2


http://www.gpugrid.net/result.php?resultid=6222298

I have three more of these units crunching right now. I hope this doesn't happen again.

ID: 27755 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Gattorantolo [Ticino]
Avatar

Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 27756 - Posted: 20 Dec 2012, 13:00:18 UTC - in response to Message 27755.  

What we have to do? 23 hours GPU work for nothing... :-(
Member of Boinc Italy.
ID: 27756 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 27757 - Posted: 20 Dec 2012, 13:53:06 UTC - in response to Message 27756.  

Can you manually increase the limit or at least see how much it is?

Is your version of the boinc client compressing the files? I don't understand why for some it works and for few it does not.

gdf
ID: 27757 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TheFiend

Send message
Joined: 26 Aug 11
Posts: 100
Credit: 2,720,959,686
RAC: 1,295,045
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27761 - Posted: 20 Dec 2012, 15:31:28 UTC

I've just had another one fail at upload!!! :(
ID: 27761 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>Belgique] bill1170

Send message
Joined: 4 Jan 09
Posts: 13
Credit: 1,379,104,222
RAC: 684,407
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27762 - Posted: 20 Dec 2012, 15:34:55 UTC - in response to Message 27757.  

Same problem
"20/12/2012 12:40:09 | GPUGRID | Output file 1x18_5-NOELIA_hfXA_long-0-2-RND1979_0_4 for task 1x18_5-NOELIA_hfXA_long-0-2-RND1979_0 exceeds size limit.
20/12/2012 12:40:09 | GPUGRID | File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes"

with this one :
http://www.gpugrid.net/workunit.php?wuid=3964747

on GTX660Ti XP32 Boinc 7.0.28
ID: 27762 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 193,866
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27763 - Posted: 20 Dec 2012, 15:36:58 UTC

This is very very bad.
I have only one successful NOELIA_hfXA_long task (it was the first one) and 9 failures 8 of the failures are because the upload limit exceeded.
Maybe the BOINC manager (on Windows) has this 128MB upload limit, so it couldn't be fixed on server side, since the increase I've got this error again.
ID: 27763 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 949,416,958
RAC: 67,995
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27764 - Posted: 20 Dec 2012, 16:29:22 UTC

Oh man..thats hard to know all fresh WUs will error out too in 14-18 hours and i cant abort them all..."50M until End of the year byebye" -_-
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 27764 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : News : New systems in Long queue

©2025 Universitat Pompeu Fabra