New workunits

Message boards : News : New workunits
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

AuthorMessage
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 53197 - Posted: 29 Nov 2019, 0:15:26 UTC - in response to Message 53195.  
Last modified: 29 Nov 2019, 0:18:28 UTC

Ok, on point 1, it was set for 360 already because that's a good time for LHC ATLAS to run complete. I moved it up to 480 now to try and deal with this stuff in GPUGRID.

As your GPU is taking 728 minutes to complete the current batch of Tasks, this setting needs to be MORE that 728 to have a positive effect. Times for other projects don't suit GPUgrid requirements as tasks here can be longer.
ID: 53197 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg _BE

Send message
Joined: 30 Jun 14
Posts: 153
Credit: 129,654,684
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 53205 - Posted: 29 Nov 2019, 14:31:21 UTC - in response to Message 53197.  
Last modified: 29 Nov 2019, 14:34:45 UTC

Ok, on point 1, it was set for 360 already because that's a good time for LHC ATLAS to run complete. I moved it up to 480 now to try and deal with this stuff in GPUGRID.

As your GPU is taking 728 minutes to complete the current batch of Tasks, this setting needs to be MORE that 728 to have a positive effect. Times for other projects don't suit GPUgrid requirements as tasks here can be longer.


Oh? That's interesting. Changed to 750 minutes.
ID: 53205 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg _BE

Send message
Joined: 30 Jun 14
Posts: 153
Credit: 129,654,684
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 53224 - Posted: 30 Nov 2019, 17:11:11 UTC

Just suffered DPC_WATCHDOG_VIOLATION on my system. Will be offline ba few days.
ID: 53224 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53225 - Posted: 30 Nov 2019, 20:04:53 UTC - in response to Message 53193.  
Last modified: 30 Nov 2019, 20:14:23 UTC

# Engine failed: Particle coordinate is nan

this usually indicates mathematical errors in the operations performed, memory corruption, or similar (or a faulty wu, unlikely in this case). Maybe a reboot will solve it.
These workunits has failed on all 8 hosts with this error condition.
initial_1923-ELISA_GSN4V1-12-100-RND5980
initial_1086-ELISA_GSN0V1-2-100-RND9613
Perhaps these workunits inherited a NaN (=Not a Number) from their previous stage.
I don't think this could be solved by a reboot.
I'm eagerly waiting to see how many batches will survive through all the 100 stages.
ID: 53225 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 69
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53290 - Posted: 6 Dec 2019, 23:38:21 UTC

I ran the following unit:

1_7-GERARD_pocket_discovery_d89241c4_7afa_4928_b469_bad3dc186521-0-2-RND1542_1, which ran well and would have finished as valid, if the following error did not occur:

</stderr_txt>
<message>
upload failure: <file_xfer_error>
<file_name>1_7-GERARD_pocket_discovery_d89241c4_7afa_4928_b469_bad3dc186521-0-2-RND1542_1_9</file_name>
<error_code>-131 (file size too big)</error_code>
</file_xfer_error>
</message>
]]>

Scroll to the bottom on this page:

http://www.gpugrid.net/result.php?resultid=21553962


It looks like you need to increase the size limits of the output files for it to upload. It should be done for all the subsequent WUs.





ID: 53290 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53291 - Posted: 7 Dec 2019, 6:12:34 UTC - in response to Message 53290.  

I must have squeaked in under the wire by just this much with this GERARD_pocket_discovery task.
https://www.gpugrid.net/result.php?resultid=21551650
ID: 53291 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 69
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53293 - Posted: 7 Dec 2019, 12:45:41 UTC - in response to Message 53291.  

I must have squeaked in under the wire by just this much with this GERARD_pocket_discovery task.
https://www.gpugrid.net/result.php?resultid=21551650



Apparently, these units vary in length. Here is another one with the same problem:

http://www.gpugrid.net/workunit.php?wuid=16894092



ID: 53293 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 428
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53294 - Posted: 7 Dec 2019, 17:11:49 UTC
Last modified: 7 Dec 2019, 17:20:36 UTC

I've got one running from 1_5-GERARD_pocket_discovery_d89241c4_7afa_4928_b469_bad3dc186521-0-2-RND2573 - I'll try to catch some figures to see how bad the problem is.

Edit - the _9 upload file (the one named in previous error messages) is set to allow

<max_nbytes>256000000.000000</max_nbytes>

or 256,000,000 bytes. You'd have thought that was enough.
ID: 53294 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 53295 - Posted: 7 Dec 2019, 17:47:56 UTC - in response to Message 53294.  

The 256 MB is the new limit - I raised it today. There are only a handful of WUs like that.
ID: 53295 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 428
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53301 - Posted: 7 Dec 2019, 23:16:16 UTC

I put precautions in place, but you beat me to it - final file size was 155,265,144 bytes. Plenty of room. Uploading now.
ID: 53301 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 53303 - Posted: 8 Dec 2019, 5:54:33 UTC

what I also noticed with the GERARD tasks (currently is running 0_2-GERARD_pocket_discovery ...):

the GPU utilization oscillates between 76% and 95% (in contrast to the ELISA tasks, where it was permanently close to or even at 100%)
ID: 53303 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile God is Love, JC proves it. I t...
Avatar

Send message
Joined: 24 Nov 11
Posts: 30
Credit: 201,648,059
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 53316 - Posted: 9 Dec 2019, 20:12:48 UTC - in response to Message 53295.  
Last modified: 9 Dec 2019, 20:16:22 UTC

I am getting upload errors too, on most but not all (4 of 6) WUs...
but, only on my 950M, not on my 1660 Ti, ... or EVEN my GeForce 640 !!

need to increase the size limits of the output files

So, how is this done?
Via the Options, Computing preferences, under Network, the default values are not shown (that I can see). I WOULD have assumed that boinc manager would have these as only limited by the system constraints unless tighter limits are desired.
AND, only download rate, upload rate, and usage limits can be set.
Again, how should output file size limits be increased.

It would have been VERY polite of GpuGrid to post some notice about this with the new WU releases.

I am very miffed, and justifiably so, at having wasted so much of my GPU time and energy, and effort on my part to hunt down the problem. Indeed, there was NO feedback from GpuGrid on this at all; I only noticed that my RAC kept falling even though I was running WUs pretty much nonstop.

I realize that getting research done is the primary goal, but if GpuGrid is asking people to donate their PC time and GPU time, then please be more polite to your donors.

LLP, PhD
ID: 53316 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53317 - Posted: 9 Dec 2019, 21:38:27 UTC

You can't control the result output file. That is set by the science application under control of the project administrators. The quote you referenced was from Toni acknowledging that he needed to increase the size of the upload server input buffer to handle the larger result files that a few tasks were producing. Not the norm of the usual work we have processed so far. Should be rare cases the results files exceed 250MB.
ID: 53317 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 428
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53320 - Posted: 9 Dec 2019, 23:39:40 UTC

Neither of those two. The maximum file size is specified in the job specification associated with the task in question. You can (as I did) increase the maximum size by careful editing of the file 'client_state.xml', but it needs a steady hand, some knowledge, and is not for the faint of heart. It shouldn't be needed now, after Toni's correction at source.
ID: 53320 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile God is Love, JC proves it. I t...
Avatar

Send message
Joined: 24 Nov 11
Posts: 30
Credit: 201,648,059
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 53321 - Posted: 9 Dec 2019, 23:59:42 UTC - in response to Message 53317.  
Last modified: 10 Dec 2019, 0:03:52 UTC

Hm,
Toni's message (53295) was posted on the 7th. Toni used the past tense on the 7th ("I raised");
yet, https://gpugrid.net/result.php?resultid=21553648
ended on the 8th and still had the same frustrating error.
After running for hours, the results were nonetheless lost:
upload failure: <file_xfer_error>
<file_name>initial_1497-ELISA_GSN4V1-20-100-RND8978_0_0</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>

Also, I must be just extremely unlucky. Toni says this came up on 'only a handful' of WUs, yet this happened to at least five of the WUs my GPUs ran.

I am holding off on running any GpuGrid WUs for a while, until this problem is more fully corrected.

Just for full disclosure... Industrial Engineers hate waste.

LLP
MS and PhD in Industrial & Systems Engineering.
Registered Prof. Engr. (Industrial Engineering)
ID: 53321 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile God is Love, JC proves it. I t...
Avatar

Send message
Joined: 24 Nov 11
Posts: 30
Credit: 201,648,059
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 53322 - Posted: 10 Dec 2019, 0:15:47 UTC
Last modified: 10 Dec 2019, 0:18:53 UTC

Besides the upload errors,
a couple, resultid=21544426 and resultid=21532174, had said:
"Detected memory leaks!"
So I ran extensive memory diagnostics, but no errors were reported by windoze (extensive as in some eight hours of diagnostics).
Boinc did not indicate if this was RAM or GPU 'memory leaks'

In fact, now I am wondering whether these 'memory leaks' were on my end at all, or on the GpuGrid servers...

LLP
I think ∴ I THINK I am
My thinking neither is the source of my being
NOR proves it to you
God Is Love, Jesus proves it! ∴ we are
ID: 53322 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 428
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53325 - Posted: 10 Dec 2019, 8:26:48 UTC - in response to Message 53321.  

Hm,
Toni's message (53295) was posted on the 7th. Toni used the past tense on the 7th ("I raised");
yet, https://gpugrid.net/result.php?resultid=21553648
ended on the 8th and still had the same frustrating error.
After running for hours, the results were nonetheless lost:
upload failure: <file_xfer_error>
<file_name>initial_1497-ELISA_GSN4V1-20-100-RND8978_0_0</file_name>
<error_code>-240 (stat() failed)</error_code>
</file_xfer_error>

That's a different error. Toni's post was about a file size error.
ID: 53325 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 53326 - Posted: 10 Dec 2019, 9:24:58 UTC - in response to Message 53322.  

Besides the upload errors,
a couple, resultid=21544426 and resultid=21532174, had said:
"Detected memory leaks!"
So I ran extensive memory diagnostics, but no errors were reported by windoze (extensive as in some eight hours of diagnostics).
Boinc did not indicate if this was RAM or GPU 'memory leaks'

In fact, now I am wondering whether these 'memory leaks' were on my end at all, or on the GpuGrid servers...

LLP



Such messages are always present in Windows. They are not related to successful or not termination. If an error message is present, it's elsewhere in the output.
ID: 53326 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 53327 - Posted: 10 Dec 2019, 9:26:15 UTC - in response to Message 53326.  
Last modified: 10 Dec 2019, 9:27:26 UTC

Also, slow and mobile cards should not be used for crunching for the reasons you mention.
ID: 53327 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Gustav

Send message
Joined: 24 Jul 19
Posts: 1
Credit: 112,924,891
RAC: 0
Level
Cys
Scientific publications
wat
Message 53328 - Posted: 10 Dec 2019, 9:46:06 UTC

Hi,

I have not received any new WU in like 30-40 days.

Why? Are there no available WU:s for anyone or could it be bad settings on my side?

My PC:s are starving...


Br Thomas
ID: 53328 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · 10 . . . 11 · Next

Message boards : News : New workunits

©2025 Universitat Pompeu Fabra