Message boards :
Number crunching :
new ADRIA_KIXcMyb_HIP_bandit workunits
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
I saw a bunch of these went out today. Even more intense than the D3RBandit units. Runtime/Computation of these new units looks to be about 20% longer. ~12hrs for an RTX 2080 Ti (225W, 1780MHz) ~15-16hrs for an RTX 2080 (185W, 1850MHz) ~20hrs for an RTX 2070 (150W, 1780MHz) ~31hrs for a GTX1660 Super (125W, 1900MHz)
|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Huston, we have a problem. My first round of tasks ran to completion, but all hit a computation error right after hitting 100% <error_code>-131 (file size too big)</error_code> I expect everyone's will fail the same. This is a project-side issue. That's a lot of wasted time for no results and no credit. Edit* I've notified Toni, but it probably wont get fixed before hundreds of tasks have failed and been resent :(.
|
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Well that sucks. I got a bunch of them also. I have about an hour to go on one of them to see if it too bombs out with too big an upload. |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Yes, the same problem of file upload too big error. |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My first round of tasks ran to completion, but all hit a computation error right after hitting 100% Thank you very much for sharing this issue. In a flash: I have several of these tasks currently running at my hosts. The first in finishing is running at a GTX 1660 Ti GPU under Linux OS. https://www.gpugrid.net/result.php?resultid=32622893 - I've stopped BOINC activity at BOINC Manager. - I've killed all BOINC processes. - I've edited as administrator the file \var\lib\boinc-client\client_state.xml - I've edited all the instances (21) for the mentioned task of <max_nbytes> parameter, adding a leading 10 to the existing value. - I've restarted boinc-client service, and activity at BOINC Manager. This task is estimating to finish in about 10 hours. Then I'll report if this bypass has worked. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 428 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
That process sounds right. I've been trying to catch a resend myself, but with no success so far (most of the fun seems to have happened overnight, UK time, when I had work fetch suspended). Two additional points: 1) Usually, GPUGrid tasks produce multiple upload files, though I can't speak for this particular batch until I catch one. I was hoping to report on which particular file was over-size, and by how much the allowed/actual sizes differed. That would help Toni and/or Adria. 2) Depending on how you have BOINC installed, stopping the BOINC client service should be an easy way of ceasing all activity in an orderly fashion, in preparation for the edit. |
|
Send message Joined: 14 Feb 16 Posts: 5 Credit: 17,756,170 RAC: 0 Level ![]() Scientific publications
|
Am I late to the party? Had my computer running with the gpugrid project attached but didn't get anything. Checked the logs as well, it just keeps saying project has not tasks available. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
My first round of tasks ran to completion, but all hit a computation error right after hitting 100% I wasnt able to save a lot of my tasks, but I applied this change to my existing tasks and it looks like it worked. the trouble file seems to be the one labelled "_9". that file is like 475MB and takes several minutes to upload. not sure if any others are over their limits, since each upload file has it's own size limit and they are not all the same. the is quite labor intensive though. you'll have to make this change for each new task that shows up. it needs to be fixed on the project side.
|
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I was able to edit one task on the host with homogeneous cards. But won't bother with the tasks running on non-homogeneous hosts as they will likely error out from restarting on a different card anyway. |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 428 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
the trouble file seems to be the one labelled "_9". that file is like 475MB and takes several minutes to upload. not sure if any others are over their limits, since each upload file has it's own size limit and they are not all the same. I've just picked up a couple of these tasks. The _9 file seems to have a <max_bytes> of 512,000,000. That should be just enough, even though the binary equivalent is 488.28 MB. I'll bump it for safety, and try and catch the exact size when it uploads. The tasks I've got are freshly created, at 15:42 UTC this afternoon - not resends. They may have made a fresh batch with the problem fixed. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
the trouble file seems to be the one labelled "_9". that file is like 475MB and takes several minutes to upload. not sure if any others are over their limits, since each upload file has it's own size limit and they are not all the same. yup. i saw that too. they are labelled with "new" and "Adaptive" in the filename. so Toni obviously saw my message to change these units. all of the existing ones should be cancelled and re-sent IMO. if left to their own devices, the tasks will land in the hands of someone who doesnt know to manually fix them. it'll take a long time for these to naturally hit 8 errors since they are so long running. weeks.
|
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
My _9 file currently uploading is 478MB. I forgot to note down what it was default before I edited it. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
My _9 file currently uploading is 478MB. default is 256,000,000 bytes. they have since changed it on the new files to 512,000,000 bytes
|
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I suspended BOINC network activity a while before my previously mentioned task finished. -1) The workaround succeeded for me too. It finished in a valid result after a total processing time of 102996 seconds. -2) Regarding the generated output files, I took a screenshot that looks as follows: |
True:I think, therefor I THINK...Send message Joined: 23 Mar 15 Posts: 4 Credit: 32,420,906 RAC: 0 Level ![]() Scientific publications
|
WHAT a WASTE! upload failure: <file_xfer_error> <file_name>e1s8_I3-ADRIA_KIXcMyb_HIP_bandit-0-2-RND3309_0_9</file_name> <error_code>-131 (file size too big)</error_code> after running for 118,437 sec = 39.9 HOURS on my GTX 1660 Ti. Come on, people get with it, Please! I generally prefer donating my PC power to GpuGrid: you guys publish papers, you do real science to benefit people (so I assume). BUT, I am not willing to waste either my PC & GPU, nor my power bill. I am signing OFF this project for a while. Besides, as these are simulation runs, GPUGRID can set the time to run pretty closely by specifying the number of simulation replications. For weeks now, my 1660 Ti had been consistently just missing the 24Hr completion mark. Other GPU projects give double or even ten times the credit per time unit. LLP, 6-sigma healthcare LLP, PhD, Prof. Engr. I think => I THINK I am. My thinking is not the source of my being, nor does it prove my existence to you. The Living Word of God World Youth Day |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
New tasks have already been reconfigured with a larger file size limit and are being distributed. They compute and report fine. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
a 1660ti is not powerful enough to complete in under 24hrs. if you want the 24hr bonus, you'll need a faster GPU for these tasks. but what project gives 10x the credit? Collatz? LOL. do you care about credits, or do you care about doing real work? Collatz is doing useless research IMO. and at least one volunteer has pointed out that the project isn't even providing valid results. https://boinc.berkeley.edu/forum_thread.php?id=14159
|
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
New tasks have already been reconfigured with a larger file size limit and are being distributed. Fixed. Thanks for everybody reporting. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
new tasks failing again, even with the higher limt: https://www.gpugrid.net/result.php?resultid=32625435 needs to be higher than 512000000 still
|
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My last *_0_9 file size was 509240934 bytes. Possibly 512000000 is set too tight for certain tasks. |
©2025 Universitat Pompeu Fabra