Message boards :
News :
ATM
Message board moderation
Previous · 1 . . . 30 · 31 · 32 · 33 · 34 · 35 · Next
Author | Message |
---|---|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 869 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The work unit generator has an incorrect value for estimated time to complete in the task profile. same thing here, short time ago: https://www.gpugrid.net/result.php?resultid=35071561 this is a new type of failure? What a waste :-( Could someone back at GPUGRID please take care of this? |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
One of the GPUGrid devs, Adria for the acemd3/Insilico-binding-assay devs said on their Discord server they would pass on the time limit exceeded error messages to the other devs so that the task generator templates can be updated so they get the proper values for the new tasks. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Note that these tasks are ACEMD 3, rather than ATM - and they are indeed from a new version of that application, v2.27 deployed 19 Apr 2024 for Windows. So Erich is right to identify this as a new problem. BOINC projects don't set a time limit explicitly. It's calculated by the client, on the host machine running the task. The calculation is done from: rsc_fpops_bound - by default 10x the rsc_fpops_est, set by the project Host average processing rate (avp) - shown as 38141 Gflops for Erich's machine, after just one successful task. Keith - you'll remember that we had major problems with AVP at SETI, after it was first introduced in 2010. It isn't used by the server until 11 tasks have been completed and validated by that particular host. I'll try and snag one of the new tasks on one of my Windows machines, so I can investigate further, but it'll be tricky. |
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 47,738 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Note that these tasks are ACEMD 3, rather than ATM - and they are indeed from a new version of that application, v2.27 deployed 19 Apr 2024 for Windows. So Erich is right to identify this as a new problem. This is indeed a pertinent conversation, but if I may point out, it is listed on the wrong thread, for a reason mentioned above..... |
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 869 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Note that these tasks are ACEMD 3, rather than ATM - and they are indeed from a new version of that application, v2.27 deployed 19 Apr 2024 for Windows. So Erich is right to identify this as a new problem. sorry folks, I hadn't even caught that the tasks in question are ACEMD3. So my complaint ended up in the wrong thread :-( Anyway, I now deselected ACEMD 3 for the time being. |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
I wasn't lucky in snagging any of the new acemd3 tasks and app this last pass, I keep gorging on the QC tasks. Has anyone running Linux run into the same time exceeded issue? |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Likewise. Linux has a continuous supply of QC, only interrupted by the occasional ATM. And no joy yet on Windows. |
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 47,738 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I wasn't lucky in snagging any of the new acemd3 tasks and app this last pass, I keep gorging on the QC tasks. I have: https://www.gpugrid.net/results.php?hostid=610674&offset=0&show_names=0&state=0&appid=32 Twice. |
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 869 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I restarted downloading ATMs this morning on three of my hosts. About a third of the tasks errored out after about 2.700 - 2.900 secs, No stderr is shown at any of the erronous tasks, so this probably is part of the problem. |
Send message Joined: 15 Jul 20 Posts: 95 Credit: 2,550,803,412 RAC: 170,875 Level ![]() Scientific publications ![]() |
RAS sur mon pc linux mint avec rtx 4060 t rtx a2000. https://www.gpugrid.net/results.php?userid=563937 nothing to report on my mint linux pc with rtx 4060 and rtx a2000. https://www.gpugrid.net/results.php?userid=563937 |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,909,595 RAC: 4,232,576 Level ![]() Scientific publications ![]() |
I restarted downloading ATMs this morning on three of my hosts. not including tasks still in progress, you have 20 tasks processed, and only 4 errors (that's about 1/5th). of those four, 2 were aborted, not computation error. ![]() |
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 869 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I restarted downloading ATMs this morning on three of my hosts. why did I abort 2 tasks - you can see it: they were running, running, running - for many hours - but no CPU at all. Hence, they also were erronous. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,909,595 RAC: 4,232,576 Level ![]() Scientific publications ![]() |
the two you aborted show less than an hour of runtime. I was talking about these two: http://www.gpugrid.net/result.php?resultid=35340996 http://www.gpugrid.net/result.php?resultid=35340990 I made my post before you aborted the other two. so now you have 4 that you aborted from that system. "no CPU use at all" is a strange comment, these are GPU tasks. ATM also frequently stops GPU computation to write results to the file. you stopped the other two tasks around 19-20,000 seconds, which was in the same range to time completion for the tasks that completed successfully. perhaps you got confused and the tasks were nearly complete? ![]() |
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 869 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
... as can easily be seen from successfully completed tasks, CPU time is close to total runtime. In the case of the tasks which I aborted, I realized by looking at the Windows task manager that there was no CPU usage at all, not at any time, so I aborted them. Also a look at the task list shows that CPU usage was "0". So, in some way these tasks must have been faulty |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,909,595 RAC: 4,232,576 Level ![]() Scientific publications ![]() |
might be an intermittent problem with your computer. like a driver crash/recovery. since you have some tasks that are running fine. ![]() |
Send message Joined: 28 Dec 20 Posts: 7 Credit: 26,500,257,436 RAC: 1,503 Level ![]() Scientific publications ![]() |
6/20/2024 6:34:33 PM | GPUGRID | [error] Error reported by file upload server: Server is out of disk space |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Been seeing this issue now for several hours now. Sent a PM through the GPUGrid Discord channel to Gianni. Probably won't see any relief till tomorrow European time. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Some files have uploaded, and some tasks reported, but they've now stopped again with a slightly different set of messages. Compare: 21/06/2024 11:46:33 | GPUGRID | [error] Error reported by file upload server: can't write file /home/ps3grid/projects/PS3GRID/upload/2a/BACE_m26_m17_5-QUICO_ATM_GAFF2_RESP-4-7-RND2126_0_1: No space left on server 21/06/2024 11:46:34 | GPUGRID | [error] Error reported by file upload server: Server is out of disk space I interpret the long version as meaning there's no space left on the backing store either, but that's a guess. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
And now they've all gone. The quota system is even allowing me to download new tasks again. |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Uploads are stalled out again. |
©2025 Universitat Pompeu Fabra