Message boards :
Number crunching :
failing tasks lately
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
the faulty tasks seem to be back (erroring out after a few seconds): http://www.gpugrid.net/result.php?resultid=21331546 :-( |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I had a task fail after few seconds. Stderr says: ERROR: file pme.cpp line 91: PME NX too small here the URL: http://www.gpugrid.net/result.php?resultid=21429528 anyone any idea what was going wrong? |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 351 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
here another one, from this morning, with error message: ERROR: file mdioload.cpp line 81: Unable to read bincoordfile http://www.gpugrid.net/result.php?resultid=21431713 |
|
Send message Joined: 18 Oct 13 Posts: 53 Credit: 406,647,419 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Same here http://www.gpugrid.net/result.php?resultid=21432948 http://www.gpugrid.net/result.php?resultid=21432946 http://www.gpugrid.net/result.php?resultid=21431340 http://www.gpugrid.net/result.php?resultid=21431266 http://www.gpugrid.net/result.php?resultid=21430771 ...and more others, all CUDA 80 |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Same hereUntil the new app (ACEMD3) is released, you should assign this host to a venue which receives work only from the ACEMD3 queue, as the other two queues have the old client, which is incompatible with the Turing cards. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
obviously, the faulty tasks are back, here the next one from a minute ago: http://www.gpugrid.net/result.php?resultid=21433016 This is even worse in times where new tasks are very rare, anyway :-( |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
the next ones: http://www.gpugrid.net/result.php?resultid=21462742 http://www.gpugrid.net/result.php?resultid=21462460 http://www.gpugrid.net/result.php?resultid=21462682 http://www.gpugrid.net/result.php?resultid=21462715 they all didn't run even one second :-( |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
and here some more: http://www.gpugrid.net/result.php?resultid=21463119 http://www.gpugrid.net/result.php?resultid=21463047 http://www.gpugrid.net/result.php?resultid=21463010 http://www.gpugrid.net/result.php?resultid=21462974 http://www.gpugrid.net/result.php?resultid=21463183 http://www.gpugrid.net/result.php?resultid=21463207 |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I think the license of the v9.22 app has expired this time. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I think the license of the v9.22 app has expired this time. that's what I now am suspecting, too :-( |
|
Send message Joined: 7 Apr 15 Posts: 33 Credit: 1,201,157,375 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Any prediction when continous supply of new WU's will become available again ? Nearly full month of very intermittent and small numbers of WU's. Einstein is a happy project in the meantime :-) Are all efforts being put into support of the new 20XX cards at the detriment of the current 10XX cards ? (limited staff available maybe/lack of funding ?) |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
this is an increasingly annoying situation: while there are no tasks available most of the time, some of the few ones that are being downloaded fail after 5 seconds: http://www.gpugrid.net/result.php?resultid=21481323 ERROR: file mdioload.cpp line 81: Unable to read bincoordfile :-( :-( :-( |
|
Send message Joined: 2 Jul 19 Posts: 21 Credit: 90,744,164 RAC: 0 Level ![]() Scientific publications
|
Hi: I see this is a well used section of the forum. I would like to contribute some useful results here with my Alienware laptop but I have a high failure rate which I would like to resolve here. The GPU in my laptop is a Geoforce 660M. The OS I am using is uptodate Windows 10. I would appreciate it if a tech person could narrow down the reason or reasons why I am experiencing such a high failure rate. Clive Hunt Canada |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I would like to contribute some useful results here with my Alienware laptop I am afraid that laptop GPUs are not made for this kind of load :-( |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I would like to contribute some useful results here with my Alienware laptop |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I would like to contribute some useful results here with my Alienware laptop I am afraid that laptop GPUs are not made for this kind of heavy load :-( |
|
Send message Joined: 27 Jul 11 Posts: 138 Credit: 539,953,398 RAC: 0 Level ![]() Scientific publications ![]()
|
My Dell G7 15 laptop is happily crunching. That is another matter that I have to send a blast of air every day to get the dust-out. |
|
Send message Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi: The issue is with the Scheduler on the GPUgrid servers. The Scheduler is sending CUDA65 tasks to your Laptop, all of which will fail due to an expired license. (Server end) Your laptop can process CUDA80 tasks, but you are at the mercy of the Scheduler. For most Hosts it sends the correct tasks, and for a handful of Hosts, it is sending the wrong tasks. This issue tends to affect Kepler GPUs (600 series GPU), even though they are still supported. Some relevant posts discussing this issue are here: http://www.gpugrid.net/forum_thread.php?id=5000&nowrap=true#52924 http://www.gpugrid.net/forum_thread.php?id=5000&nowrap=true#52920 The Project is in the middle of changing the Application to a newer version, hopefully when the new Application is released (ACEMD3), these issues will be smoothed out. |
|
Send message Joined: 1 Jan 15 Posts: 1166 Credit: 12,260,898,501 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
... when the new Application is released (ACEMD3)... I am curious WHEN this will be the case |
©2025 Universitat Pompeu Fabra