Message boards :
Graphics cards (GPUs) :
Cancelled by server
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 18 Sep 08 Posts: 65 Credit: 3,037,414 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
is this by purpose, or what? initial replication 10 and killing running jobs??? wuid=415152 590172 32612 27 Apr 2009 9:28:08 UTC 2 May 2009 9:28:08 UTC In progress --- New --- --- --- 590173 29707 27 Apr 2009 9:27:30 UTC 27 Apr 2009 15:58:46 UTC Over Redundant result Cancelled by server 3,358.81 3,946.78 --- 590174 22935 27 Apr 2009 9:27:59 UTC 2 May 2009 9:27:59 UTC In progress --- New --- --- --- 590175 18304 27 Apr 2009 9:28:07 UTC 27 Apr 2009 16:03:36 UTC Over Redundant result Cancelled by server 0.00 --- --- 590176 23183 27 Apr 2009 9:28:20 UTC 2 May 2009 9:28:20 UTC In progress --- New --- --- --- 590177 33634 27 Apr 2009 9:29:38 UTC 2 May 2009 9:29:38 UTC In progress --- New --- --- --- 590178 30738 27 Apr 2009 9:29:00 UTC 27 Apr 2009 16:15:14 UTC Over Redundant result Cancelled by server 0.00 --- --- 590179 28591 27 Apr 2009 9:31:41 UTC 2 May 2009 9:31:41 UTC In progress --- New --- --- --- 590180 19103 27 Apr 2009 9:27:25 UTC 27 Apr 2009 15:55:04 UTC Over Redundant result Cancelled by server 0.00 --- --- 590181 16930 27 Apr 2009 9:29:06 UTC 27 Apr 2009 16:13:09 UTC Over Redundant result Cancelled by server 11.40 3,946.78 --- |
|
Send message Joined: 28 Mar 09 Posts: 6 Credit: 6,972,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I just had a similar problem, with a partially completed WU cancelled after 10 hours (approx 80% complete). 27/04/2009 19:19:54 GPUGRID Message from server: Result p1380000-GIANNI_pYIpYVk12204-6-10-RND8950_0 is no longer usable 27/04/2009 19:19:55 GPUGRID Computation for task p1380000-GIANNI_pYIpYVk12204-6-10-RND8950_0 finished Is this by design or is it an error??? Here is the WU in question. http://www.gpugrid.net/workunit.php?wuid=413724 |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
We cancelled a set (hopefully small) of running WUs. That happened in relation to a fix reported in another thread (download error). In handling the huge amount of GPUs that you generously donate we struggle, from time to time, with the unpredictable... Replication for most jobs is 1 or 2 for a few. Some were created with 10 in a prototype we were testing to "push" late WUs. |
|
Send message Joined: 18 Sep 08 Posts: 65 Credit: 3,037,414 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
hmm - if things like this happen, you should at least grant some credit for the killed WUs. it's not funny to loose several hours of crunching time due to a faulty scheduler-setup.. |
DoctorNowSend message Joined: 18 Aug 07 Posts: 83 Credit: 135,208,752 RAC: 3 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
hmm - if things like this happen, you should at least grant some credit for the killed WUs. it's not funny to loose several hours of crunching time due to a faulty scheduler-setup.. I saw this on the WU on one of my team mates. The problem is, when the WU is marked as redundant, the reported WU doesn't contain how long it has run in the log file, and she said it was already at 50%! So it isn't even possible to grant partial credits. Member of BOINC@Heidelberg and ATA!
|
|
Send message Joined: 20 Aug 07 Posts: 18 Credit: 1,319,274 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
I've had that done to me also, 90% plus (about 18 hours work) done and chopped off at the knees. A big fat "GOOSE EGG" for the credit. I've started to abort work units where the initial replication is more than "1" |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
This is not a problem of replication. Replication does not cancel the workunit if you are running it. It's a problem that WUs were manually canceled to eliminate the one with download result problems, some of these had the files and were actually running. gdf |
©2025 Universitat Pompeu Fabra