Message boards :
Server and website :
http error with HIV workunits
Message board moderation
| Author | Message |
|---|---|
HydropowerSend message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
I have had two issues today where a HIV workunit (635688 and 582818) stopped downloading with a http error, on all files in the package afaik. Even after letting it run its course it did not download. Eventually I had to cancel the workunits to keep going. Is this a workunit related issue ? The connection with the server was fine, other packets right before it and after it downloaded fine. What is the best course of action in cases like these ? |
HydropowerSend message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Two more cases today. I have noticed that the issue seemingly is caused by THREE download threads being started simultaneously whereas normally only TWO threads are allowed. Hope this provides some insight into the issue. |
|
Send message Joined: 21 Dec 08 Posts: 51 Credit: 26,320,167 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I just had to abort transfer on 2 HIV workunits. Stalled in download with HTTP error. 7/26/2009 2:16:57 AM GPUGRID Temporarily failed download of 9-KASHIF_HIVPR_dim_ba5-22-9-KASHIF_HIVPR_dim_ba5-21-100-RND0604_2: HTTP error 7/26/2009 2:17:58 AM GPUGRID Temporarily failed download of 9-KASHIF_HIVPR_dim_ba5-22-9-KASHIF_HIVPR_dim_ba5-21-100-RND0604_1: HTTP error |
[AF>HFR>RR] Jim PROFITSend message Joined: 3 Jun 07 Posts: 107 Credit: 31,331,137 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Same for me today. |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
And you can add me to the list. Like the other guys its been doing this for the last couple of days. 26/07/2009 6:55:30 PM GPUGRID [error] File 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_1 has wrong size: expected 1210492, got 0 26/07/2009 6:55:30 PM GPUGRID Started download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_1 26/07/2009 6:55:30 PM GPUGRID [error] File 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_2 has wrong size: expected 1210492, got 0 26/07/2009 6:55:30 PM GPUGRID Started download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_2 26/07/2009 6:55:31 PM GPUGRID Temporarily failed download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_1: HTTP error 26/07/2009 6:55:31 PM GPUGRID Backing off 1 min 0 sec on download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_1 26/07/2009 6:55:31 PM GPUGRID Temporarily failed download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_2: HTTP error 26/07/2009 6:55:31 PM GPUGRID Backing off 1 min 0 sec on download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_2 26/07/2009 6:55:31 PM GPUGRID [error] File 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_3 has wrong size: expected 410624, got 0 26/07/2009 6:55:31 PM GPUGRID Started download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_3 26/07/2009 6:55:31 PM GPUGRID [error] File 76-KASHIF_HIVPR_dim_ba5-24-pdb_file has wrong size: expected 3442503, got 0 26/07/2009 6:55:31 PM GPUGRID Started download of 76-KASHIF_HIVPR_dim_ba5-24-pdb_file 26/07/2009 6:55:32 PM GPUGRID Temporarily failed download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_3: HTTP error 26/07/2009 6:55:32 PM GPUGRID Backing off 1 min 0 sec on download of 76-KASHIF_HIVPR_dim_ba5-24-76-KASHIF_HIVPR_dim_ba5-23-100-RND8018_3 26/07/2009 6:55:32 PM GPUGRID Temporarily failed download of 76-KASHIF_HIVPR_dim_ba5-24-pdb_file: HTTP error 26/07/2009 6:55:32 PM GPUGRID Backing off 1 min 0 sec on download of 76-KASHIF_HIVPR_dim_ba5-24-pdb_file 26/07/2009 6:55:32 PM GPUGRID [error] File 76-KASHIF_HIVPR_dim_ba5-24-par_file has wrong size: expected 8402771, got 0 26/07/2009 6:55:32 PM GPUGRID Started download of 76-KASHIF_HIVPR_dim_ba5-24-par_file 26/07/2009 6:55:33 PM GPUGRID Temporarily failed download of 76-KASHIF_HIVPR_dim_ba5-24-par_file: HTTP error 26/07/2009 6:55:33 PM GPUGRID Backing off 1 min 0 sec on download of 76-KASHIF_HIVPR_dim_ba5-24-par_file 26/07/2009 6:55:33 PM GPUGRID [error] File 76-KASHIF_HIVPR_dim_ba5-24-myfile.enc has wrong size: expected 872, got 0 26/07/2009 6:55:33 PM GPUGRID Started download of 76-KASHIF_HIVPR_dim_ba5-24-myfile.enc 26/07/2009 6:55:34 PM GPUGRID Temporarily failed download of 76-KASHIF_HIVPR_dim_ba5-24-myfile.enc: HTTP error 26/07/2009 6:55:34 PM GPUGRID Backing off 1 min 0 sec on download of 76-KASHIF_HIVPR_dim_ba5-24-myfile.enc BOINC blog |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
We stopped some HIV WUs two days ago, but they left behind remnants. Please abort them at will. |
HydropowerSend message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Thanks. I'll have to because I'm being bombarded with them now.. :( and they block processing. |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
We'll try to cancel them server-side asap, thanks for your patience. |
HydropowerSend message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
Thanks for that, much appreciated. |
|
Send message Joined: 6 May 09 Posts: 34 Credit: 443,507,669 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
We'll try to cancel them server-side asap, thanks for your patience. I also have had 20 WUs errors in download , stuck in mid download , error in computation E.T.C Typical of is below 27/07/2009 10:44:29 a.m. GPUGRID Finished download of 91-KASHIF_HIVPR_sub_so_ba1-5-91-KASHIF_HIVPR_sub_so_ba1-4-100-RND7343_2 27/07/2009 10:44:29 a.m. GPUGRID Started download of 91-KASHIF_HIVPR_sub_so_ba1-5-91-KASHIF_HIVPR_sub_so_ba1-4-100-RND7343_3 27/07/2009 10:45:17 a.m. GPUGRID Finished download of 91-KASHIF_HIVPR_sub_so_ba1-5-91-KASHIF_HIVPR_sub_so_ba1-4-100-RND7343_3 27/07/2009 10:45:17 a.m. GPUGRID Started download of 91-KASHIF_HIVPR_sub_so_ba1-5-pdb_file 27/07/2009 10:46:44 a.m. GPUGRID Finished download of 77-GIANNI_BINDX119-29-par_file 27/07/2009 10:46:44 a.m. GPUGRID Started download of 91-KASHIF_HIVPR_sub_so_ba1-5-psf_file 27/07/2009 10:46:44 a.m. GPUGRID [error] MD5 check failed for 77-GIANNI_BINDX119-29-par_file 27/07/2009 10:46:44 a.m. GPUGRID [error] expected c2605a4451ad8240f29215f84cb6de7e, got d8298542b27b3e9c7a3396c23444223c 27/07/2009 10:46:44 a.m. GPUGRID [error] Checksum or signature error for 77-GIANNI_BINDX119-29-par_file plus other strange behavoiur. is it all sorted out now? Ross |
|
Send message Joined: 2 Mar 09 Posts: 159 Credit: 13,639,818 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
7/27/2009 5:21:29 AM GPUGRID [error] File 35-KASHIF_HIVPR_dim_ba3-26-35-KASHIF_HIVPR_dim_ba3-25-100-RND7138_1 has wrong size: expected 1210492, got 0 7/27/2009 5:21:29 AM GPUGRID Started download of 35-KASHIF_HIVPR_dim_ba3-26-35-KASHIF_HIVPR_dim_ba3-25-100-RND7138_1 7/27/2009 5:21:29 AM GPUGRID [error] File 35-KASHIF_HIVPR_dim_ba3-26-35-KASHIF_HIVPR_dim_ba3-25-100-RND7138_2 has wrong size: expected 1210492, got 0 i know you guys are working on it... am i'm about to try to get new work, i knew something was wrong with my boinc as my stomach got me up, 5 hrs early. as of 5:30 am est, it's still not dling. and my stomach is being fed. pizza hut lasagna; hey, 10 minutes later, 1 task got through. i guess gpugrid got jealous of seti@home. |
[AF>HFR>RR] Jim PROFITSend message Joined: 3 Jun 07 Posts: 107 Credit: 31,331,137 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]()
|
And again this morning, but just on one host! Always the same. Can't monitor all the time. Please do something. |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
We cancelled the faulty WUs. Hopefully the change propagates fast to your clients. BTW, if someone has still faulty downloads, or notices that the faulty DLs were removed without intervention, can you please report here, so that we can figure out how quickly clients get informed of such things? Thanks |
|
Send message Joined: 2 Mar 09 Posts: 159 Credit: 13,639,818 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
yep, change seems to have fixed the servers... |
|
Send message Joined: 6 May 09 Posts: 34 Credit: 443,507,669 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
have 2 WUS almosted completed test will be when they are replaced hopefully the Ge Fouce will conqueror Cheers Ross |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
This would have to happen while I'm on vacation. Just got home to find 2 GPUs stuck on these bad WUs :-( |
|
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
Uhm.. that means that the clients do not really obey cancellation requests... |
BeyondSend message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I had to cancel them both manually. They probably didn't cancel because they were stuck with download errors. It's way worse than a normally bad WU though because they took the GPUs out of action until I got home to intervene. |
|
Send message Joined: 16 Aug 08 Posts: 87 Credit: 1,248,879,715 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ok, from my log files 7/26/2009 12:03:03 AM|GPUGRID|Sending scheduler request: To report completed tasks. Requesting 82299 seconds of work, reporting 3 completed tasks 7/26/2009 12:03:08 AM|GPUGRID|Scheduler request completed: got 1 new tasks 7/26/2009 12:03:10 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-LICENSE 7/26/2009 12:03:10 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-COPYRIGHT 7/26/2009 12:03:11 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-LICENSE: HTTP error 7/26/2009 12:03:11 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-LICENSE 7/26/2009 12:03:11 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-COPYRIGHT: HTTP error 7/26/2009 12:03:11 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-COPYRIGHT 7/26/2009 12:03:11 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_1 7/26/2009 12:03:11 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_2 7/26/2009 12:03:13 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_1: HTTP error 7/26/2009 12:03:13 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_1 7/26/2009 12:03:13 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_2: HTTP error 7/26/2009 12:03:13 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_2 7/26/2009 12:03:13 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_3 7/26/2009 12:03:13 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-pdb_file 7/26/2009 12:03:14 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_3: HTTP error 7/26/2009 12:03:14 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-3-KASHIF_HIVPR_dim_ba5-24-100-RND4733_3 7/26/2009 12:03:14 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-pdb_file: HTTP error 7/26/2009 12:03:14 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-pdb_file 7/26/2009 12:03:14 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-psf_file 7/26/2009 12:03:14 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-par_file 7/26/2009 12:03:14 AM|Docking@Home|Sending scheduler request: To fetch work. Requesting 120956 seconds of work, reporting 0 completed tasks 7/26/2009 12:03:15 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-psf_file: HTTP error 7/26/2009 12:03:15 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-psf_file 7/26/2009 12:03:15 AM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-myfile.enc 7/26/2009 12:03:17 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-myfile.enc: HTTP error 7/26/2009 12:03:17 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-myfile.enc 7/26/2009 12:03:18 AM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-par_file: HTTP error 7/26/2009 12:03:18 AM|GPUGRID|Backing off 1 min 0 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-par_file my logfile is full of messages like 7/28/2009 12:06:37 PM|GPUGRID|Started download of 3-KASHIF_HIVPR_dim_ba5-25-LICENSE 7/28/2009 12:06:38 PM|GPUGRID|Temporarily failed download of 3-KASHIF_HIVPR_dim_ba5-25-LICENSE: HTTP error 7/28/2009 12:06:38 PM|GPUGRID|Backing off 1 hr 22 min 54 sec on download of 3-KASHIF_HIVPR_dim_ba5-25-LICENSE taken just now. I went in to abort the work units, but they were not on the Tasks page, so I manually aborted the transfers. That seemed to clean things up. Interestingly, my quad core linux box got 07/28/09 12:21:48|GPUGRID|Sending scheduler request: To fetch work. Requesting 222626 seconds of work, reporting 0 completed tasks 07/28/09 12:21:58|GPUGRID|Scheduler request completed: got 5 new tasks 07/28/09 12:22:00|GPUGRID|Started download of acemd_6.66_x86_64-pc-linux-gnu__cuda 07/28/09 12:22:00|GPUGRID|Started download of libcufft.so.2.1 07/28/09 12:22:48|GPUGRID|Finished download of libcufft.so.2.1 07/28/09 12:22:48|GPUGRID|Started download of libcudart.so.2.1 5 tasks for one GPU? Come on, it was bad enough when it gave me 4, but 5?? |
StoneagemanSend message Joined: 25 May 09 Posts: 224 Credit: 34,057,374,498 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yet more failed downloads, starting from 29th 22:22hrs. Really is a pita! That's using client 6.6.36 |
©2026 Universitat Pompeu Fabra