Problems uploading completed work units

Message boards : Server and website : Problems uploading completed work units
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22235 - Posted: 8 Oct 2011, 18:04:23 UTC

It seems every work unit that my machine finishes takes multiple tries to upload. I am using BOINC 6.12.34. Anyone else having this same problem?

It is extremely frustrating because some of the retry times go to 8 or more hours. I literally have to sit at my computer and press "retry now" multiple times over a period of perhaps 10 or 20 minutes to "force" a finished work unit to complete its upload.

Are there any known issues with uploading?
ID: 22235 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22242 - Posted: 9 Oct 2011, 23:26:37 UTC - in response to Message 22235.  

I'm not aware of this issue. Are you using wireless?
You might want to report this to Berkeley.

Fortunately the uploads here allow you to continue from where you left off; if you uploaded 3MB then you would continue from 3MB (this is not normal on many Boinc projects that require you to restart).

I did notice that when trying to upload some CPU tasks (elsewhere) Boinc keeps adding the bandwidth used, so tasks go past the 100% mark. In such cases you definately have to select try again to get anywhere. That is a Boinc issue, and not related to this project.

Using report tasks immediately might help.
ID: 22242 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22245 - Posted: 10 Oct 2011, 10:55:14 UTC
Last modified: 10 Oct 2011, 11:05:07 UTC

I had uploads going into backoff tonight as well. After hitting the retry button a few times they managed to get through. Is there a comms issue on the server end (or anywhere in between)?

Downloads seems fine.

1043 GPUGRID 10-10-2011 09:33 PM Temporarily failed upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_0: HTTP error
1044 GPUGRID 10-10-2011 09:33 PM Backing off 13 min 56 sec on upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_0
1045 GPUGRID 10-10-2011 09:33 PM Temporarily failed upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_1: HTTP error
1046 GPUGRID 10-10-2011 09:33 PM Backing off 13 min 27 sec on upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_1
1047 GPUGRID 10-10-2011 09:33 PM Started upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_2
1048 GPUGRID 10-10-2011 09:33 PM Started upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_3
1049 GPUGRID 10-10-2011 09:34 PM Temporarily failed upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_2: HTTP error
1050 GPUGRID 10-10-2011 09:34 PM Backing off 16 min 3 sec on upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_2
1051 GPUGRID 10-10-2011 09:34 PM Temporarily failed upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_3: HTTP error
1052 GPUGRID 10-10-2011 09:34 PM Backing off 19 min 45 sec on upload of s0r162-TONI_SH2MS3-46-100-RND4787_0_3
ID: 22245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 22246 - Posted: 10 Oct 2011, 11:12:00 UTC - in response to Message 22245.  

We are not aware of any connectivity issue, but we'll keep an eye.
ID: 22246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22248 - Posted: 10 Oct 2011, 17:21:33 UTC - in response to Message 22242.  

I'm not aware of this issue. Are you using wireless?
You might want to report this to Berkeley.

Fortunately the uploads here allow you to continue from where you left off; if you uploaded 3MB then you would continue from 3MB (this is not normal on many Boinc projects that require you to restart).

I did notice that when trying to upload some CPU tasks (elsewhere) Boinc keeps adding the bandwidth used, so tasks go past the 100% mark. In such cases you definately have to select try again to get anywhere. That is a Boinc issue, and not related to this project.

Using report tasks immediately might help.
I am not on wireless. I am also only running GPU WUs. I'll report to Berkeley; however, I have only observed this behavior with GPUGrid.

Even with uploads picking up where they left off, if the upload fails enough times, the WU could be returned well after the 24 hour deadline for 1.5 credits even though it completed processing long before that. Gianni's WUs take my 460 about 18 hours. I could easily see retry fails returning the finished WUs over 24 hours after initial download.

I had uploads going into backoff tonight as well. After hitting the retry button a few times they managed to get through. Is there a comms issue on the server end (or anywhere in between)?

Downloads seems fine.
This matches my experience, and I was thinking that there might be server problems, too.

Download, for me, typically happens at something like 500kbps. However, upload seems to only happen at less than 100 kbps. Also, what I have observed when uploading finished work units is this: Typically, (among other files) there is a file that is on the order of 20 or more MB, and when this uploads, simultaneous upload of other files fails. In particular, it always seems to be one file that is always 855kB in size. It seems as if the server is not accepting more that one simultaneous connection from any one user, or perhaps the nature of the data in that file is somehow affecting the upload? It always seems to be only that 855kB file that fails.

The next time I run a few work units, I'll post the upload logs. It happened again yesterday. In fact, it is a regular occurrence for me.


ID: 22248 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 22249 - Posted: 10 Oct 2011, 21:17:21 UTC - in response to Message 22248.  
Last modified: 10 Oct 2011, 21:32:07 UTC

Well, for ADSL connections uploads are normally much slower than downloads. (Sometimes as low as 128 kbit/s). What *might* be happening is that large uploads hog the upload capacity, and cause smaller uploads to timeout.

I'm not sure how timeouts are handled, but I doubt it's a parameter in the server.

Edit: possibly these posts may be helpful-


http://climateprediction.net/board/viewtopic.php?p=94924#p94924
http://climateprediction.net/board/viewtopic.php?p=92119#p92119
ID: 22249 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22250 - Posted: 10 Oct 2011, 21:56:58 UTC - in response to Message 22249.  

It's possible this is a routing problem. Turn your router off for a couple of minutes and then back on. Then restart your system, or if your up to it just flush your DNS and renew ip info; (Start, run, CMD, ipconfig -release, ipconfig -flushdns, ipconfig -renew) - fixes many issues. ISP's tend to update their DNS routers and servers over weekends. The route from the USA to Europe could change by the minute, so if you are using old arp addresses this is likely to happen. Another thing is that some ISP's sneakily reduce your bandwith, number of connections and contention, restarting resets this in many cases.
ID: 22250 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22289 - Posted: 18 Oct 2011, 1:46:18 UTC - in response to Message 22250.  
Last modified: 18 Oct 2011, 1:47:26 UTC

I will most often turn off my "router" (a machine running SuSE Linux) over night, so first failures are sometimes attributable to the network not being available. However, when I turn the router back on, I also select all pending uploads and hit "retry now." I have never had a similar issue with other projects unless they are off-line for some reason. The problem seems unique, for me, to GPUGrid. I do run a local caching name server, however, from the logs below, I doubt that is the problem.

As I previously stated, it always seems to be, relative to each individual WU, the same file. Is there something special about this file that would cause the upload to fail?

Anyway, I ran three WUs over the weekend. After filtering out the error messages for not having the router running, here are the pertinent entries for the same problem:

10/15/2011 11:05:22 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_0
10/15/2011 11:05:22 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 11:05:41 | GPUGRID | Finished upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_0
10/15/2011 11:05:41 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_2
10/15/2011 11:06:00 | GPUGRID | Finished upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_2
10/15/2011 11:06:00 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_3
10/15/2011 11:06:10 | GPUGRID | Temporarily failed upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1: HTTP error
10/15/2011 11:06:10 | GPUGRID | Backing off 18 min 38 sec on upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 11:06:10 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_7
10/15/2011 11:06:11 | GPUGRID | Finished upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_3
10/15/2011 11:06:11 | GPUGRID | Finished upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_7
10/15/2011 11:06:13 | | Project communication failed: attempting access to reference site
10/15/2011 11:06:15 | | Internet access OK - project servers may be temporarily down.
10/15/2011 11:24:48 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 11:25:17 | GPUGRID | Temporarily failed upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1: HTTP error
10/15/2011 11:25:17 | GPUGRID | Backing off 21 min 14 sec on upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 11:25:21 | | Project communication failed: attempting access to reference site
10/15/2011 11:25:23 | | Internet access OK - project servers may be temporarily down.
10/15/2011 11:46:32 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 11:47:08 | GPUGRID | Temporarily failed upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1: HTTP error
10/15/2011 11:47:08 | GPUGRID | Backing off 53 min 24 sec on upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 11:47:10 | | Project communication failed: attempting access to reference site
10/15/2011 11:47:11 | | Internet access OK - project servers may be temporarily down.
10/15/2011 12:40:32 | GPUGRID | Started upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 12:40:39 | GPUGRID | Finished upload of p5-IBUCH_5_nwEGFR_110919-14-20-RND1733_0_1
10/15/2011 12:40:42 | GPUGRID | Sending scheduler request: To report completed tasks.
10/15/2011 12:40:42 | GPUGRID | Reporting 1 completed tasks, not requesting new tasks
10/16/2011 8:13:42 | GPUGRID | Temporarily failed upload of I172R1-GIANNI_KKFREE5-38-100-RND6106_1_4: HTTP error
10/16/2011 8:13:42 | GPUGRID | Backing off 17 min 54 sec on upload of I172R1-GIANNI_KKFREE5-38-100-RND6106_1_4
10/16/2011 8:13:46 | | Project communication failed: attempting access to reference site
10/16/2011 8:13:48 | | Internet access OK - project servers may be temporarily down.
10/16/2011 8:31:37 | GPUGRID | Started upload of I172R1-GIANNI_KKFREE5-38-100-RND6106_1_4
10/16/2011 8:33:24 | GPUGRID | Finished upload of I172R1-GIANNI_KKFREE5-38-100-RND6106_1_4
10/16/2011 19:53:35 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_0
10/16/2011 19:53:35 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 19:53:43 | GPUGRID | Finished upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_0
10/16/2011 19:53:43 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_2
10/16/2011 19:53:57 | GPUGRID | Finished upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_2
10/16/2011 19:53:57 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_3
10/16/2011 19:54:12 | GPUGRID | Temporarily failed upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1: HTTP error
10/16/2011 19:54:12 | GPUGRID | Backing off 15 min 44 sec on upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 19:54:12 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_4
10/16/2011 19:54:16 | | Project communication failed: attempting access to reference site
10/16/2011 19:54:17 | | Internet access OK - project servers may be temporarily down.
10/16/2011 19:54:23 | GPUGRID | Temporarily failed upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_3: HTTP error
10/16/2011 19:54:23 | GPUGRID | Backing off 13 min 46 sec on upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_3
10/16/2011 19:54:23 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_7
10/16/2011 19:54:24 | GPUGRID | Finished upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_7
10/16/2011 19:54:26 | | Project communication failed: attempting access to reference site
10/16/2011 19:54:27 | | Internet access OK - project servers may be temporarily down.
10/16/2011 20:00:45 | GPUGRID | Finished upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_4
10/16/2011 20:08:10 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_3
10/16/2011 20:08:20 | GPUGRID | Finished upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_3
10/16/2011 20:09:58 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:10:26 | GPUGRID | Temporarily failed upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1: HTTP error
10/16/2011 20:10:26 | GPUGRID | Backing off 32 min 50 sec on upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:10:30 | | Project communication failed: attempting access to reference site
10/16/2011 20:10:32 | | Internet access OK - project servers may be temporarily down.
10/16/2011 20:10:52 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:11:17 | GPUGRID | Temporarily failed upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1: HTTP error
10/16/2011 20:11:17 | GPUGRID | Backing off 1 hr 12 min 11 sec on upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:11:21 | | Project communication failed: attempting access to reference site
10/16/2011 20:11:22 | | Internet access OK - project servers may be temporarily down.
10/16/2011 20:11:24 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:11:50 | GPUGRID | Temporarily failed upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1: HTTP error
10/16/2011 20:11:50 | GPUGRID | Backing off 1 hr 21 min 49 sec on upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:11:53 | | Project communication failed: attempting access to reference site
10/16/2011 20:11:54 | | Internet access OK - project servers may be temporarily down.
10/16/2011 20:12:02 | GPUGRID | Started upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:12:07 | GPUGRID | Finished upload of s0r334-TONI_SH2MS3-58-100-RND1445_0_1
10/16/2011 20:12:11 | GPUGRID | Sending scheduler request: To report completed tasks.
10/16/2011 20:12:11 | GPUGRID | Reporting 1 completed tasks, not requesting new tasks
10/16/2011 20:12:13 | GPUGRID | Scheduler request completed
ID: 22289 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22290 - Posted: 18 Oct 2011, 2:02:40 UTC - in response to Message 22249.  

Well, for ADSL connections uploads are normally much slower than downloads. (Sometimes as low as 128 kbit/s). What *might* be happening is that large uploads hog the upload capacity, and cause smaller uploads to timeout.

I'm not sure how timeouts are handled, but I doubt it's a parameter in the server.

Edit: possibly these posts may be helpful-


http://climateprediction.net/board/viewtopic.php?p=94924#p94924
http://climateprediction.net/board/viewtopic.php?p=92119#p92119

So the first post has the admin removing two 0 byte files from the server of the same name - I'm not saying this is the problem, just pointing out that the admin found "something" on the server.

Also, though the first "time-outs" appear when simultaneous uploads are in progress - as in both threads - to me, that really does not explain why I have to hit retry on the "problem file" multiple times even when it is the only file that remains to upload.

My experience has been that even after all files have uploaded except the problem file, the "problem file" will still experience difficulty even when it is the only file uploading. Though there is an _02 file that had a problem when one of my WUs was uploading, it seems to most often be the _01 file that has the problem as with MarkJ's post above.
ID: 22290 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 22294 - Posted: 18 Oct 2011, 12:52:59 UTC - in response to Message 22290.  

Have you tried to add these to cc_config.xml ?




900





ID: 22294 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22343 - Posted: 23 Oct 2011, 1:40:58 UTC - in response to Message 22294.  

Have you tried to add these to cc_config.xml ?


<cc_config>
<options>
<http_transfer_timeout>900</http_transfer_timeout>
</options>
</cc_config>




I've searched the machine, and do not find this file. I am running Window 7. Where should I put the file?

Thanks.
ID: 22343 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Dagorath

Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22345 - Posted: 23 Oct 2011, 2:31:44 UTC - in response to Message 22343.  

You have to create the cc_config.xml file as it's not one of the files BOINC creates for you. It is kept in the BOINC data directory. In BOINC manager, open the Event Log (Messages if running an older BOINC),. scroll to the top of Event log, 5 or 10 lines from the top it will tell you the path to the data directory. If it doesn't then that line has "expired". In that case stop BOINC client, restart the client, open the Event Log and you'll see the location of the data directory.

To create the cc_config.xml file, start Notepad (not Wordpad, Word or an XML editor, just Notepad), copy and paste the XML code Toni gave you into Notepad, save the file in UTF-8 or ASCII format, exit Notepad. Doublecheck the name of the file to make sure it is saved as cc_config.xml, sometimes Notepad tries to add the .txt extension. Then in BOINC manager click Advanced -> Read Config File. Then look in Event Log, at the bottom of the messages, where it should say something like "HTTP transfer timeout: 900". Such a message indicates you created and saved the file correctly.
ID: 22345 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22351 - Posted: 23 Oct 2011, 11:46:23 UTC - in response to Message 22345.  
Last modified: 23 Oct 2011, 11:46:54 UTC

From FAQ - Best configurations for GPUGRID:
    For Vista and Win7 create the file in this folder, C:\ProgramData\BOINC

    Add the following lines:
      <cc_config>
      <options>
      <report_results_immediately>1</report_results_immediately>
      <http_transfer_timeout>900</http_transfer_timeout>
      </options>
      </cc_config>


Boinc has to be closed then opened again for the changes to take effect, reading does just that, reads them but does not implement the changes.

PS. The cc_config.xml file is not there by default in the Windows versions of Boinc, however it is there by default in Linux versions, and comes with a list of options and log flags.

ID: 22351 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22379 - Posted: 27 Oct 2011, 0:42:04 UTC - in response to Message 22351.  

I created the file in the c:\programdata\boinc directory with this content:

<cc_config>
<options>
<report_results_immediately>1</report_results_immediately>
<http_transfer_timeout>900</http_transfer_timeout>
</options>
</cc_config>

I then saved the file as UTF-8 and restared boinc. The event log had a line that there was a missing "start" tag in the file. Apparently, someone has traced this to saving the file as UTF-8, so I saved again as ANSI.

On restarting BOINC, I see nothing indicating that the http transfer timeout has been set to 900 seconds, however, there is a line indicating report results immediately has been set. So, should the line about the transfer timeout appear?

Thanks for the help.

Matthew
ID: 22379 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22384 - Posted: 27 Oct 2011, 7:49:23 UTC - in response to Message 22379.  
Last modified: 27 Oct 2011, 8:18:36 UTC

The cc_config http_transfer_timeout option was introduced with Boinc version 6.12.27, so it should work with your 6.12.34. Your cc_config.xml contents look fine. When I tested using http_transfer_timeout the Event Log did not report anything either.
By default the timeout for file transfer is 300seconds, so after 5min of connection inactivity a transfer attempt would abort.
ID: 22384 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile WirelessDude

Send message
Joined: 3 Aug 11
Posts: 21
Credit: 189,614,059
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwat
Message 22405 - Posted: 28 Oct 2011, 18:10:42 UTC - in response to Message 22235.  

Just so that you don't feel all alone on the issue, I have a WU now and then not immediately uploading. But, it does eventually upload on its own...
---
WirelessDude
ID: 22405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22414 - Posted: 29 Oct 2011, 18:29:25 UTC

Yes, the WUs do eventually upload. I tend to run my computers just for the weekend, and periodically during the week. What I find annoying is that if there are failed uploads when I am trying to shut down for the weekend, I will sometimes have to hit retry numerous times and sometimes it takes 20 minutes to shut my machines down because the WU fails to complete its upload.

I am now running with the latest settings as suggested, and the problem is still evident. In addition, it looks like setting the http timeout to 900 seconds has not been helpful as the _1 file timed out in approximately 33-seconds.

Here's the latest log of the upload process for a WU that completed this morning:

10/29/2011 11:19:31 | GPUGRID | Computation for task s0r694-TONI_SH2MS3-66-100-RND2873_0 finished
10/29/2011 11:19:42 | GPUGRID | Starting task I63R0-GIANNI_KKFREE5-56-100-RND9456_0 using acemdlong version 615
10/29/2011 11:19:43 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_0
10/29/2011 11:19:43 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_1
10/29/2011 11:19:47 | GPUGRID | Finished upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_0
10/29/2011 11:19:47 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_2
10/29/2011 11:20:00 | GPUGRID | Finished upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_2
10/29/2011 11:20:00 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_3
10/29/2011 11:20:08 | GPUGRID | Finished upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_3
10/29/2011 11:20:08 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_4
10/29/2011 11:20:16 | GPUGRID | Temporarily failed upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_1: HTTP error
10/29/2011 11:20:16 | GPUGRID | Backing off 15 min 58 sec on upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_1

10/29/2011 11:20:16 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_7
10/29/2011 11:20:17 | GPUGRID | Finished upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_7
10/29/2011 11:20:19 | | Project communication failed: attempting access to reference site
10/29/2011 11:20:20 | | Internet access OK - project servers may be temporarily down.
10/29/2011 11:25:57 | GPUGRID | Finished upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_4
10/29/2011 11:36:14 | GPUGRID | Started upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_1
10/29/2011 11:36:24 | GPUGRID | Finished upload of s0r694-TONI_SH2MS3-66-100-RND2873_0_1
10/29/2011 11:36:27 | GPUGRID | Sending scheduler request: To report completed tasks.
10/29/2011 11:36:27 | GPUGRID | Reporting 1 completed tasks, not requesting new tasks
10/29/2011 11:36:29 | GPUGRID | Scheduler request completed


Note that as previously reported, the _1 file failed its initial upload attempt. AFAIK, I've done everything on my end to resolve the issue.

Personally, I have 20-years experience programming, and if it were me, I would be looking at the server logs to see if there is an indication of a problem on the server. The fact that it happens 99% of the time with the _1 file I would suspect is not coincidental and is a good clue to finding the issue.

If you have not done so already, please humor me and check the server and/or set up some monitoring to debug this. I'm not the only one who is experiencing this issue; perhaps as in the case of the Climate Prediction project, there is something unexpected happening on the server.

Thanks.
ID: 22414 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22421 - Posted: 30 Oct 2011, 3:30:19 UTC
Last modified: 30 Oct 2011, 3:32:59 UTC

In your cc_config file, between the <options> tags try adding the following:

<http_1_0>1</http_1_0>

Set this flag to use HTTP 1.0 instead of 1.1 (this may be needed with some proxies).

You'll need to re-read the config file or restart BOINC to pick up the change.

If that doesn't work we'll probably have to turn on the debug flags to see what sort of error response its coming back with. Let us know how this goes.

I don't have access to the server logs, but Toni and GDF would be able to see them.
BOINC blog
ID: 22421 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22589 - Posted: 26 Nov 2011, 18:45:45 UTC - in response to Message 22421.  

In your cc_config file, between the <options> tags try adding the following:

<http_1_0>1</http_1_0>

Set this flag to use HTTP 1.0 instead of 1.1 (this may be needed with some proxies).

You'll need to re-read the config file or restart BOINC to pick up the change.

If that doesn't work we'll probably have to turn on the debug flags to see what sort of error response its coming back with. Let us know how this goes.

I don't have access to the server logs, but Toni and GDF would be able to see them.

I am not using a proxy. So, it sounds like this will not help?? Or should I do this anyway just to see if it will help?

Being someone in the software industry, I would take the action that you are suggesting, i.e., set the debug flags on the server. Why? Because multiple people are experiencing this as noted by others who have posted to the thread. To me, as a software industry professional, it makes no sense that it is on the user's side when multiple people are experiencing the problem.

So, please let me know whether it is worth it for me to adjust my config file again even though I am not using a proxy and I am not the only one experiencing the problem. If you think it will help, I'll do it. Everything that was suggested, to this point, I have implemented and it has not helped.

I've a wu on my machine right now that is showing the same problem.
ID: 22589 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 22594 - Posted: 27 Nov 2011, 1:31:23 UTC - in response to Message 22589.  

Being someone in the software industry, I would take the action that you are suggesting, i.e., set the debug flags on the server. Why? Because multiple people are experiencing this as noted by others who have posted to the thread. To me, as a software industry professional, it makes no sense that it is on the user's side when multiple people are experiencing the problem.

<snipped>

I've a wu on my machine right now that is showing the same problem.


The debug flags I referred to are on your client, using the same cc_config file. As I mentioned before I don't have access to the server logs, thats something the project admins (GDF, Toni and Ignassi) would have to do.

The GPUgrid server(s) are usually pretty stable. We've had problems in the past when they have run out of disk space. The error message in your BOINC client logs would tell you that. As it is its saying the fairly generic HTTP error one.

One other thing to try (apart from shutting down BOINC and restarting it) is to flush your DNS cache. At a command prompt type "ipconfig /flushdns". This tells windows to flush its DNS cache so that it has to lookup the DNS again instead of using its local cache.
ID: 22594 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Server and website : Problems uploading completed work units

©2026 Universitat Pompeu Fabra