Message boards :
Server and website :
Server only allows one connection at a time from an IP? 30s cooldown is too short.
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 2 Level ![]() Scientific publications
|
I have a gigabit up/down fiber connection at both locations that run my computers (separate external IPs) and experience the same problem at both locations if one system is running the default 30s cooldown. Changing the default cooldown on the project server side to something longer like 5-10mins will largely solve this problem for everyone without the need for each user to run a custom client to work around the problem. Toni, please implement this on the project servers.
|
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ian, what are your current statistics? My two fastest machines are the two Linux boxes, each with 2x GTX 1660 Super or GTX 1660 Ti. I've got 321 valid tasks showing at the moment - since the start of the current run, probably. The runtimes are Max 8,994.80 sec 149 minutes Min 1,180.58 sec 19 minutes Avg 3,114.27 sec 51 minutes I'm guessing your fastest will be better than 19 minutes - maybe we ought to ask Toni to start with a 5 minute delay, and see how we go, before upping it to 10 minutes if we have to? I'm also worrying about what happens if we get more bad batches - these machines spit out the error tasks in just 3 seconds. Blow two of those in succession, and I'm left waiting for the next scheduler contact. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 2 Level ![]() Scientific publications
|
the shortest i've seen on my 2080ti (PL 225W) is about 800s (13.3mins) the longest i've seen on my 2080ti (PL 225W) is about 3200s (53.3mins) the shortest i've seen on my 2070 (PL 150W) is about 1200s (20mins) the longest i've seen on my 2070 (PL 150W) is about 6000s (1.6hrs) They could also allow more than 2 WU per GPU, and increase the max in-progress to reflect that. but really things like bad batches shouldn't be considered for figuring the cooldown IMO. treat that as an edge case. Plan for things to work normally most of the time.
|
|
Send message Joined: 13 Feb 14 Posts: 6 Credit: 1,068,161,100 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Please fix this issue, as it is clearly causing problems with receiving and sending work for many users. I am setting "No New Work" on this project until the issue is corrected. |
|
Send message Joined: 1 Jan 15 Posts: 1171 Credit: 12,662,148,501 RAC: 1,014,572 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
so far, I have had no connection problems. However, since this afternoon there are many of them. Should be fixed ASAP. |
|
Send message Joined: 22 May 20 Posts: 2 Credit: 22,042,067 RAC: 0 Level ![]() Scientific publications
|
Hi! I just experienced the same problem: I have two old HP Z220 with GTX-960 and GTX-750Ti, and both were standing still with files that wouldn't be downloaded, and no new tasks as dl was pending on the current task. It didn't help to abort the stalled downloads, or aborting the whole task - it was STILL complaining about those downloads!! :-( In the end it was nothing to do but to hit the "reset project" button on both of the machines, but that resulted in several hundred MB:s of downloading for each one! :-O Now both machines are up and running again - let's see how long it'll last. Hope admins will sort this problem out as soon as possible, before the server lines will be all bogged down. Happy crunching!!! //Gunnar |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
It didn't help to abort the stalled downloads, or aborting the whole task - it was STILL complaining about those downloads!!That's a different problem. These tasks were created before the http->https transition, so they still want to download through http, but that won't succeed. You have to abort the downloads, then restart the BOINC manager, or manually edit the client_state.xml file (see the Warning: bad tasks re-appearing in the download queue thread for details). |
|
Send message Joined: 22 May 20 Posts: 2 Credit: 22,042,067 RAC: 0 Level ![]() Scientific publications
|
Thanks for pointing that out! Didn't know about that problem as I'm pretty new on this project, and when the same thing happened on both my computers simultaneous, I thought it was related to this problem. :-) Hope them faulty tasks will be cleaned out from the database asap!! They are effectively locking up my machines and forcing me to reset the project manually. Happy crunching!!! |
©2026 Universitat Pompeu Fabra