Unable to load units

Message boards : Server and website : Unable to load units
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
jiipee

Send message
Joined: 4 Jun 15
Posts: 19
Credit: 8,949,558,416
RAC: 340,932
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 53374 - Posted: 19 Dec 2019, 11:04:33 UTC

Some WU's have downloaded veeeeery slowly. This seems to affect only my single Windows box, not Linux boxes (three Linuxes on same network with Win10).

My Win10 usually works like this:

Task / WU / Sent / Reported / Status / Run time / CPU time / Credit / App

21568109 / 16907517 / 16 Dec 2019 | 17:48:59 UTC / 16 Dec 2019 | 21:19:01 UTC / Completed and validated / 7,958.60 / 7,862.38 / 91,500.00 / New version of ACEMD v2.10 (cuda101)

but when downloading starts to take time, tasks start to look like this:

21568706 / 16908048 / 17 Dec 2019 | 5:42:39 UTC / 19 Dec 2019 | 7:42:10 UTC / Completed and validated / 7,959.27 / 7,860.33 / 61,000.00 / New version of ACEMD v2.10 (cuda101)

If I recall correctly, suspended download retry time is something like several hours. If I manually request it to continue, it usually advances normally. Is there a way to edit download properties somewhere to get the retry interval shorter?

Br, Jukka
ID: 53374 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1424
Credit: 9,189,946,190
RAC: 8
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53375 - Posted: 19 Dec 2019, 15:01:41 UTC - in response to Message 53374.  

Is there a way to edit download properties somewhere to get the retry interval shorter?


No. The server sets the backoff interval. Not the client.
ID: 53375 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53376 - Posted: 19 Dec 2019, 16:58:00 UTC - in response to Message 53375.  

Is there a way to edit download properties somewhere to get the retry interval shorter?

No. The server sets the backoff interval. Not the client.

That's a different backoff (scheduler for update / request work, not download).

The client sets the download backoff, and it starts small with the first failure, then gets longer after each attempt: the idea is to reduce congestion if it's a general problem and everyone starts hammering on the server at once.

To reach an elapsed time of 50 hours instead of 4 hours, there must have been multiple failures (they'll be in your Event Log).

One observation I have with multiple Windows machines is that the GPUGrid server doesn't like it when different computers try to connect from the same LAN / IP address in quick succession. It only takes a couple of minutes to clear, but it's very annoying: just had it at the beginning of this forum session. I can only suggest that you check the Win 10 download status any time you happen to be passing: click update on any stalled download, and keep clicking until all the stalled files have transferred.
ID: 53376 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1424
Credit: 9,189,946,190
RAC: 8
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53377 - Posted: 20 Dec 2019, 1:36:45 UTC - in response to Message 53376.  

One observation I have with multiple Windows machines is that the GPUGrid server doesn't like it when different computers try to connect from the same LAN / IP address in quick succession. It only takes a couple of minutes to clear, but it's very annoying: just had it at the beginning of this forum session.

Thanks for that observation. And the client correction. I wondered if it was just me or does everyone have that problem of multiple hosts trying to connect from the same LAN IP address range and the schedulers just ignoring the host request.

I've learned to wait a couple of minutes after a host has had the double scheduler connect and gone quiescent again before I attempt an update on another host.
ID: 53377 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53378 - Posted: 20 Dec 2019, 17:47:32 UTC - in response to Message 53377.  

One observation I have with multiple Windows machines is that the GPUGrid server doesn't like it when different computers try to connect from the same LAN / IP address in quick succession. It only takes a couple of minutes to clear, but it's very annoying: just had it at the beginning of this forum session.

Thanks for that observation. And the client correction. I wondered if it was just me or does everyone have that problem of multiple hosts trying to connect from the same LAN IP address range and the schedulers just ignoring the host request.

I've learned to wait a couple of minutes after a host has had the double scheduler connect and gone quiescent again before I attempt an update on another host.
+1
ID: 53378 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1171
Credit: 12,662,148,501
RAC: 1,014,572
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 53379 - Posted: 20 Dec 2019, 17:54:01 UTC - in response to Message 53378.  

One observation I have with multiple Windows machines is that the GPUGrid server doesn't like it when different computers try to connect from the same LAN / IP address in quick succession. It only takes a couple of minutes to clear, but it's very annoying: just had it at the beginning of this forum session.

Thanks for that observation. And the client correction. I wondered if it was just me or does everyone have that problem of multiple hosts trying to connect from the same LAN IP address range and the schedulers just ignoring the host request.

I've learned to wait a couple of minutes after a host has had the double scheduler connect and gone quiescent again before I attempt an update on another host.
+1

+1
ID: 53379 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Killersocke

Send message
Joined: 18 Oct 13
Posts: 53
Credit: 406,647,419
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53395 - Posted: 27 Dec 2019, 7:47:05 UTC - in response to Message 51498.  

Same here...

27.12.2019 08:39:38 | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU
27.12.2019 08:39:41 | | Project communication failed: attempting access to reference site
27.12.2019 08:39:42 | | Internet access OK - project servers may be temporarily down.
27.12.2019 08:40:00 | GPUGRID | Scheduler request failed: Couldn't connect to server
27.12.2019 08:40:01 | | Project communication failed: attempting access to reference site
27.12.2019 08:40:02 | | Internet access OK - project servers may be temporarily down.
ID: 53395 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1171
Credit: 12,662,148,501
RAC: 1,014,572
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 53396 - Posted: 27 Dec 2019, 12:58:38 UTC - in response to Message 53395.  

Same here...

The GPUGRID server as well as the website were down this morning for some short time.
But everything okay since then.
ID: 53396 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
jjch

Send message
Joined: 10 Nov 13
Posts: 101
Credit: 15,776,211,122
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53400 - Posted: 27 Dec 2019, 21:48:53 UTC - in response to Message 52624.  

Only a few of the Xeon E3 - V2, V5 and V6's include the integrated Intel P4000, HD P530 or Iris Pro P580 graphics.

https://www.intel.com/content/dam/www/public/us/en/documents/guides/hd-graphics-p530-p580-performance-guide.pdf
ID: 53400 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 595
Credit: 13,083,686,510
RAC: 2,983,710
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53629 - Posted: 9 Feb 2020, 12:10:05 UTC - in response to Message 53379.  

One observation I have with multiple Windows machines is that the GPUGrid server doesn't like it when different computers try to connect from the same LAN / IP address in quick succession. It only takes a couple of minutes to clear, but it's very annoying: just had it at the beginning of this forum session.

Thanks for that observation. And the client correction. I wondered if it was just me or does everyone have that problem of multiple hosts trying to connect from the same LAN IP address range and the schedulers just ignoring the host request.

I've learned to wait a couple of minutes after a host has had the double scheduler connect and gone quiescent again before I attempt an update on another host.

+1

+1

+1
It affects not only to scheduler requests, but also to the other GPUGrid's Web services.

Example: One of my hosts has recently asked for new tasks, and I try to access GPUGrid's webpage from another host in the same network, then access will not be possible until a certain delay is past.
Possibly it is due to some DoS (Denial of Service attack) protection running on GPUGrid's server or firewall.

It's been commented in some other posts like this:
It's not just the speed.
There's some DDOS prevention algorithm in operation, because my hosts gets blocked if they try to contact the server one by one in rapid succession (from the same public IP address).

What can we do to mitigate this effect???

There's no easy way to fix this in our end.


Edit:
Second example: I've had to publish this post twice, with a pause in between. First time was not attended by server...
ID: 53629 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Server and website : Unable to load units

©2026 Universitat Pompeu Fabra