New D3RBanditTest workunits

Message boards : News : New D3RBanditTest workunits
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · Next

AuthorMessage
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56885 - Posted: 21 May 2021, 15:32:48 UTC

Glad to see a large batch of these new units coming out. should keep us well fed for another few weeks.

my fast GPUs really like these long units.

~10hrs on a 2080ti (225W)
~13hrs on a 2080 (185W)
~17hrs on a 2070 (150W)
~27hrs on a 1660S (100W)
ID: 56885 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 869
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56886 - Posted: 21 May 2021, 18:11:39 UTC - in response to Message 56885.  

Glad to see a large batch of these new units coming out.

I suspect they still won't run on Ampere cards ?
ID: 56886 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56887 - Posted: 21 May 2021, 18:29:38 UTC - in response to Message 56886.  

nope. still CUDA 10.0/10.1 = no Ampere support. still waiting for that CUDA 11.1+ app.

note, the compatibility issue is with the application, not the tasks. there are many types of tasks here (MDAD, Pocket Discovery, D3RBandit, etc) but they all use the same acemd3 app.

keep tabs on the applications the project has available here: https://www.gpugrid.net/apps.php

unless you see "cuda111" or "cuda112" listed, don't count on Ampere support
ID: 56887 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 869
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56888 - Posted: 21 May 2021, 19:18:56 UTC - in response to Message 56887.  

unless you see "cuda111" or "cuda112" listed, don't count on Ampere support

thanks for the information; what a pitty :-(
ID: 56888 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 22 May 20
Posts: 110
Credit: 115,525,136
RAC: 345
Level
Cys
Scientific publications
wat
Message 56892 - Posted: 26 May 2021, 15:17:09 UTC

Does anyone know by any chance, what the current batch of tasks (D3RBandit) are all about? What do we compute?

And any pointer as to what the nmax parameter indicates? I have seen max 1000/2000 and 5000 WUs, but all are taking pretty much the same time to compute.

Thx
ID: 56892 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pop Piasa
Avatar

Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 56893 - Posted: 26 May 2021, 15:43:29 UTC - in response to Message 56887.  

Lately I've caught lots of WUs that have bounced off one or more hosts running Ampere GPUs. There is much untapped resource, even with the new anti-mining feature.
"Together we crunch
To check out a hunch
And wish all our credit
Could just buy us lunch"


Piasa Tribe - Illini Nation
ID: 56893 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56894 - Posted: 26 May 2021, 15:51:09 UTC - in response to Message 56893.  

Lately I've caught lots of WUs that have bounced off one or more hosts running Ampere GPUs. There is much untapped resource, even with the new anti-mining feature.


I gave up waiting for Ampere support here. It was clear the project devs have it at lowest priority (every time I asked about it, I was ignored, even when they were responsive about any other topic).

I traded my 3070 for a 2080ti and moved on.
ID: 56894 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Pop Piasa
Avatar

Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 56899 - Posted: 26 May 2021, 22:33:04 UTC - in response to Message 56894.  

Ian (and Bozz4science), something else we're apparently not allowed to ask is what it is we are crunching.

I was so bold as to put it on the wish list, no reply. I don't mean that anyone should reveal proprietary info, just a general categorization of the type of research it is, as Tony did on the previous methods project.
"Together we crunch
To check out a hunch
And wish all our credit
Could just buy us lunch"


Piasa Tribe - Illini Nation
ID: 56899 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 22 May 20
Posts: 110
Credit: 115,525,136
RAC: 345
Level
Cys
Scientific publications
wat
Message 56903 - Posted: 27 May 2021, 14:42:37 UTC
Last modified: 27 May 2021, 14:43:32 UTC

Yeah, sadly that is very disappointing to say the least. Information policy here is annoying sometimes due to its non-existence. I am just a small fish with my little machine, but it would certainly drive me nuts not getting a single statement from the project team with respect to planned Ampere support.

Otherwise, IMO GPUGrid is certainly doing many things right in terms of website curation and the research publications list, but I hate to not know what I am computing for atm ahead of any prospective paper months or years down the line. What is so hard in telling us the top-level category of research our GPUs are computing. is it about cancer? cov2? methods? brain? Is that too much to ask for? Noone expects much more than that. in that regard I think F@H is much ahead. they offer more comprehensive and easy-to access information about any WU/project a volunteer is computing for.
ID: 56903 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 869
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 56905 - Posted: 27 May 2021, 17:59:59 UTC - in response to Message 56903.  

Yeah, sadly that is very disappointing to say the least. Information policy here is annoying sometimes due to its non-existence. I am just a small fish with my little machine, but it would certainly drive me nuts not getting a single statement from the project team with respect to planned Ampere support.

+ 1
ID: 56905 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56907 - Posted: 28 May 2021, 10:59:35 UTC

Yeah, sadly that is very disappointing to say the least. Information policy here is annoying sometimes due to its non-existence. I am just a small fish with my little machine, but it would certainly drive me nuts not getting a single statement from the project team with respect to planned Ampere support.

Otherwise, IMO GPUGrid is certainly doing many things right in terms of website curation and the research publications list, but I hate to not know what I am computing for atm ahead of any prospective paper months or years down the line. What is so hard in telling us the top-level category of research our GPUs are computing. is it about cancer? cov2? methods? brain? Is that too much to ask for? Noone expects much more than that. in that regard I think F@H is much ahead. they offer more comprehensive and easy-to access information about any WU/project a volunteer is computing for.

+1 (adding "Please") <--|
->-----------------------------|
ID: 56907 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56908 - Posted: 28 May 2021, 14:27:18 UTC

over 4000 tasks ready to send.

looks like we'll have work available for some time to come.
ID: 56908 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56909 - Posted: 28 May 2021, 16:12:01 UTC

Error on D3RBandit task below wasn't due to a "restart on a different device" known problem.
This failed task ran always on the same device 0.
It was actually a reboot after a Nvidia driver update from version 460.80 to version 465.27
I had suspended BOINC activity during the transition, but it wasn't enough for avoiding the task to fail...
I take note of this.
On a next time, I'll schedule such a driver version upgrade for a moment when no Gpugrid tasks are running.

ID: 56909 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56910 - Posted: 28 May 2021, 17:29:31 UTC - in response to Message 56909.  

I've seen this happen on occasion too. Definitely have to be more careful with these long running D3RBandit tasks, or risk throwing away a lot of computation time.
ID: 56910 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 2
Level
Trp
Scientific publications
watwatwat
Message 56914 - Posted: 29 May 2021, 11:32:15 UTC

These current WUs perform worse than anything I've ever seen from GG. Far more failures than even WUs that run.
ID: 56914 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56915 - Posted: 29 May 2021, 12:23:00 UTC - in response to Message 56914.  
Last modified: 29 May 2021, 12:53:10 UTC

These current WUs perform worse than anything I've ever seen from GG. Far more failures than even WUs that run.

sounds like something wrong on your end. I've had very few failures.

I have only 2 legitimate computation errors with the latest D3RBandit series, from any of my systems, of the hundreds of tasks that I've processed in the past few weeks. that's not including things like me aborting them for whatever reason, or the server cancelling a resend, or a download error.

one of the failures was a bad WU (all hosts failed)
the other looks like it was some random problem on my host, as it was processed by another host eventually

if you un-hide your hosts, I might be able to see what the problem is.
ID: 56915 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56918 - Posted: 29 May 2021, 12:52:18 UTC - in response to Message 56914.  

These current WUs perform worse than anything I've ever seen from GG. Far more failures than even WUs that run.

That would have an explanation if you had upgraded your hosts to Ampere GPUs.
That series of graphics cards are not supported by current Gpugrid applications, and every tasks will fail immediately.
ID: 56918 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bozz4science

Send message
Joined: 22 May 20
Posts: 110
Credit: 115,525,136
RAC: 345
Level
Cys
Scientific publications
wat
Message 56919 - Posted: 29 May 2021, 12:56:20 UTC

Those Anaconda Python 3 Environment tasks admittedly had a very high failure rate but were part of a test batch though without the intent to compute on actual data. D3RBandit tasks are finishing just fine except for the known suspend/resume issue.

Hope you can quickly figure out what is causing this issue for you
ID: 56919 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,722,595
RAC: 4,266,994
Level
Trp
Scientific publications
wat
Message 56920 - Posted: 29 May 2021, 13:01:02 UTC - in response to Message 56919.  

Those Anaconda Python 3 Environment tasks admittedly had a very high failure rate but were part of a test batch though without the intent to compute on actual data. D3RBandit tasks are finishing just fine except for the known suspend/resume issue.

Hope you can quickly figure out what is causing this issue for you


agreed that the Python tasks had a lot of failures, but this is the D3RBandit thread so I have to assume he's referring to those.

ID: 56920 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 56928 - Posted: 3 Jun 2021, 8:15:04 UTC

There are 200 workunits left.
It will last for 8 hours.
ID: 56928 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · Next

Message boards : News : New D3RBanditTest workunits

©2025 Universitat Pompeu Fabra