GPUGRID active users are falling dramatically!

Message boards : Number crunching : GPUGRID active users are falling dramatically!
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47358 - Posted: 3 Jun 2017, 9:24:46 UTC
Last modified: 3 Jun 2017, 10:13:46 UTC

Why are active users falling on this project?

Lost over 33% of active users since April. That's bad and doesn't bode well for the future.

Only faster cards are holding flops/s up.

Time I think for this project to get proactive on how to attract and retain high end users, communicate more and stop taking users for granted.



Source: https://boincstats.com/en/stats/45/project/detail/user

I posted this over a year ago https://gpugrid.net/forum_thread.php?id=4304#43468
ID: 47358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47359 - Posted: 3 Jun 2017, 15:55:19 UTC - in response to Message 47358.  

Maybe the occasional very long work units are discouraging some people? I don't think the bonuses are that important, but it is a psychological thing. Also, the reduced output of the app under Windows may be a problem, though I am mainly on Linux now and am glad to have the work.
ID: 47359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
kain

Send message
Joined: 3 Sep 14
Posts: 152
Credit: 918,557,369
RAC: 28
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 47361 - Posted: 3 Jun 2017, 20:58:02 UTC

It is called summer in the north hemisphere. High temps = air conditioning and air conditioning doesn't work well with high end computers crunching for the better future...
ID: 47361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47362 - Posted: 3 Jun 2017, 21:45:22 UTC - in response to Message 47361.  

It is called summer in the north hemisphere. High temps = air conditioning and air conditioning doesn't work well with high end computers crunching for the better future...


I have had ambient temperatures up to 32c in summer and still run computers without air conditioning. Are you trying to tell me that's responsible for a 33% drop in users that started in April?

I would like to believe you but I can't just yet.
ID: 47362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47363 - Posted: 4 Jun 2017, 0:49:26 UTC - in response to Message 47362.  
Last modified: 4 Jun 2017, 0:50:59 UTC

What happened between 21st & 22nd of April? (actually a month before this date)
21. April: 3133 users
22. April: 2840 users
That's 293 less (~10%) in a day.
There's a "thing" called charityengine.
It actually installs BOINC manager and connects it to different projects (at least the last time I've encountered this it did. I did not have the guts to install it on my computers, as they are selling the computing power their users submit for them). I think there's a lot of users on every project thanks to charityengine, but I think these users could quit after a short period if they don't win a fortune.
Maybe it's not them, but another similar "thing".
ID: 47363 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47364 - Posted: 4 Jun 2017, 6:20:05 UTC - in response to Message 47358.  

Why are active users falling on this project?

Lost over 33% of active users since April. That's bad and doesn't bode well for the future.

Only faster cards are holding flops/s up.

Time I think for this project to get proactive on how to attract and retain high end users, communicate more and stop taking users for granted.


really surprised? I am not!

It all started out around mid-April when all of a sudden the crunching software became invalid and stopped working (which should never ever happen that way).

So, a new software needed to be put together, and since this had to be done in a hurry, it was rather buggy. Here just a few examples of what one could read in various threads of the forum, and what I myself have been experiencing since:

- the new software is around 30% slower :-(

- GPU overclocking is much less possible than before (at least for Maxwell cards; no idea how it is with Pascals).

- tasks stop for unknown reasons, and only continue if they are switched off (suspended) and switched on again manually (so, if a cruncher does not notice such a stop for say 10 hours, the system runs idle for 10 hours).

- the new software does not work well with BOINC: when pushing the "suspense" button in the BOINC manager (either in "Tasks" or in "Projects"), it takes several minutes until the task reacts and stops.

In the recent past, GPUGRID tasks have become even more GPU-straining and long-lasting; for example "ADRIA_FOLDGREED10_crystal_ss_contacts_100_ubiquitin" (also the _50_ubiquitin) - on a GTX750ti (with some unvoluntary stops inbetween, as mentioned above), it can take 3 days or more until this task gets finished.
That's why I had suggested that on the Project Preference page, besides "short runs" (which virtually don't exist any more) and "long runs", a third category like "extra long runs" (or whatever wording suits) is being implemented, so that the many GTX750Ti crunchers can exclude such long tasks from download.

And here we are at the next problem:
back at GPUGRID, no-one really seems to care which problems the crunchers have and which suggestens they are presenting.
Reading much in the forum, I can think of so many other people writing about all kinds of problems, making useful suggestions and also putting questions now and then. However: NO REACTION AT ALL !

So, coming back to the beginning of this posting: I am NOT surprised that people are turning away from this project. Sorry to say this :-(

By accident, in the forum of another BOINC project I participate, yesterday I read a statement from the project people there:
"Of course having happy volunteers is very important for the health of a project; so it is something that should be addressed ..."
Why is this different with GPUGRID?
ID: 47364 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
randi

Send message
Joined: 9 Nov 16
Posts: 2
Credit: 3,321,875
RAC: 0
Level
Ala
Scientific publications
watwat
Message 47367 - Posted: 5 Jun 2017, 16:10:18 UTC
Last modified: 5 Jun 2017, 16:11:07 UTC

I have been waiting a long time to get a task.

Normally I say no to long tasks because they take a VERY long time on my computer.
Recently I changed that to yes, but I am still not getting any tasks.

6/5/2017 12:07:49 | GPUGRID | update requested by user
6/5/2017 12:07:54 | GPUGRID | Sending scheduler request: Requested by user.
6/5/2017 12:07:54 | GPUGRID | Requesting new tasks for NVIDIA GPU
6/5/2017 12:07:56 | GPUGRID | Scheduler request completed: got 0 new tasks
6/5/2017 12:07:56 | GPUGRID | No tasks sent
ID: 47367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47368 - Posted: 5 Jun 2017, 16:30:02 UTC - in response to Message 47367.  

I have been waiting a long time to get a task.

I guess this is kind of not quite the right thread to post your problem.

A lot of statements and opinions about the problem of not getting tasks are contained in this thread here:

http://gpugrid.net/forum_thread.php?id=4574

you may look this up, perhaps you get an idea what's wrong.

ID: 47368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47404 - Posted: 12 Jun 2017, 9:45:50 UTC - in response to Message 47364.  

...
It all started out around mid-April when all of a sudden the crunching software became invalid and stopped working (which should never ever happen that way).

So, a new software needed to be put together, and since this had to be done in a hurry, it was rather buggy. Here just a few examples of what one could read in various threads of the forum, and what I myself have been experiencing since:

- the new software is around 30% slower :-(

- GPU overclocking is much less possible than before (at least for Maxwell cards; no idea how it is with Pascals).

- tasks stop for unknown reasons, and only continue if they are switched off (suspended) and switched on again manually (so, if a cruncher does not notice such a stop for say 10 hours, the system runs idle for 10 hours).

- the new software does not work well with BOINC: when pushing the "suspense" button in the BOINC manager (either in "Tasks" or in "Projects"), it takes several minutes until the task reacts and stops.

I am curious how much longer it will take the GPUGRID people to acknowledge that the current software is buggy and needs to be repaired!
More or less every day, I get annoyed by these bugs cited above :-(
ID: 47404 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 47407 - Posted: 12 Jun 2017, 10:07:17 UTC - in response to Message 47404.  

Fixing bugs with BOINC is relatively pointless from our perspective (and time-intensive). We are considering rather other options like moving out of it, but don't ask when or how as it's more an idea than a scheduled plan.

I am sorry for those inconveniences that this causes.

The reason we cannot address technical issues with BOINC is that we don't have anyone in the lab anymore who knows his way around it and that priorities are higher on getting scientific work done. Of course you have a point that this will eventually bite us in the ass since we won't be able to do scientific work without crunchers but it's a tricky thing to manage.

ID: 47407 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47411 - Posted: 12 Jun 2017, 13:39:13 UTC - in response to Message 47407.  

Thanks for replying.

It is unclear to me, which bugs are actual BOINC bugs (if any?), which bugs are NVIDIA bugs (if any?), and which bugs are GPUGrid app bugs.

I hope you guys can get the resources you need to triage and solve the bugs. If you think any are BOINC or NVIDIA bugs, please let the community know, so we can (continue) to offer help in solving those. I have been asking for a while, with no response.

Jacob
ID: 47411 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47433 - Posted: 14 Jun 2017, 13:50:53 UTC - in response to Message 47407.  

Fixing bugs with BOINC is relatively pointless from our perspective (and time-intensive). We are considering rather other options like moving out of it, but don't ask when or how as it's more an idea than a scheduled plan.

I am sorry for those inconveniences that this causes.

The reason we cannot address technical issues with BOINC is that we don't have anyone in the lab anymore who knows his way around it and that priorities are higher on getting scientific work done. Of course you have a point that this will eventually bite us in the ass since we won't be able to do scientific work without crunchers but it's a tricky thing to manage.

Sorry Stefan for contradicting.
I don't think that any of the deficits in the crunching software 9.18 have to do with BOINC. So blaming BOINC, at least the way I see it, is simply wrong.

As said before, this software was obviously compiled in a hurry, overnight so to speak, without much (thorough) testing.
All the bugs had not existed with the previous software.

The content of the second paragraph of your postings makes me worry even more.
Again, as I said in another posting, a project of the magnitude of GPUGRID definitely needs a certain amount of infrastructure expertise. Just having the scientits there is not enough.

If, for example, no one at GPUGRID is able to reply to my posting
http://gpugrid.net/forum_thread.php?id=4561&nowrap=true#47204
from a month ago, then something needs to be improved. Definitely so.
Otherwise, GPUGRID really risks to loose more and more crunchers. Which would be too bad - I personally feel that GPUGRID is a fantastic project! And that's why I am participating :-)
So, please put your heads together to come up with a solution!
ID: 47433 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47434 - Posted: 14 Jun 2017, 14:20:21 UTC - in response to Message 47433.  

I don't think that any of the deficits in the crunching software 9.18 have to do with BOINC. So blaming BOINC, at least the way I see it, is simply wrong.

FWIW, I have always advocated optimizing the apps for the latest hardware, since I think you get more bang for the crunching buck that way. If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here.

I usually have fairly new cards, and you will get a lot of complaints from people with older cards that they are being abandoned, or that they are being "forced" to buy new cards (I love that one).

So you have a choice. Make the one that is best for the science.
ID: 47434 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47435 - Posted: 14 Jun 2017, 15:13:16 UTC - in response to Message 47434.  
Last modified: 14 Jun 2017, 15:50:18 UTC

If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here.

one thing that's interesing:

the GTX750Ti in the host with Windows10 now shows problems with the new software.
the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10.

I guess there won't be many crunchers using WindowsXP; so, many of the crunchers using their GTX750Ti with Windows10 might have problems now. And I also guess that there are many crunchers with a GTX750Ti. What can be done: throw the GTX750Ti's away? :-(

Last year, I bought two GTX780Ti just for GPUGRID crunching, Euro 700 each. So far, they work perfectly with WindowsXP. When GPUGRID support will end in April of next year, I'll need to change to Windows10. And then all the problems will begin.
However, I don't think that I will exchange them for two new Pascals. Paying some 1400 Euros every two years just to have the latest generation of cards in order to have GPUGRID running smoothly?
ID: 47435 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47436 - Posted: 14 Jun 2017, 16:02:42 UTC - in response to Message 47435.  
Last modified: 14 Jun 2017, 16:22:56 UTC

the GTX750Ti in the host with Windows10 now shows problems with the new software.
the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10.

I guess that says something about WDDM, but I don't know what. It would be fun to trace it down, but GPUGrid just does not have the staff it seems. That is why they have to avoid unnecessary risks if they can. It is not a perfect solution, but seems to be the best under the circumstances.

I was planning to wait for Volta, but that will be a long time, so I migrated out of the lower-end cards into a few Pascals for higher efficiency in the warmer months, though it is still a mix. The prices are much more reasonable in the U.S., especially on sales. But everything has gone through the roof now, apparently with high demand for AMD cards even spilling over into Nvidia.
ID: 47436 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mikey

Send message
Joined: 2 Jan 09
Posts: 303
Credit: 7,321,800,090
RAC: 330
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47470 - Posted: 18 Jun 2017, 11:17:29 UTC - in response to Message 47435.  

If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here.

one thing that's interesing:

the GTX750Ti in the host with Windows10 now shows problems with the new software.
the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10.

I guess there won't be many crunchers using WindowsXP; so, many of the crunchers using their GTX750Ti with Windows10 might have problems now. And I also guess that there are many crunchers with a GTX750Ti. What can be done: throw the GTX750Ti's away? :-(

Last year, I bought two GTX780Ti just for GPUGRID crunching, Euro 700 each. So far, they work perfectly with WindowsXP. When GPUGRID support will end in April of next year, I'll need to change to Windows10. And then all the problems will begin.
However, I don't think that I will exchange them for two new Pascals. Paying some 1400 Euros every two years just to have the latest generation of cards in order to have GPUGRID running smoothly?


On other projects some people have gone back to older drivers for their older gpu's and that brings back the gpu's under Win10 again. In short try older drivers and see if your Win10 machine can crunch again, it may just work for you too.
ID: 47470 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ChristianVirtual

Send message
Joined: 16 Aug 14
Posts: 17
Credit: 378,346,925
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwat
Message 47471 - Posted: 18 Jun 2017, 11:33:31 UTC
Last modified: 18 Jun 2017, 11:55:59 UTC

For what it is worth: no issues with Linux / CentOS on my 980Ti,1080 and 1080Ti ... come over to the bright side of life ;-)

Update: opps, sorry, just saw the version difference ... kind of still learning the technical details here ...
ID: 47471 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 47473 - Posted: 18 Jun 2017, 16:13:12 UTC - in response to Message 47470.  

On other projects some people have gone back to older drivers for their older gpu's and that brings back the gpu's under Win10 again. In short try older drivers and see if your Win10 machine can crunch again, it may just work for you too.

The new crunching software acemd 918.80 only works with the latest drivers.
My two Windows 10 machines had run with 376.53 before, and with the new crunching software I had to update to 381.65 to get GPUGRID run.
Furthermore, Matt was pointing out clearly that the new software requires the newest drivers.

In other words: no way to install older drivers for getting problems solved :-(
ID: 47473 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Zarck

Send message
Joined: 16 Aug 08
Posts: 145
Credit: 328,473,995
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47505 - Posted: 29 Jun 2017, 11:59:28 UTC
Last modified: 29 Jun 2017, 11:59:46 UTC

I stopped your project, for my part far too many units end up in error ... especially after 12 hours of calculations (Titan 2013) is it not possible to still have points for calculated time? More I have no problems with Asteroids, Folding, milkyway etc.

J'ai arrêté votre projet, pour ma part beaucoup trop d'unités finissent en erreur... surtout après 12 heures de calculs (Titan 2013) n'est-il pas possible d'avoir quand même des points pour le temps calculé ? de plus je n'ai pas de problèmes avec Asteroids, Folding, milkyway etc.

@+
*_*
ID: 47505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 47506 - Posted: 29 Jun 2017, 13:13:50 UTC - in response to Message 47505.  

I stopped your project, for my part far too many units end up in error ...
Your GPU is too hot (88°C), that's the reason for the too many errors. (see your stderr.txt output I've attached at the end of this post.)
You should increase the cooling of your card: increase the airflow by increasing the RPM of the GPU's fan, or install extra fans in your PC case, or remove its side panel.
Alternatively you can reduce the clock speed (or the power target) of your card to decrease its power consumption (=its heat output).

especially after 12 hours of calculations (Titan 2013) is it not possible to still have points for calculated time?
No. A partial result is useless for GPUGrid, as it can't be used for generating the next step of the simulation, so it has to be calculated again (on another host).

More I have no problems with Asteroids, Folding, milkyway etc.
Those projects do not stress the GPU as much as GPUGrid does.

<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -55 (0xffffffc9)
</message>
<stderr_txt>
# GPU [GeForce GTX TITAN] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0	:
#	Name		: GeForce GTX TITAN
#	ECC		: Disabled
#	Global mem	: 6144MB
#	Capability	: 3.5
#	PCI ID		: 0000:02:00.0
#	Device clock	: 875MHz
#	Memory clock	: 3004MHz
#	Memory width	: 384bit
#	Driver version	: r382_48 : 38253
# GPU 0 : 70C
# GPU 0 : 77C
# GPU 0 : 80C
# GPU 0 : 81C
# GPU 0 : 82C
# GPU 0 : 83C
# GPU 0 : 86C
# GPU 0 : 87C
# GPU [GeForce GTX TITAN] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0	:
#	Name		: GeForce GTX TITAN
#	ECC		: Disabled
#	Global mem	: 6144MB
#	Capability	: 3.5
#	PCI ID		: 0000:02:00.0
#	Device clock	: 875MHz
#	Memory clock	: 3004MHz
#	Memory width	: 384bit
#	Driver version	: r382_48 : 38253
# GPU 0 : 77C
# GPU 0 : 80C
# GPU 0 : 82C
# GPU 0 : 83C
# GPU 0 : 84C
# GPU 0 : 87C
# GPU 0 : 88C
SWAN : FATAL : Cuda driver error 702 in file 'swanlibnv2.cpp' in line 1965.
# SWAN swan_assert 0

</stderr_txt>
]]>

ID: 47506 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : GPUGRID active users are falling dramatically!

©2025 Universitat Pompeu Fabra