Advanced search

Message boards : Graphics cards (GPUs) : I'm running short of WUs!!

Author Message
Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2871 - Posted: 8 Oct 2008 | 5:46:10 UTC

I have a PC with two PCIe slots and a Nvidia GTX280 in each one. I have a two cores CPU, so I can crunch two WUs at at time in GPUGRID. The problem is that I'm not getting enough WU to be crunching continuosly. I can do 7 WUs a day in total, which is less than my 8 WUs a day limit. But I NEVER has any WU waiting to be crunched (in adition to those being crunched) and, STILL MORE, now I'm without any unit at all, with two GPUs doing NOTHING.

What can I do?

Is the project running short of units?

Other kind of BOINC projects that I'm running, have many WUs waiting to be processed.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2874 - Posted: 8 Oct 2008 | 6:36:54 UTC

Once we have a stable new BOINC release and all the other client changes, which are already in the pipeline, it is hopefully on the developers list to change this policy to something like "at least 2 concurrent WUs per CUDA device".

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 2878 - Posted: 8 Oct 2008 | 8:42:15 UTC - in response to Message 2874.

For the time being I will increase the limit from 8 to 10.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2883 - Posted: 8 Oct 2008 | 17:13:45 UTC - in response to Message 2878.

I think he's running into the limit of 1 concurrent WU per CPU, not the overall quota (but increasing it to 10 is not a bad idea anyway). So with 2 cores and 2 GPUs he can't have a cache and gets idle time as soon as the WU-transfer and scheduler contact take some time.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2905 - Posted: 9 Oct 2008 | 6:21:50 UTC - in response to Message 2883.

I think he's running into the limit of 1 concurrent WU per CPU, not the overall quota (but increasing it to 10 is not a bad idea anyway). So with 2 cores and 2 GPUs he can't have a cache and gets idle time as soon as the WU-transfer and scheduler contact take some time.

MrS


Yes, but the problem is that my PC remains more than two hours idle waiting for more WUs. Then I transfer it to FOLDING and try GPUGRID some more hours later.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2915 - Posted: 9 Oct 2008 | 17:17:49 UTC - in response to Message 2905.
Last modified: 9 Oct 2008 | 17:18:59 UTC

Until "at least 2 concurrent WUs per CUDA device" get into scene, I have one GPU crunching for GPUGRID while the other one is crunching for FOLDING.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2917 - Posted: 9 Oct 2008 | 17:24:16 UTC

Seems like the best solution for you for now. Do you know how to teach BOINc to use only 1 of 2 GPUs?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 2918 - Posted: 9 Oct 2008 | 17:35:41 UTC - in response to Message 2917.

We are looking for solutions of this problem.

gdf

Profile UL1
Send message
Joined: 16 Sep 07
Posts: 56
Credit: 35,013,195
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwat
Message 2920 - Posted: 9 Oct 2008 | 18:08:24 UTC - in response to Message 2918.

Similar prob over here: running two 9800 GX2 = 4 GPUs (and today was the first day they ran without errors)...and there are no WUs waiting...

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2923 - Posted: 9 Oct 2008 | 18:54:24 UTC - in response to Message 2917.
Last modified: 9 Oct 2008 | 18:59:01 UTC

Do you know how to teach BOINc to use only 1 of 2 GPUs?

MrS


No. I have put a 1 core limit to the client, which is automatically used by GPUGRID to run one GPU. The handicap is that I have the other core doing only FOLDING, which means 80% of it useless. I have intented to limit the cores to 1 using the GPUGRID web configuration, so the other one can be used by other BOINC projects, but it doesn't work (really it is default configuration).

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2972 - Posted: 11 Oct 2008 | 6:08:59 UTC

I upgraded to version .14 and the problem with scheduling is worst now. I have been more than 24 hours crunching only one WU (one GPU crunching and one GPU doing nothing). Now my two GPUs are idle. It has spent more than 5 hours and there is no WU to crunch. Theoretically this PC can do more than 7 WUs a day if it would be fed constanly.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2975 - Posted: 11 Oct 2008 | 10:01:33 UTC

Why are you not getting new WUs? Your host sems to be this one, which has a daily quota of 10 (the maximum). Is BOINC not requesting new work or is the scheduler denying them?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2979 - Posted: 11 Oct 2008 | 12:04:54 UTC - in response to Message 2975.

Why are you not getting new WUs? Your host sems to be this one, which has a daily quota of 10 (the maximum). Is BOINC not requesting new work or is the scheduler denying them?

MrS


Yes, the PC is that one. I'm getting WUs, but at a "slow pace". I have received in the meanwhile (since I wrote my last post) two more units after the computer has been almost 9 hours with both GPUs idle. Now it is crunching these both WUs, but dued to the "1 WU by CPU" limit, as I have a Intel 8500(two cores), I have not received more units and have no one "in reserve", so surely the problem will repeat when these WUs will be completed (I last 7 hours to process a WU). I don't understand those "9 hours" lag in download units.

Profile koschi
Avatar
Send message
Joined: 14 Aug 08
Posts: 124
Credit: 792,979,198
RAC: 799
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 2980 - Posted: 11 Oct 2008 | 12:22:55 UTC
Last modified: 11 Oct 2008 | 12:25:04 UTC

When this happens to me, (hitting the 2 WU limit, which normaly only happens when the project is down) and my GPU goes idle I update the project once in the project tab. Then its immediatelly reporting the finished units and gets new ones...

What you could do is setting a task in the windows task planer (no idea how its really called), for the following command to be executed once each hour...

c:/path/to/your/boinc/boinccmd.exe --project http://www.ps3grid.net update

Its just a temporary solution until the projects hands out units based on the number of GPUs, not CPUs...

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2981 - Posted: 11 Oct 2008 | 12:35:24 UTC - in response to Message 2980.
Last modified: 11 Oct 2008 | 12:36:06 UTC

No, I tried it manually many times and the answer is "get 0 tasks". I only receive task when the scheduler wants...

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 288,617,006
RAC: 1,969,263
Level
Asn
Scientific publications
watwatwatwatwatwatwatwat
Message 2982 - Posted: 11 Oct 2008 | 12:48:43 UTC

Does BOINC ask for work if you click on the update button and you get the "got 0 tasks" answer from the server?

How does you resource share look like for PS3GRID and the other projects?
How have you set the options "connect to the internet every n days" and "maintain enough work for an additional n days"?
____________

pixelicious.at - my little photoblog

Temujin
Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 2983 - Posted: 11 Oct 2008 | 12:51:07 UTC - in response to Message 2981.

No, I tried it manually many times and the answer is "get 0 tasks". I only receive task when the scheduler wants...
I've just had the same happen on my quad core.
I got around it by suspending the other active project (seti) and then PS3Grid immediately requested more work.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2985 - Posted: 11 Oct 2008 | 13:39:52 UTC - in response to Message 2982.
Last modified: 11 Oct 2008 | 13:41:36 UTC

Does BOINC ask for work if you click on the update button and you get the "got 0 tasks" answer from the server?

How does you resource share look like for PS3GRID and the other projects?
How have you set the options "connect to the internet every n days" and "maintain enough work for an additional n days"?


a- Yes, as I said before, I get the "got 0 tasks"
b- 100% each project. I actually have 3x+1, but when the GPUs beguin to crunch, they stop (only two cores).
c- I left as default. Some time ago I set to 10 days the field in the field "maintain enough..." and the computer tried very many times to get WUs, but it always get this message from server (in two lines): "No work sent (reached per-cpu limit of 1 tasks)" and only get 2 Wus, no one appart from the ones being crunched.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2986 - Posted: 11 Oct 2008 | 13:43:34 UTC - in response to Message 2983.
Last modified: 11 Oct 2008 | 13:44:13 UTC

No, I tried it manually many times and the answer is "get 0 tasks". I only receive task when the scheduler wants...
I've just had the same happen on my quad core.
I got around it by suspending the other active project (seti) and then PS3Grid immediately requested more work.


I'll try. I have 3x+1 project since the GPUs are so many time idle. If they are working, no other project can be executed, so it would be innecessary to have any other project besides GPUGRID.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2987 - Posted: 11 Oct 2008 | 13:49:51 UTC - in response to Message 2986.

I have just tried and YES, it has been to stop 3x`1 project and the client has tried to get WU inmediately, but it has get the "reached per-cpu limit...." (because I still have twuo units being processed). But at least it has tried to get WUs. So I'll leave stoped the other project and wait until I complete the running WUs... I'll report later.

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2988 - Posted: 11 Oct 2008 | 14:01:18 UTC - in response to Message 2985.

Does BOINC ask for work if you click on the update button and you get the "got 0 tasks" answer from the server?


a- Yes, as I said before, I get the "got 0 tasks"


Before a line like:
"11/10/2008 15:21:54|PS3GRID|Scheduler request completed: got 0 new tasks"

there should be a line like:
"11/10/2008 15:21:49|PS3GRID|Sending scheduler request: To fetch work. Requesting 6725 seconds of work, reporting 0 completed tasks"

If your BOINC wrote "requesting 0 seconds of work" we'd have found the reason for your problem (but not necessarily the solution).

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2989 - Posted: 11 Oct 2008 | 14:14:57 UTC - in response to Message 2988.

··· Before a line like:
"11/10/2008 15:21:54|PS3GRID|Scheduler request completed: got 0 new tasks"

there should be a line like:
"11/10/2008 15:21:49|PS3GRID|Sending scheduler request: To fetch work. Requesting 6725 seconds of work, reporting 0 completed tasks"

If your BOINC wrote "requesting 0 seconds of work" we'd have found the reason for your problem (but not necessarily the solution).

MrS


Yes, what I have set in red is exactly what I get when I make a manual update.

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 288,617,006
RAC: 1,969,263
Level
Asn
Scientific publications
watwatwatwatwatwatwatwat
Message 2990 - Posted: 11 Oct 2008 | 14:24:25 UTC - in response to Message 2989.

Well then it is clear why you don't get GPUGRID tasks - BOINC doesn't ask for them... ;)

To make sure I always have enough GPUGRID tasks in the queue I have set the resource share for GPUGRID/PS3GRID to 1000.
All other CPU only projects have a resource share of 200 to max.600.

Then I have set a short "connect to internet" time of 0.1 and "maintain enough work for an additional n days" is set to 0.7.

That way I always have enough work from GPUGRID and also from other projects on my Quadcores.
It may still be a problem with a Dualcore and two GPUs, but BOINC should at least ask for more work from GPUGRID more frequently with such settings.
____________

pixelicious.at - my little photoblog

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2991 - Posted: 11 Oct 2008 | 14:41:42 UTC - in response to Message 2990.
Last modified: 11 Oct 2008 | 14:42:33 UTC

··· but BOINC should at least ask for more work from GPUGRID more frequently with such settings.


I understand the idea, but when I do a request manually, I get no WU, so I think that the fact that they are made automatically doesn't matter: I'll get no WU again. Or not?

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 288,617,006
RAC: 1,969,263
Level
Asn
Scientific publications
watwatwatwatwatwatwatwat
Message 2992 - Posted: 11 Oct 2008 | 15:06:55 UTC - in response to Message 2991.

Well, to be honest - a manual update won't do any good. And suspending project a to get more work from project b won't help much either, because that way the BOINC scheduler will hardly ever "learn" how much work it has to fetch from which project... If you like to crunch more for one project than for another one, set the resource share for that project higher, and just wait...

But the main problem here at GPUGRID like it is now, is that you can only have as much tasks in the queue like you have CPUs.
Which means with a Dualcore you can always have only two GPUGRID tasks. Which wouldn't be a problem if you only have one GPU, but with two GPUs that's bad...

If you set a higher resource share and short connect to Internet intervals, you have at least (I hope) not to wait 9 hours until you get a new PS3GRID WU.

A better solution would be if GDF would change the max. WU per CPU setting to 2, but this might cause problems with Quadcores with only one GPU... So we all will have to wait until there is a server setting to allow x WUs per GPU instead of the x WUs per CPU like it is now.
____________

pixelicious.at - my little photoblog

Profile KyleFL
Send message
Joined: 28 Aug 08
Posts: 33
Credit: 786,046
RAC: 0
Level
Gly
Scientific publications
wat
Message 2993 - Posted: 11 Oct 2008 | 16:03:20 UTC - in response to Message 2985.
Last modified: 11 Oct 2008 | 16:07:50 UTC

How does you resource share look like for PS3GRID and the other projects?

b- 100% each project. I actually have 3x+1, but when the GPUs beguin to crunch, they stop (only two cores).



That´s the reason why your client won´t ask for new work on GPU-Grid.
It thinks, that GPU-grid has it´s share on projekt time and so he doesn´t request new work for it. After several ours idleing the CPU-projects seem to catch up and the GPU-Project gets new work. I don´t know exactly how that is calculated, but it could be based on the Credits you get for a processed unit. On GPU-Grid they are much higher because of the massive calculating power of the gpu and so the scheduler starts to give the other projects more time to get them on the same level.

Just set the resource-shares here on your GPU-grid acount (under your account) to 1000 and everything should work. Of course there is still the problem that you only gets 2WUs because of the 1 WU per CPU-Core Limit (I hope that gets fixed some time - as it´s really annoying to Dual-Core PCs)


Cu KyleFL

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2994 - Posted: 11 Oct 2008 | 17:12:14 UTC

Recently the two WUs has been completed. The client load one more task and stopped. There was no other project running. I waited some time and tried what I has been told. I activated the 3x+1 project that was suspended. Waited a minute or so, and I suspended it again: Inmediately a new WU beguin to be downloaded. But it has a seious handicap: it has to be made manually...

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2995 - Posted: 11 Oct 2008 | 17:48:34 UTC

What's your current ressource share?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2996 - Posted: 11 Oct 2008 | 18:03:18 UTC - in response to Message 2995.

Resource share = 1000 for gpugrid. 100 for all other projects.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2998 - Posted: 12 Oct 2008 | 0:11:43 UTC
Last modified: 12 Oct 2008 | 0:13:25 UTC

It seems that now it works. I have checked just now believing that I would have to force the new WU manually, and I have been surprised by the fact that they both have been sent and the new ones downloaded automatically. The share=1000 seems to works. The other projecta were suspended too (which in this case is irrelevant because they will haven't work too).

I hope this repeats the next time.

Thanks you, thank you very much.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 3002 - Posted: 12 Oct 2008 | 13:02:24 UTC

Well, with the "share = 1000" trick I can have WUs. But the total solution will arrive when I can have at least two WUs waiting to be processed (overpass the "1 per-cpu limit") to overcome the sometimes large amount of time elapsed since a WU download is complete, which implies that the corresponding GPU is idle.

Profile Edboard
Avatar
Send message
Joined: 24 Sep 08
Posts: 72
Credit: 12,410,275
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 3036 - Posted: 13 Oct 2008 | 20:32:47 UTC
Last modified: 13 Oct 2008 | 20:33:07 UTC

It seems that it's necessary to have no CPU-BOINC project active in order that the upload/download process be without hours-lag.

Profile Krunchin-Keith [USA]
Avatar
Send message
Joined: 17 May 07
Posts: 512
Credit: 111,288,061
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 3068 - Posted: 15 Oct 2008 | 20:23:32 UTC

I have suggested we need a GPU limit, like the CPU limit, but they need to be two different items, currently they are the same but only a CPU limit. It will be considered on how to do and may be in a future versions. This will avoid two problems, not getting enough work for GPUs when you have less CPUs than GPUs or getting to much work extra work when you have more CPUs than GPUs. In essence you should end up with always 1 running and 1 on standby for each GPU where you have enough CPUs to support them. The current way was jsut an extension of the exisiting CPU limits, which we have proved to be not sufficient.

Post to thread

Message boards : Graphics cards (GPUs) : I'm running short of WUs!!

//