Run times

Message boards : Graphics cards (GPUs) : Run times
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Paul D. Buck

Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12554 - Posted: 19 Sep 2009, 4:13:03 UTC

Has the run time been creeping up while I was not looking? I recall run times of about 6:30 on most of my cards (with some plus minus slop) but now It seems I am seeing 7:30 to 8:00 run times.

I can even see some that have projected run times of over 9 hours...
ID: 12554 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12562 - Posted: 19 Sep 2009, 17:40:35 UTC - in response to Message 12554.  

There is a new bug in BOINC (6.6.36) which assigns all WU to the same gpu if you have multiple gpus. As they time share it, they take much longer.
I am not sure that this is your case.

Can anyone suggest a good BOINC version which is new, but without major flaws?

gdf
ID: 12562 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12563 - Posted: 19 Sep 2009, 18:07:47 UTC - in response to Message 12562.  

There is a new bug in BOINC (6.6.36) which assigns all WU to the same gpu if you have multiple gpus. As they time share it, they take much longer.
I am not sure that this is your case.

Can anyone suggest a good BOINC version which is new, but without major flaws?

gdf

GDF,

As far as I know that assign all tasks to GPU 0 is a Linux only bug ... of course that is what you run ... :)

At the moment I am running mostly 6.6.3x versions but have been pretty happy with 6.10.3 which does not have the two major issues of 6.10.4 and .5 ... 6.10.6 fixed a couple issues but has still left uncorrected some problems with the order in which it processes GPU tasks for some people (introduced in 6.10.4).

Also note that I think that I just uncovered a new bug / situation with task ordering on the GPU with multiple projects that, in essence, will cause Resource Share to be ignored. I do not know how far back in versions that this bug extends. For me it is new in that to this point there was no pressure to run multiple projects for the simple reason that effectively there were no projects to run ...

Now that we are ramping up more and more projects with GPU capabilities ... well ...

Anyway, my suggestions are still 6.5.0, 6.6.36, or 6.10.3; and as I said these are versions I had run extensively or am running now ...
ID: 12563 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12565 - Posted: 19 Sep 2009, 19:35:35 UTC

I just realized that you neatly sidestepped my original question... :)

Are the tasks longer now, or is it my imagination? I am not talking about longer run times caused by bugs but just normal run times...
ID: 12565 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Temujin

Send message
Joined: 12 Jul 07
Posts: 100
Credit: 21,848,502
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwat
Message 12566 - Posted: 19 Sep 2009, 20:04:00 UTC - in response to Message 12562.  

There is a new bug in BOINC (6.6.36) which assigns all WU to the same gpu if you have multiple gpus. As they time share it, they take much longer.

Aha, that'll be why my GTX295 has started working, albeit slowly
ID: 12566 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JackOfAll
Avatar

Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12569 - Posted: 20 Sep 2009, 0:55:57 UTC - in response to Message 12562.  
Last modified: 20 Sep 2009, 1:00:09 UTC

There is a new bug in BOINC (6.6.36) which assigns all WU to the same gpu if you have multiple gpus. As they time share it, they take much longer.
I am not sure that this is your case.

Can anyone suggest a good BOINC version which is new, but without major flaws?


GPU scheduling seems to be fubar'd for Linux in one way or another with pretty much all releases. There is the everything gets assigned to '--device 0' bug in the 6.6.3x series (cause coproc_cmdline() is called post fork()) and the preempt problems with 6.10.x.

I'm running 6_6a branch (which is equiv to an unreleased 6.6.39) plus the following patch (r18836 from trunk) which will resolve the '--device 0' issue. It seems pretty solid.

--- boinc_core_release_6_6_39/client/app_start.cpp.orig	2009-09-15 11:18:45.000000000 +0100
+++ boinc_core_release_6_6_39/client/app_start.cpp	2009-09-15 11:52:34.000000000 +0100
@@ -104,8 +104,10 @@
 }
 #endif
 
-// for apps that use coprocessors, reserve the instances,
-// and append "--device x" to the command line
+// For apps that use coprocessors, reserve the instances,
+// and append "--device x" to the command line.
+// NOTE: on Linux, you must call this before the fork(), not after.
+// Otherwise the reservation is a no-op.
 //
 static void coproc_cmdline(
     COPROC* coproc, ACTIVE_TASK* atp, int ninstances, char* cmdline
@@ -793,6 +795,13 @@
 
     getcwd(current_dir, sizeof(current_dir));
 
+    sprintf(cmdline, "%s %s",
+        wup->command_line.c_str(), app_version->cmdline
+    );
+    if (coproc_cuda && app_version->ncudas) {
+        coproc_cmdline(coproc_cuda, this, app_version->ncudas, cmdline);
+    }
+
     // Set up core/app shared memory seg if needed
     //
     if (!app_client_shm.shm) {
@@ -924,10 +933,6 @@
             }
         }
 #endif
-        sprintf(cmdline, "%s %s", wup->command_line.c_str(), app_version->cmdline);
-        if (coproc_cuda && app_version->ncudas) {
-            coproc_cmdline(coproc_cuda, this, app_version->ncudas, cmdline);
-        }
         sprintf(buf, "../../%s", exec_path );
         if (g_use_sandbox) {
             char switcher_path[100];


Send me a PM if you want a link to the RPM's and SRPM for a Fedora 11 build.
ID: 12569 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jet

Send message
Joined: 14 Jun 09
Posts: 25
Credit: 5,835,455
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 12570 - Posted: 20 Sep 2009, 7:05:39 UTC - in response to Message 12554.  

Should agree with you, Paul.
Unfortunately, couldn't find the records earlier 31 of July ( when software was updated to CUDA 2.2 capability), to confirm your thoughts, but anyhow, sure that you are totally right: WU's becomes longer.
New WU's are lasts longer (from 6-6:30 hours to complete till 7:30++ hours), according to my feelings. Previously my station was able to complete at least, 3,5 - 4 WU's\day per GPU, right now I'm happy with 3 WU's\day. OK, to keep station more stable, I was downclocked GPU's a bit ( from 1,63gHz till 1,57gHz), but that wasn't the main reason.
Running damn stable GTX260 x3 under BOINC 6.10.0, 190.38 driver on Win Server 2008.

B.T.W., did you run other projects on your farms on CPU's ? If "yes", it could happened, that neighbor project, running on CPU's, could consume a bit of the CPU power, with need GPU's to be served ( data feed & output, etc). That could be one of the reasons, as well, I think. Right now I'm checking this on my system.


ID: 12570 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12571 - Posted: 20 Sep 2009, 7:21:10 UTC - in response to Message 12570.  

No, WUs are not longer. They are designed to last 1/4 of a day on a fast card.

gdf
ID: 12571 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12572 - Posted: 20 Sep 2009, 7:23:33 UTC - in response to Message 12571.  

I have updated the recommended client to 6.10.3.

thanks, gdf
ID: 12572 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MarkJ
Volunteer moderator
Volunteer tester

Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12576 - Posted: 20 Sep 2009, 11:29:03 UTC - in response to Message 12571.  

No, WUs are not longer. They are designed to last 1/4 of a day on a fast card.

gdf


Looking through my last lot of results the shortest seem to be 7 hours 30 mins and the majority seem to be around 8 hours 30 mins. That was taken using the "approx elapsed time" shown in the wu results.

These were run on GTX295 and GTX275 so by no means a slow card, although they do run at standard speeds. One machine (the GTX275) has BOINC 6.6.37 and the other is currently running 6.10.3 under windows.
BOINC blog
ID: 12576 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12577 - Posted: 20 Sep 2009, 15:43:14 UTC - in response to Message 12571.  

No, WUs are not longer. They are designed to last 1/4 of a day on a fast card.

Well, your design is bent ...

I used to get timings that were in the range of 6 hours and change on my GTX295 cards... sadly I cannot prove this as the task list is truncated at about the first of September and I am thinking back to much earlier.

If your intent is to run for about 1/4 of a day, or 6 hours well, you are over-shooting that on GTX260s, GTX285, and GTX295 cards ... the more common time seems to be up in the 28,000 seconds than down at 21K seconds.

This does seem task dependent.

I am only pointing this out because it seems strange that before most tasks did come in under 7 hours and now more and more are running up to 9 hours.

And you don't seem to be aware of the increase in run times ...

A minor point then becomes you are shading the credit grant ... :)

But most importantly to me is that you are not aware that you are overrunning your execution time targets ...

Off to see the football ...
ID: 12577 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12578 - Posted: 20 Sep 2009, 16:22:36 UTC - in response to Message 12577.  
Last modified: 20 Sep 2009, 16:23:17 UTC

Ok, let's say that it is between 1/4 and 1/3 of a day.
The calculations is made approximately, it is not designed to be exact.

gdf
ID: 12578 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12581 - Posted: 21 Sep 2009, 6:08:00 UTC - in response to Message 12578.  

Ok, let's say that it is between 1/4 and 1/3 of a day.
The calculations is made approximately, it is not designed to be exact.

Ok ...

But I am not sure that you are seeing to point of my question...

Are you aware that the time is growing... The only reason I really noticed it was because for a couple months here lately I was not able to pay attention to GPU Grid (notice the lack of posting) and it was a little bit of a shock to see that my run times are almost always over 7 hours now and running as high as 9 where before my run times were real consistenly clustered about 6.5 hours ...

Not to put too fine a point on it, but, if this is the case the low end recommendation for hardware needs revision ...
ID: 12581 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Tom Philippart

Send message
Joined: 12 Feb 09
Posts: 57
Credit: 23,376,686
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 12583 - Posted: 21 Sep 2009, 8:41:35 UTC - in response to Message 12581.  

I had a similar increase in runtime just after I upgraded to the 190 drivers. Completely removing them and reinstalling them fixed it for me!
ID: 12583 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
SuperViruS
Avatar

Send message
Joined: 18 Aug 08
Posts: 8
Credit: 127,707,074
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12606 - Posted: 22 Sep 2009, 6:00:54 UTC

The increased time may be due to a bug in the 190.xx drivers, which puts the GPU in 2D mode, more information in this post.
ID: 12606 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12607 - Posted: 22 Sep 2009, 6:57:49 UTC - in response to Message 12606.  

The increased time may be due to a bug in the 190.xx drivers, which puts the GPU in 2D mode, more information in this post.

I could have sworn we were told we needed to update to the 190 series drivers. Did I misunderstand? I mean I think I have all my systems running the 190.62 drivers now ... no 2 are on 190.62 and one is on 190.38 ...

The thing is that in that I don't turn my systems off and they run 24/7 I don't see how they get back into 3D mode if the issue is down-shifting to 2D mode... I would think that once it was down it could not, or at least would not, re-adjust up on the next task.

That is why I have trouble thinking that this is that kind of problem. I have not done a survey though my quick look seemed to hint that it is more likely task type dependent ... that is, some of the tasks (by task name class) are now running longer than the norms ...
ID: 12607 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
RalphEllis

Send message
Joined: 11 Dec 08
Posts: 43
Credit: 2,216,617
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 12649 - Posted: 23 Sep 2009, 2:55:04 UTC - in response to Message 12607.  

With the new Linux cuda 2.2 application, my work units are running faster and producing more credit per day. I am using the 190.32 Nvidia drivers.
When I was running with the 190 drivers in Windows, the work units were not running faster but they were more stable and used less CPU time.
ID: 12649 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Jet

Send message
Joined: 14 Jun 09
Posts: 25
Credit: 5,835,455
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 12691 - Posted: 23 Sep 2009, 19:03:29 UTC - in response to Message 12606.  

I don't think, that sudden switch to 2D mode could be the reason. I'm almost run all the time GPU-Z, to control the core \ mem frequencies, as well, as core temps. All three GTX 260 runs at full load. Additionally, in <stderr_txt> file main details is shown, as well, as core frequency. Here is sample:

# Using CUDA device 0
# Device 0: "GeForce GTX 260"
# Clock rate: 1.59 GHz
# Total amount of global memory: 939524096 bytes
# Number of multiprocessors: 27
# Number of cores: 216
# Driver version 2030
# Runtime version 2020
# Device 1: "GeForce GTX 260"
# Clock rate: 1.59 GHz
# Total amount of global memory: 939524096 bytes
# Number of multiprocessors: 27
# Number of cores: 216
# Driver version 2030
# Runtime version 2020
# Device 2: "GeForce GTX 260"
# Clock rate: 1.59 GHz
# Total amount of global memory: 939524096 bytes
# Number of multiprocessors: 27
# Number of cores: 216
# Driver version 2030
# Runtime version 2020
MDIO ERROR: cannot open file "restart.coor"
# Time per step: 51.394 ms
# Approximate elapsed time for entire WU: 32121.105 s
called boinc_finish

</stderr_txt>

No any sign of fall to 2D mode.

ID: 12691 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12714 - Posted: 24 Sep 2009, 11:47:34 UTC - in response to Message 12691.  

Another consideration is that the amount of CPU time has risen sharply which slows down other projects.
GPU Grid was my project of choice for the GPU however, it appears it consumes a hefty amount of CPU, more than I would expect and I am aware it needs to use some CPU time.
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline
ID: 12714 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
JackOfAll
Avatar

Send message
Joined: 7 Jun 09
Posts: 40
Credit: 24,377,383
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwat
Message 12715 - Posted: 24 Sep 2009, 12:41:47 UTC - in response to Message 12714.  

Another consideration is that the amount of CPU time has risen sharply which slows down other projects. GPU Grid was my project of choice for the GPU however, it appears it consumes a hefty amount of CPU, more than I would expect and I am aware it needs to use some CPU time.


I have noticed that v670 of the Linux app uses approx 10% more CPU than it used to with v666. I wonder whether that is by design or an unwelcome side effect?
ID: 12715 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Graphics cards (GPUs) : Run times

©2026 Universitat Pompeu Fabra