Advanced search

Message boards : Graphics cards (GPUs) : New applications ACEMD2 6.07/6.08 for Win and Lin

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17386 - Posted: 28 May 2010 | 12:32:14 UTC

Changes are:
1) it better supports cards with little memory
2) setting the environment variable SWAN_SYNC=0, will force use of an entire CPU core to run faster. This is valid in Windows and Linux now.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17389 - Posted: 28 May 2010 | 13:26:01 UTC - in response to Message 17386.

What about some SWAN_SYNC=0 details?

Does it pair a single GPU with a single CPU (Thread or Core)?

Does it limit the CPU use to one core on a quad core system with 4 GPUs?

If so this might of course make it slower (for example if each GPU WU would otherwise use 33% CPU).

So, is there a way to use more than one CPU Core when running multiple GPU tasks?

Thanks,

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1626
Credit: 9,384,566,723
RAC: 19,075,423
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17392 - Posted: 28 May 2010 | 14:16:33 UTC
Last modified: 28 May 2010 | 14:33:41 UTC

My first v6.07 (on 9800 GTX+) doesn't seem happy.

Repeatedly restarting from zero, several cycles of 'every 4 seconds', then maybe it'll run for 40 seconds. Lots of 'exit 0' in the message log. It'll hit the 100-exit limit soon, so you'll be able to look at std_err.

Edit: Ooooooh, isn't it rude! You got shares in NVidia, or something? (only joking).

# Insufficient memory for fast kernels. Get a better card!
Can't fall-back to non-pairlist kernel: Type=2

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17396 - Posted: 28 May 2010 | 15:17:34 UTC - in response to Message 17392.
Last modified: 28 May 2010 | 15:20:28 UTC

My first v6.07 (on 9800 GTX+) doesn't seem happy.

The TONI_GA WUs have been and still are a big problem on most GPUs with the v6.05 app too. Some cards seem to run them, others won't at all. They do seem to run better on Fermi for some reason.

Looks like it's a memory issue with v6.07 though. Nice error message, not:

>> # Insufficient memory for fast kernels. Get a better card!
>> Can't fall-back to non-pairlist kernel: Type=2

Now we have our computers being rude to us :-)


Changes are:
1) it better supports cards with little memory

Seems like the change in v6.07 made the memory issue worse.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17397 - Posted: 28 May 2010 | 16:08:27 UTC - in response to Message 17396.

Removed for now. We will look at it again on Monday.

gdf

jjwhalen
Send message
Joined: 23 Nov 09
Posts: 29
Credit: 17,591,899
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17398 - Posted: 28 May 2010 | 16:12:06 UTC
Last modified: 28 May 2010 | 16:22:59 UTC

ACEMD2 6.07 tasks are continuously restarting on my EVGA GTS250/512MB, running on WinVista/BOINC 6.10.56. Resetting the project (as per the client's message string) had no effect. I've been running 6.05 without problem. Thanks, guys, for the "upgrade."
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17409 - Posted: 28 May 2010 | 21:01:07 UTC - in response to Message 17386.
Last modified: 28 May 2010 | 21:01:59 UTC

Changes are:
1) it better supports cards with little memory
2) setting the environment variable SWAN_SYNC=0, will force use of an entire CPU core to run faster. This is valid in Windows and Linux now.

gdf

6.08 on linux - 0% CPU usage and low GPU temps. what should I do with SWAN_SYNC=0?
____________

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17410 - Posted: 28 May 2010 | 21:36:54 UTC - in response to Message 17409.

what should I do with SWAN_SYNC=0?


Create an environment variable called "SWAN_SYNC" and set it to 0 to get maximum GPU utilization. I have no idea how to do this in Linux, though.

MrS
____________
Scanning for our furry friends since Jan 2002

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17413 - Posted: 28 May 2010 | 22:04:52 UTC - in response to Message 17410.

a quick search found this page ... read the bottom of the page to see how tyo make it "global"

http://lowfatlinux.com/linux-environment-variables.html
____________
Thanks - Steve

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17415 - Posted: 28 May 2010 | 23:11:06 UTC
Last modified: 28 May 2010 | 23:12:41 UTC

Thx, Steve ;-)

Just I've got:

Fri 28 May 2010 07:08:36 PM EDT GPUGRID Sending scheduler request: To fetch work.
Fri 28 May 2010 07:08:36 PM EDT GPUGRID Requesting new tasks for CPU
Fri 28 May 2010 07:08:41 PM EDT GPUGRID Scheduler request completed: got 0 new tasks
Fri 28 May 2010 07:08:41 PM EDT GPUGRID Message from server: No work sent
Fri 28 May 2010 07:08:41 PM EDT GPUGRID Message from server: ACEMD beta version is not available for your type of computer.


what's this mean? I just checked preference iether in account either in BOINC manager itself - looks OK for me.
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17416 - Posted: 28 May 2010 | 23:27:14 UTC - in response to Message 17415.

Looking at the messages, I think BOINC Manager is only asking for CPU tasks (of which there are none) and not asking for GPU tasks. At this point I think you could try detaching and re-attaching to the project. Open BOINC Manager and after clicking on GPUGrid in the Projects tab, click on the Reset Project button. This should clear out your local settings for the project and hopefully BOINC Manager will start asking for the correct type of work again. After doing this please check you messages closely to see if it is asking for GPU tasks. It will still ask for CPU work but that's OK. If that does not work I would suggest doing a Detach, close BOINC Manager completely and then open it again and from the Tools menu attach to this project again.
____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17417 - Posted: 28 May 2010 | 23:36:40 UTC - in response to Message 17413.

Snow Crash, I followed your link & posted a question, but was impatient. I have no idea what I just did, but something happened with BOINC. Could you help to point out if I did something bad?

I opened the terminal & wrote:

sudo /etc/init.d/boinc-client stop

echo $PATH

PATH=$PATH:$HOME/bin

env

export SWAN_SYNC=0

sudo /etc/init.d/boinc-client restart
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17419 - Posted: 29 May 2010 | 0:09:54 UTC - in response to Message 17417.

I know nothing about linux, I just read the page ...

To list the current values of all environment variables, issue the command

env


does it show SWAN_SYNC ?

do you know if you are using csh or bash as your shell?
Some versions of Linux use a different command shell. Here are the commands for csh and bash shells:

CSH: setenv SWAN_SYNC 0
BASH: export SWAN_SYNC

____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17420 - Posted: 29 May 2010 | 0:17:36 UTC - in response to Message 17419.
Last modified: 29 May 2010 | 0:21:06 UTC

It uses BASH, since export worked. SWAN_SYNC=0 is listed under after typing env, but only as long as the terminal isn't closed. If the terminal is closed, SWAN_SYNC=0 disappears, I don't know how to make it stick, I didn't understand the guide, so I just keep the terminal open if I want SWAN_SYNC=0
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17421 - Posted: 29 May 2010 | 0:36:51 UTC

guys,
DO NOT INSTALL latest free driver, which comes as update in ubuntu. this is the reason why there were jobs for me. none of known ways works for me as of now.
____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17422 - Posted: 29 May 2010 | 0:43:21 UTC - in response to Message 17420.

I don't know how to make it stick


I found another another guide that explains how to make them stick ...

Important : To make the above changes permanent (that is it should work every time you login) make the changes to the .profile file that exists in your HOME directory. Simply type the required commands one line for each. And there you go. It will be available every time you login. You could check the variables using 'env' command.

I am now going to assume (ahem) that you should log off and then log on to make sure you did it correctly and that when BOINC fires up it will read it in properly.

____________
Thanks - Steve

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17423 - Posted: 29 May 2010 | 1:29:43 UTC - in response to Message 17422.

I added; export SWAN_SYNC=0 to the .profile file (which is hidden), restarted, opened terminal & wrote env, it's there, it stuck, but does it do anything??? BOINC is still might slow with the 6.08 so does it work???
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17425 - Posted: 29 May 2010 | 1:55:19 UTC

I do believe that the problem is with 6.08 app coz 6.04 runs just fine with high GPU load and temps (some kind of indicator for me:-)

The crash I faced with new free drivers was terrible. I tried all ways I know to install driver - no luck.
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17426 - Posted: 29 May 2010 | 2:16:30 UTC - in response to Message 17425.

You're using Ubuntu 10.04 aren't you? It's different from 9.10 I went back to 9.10 on 2 of my PC's so I could upgrade/downgrade myself. I haven't figured out how to do that on Ubuntu 10.04
____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17428 - Posted: 29 May 2010 | 6:27:58 UTC - in response to Message 17426.

that's pretty much easy:

1. get new driver from http://www.nvidia.com/Download/index5.aspx?lang=en-us
2. sudo gedit /etc/modprobe.d/blacklist.conf and adding:

blacklist vga16fb
blacklist nouveau
blacklist rivafb
blacklist nvidiafb
blacklist rivatv

3. deleting any nvidia driver

sudo apt-get --purge remove nvidia-*

4. restart
5. login/pass

sudo sh /path/to/NVIDIA-Linux-x86_64-195.36.24-pkg2.run

7. sudo service gdm start
or
sudo reboot
____________

Profile liveonc
Avatar
Send message
Joined: 1 Jan 10
Posts: 292
Credit: 41,567,650
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwat
Message 17429 - Posted: 29 May 2010 | 12:24:55 UTC - in response to Message 17410.

what should I do with SWAN_SYNC=0?


Create an environment variable called "SWAN_SYNC" and set it to 0 to get maximum GPU utilization. I have no idea how to do this in Linux, though.

MrS


I got SWAN_SYNC=0 set up on my Linux with a 6.08 WU, it didn't help much after running for 14 hours where 10 of them was with SWAN_SYNC=0 it was only at 24%. It was an improvement, since before SWAN_SYNC=0 4% was achieved after 4 hours so roughly it got a 100% boost, but 50 hours for a 6.08 WU on a GTX260(216) was unacceptable so I canceled after 14 hours. http://www.gpugrid.net/result.php?resultid=2409494



____________

CTAPbIi
Send message
Joined: 29 Aug 09
Posts: 175
Credit: 259,509,919
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17432 - Posted: 29 May 2010 | 14:01:25 UTC

so, the problem is 6.08, but not the variable
____________

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,663,990,196
RAC: 548,773
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17499 - Posted: 2 Jun 2010 | 15:21:50 UTC

Installed my brand new GTX 470 yesterday om my LINUX machine.
Uploaded a new WU and started crunching a 6.06 after a couple of minutes.
However, the speed was excruciating slow - a percentage tick every 28-30 seconds, So I decided to try something - I changed the nice value from 10 to 4 and as by miracle the turtle turned into a hair. Now a percentage tick in less than 2 seconds average.
Unfortunately I then had to leave for some duty travel. Coming back today I found out that
- the first WU was turned in successfully well under 3 hours time
- the following second WU though had only reached 24% after some 22 hrs runtime, so I changed the nice value again to 4. An hour or so later it reached already 66%

Anybody that knows how to permanently (i,e, so it sticks after every restart) change the nice value of a programme?

Kind regards

Alain

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17504 - Posted: 2 Jun 2010 | 20:52:40 UTC - in response to Message 17499.

You can probably write a script / program to poll the active tasks and to react accordingly. Could be easier to leave one CPU core free, though. That is assuming your CPU is currently running comething else on all other cores?

And you may want to take a look at creating the environment variable "SWAN_SYNC" and setting it to "0". It's been posted somewhere (also how to do it under linux) and is usually required to get maximum performance out of Fermis.

MrS
____________
Scanning for our furry friends since Jan 2002

Alain Maes
Send message
Joined: 8 Sep 08
Posts: 63
Credit: 1,663,990,196
RAC: 548,773
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17505 - Posted: 2 Jun 2010 | 21:50:40 UTC - in response to Message 17504.

Thanks for your advice Mr S.
I freed up one CPU core and set the SWAN_SYNC to 0.
Now the speed is acceptable.
However, changing the nice value to 4 still gives a very good speed rise.

But that will be for later...

Good night

Alain

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17520 - Posted: 3 Jun 2010 | 14:57:00 UTC - in response to Message 17505.

Thanks for your advice Mr S.
I freed up one CPU core and set the SWAN_SYNC to 0.
Now the speed is acceptable.
However, changing the nice value to 4 still gives a very good speed rise.

I see a speed increase by setting the priority to normal in windows. CPU time rises very little and responsiveness of the system is unaffected. Wish there was a switch to allow this as the syan_sync setting seems like complete overkill, wasting a CPU core when what is really needed is a priority bump or better yet settings like Collatz and MilkyWay use:

One can configure the app through some parameters in the supplied app_info.xml.
This is done by editing the line

<cmdline></cmdline>

There are several options one can put in there, the order does not matter. If the same argument
is given twice, the last one counts. Remember that for editing the app_info.xml one should take
the same precautions as for the installation of the application, i.e. one should stop BOINC
completely and restart it afterwards.

Options:

kernel frequency: f (default 30)
The application determines the size of the work packages sent to the GPU in a dynamic manner. It
tries to get the number of executed packages per second above the value specified here by
splitting them to smaller ones. The OS is completely blocked from accessing the graphics card for
the duration of one package leading to a somewhat sluggish behaviour of the user interface.
Limiting the size and therefore reducing the execution time per package creates more opportunities
for the OS to slip in between two work packages and to react to user input. The default value of
30 Hz is tuned for usability of the system. Reducing it could increase the throughput of the app
slightly.

Example:
<cmdline>f10</cmdline>
The app will try to run 10 work packages per second (limits the execution time to 100ms or shorter).


Wait factor: w (default 1.0)
The app tries to release the CPU during the the GPU computations. It does so by predicting
the runtime for the GPU calls and send the CPU thread to sleep for that time. One can manually
correct the prediction of the application with this factor. When raising it, the CPU thread
sleeps longer, decreasing the value will lead to a faster wakeup of the thread. After that time,
the polling of the GPU starts. Setting this value to 0.0 turns off the release of the CPU.
Default is of course a value of 1.0. If you see a low GPU load, you can try to decrease it (if
you have a load of 80%, set it to 0.8). If you see a high CPU load, a slight increase of this
value may help. Increase it too much and it will affect the crunching speed. You could use this
for throttling of the GPU in case the high GPU load leads to a very sluggish behaviour of the
user interface or even VPU recover events. Setting w1.1 could improve the situation (see also
the f, b and p options).

Example:
<cmdline>w1.3</cmdline>
This would increase the sleep time by 30% in relation to the prediction.


Maximum GPU RAM use: r (default 33.4)
The application allocates memory on the graphics card to store the calculated steps. Increasing
the size can offer more parallel data to process and the f option more room to get the real
setting closer to its specified value. But you should not increase the RAM usage, if more than
one WU are processed concurrently. The value is given in percent relative to the installed amount.
Too high or too low values can crash the application, so be careful! See also the explanation
about the graphics RAM usage at the end of the readme.

Example:
<cmdline>r25</cmdline>
This would decrease the maximum RAM usage to 25%.


Priority: p (default 2)
All BOINC applications normally run with the lowest possible priority to not disturb other
applications. This can lead to a low GPU load, as it may be not possible to fire up the next tasks
if all cores of the CPU are under load. Raising the priority may help here. BOINC recommends a
slightly increased priority for GPU applications. This setting is the default. Possible values:
p0: idle priority, used by CPU BOINC applications
p1: normal priority in idle priority class (below normal), this is recommended for BOINC GPU
applications, but apears to be not enough to enable millisecond polling of the GPU with Vista
p2: normal priority in normal priority class, the default
p3: normal priority in high priority class, use with care!
The priority will affect how much time it takes for the the app to get back control if it releases
its time slice (see also option b).

Example:
<cmdline>p3</cmdline>
This raises the priority to high (not recommended).


Polling behavior for the GPU within the Brook runtime: b (default 1)
See the option w for starters. If that time has elapsed, the GPU polling starts. This can be done
by continously checking if the task has finished (b-1), enabling the fastest runtimes, but potentially
creating a high CPU load (a bit dependent on driver version). Second possibility is to release the time
slice allotted by the OS, so other apps can run (b0). The catch is that there is some interaction with
the priority. The time slice is only released to other tasks of the same priority. So raising the priority
effectively disables the release and the behavior is virtually identical to setting this parameter
to -1. If a raised priority and a low CPU time is wanted, one should leave it at the default of 1. This
suspends the task for at least 1 millisecond, enabling also tasks of lower priority to use the CPU in the
meantime. One can use also b2 or b3 if one wants a smoother system behaviour.
Possible values:
b-1: busy waiting
b0: release time slice to other tasks of same priority
b1, b2 or b3: release time slice for at least 1, 2, or 3 milliseconds respectively
See also options p and w.

Example:
<cmdline>b-1</cmdline>
Enable busy waiting (no release of the time slice during polling).


For my systems all I have to do is add b-1 to the commandline to get GPU usage to 99%. Similar options to those above would be welcomed in GPUGRID.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17527 - Posted: 3 Jun 2010 | 18:47:25 UTC - in response to Message 17520.
Last modified: 3 Jun 2010 | 18:49:03 UTC

"The nice value", is exactly what?

Boinc could really do with a GPU Tab!

PS. Messing with process priorities and affinities can cause crashes!

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 18231 - Posted: 2 Aug 2010 | 22:05:15 UTC - in response to Message 17527.
Last modified: 2 Aug 2010 | 22:05:35 UTC

The amount of "Nice"ness is how happily a *nix program hands cpu control over to another one (actually the scheduler decides based on nice values). It's a different kind of process priority. Who ever manages to change it is probably aware that changing this could cause problems ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : New applications ACEMD2 6.07/6.08 for Win and Lin

//