New workunits

Message boards : News : New workunits
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

AuthorMessage
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 53090 - Posted: 24 Nov 2019, 13:32:19 UTC - in response to Message 53087.  

Are tasks being sent out for CUDA80 plan_class? I have only received new tasks on my 1080Ti with driver 418 and none on another system with 10/1070Ti with driver 396, which doesn't support CUDA100

Yes CUDA80 is supported, see apps page here:https://www.gpugrid.net/apps.php
Also see FAQ for ACEMD3 here: https://www.gpugrid.net/forum_thread.php?id=5002
ID: 53090 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 47,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53092 - Posted: 24 Nov 2019, 15:43:53 UTC - in response to Message 53088.  

there was a task which ended after 41 seconds with:
195 (0xc3) EXIT_CHILD_FAILED

stderr here: http://www.gpugrid.net/result.php?resultid=21514460

unfortunately ACEMD3 no longer tells you the real error. The wrapper provides a meaningless generic message. (error 195)
The task error in your STDerr Output is
# Engine failed: Particle coordinate is nan

I had this twice on one host. Not sure if I am completely correct as ACEMD3 is a new beast we have to learn and tame, but in my case I reduced the Overclocking and it seemed to fix the issue, though that could just be a coincidence.



I had a couple errors on my windows 7 computer, and none on my windows 10 computer, so far. In my case, it's not overclocking, since I don't overclock.

http://www.gpugrid.net/results.php?hostid=494023&offset=0&show_names=0&state=0&appid=32

Yes, I do believe we need some more testing.





ID: 53092 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 178,897
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53093 - Posted: 24 Nov 2019, 15:50:53 UTC - in response to Message 53090.  

Are tasks being sent out for CUDA80 plan_class? I have only received new tasks on my 1080Ti with driver 418 and none on another system with 10/1070Ti with driver 396, which doesn't support CUDA100

Yes CUDA80 is supported, see apps page here:https://www.gpugrid.net/apps.php
Also see FAQ for ACEMD3 here: https://www.gpugrid.net/forum_thread.php?id=5002


Then the app requires an odd situation in Linux where it supposedly supports CUDA 80 but to use it requires a newer driver beyond it.

What driver/card/OS combinations are supported?

Windows, CUDA80 Minimum Driver r367.48 or higher
Linux, CUDA92 Minimum Driver r396.26 or higher
Linux, CUDA100 Minimum Driver r410.48 or higher
Windows, CUDA101 Minimum Driver r418.39 or higher


There's not even a Linux CUDA92 plan_class so I'm not sure what thats for in the FAQ.
ID: 53093 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
klepel

Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,798,881,008
RAC: 311
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53096 - Posted: 24 Nov 2019, 18:56:12 UTC

I just wanted to confirm, you need a driver supporting CUDA100 or CUDA101, then even a GTX670 can crunch the "acemd3" app.

See computer: http://www.gpugrid.net/show_host_detail.php?hostid=486229

Although it will not make the 24 hours deadline, and I can tell, the GPU is extremely stressed. I will run some more WUs on it, to confirm that it can handle the new app. And afterwards it will go to the summer pause or might be retired from BOINC altogether.
ID: 53096 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 178,897
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53098 - Posted: 24 Nov 2019, 20:01:08 UTC - in response to Message 53093.  

Are tasks being sent out for CUDA80 plan_class? I have only received new tasks on my 1080Ti with driver 418 and none on another system with 10/1070Ti with driver 396, which doesn't support CUDA100

Yes CUDA80 is supported, see apps page here:https://www.gpugrid.net/apps.php
Also see FAQ for ACEMD3 here: https://www.gpugrid.net/forum_thread.php?id=5002


Then the app requires an odd situation in Linux where it supposedly supports CUDA 80 but to use it requires a newer driver beyond it.

What driver/card/OS combinations are supported?

Windows, CUDA80 Minimum Driver r367.48 or higher
Linux, CUDA92 Minimum Driver r396.26 or higher
Linux, CUDA100 Minimum Driver r410.48 or higher
Windows, CUDA101 Minimum Driver r418.39 or higher


There's not even a Linux CUDA92 plan_class so I'm not sure what thats for in the FAQ.


And now I got the 1st CUDA80 task on that system w/o any driver changes.
ID: 53098 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 53100 - Posted: 24 Nov 2019, 21:24:31 UTC - in response to Message 53085.  
Last modified: 24 Nov 2019, 21:25:46 UTC

there was a task which ended after 41 seconds with:
195 (0xc3) EXIT_CHILD_FAILED

stderr here: http://www.gpugrid.net/result.php?resultid=21514460

Checking this task, it has failed on 8 computers so it is just a faulty work unit.
clocking would not be the cause as previously stated.
ID: 53100 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 53101 - Posted: 24 Nov 2019, 21:44:32 UTC - in response to Message 53092.  

there was a task which ended after 41 seconds with:
195 (0xc3) EXIT_CHILD_FAILED

stderr here: http://www.gpugrid.net/result.php?resultid=21514460

unfortunately ACEMD3 no longer tells you the real error. The wrapper provides a meaningless generic message. (error 195)
The task error in your STDerr Output is
# Engine failed: Particle coordinate is nan

I had this twice on one host. Not sure if I am completely correct as ACEMD3 is a new beast we have to learn and tame, but in my case I reduced the Overclocking and it seemed to fix the issue, though that could just be a coincidence.



I had a couple errors on my windows 7 computer, and none on my windows 10 computer, so far. In my case, it's not overclocking, since I don't overclock.

http://www.gpugrid.net/results.php?hostid=494023&offset=0&show_names=0&state=0&appid=32

Yes, I do believe we need some more testing


Agreed, testing will be an ongoing process...some errors cannot be fixed.

this task had an error code 194...
finish file present too long</message>

This error has been seen in ACEMD2 and listed as "Unknown"

Matt Harvey did a FAQ on error codes for ACEMD2 here
http://gpugrid.net/forum_thread.php?id=3468
ID: 53101 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
icg studio

Send message
Joined: 24 Nov 11
Posts: 3
Credit: 954,677
RAC: 0
Level
Gly
Scientific publications
wat
Message 53102 - Posted: 24 Nov 2019, 23:47:10 UTC

Finally Cuda 10.1! Supprot for Turing Cuda Cores other words.
My RTX 2060 start crunching.
Will post later time.
ID: 53102 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1416
Credit: 9,119,446,190
RAC: 614,515
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53103 - Posted: 25 Nov 2019, 0:20:18 UTC - in response to Message 53101.  
Last modified: 25 Nov 2019, 0:24:00 UTC

this task had an error code 194...
finish file present too long</message>

This is a bug in the BOINC 7.14.2 client and earlier versions. You need to update to the 7.16 branch to fix it.
Identified/quantified in https://github.com/BOINC/boinc/issues/3017
And resolved for the client in:
https://github.com/BOINC/boinc/pull/3019
And in the server code in:
https://github.com/BOINC/boinc/pull/3300
ID: 53103 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
rod4x4

Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,219,935,054
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 53104 - Posted: 25 Nov 2019, 0:42:18 UTC - in response to Message 53103.  
Last modified: 25 Nov 2019, 0:57:09 UTC

this task had an error code 194...
finish file present too long</message>

This is a bug in the BOINC 7.14.2 client and earlier versions. You need to update to the 7.16 branch to fix it.
Identified/quantified in https://github.com/BOINC/boinc/issues/3017
And resolved for the client in:
https://github.com/BOINC/boinc/pull/3019
And in the server code in:
https://github.com/BOINC/boinc/pull/3300

Thanks for the info and links. Sometimes we overlook the Boinc Client performance.

From the Berkeley download page(https://boinc.berkeley.edu/download_all.php):

7.16.3 Development version
(MAY BE UNSTABLE - USE ONLY FOR TESTING)

and
7.14.2 Recommended version

This needs to be considered by volunteers, install latest version if you are feeling adventurous. (any issues you may find will help the Berkeley team develop the new client)

Alternatively,
- reducing the CPU load on your PC and/or
- ensuring the PC is not rebooted as the finish file is written,
may avert this error.
ID: 53104 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1416
Credit: 9,119,446,190
RAC: 614,515
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53105 - Posted: 25 Nov 2019, 5:45:50 UTC

I haven't had a single instance of "finish file present" errors since moving to the 7.16 branch. I used to get a couple or more a day before on 7.14.2 or earlier.

It may be labelled an unstable development revision, but it is as close to general release stable as you can get. The only issue is that it is still in flux as more commits get added to it and the version number gets increased.
ID: 53105 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Greg _BE

Send message
Joined: 30 Jun 14
Posts: 153
Credit: 129,654,684
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 53109 - Posted: 25 Nov 2019, 18:39:45 UTC - in response to Message 53084.  

For me, 100% on GPU is not the best ;-)
Because I have just one card on the pc, and I can't see videos when GPUgrid is running. Even if I ask to smplayer or vlc to use the CPU So I have to pause this project when I use my pc.
Maybe one day we will can put some priority to the use of GPU (on linux).
I think I will buy a cheap card for manage the TV and play movies. But well, in general I am at work or somewhere else...

Nice to have some work. Folding@Home will wait. I was thinking to change, the others BOINC projects running on GPU doesn't interest me.



I see you have a RTX and a GTX.
You could save your GTX for video and general PC usage and put the RTX full time on GPU tasks.

I find this odd that you are having issues seeing videos. I noticed that with my system as well and it was not the GPU that was having trouble, it was the CPU that was overloaded. After I changed the CPU time to like 95% then I had no trouble watching videos.

After much tweaking on the way BOINC and all the projects I run use my system, I finally have it to where I can watch videos without any problems and I use a GTX 1050TI as my primary card along with a Ryzen 2700 with no video processor.

There must be something overloading your system if you can't watch videos on a RTX GPU while running GPU Grid.
ID: 53109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
wiyosaya

Send message
Joined: 22 Nov 09
Posts: 114
Credit: 589,114,683
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53110 - Posted: 25 Nov 2019, 20:19:01 UTC
Last modified: 25 Nov 2019, 20:20:26 UTC

I am getting high CPU/South bridge temps on one of my PCs with these latest work units.

The PC is http://www.gpugrid.net/show_host_detail.php?hostid=160668
and the current work unit is http://www.gpugrid.net/workunit.php?wuid=16866756

Every WU since November 22, 2019 had been exhibiting high temperatures on this PC. The previous apps never exhibited this. In addition, I found the PC unresponsive this afternoon. I was able to reboot, however, this does not give me a warm fuzzy feeling about continuing to run GPUGrid on this PC.

Anyone else seeing something similar or is there a solution for this?

Thanks.
ID: 53110 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53111 - Posted: 25 Nov 2019, 23:03:47 UTC - in response to Message 53110.  

I am getting high CPU/South bridge temps on one of my PCs with these latest work units.
That's because of two reasons:
1. The new app uses a whole CPU thread (or core, if there's no HT or SMT) to feed the GPU
2. The new app is not hindered by WDDM.

Every WU since November 22, 2019 had been exhibiting high temperatures on this PC. The previous apps never exhibited this.
That's because of two reasons:
1. The old app didn't feed the GPU with a full CPU thread unless the user configured it with the SWAN_SYNC environmental variable.
2. The performance of the old app was hindered by WDDM (under Windows Vista...10)

In addition, I found the PC unresponsive this afternoon. I was able to reboot, however, this does not give me a warm fuzzy feeling about continuing to run GPUGrid on this PC.

Anyone else seeing something similar or is there a solution for this?
There are a few options:
1. reduce the GPU's clock frequency (and the GPU voltage accordingly) or its power target.
2. increase cooling (cleaning fins, increasing air ventilation/fan speed).
If the card is overclocked (by you, or the factory) you should re-calibrate the overclock settings for the new app.
A small reduction in GPU voltage and frequency results in perceptible decrease of the power consumption (=heat output), as the power consumption is in direct ratio of the clock frequency multiplied by the GPU voltage squared.
ID: 53111 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
RFGuy_KCCO

Send message
Joined: 13 Feb 14
Posts: 6
Credit: 1,068,161,100
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwat
Message 53114 - Posted: 26 Nov 2019, 4:32:11 UTC
Last modified: 26 Nov 2019, 4:36:14 UTC

I have found that running GPU's at 60-70% of their stock power level is the sweet spot in the compromise between PPD and power consumption/temps. I usually run all of my GPU's at 60% power level.
ID: 53114 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
icg studio

Send message
Joined: 24 Nov 11
Posts: 3
Credit: 954,677
RAC: 0
Level
Gly
Scientific publications
wat
Message 53119 - Posted: 26 Nov 2019, 10:27:17 UTC - in response to Message 53102.  
Last modified: 26 Nov 2019, 10:28:37 UTC

Finally Cuda 10.1! Supprot for Turing Cuda Cores other words.
My RTX 2060 start crunching.
Will post run-time later.


13134.75 seconds run-time @ RTX 2060, Ryzen 2600,Windows 10 1909.
Average GPU CUDA utilisation 99%.
No Issue at all with those workunit.
ID: 53119 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
KAMasud

Send message
Joined: 27 Jul 11
Posts: 138
Credit: 539,953,398
RAC: 0
Level
Lys
Scientific publications
watwat
Message 53126 - Posted: 26 Nov 2019, 17:36:55 UTC - in response to Message 53111.  

[quote]1. The old app didn't feed the GPU with a full CPU thread unless the user configured it with the SWAN_SYNC environmental variable.




Something was making my Climate models unstable and crashing them. That was the reason I lassoed in the GPU through SWAN_SYNC. Now my Climate models are stable. Plus I am getting better clock speeds.
ID: 53126 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53129 - Posted: 26 Nov 2019, 19:16:01 UTC - in response to Message 53110.  
Last modified: 26 Nov 2019, 19:25:35 UTC

I am getting high CPU/South bridge temps on one of my PCs with these latest work units.

As commented in several threads along GPUGrid forum, new ACEMD3 tasks are challenging our computers to their maximum.
They can be taken as a true hardware Quality Control!
Either CPUs, GPUs, PSUs and MoBos seem to be squeezed simultaneously while processing theese tasks.
I'm thinking of printing stickers for my computers: "I processed ACEMD3 and survived" ;-)

Regarding your processor:
Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz
It has a rated TDP of 130W. A lot of heat to dissipate...
It was launched on Q3/2013.
If it has been running for more than three years, I would recommend to renew CPU cooler's thermal paste.
A clean CPU cooler and a fresh thermal paste usually help to reduce CPU temperature by several degrees.

Regarding chipset temperature:
I can't remember any motherboard that I can touch chipset heatsinks with confidence.
Chipset heat evacuation is based in most of standard motherboards on passive air convection heatsinks.
If there is room at the upper back of your computer case, I would recommend to install an extra fan to extract heated air and improve air circulation.
ID: 53129 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 53132 - Posted: 26 Nov 2019, 22:56:01 UTC

Wow. My GTX 980 on Ubuntu 18.04.3 is running at 80C. It is a three-fan version, not overclocked, with a large heatsink. I don't recall seeing it above 65C before.

I can't tell about the CPU yet. It is a Ryzen 3700x, and apparently the Linux kernel does not support temperature measurements yet. But "Tdie" and "Tctl", whatever they are, report 76C on Psensor.

That is good. I want to use my hardware, and it is getting colder around here.
ID: 53132 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1416
Credit: 9,119,446,190
RAC: 614,515
Level
Tyr
Scientific publications
watwatwatwatwat
Message 53133 - Posted: 26 Nov 2019, 23:56:56 UTC - in response to Message 53132.  

Tdie is the cpu temp of the 3700X. Tctl is the package power limit offset temp. The offset is 0 on Ryzen 3000. The offset is 20° C. on Ryzen 1000 and 10° C. on Ryzen 2000. The offset is used for cpu fan control.

Tdie and Tctl is provided by the k10temp driver. You still have access to the sensors command if you install lm-sensors.

Ryzen only provides a monolithic single core temp for all cores. It doesn't support individual core temps like Intel cpus.

If you have a ASUS motherboard with a WMI BIOS, there is a driver that can report all the sensors on the motherboard, the same as you would get in Windows.
ID: 53133 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next

Message boards : News : New workunits

©2025 Universitat Pompeu Fabra