Message boards :
News :
Project restarted
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
you missed that he was trying to run the executable directly (outside of BOINC), which is likely why he received that error message. All of my machines have devices on dev 1+, and even one in the same situation with an unusable card at dev0 (which has been excluded) and only runs on the card on dev1 You're right I really missed that, but then this is the reason for that strange licensing error. It is not a good idea to run the GPUGrid app directly, as it needs a wrapper. Perhaps the wrapper contains the appropiate license, or it tells the app where to look for it. We don't know how, so we can't use this method to debug this error. Perhaps he installed the BOINC manager as a service? |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Perhaps he installed the BOINC manager as a service? The Manager always runs in user space, but the client can run as a service. My Linux machines do run as a service, without GPU problems. Windows machines can't run GPU apps on a service install, because of Microsoft driver security protocols. Edit: programagor's Debian install on host 576641 looks OK from the outside. I'd suspect a driver problem - something like using a nouveau driver without the extra CUDA (computation) libraries provided through a manufacturer driver install. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Edit: programagor's Debian install on host 576641 looks OK from the outside. I'd suspect a driver problem - something like using a nouveau driver without the extra CUDA (computation) libraries provided through a manufacturer driver install.I've installed a fresh Ubuntu 18.04 two days ago, and it has downloaded the 460.32 driver on its own, which works with FAH and GPUGrid also. I've upgraded it to 20.04 today. EDIT: the 460.39 driver on his host should be from ppa:graphics-drivers/ppa. (It works on my other host) |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The new Linux Mint (v20.1) offers me a driver manager: ![]() It was defaulted to the open-source driver, but for computation, I think the proprietary driver is better. |
Send message Joined: 3 Feb 21 Posts: 5 Credit: 1,046,250 RAC: 0 Level ![]() Scientific publications ![]() |
When I ran the binary directly, I tried supplying the `--device 1` parameter, but to no avail; due to the basic license the binary always uses device id 0. Also, I don't see the `license.dat[.*]` anywhere on my system. And my drivers are straight from nvidia, no nouveau. I can compile and run CUDA programs/kernels without any issue. For completeness sake, I reinstalled my drivers, but the issue persists. EDIT: I also looked inside the wrapper, and there is no string `license.dat`, which leads me to believe that the license file is missing, preventing me from running on GPU id 1 EDIT 2: I just noticed that the wrapper is launching the acemd with device id 0: wrapper: running acemd3 (--boinc input --device 0) |
Send message Joined: 13 Nov 19 Posts: 5 Credit: 8,496,529 RAC: 0 Level ![]() Scientific publications ![]() |
7.333% after 24 hours. It is a pitty, GPUGRID is the only projects that is not cause my computer to lag. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
When I ran the binary directly, I tried supplying the `--device 1` parameter, but to no avail; due to the basic license the binary always uses device id 0. Also, I don't see the `license.dat[.*]` anywhere on my system. And my drivers are straight from nvidia, no nouveau. I can compile and run CUDA programs/kernels without any issue. For completeness sake, I reinstalled my drivers, but the issue persists. look in your BOINC event log at startup. Is your nvidia GPU device id 0? when you see "device [id]" in boinc, it's always the BOINC order, not the system order, which can vary due to the way BOINC decides what is the best device. ![]() |
![]() Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
7.333% after 24 hours. These WUs are too large for many GPUs to complete in the 5 day (120 hour) window before they expire. It would be best to abort them on hosts which cannot meet the deadline as running them would be time and electricity wasted. Same goes for having a spare WU waiting in your cue if your GPU takes 60 or more hours to complete one. The spare will expire before completion, yielding no credit. I recommend a longer period of time before these "extra-long runs" expire. I think it will get them back quicker in the long run. |
Send message Joined: 3 Feb 21 Posts: 5 Credit: 1,046,250 RAC: 0 Level ![]() Scientific publications ![]() |
look in your BOINC event log at startup. Is your nvidia GPU device id 0? when you see "device [id]" in boinc, it's always the BOINC order, not the system order, which can vary due to the way BOINC decides what is the best device. Right, my apologies, boinc has my GPU at id 0 CUDA: NVIDIA GPU 0: GeForce GTX 1060 (driver version 460.39, CUDA version 11.2, compute capability 6.1, 4096MB, 3974MB available, 4276 GFLOPS peak) OpenCL: NVIDIA GPU 0: GeForce GTX 1060 (driver version 460.39, device version OpenCL 1.2 CUDA, 6078MB, 3974MB available, 4276 GFLOPS peak) So licensing is likely not the culprit in my case. |
Send message Joined: 2 Jul 16 Posts: 338 Credit: 7,987,341,558 RAC: 178,897 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
The app has not changed. https://www.gpugrid.net/apps.php 2x GPUs are running on one of my PCs w/o issue. |
Send message Joined: 13 Nov 19 Posts: 5 Credit: 8,496,529 RAC: 0 Level ![]() Scientific publications ![]() |
I have aborted this WU, new one much more faster. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
I have aborted this WU, new one much more faster. wait until it runs for a few hours. you will see that the initial percentage increase is not accurate, it is only an estimation from BOINC until it hits a real checkpoint. you'll see the % increase fast until it hits the checkpoint, then it will reset to 0.333 or 0.666% and will go very slow from that point. ![]() |
![]() Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
I have aborted this WU, new one much more faster. And... (sorry to butt in) Once you get to a checkpoint, highlight the task in the task window and click on the properties button. Check the progress rate near the bottom of the list. If it is less than 0.9% per hour the GPU is too slow to make the 120 hour window, even crunching 24/7. Best to send it on to someone else before it expires. |
Send message Joined: 2 Jul 19 Posts: 21 Credit: 90,744,164 RAC: 0 Level ![]() Scientific publications ![]() |
Toni: I have removed my laptop from crunching for GPUGRID. The laptop has a GTX 660M GPU which is inadequate for these large files. In my desktop there is a GTX 1060 which seems to have enough muscle to crunch these large files. I hope all this crunching will benefit humanity in some way. Clive |
Send message Joined: 5 May 19 Posts: 36 Credit: 711,308,218 RAC: 41,661 Level ![]() Scientific publications ![]() |
Toni: I have a discrete GTX 1060 on my laptop, but after 2 days of crunching a single task BOINC is still showing estimates of 8 more days... I'm crunching 24x7, and GPU's temperature is >90C, so the card has to be throttled. Are all new tasks that big? If that's the case, not only will i not be able to finish tasks in 24 hours to get some bonus, but also i won't be able to complete them in the allocated timeframe. |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
These are the largest (longest) tasks in the history of the project I believe. Previous longest was around 12 hours back in the acemd2 (long-runs) application days. If these are to become the nominal type of task in the future, they really need to increase the deadlines. Or restrict them to adequate hardware like discrete GTX 1060 or better. The estimated task GFLOPS seems to be roughly correct at 5,000,000 value. |
Send message Joined: 5 May 19 Posts: 36 Credit: 711,308,218 RAC: 41,661 Level ![]() Scientific publications ![]() |
I agree that so long-running tasks should have their deadlines increased. Otherwise, we gradually go back to super-computers that no one can afford. And the purpose of crowd-computing is that many can participate. As for limiting tasks to certain GPUs, that's not quite adequate. As I said, my GTX1060 isn't capable of handling those tasks, so it's not only the card that is important, but where it's installed, and what type of cooling is used. Unfortunately, my laptop isn't great at cooling, so both CPU and GPU heat up to 90-93 C. Putting the laptop in a fridge is not an option... And taking all parameters into account, such as cooling, power supply, throttling, even manufacture! - isn't feasible. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,535,595 RAC: 4,302,611 Level ![]() Scientific publications ![]() |
a normal full 1060 should be capable to process these tasks in 5 days. must be because it's a laptop version. ![]() |
Send message Joined: 14 Mar 20 Posts: 7 Credit: 11,283,596 RAC: 0 Level ![]() Scientific publications ![]() |
My 1050ti in my laptop can finish a task in 66 hours. Must be something wrong. |
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My GTX 1060 finished one work unit in 38 hours. http://www.gpugrid.net/results.php?hostid=512821 The GTX 1070 took 26 hours. http://www.gpugrid.net/results.php?hostid=524425 But another GTX 1070 failed twice. http://www.gpugrid.net/results.php?hostid=528983 The first time was due to a reboot, and then the next one failed immediately thereafter. They are all on Ubuntu 18.04/20.04. I think they are better used on Folding. |
©2025 Universitat Pompeu Fabra