Python Runtime (GPU, beta)

Message boards : News : Python Runtime (GPU, beta)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57821 - Posted: 12 Nov 2021, 20:15:16 UTC - in response to Message 57818.  

Thank you very much for your continuous support.
ID: 57821 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57824 - Posted: 13 Nov 2021, 10:44:54 UTC
Last modified: 13 Nov 2021, 10:46:38 UTC

Overnight, every of my currently active 6 varied Linux hosts received at least one task of the kind ...-ABOU_ppod_gym_test9-0-1-...
All the tasks gave a valid result, none of them errored. This is promising!

My triple-GPU host happened to receive several tasks in a short time, and three of them were executed concurrently.
It catched my attention that there was observed a drastic change in overall system temperatures when transitioning from executing highly GPU/CPU intensive PrimeGrid tasks to the Gpugrid tasks.

On the other hand, every GPU was effectively executing its own task, as shown at the following nvidia-smi screenshot:



This confirms the Keith Myers observation that the previous task-to-GPU assignment problem in multi-GPU systems is solved. Well done!
ID: 57824 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 178,897
Level
Tyr
Scientific publications
watwatwatwatwat
Message 57827 - Posted: 13 Nov 2021, 13:27:09 UTC
Last modified: 13 Nov 2021, 13:27:24 UTC

I enabled Python on a 2nd PC with a 1070 and 1080 and they all error out

https://www.gpugrid.net/result.php?resultid=32662330
Output in format: Requested package -> Available versions

Then lists tons pf packages and versions.

When I check python version on this PC I get 'Python 2.7.17'. On the PC that works, Python is not install at all.

I'm guessing there is some incompatibility between packages I have installed?
ID: 57827 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1416
Credit: 9,119,446,190
RAC: 614,515
Level
Tyr
Scientific publications
watwatwatwatwat
Message 57828 - Posted: 13 Nov 2021, 17:52:56 UTC

You needn't install any packages. The tasks are entirely packaged with everything they need in the work unit bundle.
ID: 57828 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 178,897
Level
Tyr
Scientific publications
watwatwatwatwat
Message 57830 - Posted: 13 Nov 2021, 23:15:03 UTC

Supposedly, but then they should work. Another PC of mine also with Ubuntu 18.04, driver 470 and Pascal arch works OK. These tasks were all completed by others.
ID: 57830 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1416
Credit: 9,119,446,190
RAC: 614,515
Level
Tyr
Scientific publications
watwatwatwatwat
Message 57831 - Posted: 14 Nov 2021, 1:34:01 UTC

I can only guess the tasks are confused with the locally installed old Python 2.7 library with the bundle containing 3.8 Python.

Python 2.7 is deprecated in current Linux distributions with minimum Python 3.6 in the distros now.

You might want to either uninstall Python or upgrade it to the 3 series. I don't think uninstalling though is desired as I believe a lot of stock applications are Python based and you would lose those.
ID: 57831 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57832 - Posted: 14 Nov 2021, 2:27:31 UTC - in response to Message 57831.  

I think you can uninstall python2 without damage. At least I could on Ubuntu 20.04.3, though I had only BOINC and Folding installed on it.
But I then made the mistake of trying to purge all python versions. It made the system unbootable, and I had to re-install it.
ID: 57832 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57833 - Posted: 14 Nov 2021, 9:12:32 UTC - in response to Message 57827.  

When I check python version on this PC I get 'Python 2.7.17'. On the PC that works, Python is not install at all.

To discard that something is getting confused with the old, deprecated Python version, you can upgrade to Python 3 with the following Terminal commands:

sudo apt install python-is-python3
sudo apt install python3-pip

And after that, you can uninstall unnecessary old packages with the command:

sudo apt autoremove
ID: 57833 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 295,172
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57834 - Posted: 14 Nov 2021, 10:24:54 UTC

I also have two closely similar Linux machines:

132158
508381

Don't be fooled by the host IDs: 132158 is an inherited ID from an earlier generation of hardware, and is actually slightly younger than 508381. Both run the same version of Linux Mint 20.2, installed from the same ISO download, and the same basic software environment - but I do make tweaks to the installed packages separately, as I encounter different testing needs.

Yesterday, I was away from home, but both machines downloaded tasks from the ppod_gym_test9 batch. 132158 failed to run them, 508381 succeeded.

The problem occurs during the learner.step in Python, with a ValueError raised at line 55 during initialisation:

File "/var/lib/boinc-client/slots/4/gpugridpy/lib/python3.8/site-packages/torch/distributions/distribution.py", line 55, in __init__
raise ValueError(
ValueError: Expected parameter loc (Tensor of shape (146, 8)) of distribution Normal(loc: torch.Size([146, 8]), scale: torch.Size([146, 8])) to satisfy the constraint Real(), but found invalid values:
tensor([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]], device='cuda:0',
grad_fn=<AddmmBackward0>)

The two 'file extraction' logs for the GPUGrid Python download seem to be different. I'll try to compare the software environment of the two machines and work out where the difference is coming from.
ID: 57834 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 295,172
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 57838 - Posted: 14 Nov 2021, 17:49:02 UTC - in response to Message 57834.  

Well, I've looked through the software installations for both machines, but I can't see any significant differences. Both have Python 3.8 installed (probably with the operating system), and no sign of any Python 2.x; I've installed a few sundries from terminal (libboost, git, some 32-bit libs for CPDN), but the same list on both machines.

The 'file extraction' logs are different for every task, and sometimes the same filename appears more than once (is duplicated) in the list for a single task.

For the tasks I ran successfully on host 508381, that was the only host that attempted them. The tasks that failed on host 132158 were issued to the full limit of 8 hosts, and failed on all of them.

I can only assume that the difference between success and failure resulted from differences in the task data make-up, and not from differences in the installed software on my hosts.

ID: 57838 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 178,897
Level
Tyr
Scientific publications
watwatwatwatwat
Message 57840 - Posted: 15 Nov 2021, 2:59:24 UTC - in response to Message 57833.  
Last modified: 15 Nov 2021, 3:00:56 UTC

When I check python version on this PC I get 'Python 2.7.17'. On the PC that works, Python is not install at all.

To discard that something is getting confused with the old, deprecated Python version, you can upgrade to Python 3 with the following Terminal commands:

sudo apt install python-is-python3
sudo apt install python3-pip

And after that, you can uninstall unnecessary old packages with the command:

sudo apt autoremove


The python-is command didn't work.

So I followed the instructions here starting with Option 1
https://phoenixnap.com/kb/how-to-install-python-3-ubuntu

At the end I did the python --version to check. Same 2.7.17 even though it seemed to complete.

So I tried option 2 from source. That worked OK too with 3.7.5
I get to the end and see the note about checking for specific versions. Uh, oh.

python --version = 2.7.17
python3 --version = 3.6.9
python3.7 --version = 3.7.5

So now I have 3 versions installed haha. Maybe one will work, dunno. But we'll need some more tasks to find out.
ID: 57840 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 2
Level
Trp
Scientific publications
watwatwat
Message 58077 - Posted: 12 Dec 2021, 14:10:45 UTC - in response to Message 57833.  

sudo apt install python-is-python3
sudo apt install python3-pip

Thanks, that worked and I now have python 3.8.10 installed on my two GG computers with cuda 11.4.

I just noticed that one computer had previously attempted to run a python WU but it failed. https://www.gpugrid.net/result.php?resultid=32727968
The stderr said this among many other things:
==> WARNING: A newer version of conda exists. <==
  current version: 4.8.3
  latest version: 4.11.0

Please update conda by running

    $ conda update -n base -c defaults conda
I tried running that command but it said "conda: command not found."

The rig that didn't run a python WU installed many more lines of files. The rig that did run the failed python WU installed less than half of the files.

What are all of the prerequisites I need to run these python WUs?
ID: 58077 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile ServicEnginIC
Avatar

Send message
Joined: 24 Sep 10
Posts: 592
Credit: 11,972,186,510
RAC: 998,578
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 58079 - Posted: 12 Dec 2021, 14:50:51 UTC - in response to Message 58077.  

What are all of the prerequisites I need to run these python WUs?

I read Keith Myers Message #58061
Then, I executed:

sudo apt install cmake

chance or not, the following Python task worked for me: e1a1-ABOU_rnd_ppod3-0-1-RND4818_5
The same WU had previously failed at five other hosts.
ID: 58079 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 2
Level
Trp
Scientific publications
watwatwat
Message 58081 - Posted: 12 Dec 2021, 15:28:52 UTC - in response to Message 58079.  

sudo apt install cmake


Done. Fingers crossed. Thx
ID: 58081 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1416
Credit: 9,119,446,190
RAC: 614,515
Level
Tyr
Scientific publications
watwatwatwatwat
Message 58083 - Posted: 12 Dec 2021, 16:47:26 UTC - in response to Message 58079.  

What are all of the prerequisites I need to run these python WUs?

I read Keith Myers Message #58061
Then, I executed:

sudo apt install cmake

chance or not, the following Python task worked for me: e1a1-ABOU_rnd_ppod3-0-1-RND4818_5
The same WU had previously failed at five other hosts.

I was hoping to get a response from the researcher before interfering with the process. Happy someone beat me to it.

So once again we crunchers need to help along the process by installing missing software on our hosts to properly crunch the work the researchers are sending out.

Would be nice if the researchers ran some of their work on some test systems of their own before releasing it to the public, or as we are also known as . . . "beta-testers"
ID: 58083 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 58105 - Posted: 14 Dec 2021, 16:59:44 UTC - in response to Message 58083.  

Hello everyone, sorry for the late reply.

we detected the "cmake" error and found a way around it that does not require to install anything. Some jobs already finished successfully last Friday without reporting this error.

The error was related to the atari_py, as some users reported. More specifically installing this python package from github https://github.com/openai/atari-py, which allows to use some Atari2600 games as a test bench for reinforcement learning (RL) agents.

Sorry for the inconveniences. Even while the AI agents part of the code has been tested and works, every time we need to test our agents in a new environment we need te modify environment initialisation part of the code with the one containing the new environment, in this case atari_py.

I just sent another batch of 5 test jobs, 3 already finished the others seem to be working without problems but have not yet finished.

http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730763
http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730759
http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730761

http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730760
http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730762

ID: 58105 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 2
Level
Trp
Scientific publications
watwatwat
Message 58106 - Posted: 14 Dec 2021, 19:08:48 UTC - in response to Message 58105.  
Last modified: 14 Dec 2021, 19:12:24 UTC

http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730760
http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730762

I cannot open these links. Please use the [url][/url] tags to make them linkable.

I have 2 running now and am surprised how much memory they report using. They finished and reported as I wrote this so I can't say how much memory but I think it said 22 GB each but my System Monitor reported much less on the order of 17 GB which has been relinquished. How much RAM should we have to run pythonGPU?

https://www.gpugrid.net/result.php?resultid=32730780
https://www.gpugrid.net/result.php?resultid=32730783

BTW, I installed cmake and latest python 3.8. Should I uninstall cmake as a better test?

I recommend making its CPU use require 1 and not 0.963.
ID: 58106 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 58107 - Posted: 14 Dec 2021, 19:24:18 UTC - in response to Message 58106.  

Those are private links, but you can see the result ID.
ID: 58107 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1114
Credit: 40,838,909,595
RAC: 4,232,576
Level
Trp
Scientific publications
wat
Message 58108 - Posted: 14 Dec 2021, 20:21:24 UTC - in response to Message 58106.  
Last modified: 14 Dec 2021, 20:21:53 UTC

http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730760
http://gpugrid.net/PS3GRID_ops/db_action.php?table=result&id=32730762

I cannot open these links. Please use the [url][/url] tags to make them linkable.

I have 2 running now and am surprised how much memory they report using. They finished and reported as I wrote this so I can't say how much memory but I think it said 22 GB each but my System Monitor reported much less on the order of 17 GB which has been relinquished. How much RAM should we have to run pythonGPU?

https://www.gpugrid.net/result.php?resultid=32730780
https://www.gpugrid.net/result.php?resultid=32730783

BTW, I installed cmake and latest python 3.8. Should I uninstall cmake as a better test?

I recommend making its CPU use require 1 and not 0.963.


real memory? or virtual memory allocation? high virt is normal, and on the order of tens of GB, even for acemd3 tasks.

re: CPU use for the task, this is easily configured client-side with an app config file, and it will force 1:1 no matter what the project defines. I'd recommend that.
ID: 58108 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 2
Level
Trp
Scientific publications
watwatwat
Message 58109 - Posted: 15 Dec 2021, 12:25:37 UTC - in response to Message 58108.  
Last modified: 15 Dec 2021, 12:30:57 UTC

re: CPU use for the task, this is easily configured client-side with an app config file, and it will force 1:1 no matter what the project defines. I'd recommend that.

I wasn't asking you for a trivial response. I'm asking the people that create these work units why they don't specify 1 instead of 0.963.
ID: 58109 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : News : Python Runtime (GPU, beta)

©2025 Universitat Pompeu Fabra