Message boards :
News :
Experimental Python tasks (beta) - task description
Message board moderation
Previous · 1 . . . 8 · 9 · 10 · 11 · 12 · 13 · 14 . . . 50 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It's from http://ppa.launchpad.net/costamagnagianfranco/boinc/ubuntuMy Ubuntu 20.04.3 machines updated themselves this morning to v7.18.1 for the 3rd time. Sorry for the confusion. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,102,898 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I think they use a different PPA, not the standard Ubuntu version. You're right. I've checked, and this is my complete repository listing. There are new pending updates for BOINC package, but I've recently catched an ACEMD3 ADRIA new task, and I'm not updating until it be finished and reported. My experience warns that these tasks are highly prone to fail if something is changed while processing. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Which distro/repository are you using? Ah. Your reply here gave me a different impression. Slight egg on face, but both our Linux update manager screenshots fail to give source information in their consolidated update lists. Maybe we should put in a feature request? |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
ACEMD3 task finished on my original machine, so I updated BOINC from PPA 2022-01-30 to 2022-02-04. I can confirm that if you used systemctl/edit to create a separate over-ride file, it remains in place - no need to re-edit every time. If you used a text editor to edit the raw systemd file in place, of course, it'll get over-written and will need editing again. (final proof-of-the-pudding of that last statement awaits the release of the next test batch) |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Got a new task (task 32738148). Running normally, confirms override to systemd is preserved. Getting entries in stderr as before: wandb: WARNING Path /var/lib/boinc-client/slots/7/.config/wandb/wandb/ wasn't writable, using system temp directory (we're back in slot 7 as usual) There are six folders created in slot 7: agent_demos gpugridpy int_demos monitor_logs python_dependencies ROMS There are no hidden folders, and certainly no .config wandb data is in: /tmp/systemd-private-f670b90d460b4095a25c37b7348c6b93-boinc-client.service-7Jvpgh/tmp There are 138 folders in there, including one called simply wandb wandb contains: debug-internal.log debug.log latest-run run-20220206_163543-1wmmcgi5 The first two are files, the last two are folders. There is no subfolder called wandb - so no recursion, such as the warning message suggests. Hope that helps. |
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Thanks! the content of the slot directory is correct. The wandb directory will be also placed in the slot directory soon, in the next experiment. During the current experiment, which consists of multiple batches of tasks, the wandb directory will be still in /tmp, as a result of the warning. That is not a problem per se, but I agree that will be cleaner to place it in the slot directory, so all BOINC files are there. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
wandb: Run data is saved locally in /var/lib/boinc-client/slots/7/wandb/run-20220209_082943-1pdoxrzo |
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Great, thanks a lot for the confirmation. So now it seems the directory is appropriate one. |
Send message Joined: 3 Sep 21 Posts: 3 Credit: 146,609,125 RAC: 0 Level ![]() Scientific publications ![]() |
Pretty happy to see that my little Quadro K620s could actually handle one of the ABOU work units. Successfully ran one in under 31 hours. It didn't hit the memory too hard, which helps. The K620 has a DDR3 memory bus so the bandwidth is pretty limited. http://www.gpugrid.net/result.php?resultid=32741283 Though, it did fail one of the Anaconda work units that went out. The error message doesn't mean much to me. http://www.gpugrid.net/result.php?resultid=32741757 Traceback (most recent call last): File "run.py", line 40, in <module> assert os.path.exists('output.coor') AssertionError 11:22:33 (1966061): ./gpugridpy/bin/python exited; CPU time 0.295254 11:22:33 (1966061): app exit status: 0x1 11:22:33 (1966061): called boinc_finish(195) |
![]() Send message Joined: 26 Sep 13 Posts: 20 Credit: 1,714,356,441 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
All tasks goes in errors on this machine : https://www.gpugrid.net/results.php?hostid=591484 I specify that the machine does not have a GPU usable by BOINC. Thanks for your help. |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I got two of those yesterday as well. They are described as "Anaconda Python 3 Environment v4.01 (mt)" - declared to run as multi-threaded CPU tasks. I do have working GPUs (on host 508381), but I don't think these tasks actually need a GPU. The task names refer to a different experimenter (RAIMIS) from the ones we've been discussing recently in this thread. |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 678,713 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
We were running those kind of tasks a year ago. Looks like the researcher has made an appearance again. |
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I just downloaded one, but it errored out before I could even catch it starting. It ran for 3 seconds, required four cores of a Ryzen 3950X on Ubuntu 20.04.3, and had an estimated time of 2 days. I think they have some work to do. http://www.gpugrid.net/result.php?resultid=32742752 PS - It probably does not help that that machine is running BOINC 7.18.1. I have had problems with it before. I will try 7.16.6 later. |
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
PPS - It ran for two minutes on an equivalent Ryzen 3950X running BOINC 7.16.6, and then errored out. |
Send message Joined: 3 May 20 Posts: 19 Credit: 1,043,759,208 RAC: 29,826 Level ![]() Scientific publications ![]() |
I just ran 4 of the Python CPU tasks wu's on my Ryzen 7 5800H, Ubuntu 20.04.3 LTS, 16 GB ram. Each was run on 4 CPU threads at the same time. The first 0,6% took over 10 minutes, then they jumped to 10%, continued a while longer until 17 minutes were over and then erroed out all at more or less the same moment in the task. Here is one example: 32743954 |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
A RAIMIS MT task - which accounts for the 4 threads. And yet - Run NVIDIA GeForce RTX 3060 Laptop GPU (4095MB) Traceback (most recent call last): |
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I am running two of the Anacondas now. They each reserve four threads, but are apparently only using one of them, since BoincTasks shows 25% CPU usage. They have been running for two hours, and should complete in 14 hours total, though the estimates are way off and show 12 days. Therefore, they are running high priority even though they should complete with no problem. |
Send message Joined: 3 May 20 Posts: 19 Credit: 1,043,759,208 RAC: 29,826 Level ![]() Scientific publications ![]() |
Hey Richard. In how far is my GPU's memory involved in a CPU task? |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 326,008 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hey Richard. In how far is my GPU's memory involved in a CPU task? It shouldn't be - that's why I drew attention to it. I think both AbouH and RAIMIS are experimenting with different applications, which exploit both GPUs and multiple CPUs. It isn't at all obvious how best to manage a combination like that under BOINC - the BOINC developers only got as far as thinking about either/or, not both together. So far, Abou seems to have got further down the road, but I'm not sure how much further development is required. We watch and wait, and help where we can. |
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My first two Anacondas ended OK after 31 hours. But they were _2 and _3. I am not sure what the error messages mean. Some ended after a couple of minutes, while others went longer. http://www.gpugrid.net/results.php?hostid=593715 I am running a _4 now. After 18 minutes it is OK, but the CPU usage is still trending down to a single core after starting out high. |
©2025 Universitat Pompeu Fabra