Experimental Python tasks (beta) - task description

Message boards : News : Experimental Python tasks (beta) - task description
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 50 · Next

AuthorMessage
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 318
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59336 - Posted: 26 Sep 2022, 9:26:17 UTC - in response to Message 59335.  
Last modified: 26 Sep 2022, 9:26:45 UTC

Looks good to me. Just one question - are there any 'minimum Windows version' constraints on the later versions of 7za? I think it's unlikely to affect us, but it would be good to check, just in case.

I mention it, because the original trial runs used native Windows tar decompression (the same as the Linux implementation): but that was only introduced in later versions of Windows 10 and 11. Some of us (myself included) still use Windows 7, which supports 7z but not tar. A reasonable degree of backwards compatibility is desirable!
ID: 59336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 26 Dec 13
Posts: 86
Credit: 1,292,358,731
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59337 - Posted: 26 Sep 2022, 12:00:48 UTC - in response to Message 59335.  
Last modified: 26 Sep 2022, 12:05:18 UTC

Hi, abouh!

Change A:
You are correct.

Change B
You are correct.
2) If this can't be changed or too hard / long to implement - no big deal.
In any case, pipelining still save some time and space : )

in this case, would the command line would be simply this one? without the -o flag?

Of course, if you launch 7za from working directory(/slots/X), than output flag not necessary.

Change C
You are correct.
Using 7z format(LZMA2 compression) significantly reduce archive size, save your bandwidth and some time for unpacking/unzipping process ; )
As I wrote above, the 7za command will be simplified, since the pipelining process will no longer be required.
NB! It is important to update the supplied 7za to current version, since version 9.20, a lot of optimizations have been made for compression/decompression of 7z archives(LZMA).


Just one question - are there any 'minimum Windows version' constraints on the later versions of 7za?

As mentioned on 7-Zip homepage, app support all versions since Windows 2000:

7-Zip works in Windows 10 / 8 / 7 / Vista / XP / 2019 / 2016 / 2012 / 2008 / 2003 / 2000.
ID: 59337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59340 - Posted: 26 Sep 2022, 13:39:36 UTC
Last modified: 26 Sep 2022, 13:42:42 UTC

As a very first step I am trying to remove the .tar.gz file. I am encountering a first issue. The steps of the jobs are specified in the job.xml file in the following way:

<job_desc>

<task>
<application>.\7za.exe</application>
<command_line>x .\pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17 -y</command_line>
</task>

<task>
<application>.\7za.exe</application>
<command_line>x .\pythongpu_windows_x86_64__cuda1131.tar -y</command_line>
</task>

....

<job_desc>


Essentially I need to execute a task that removes the pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17 file after the very first task.

When I try in the Windows command prompt:

cmd.exe /C "del pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17"


it works. However when I add to the job.xml file

<task>
<application>cmd.exe</application>
<command_line>/C "del .\pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17"</command_line>
</task>


The wrapper seems to ignore it. Doesn't the wrapper have cmd.exe? I need to run more tests to figure out the exact command to delete files
ID: 59340 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 26 Dec 13
Posts: 86
Credit: 1,292,358,731
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59341 - Posted: 26 Sep 2022, 14:09:41 UTC - in response to Message 59340.  

<task>
<application>cmd.exe</application>
<command_line>/C "del .\pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17"</command_line>
</task>

Try to use %COMSPEC% variable as alias to %SystemRoot%\system32\cmd.exe
If this doesn't work, then I'm sure specifying the full path(C:\Windows\system32\cmd.exe) should work.
ID: 59341 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 4,772
Level
Trp
Scientific publications
wat
Message 59343 - Posted: 26 Sep 2022, 14:50:29 UTC

in other news. looks like we've finally crunched through all the tasks ready to send. all that remains are the ones in progress and the resends that will come from those.

any more coming soon?
ID: 59343 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59347 - Posted: 27 Sep 2022, 13:53:43 UTC - in response to Message 59341.  
Last modified: 27 Sep 2022, 14:06:09 UTC

True! Specifying the whole path works:

<job_desc>

<task>
<application>C:\Windows\system32\cmd.exe</application>
<command_line>/C "del \pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17</command_line>
</task>

</job_desc>


I have deployed this Change A into the PythonGPUbeta app, just to test if it works in all Windows machines. Just sent a few (32) jobs. If it works fine on, will move on to introduce the other changes.
ID: 59347 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59348 - Posted: 27 Sep 2022, 14:02:54 UTC - in response to Message 59343.  
Last modified: 27 Sep 2022, 14:05:21 UTC

I will be running new experiments shortly. My idea is to use the whole capacity of the grid. I have already noticed that a few months ago it could absorb around 800 tasks and now it goes up to 1000! Thank you for all the support :)
ID: 59348 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59354 - Posted: 28 Sep 2022, 7:10:17 UTC

The first batch I sent to PythonGPUbeta yesterday failed, but I figured out the problem this morning. I just sent another batch an hour ago to the PythonGPUbeta app. This time seems to be working. It has Change A implemented, so memory usage is more optimised.
ID: 59354 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59356 - Posted: 28 Sep 2022, 7:49:54 UTC - in response to Message 59337.  
Last modified: 28 Sep 2022, 8:50:11 UTC

Hello Aleksey!

I was looking at how to implement Chance C, namely if we can encode and decode the task conda-environment files using 7zip format and recent versions of 7za.exe.

We use conda-pack to compress the conda environment that we later unpack in the gpugrid windows machines using 7za.exe.

However, looking at the documentation seems like 7zip is not a format conda-pack can deal with. https://conda.github.io/conda-pack/cli.html

Apparently the possible formats include: zip, tar.gz, tgz, tar.bz2, tbz2, tar.xz, txz, tar, parcel (?), squashfs (?)

So in case of switching from the current tar.gz, we could only go to one of these. Maybe tbz2 or txz? seems like this ones we can unpacked in a single step as well, if recent versions 7za.exe allow to handle this format.

Any recommendation? :)

For tbz2 the file size is similar, slightly smaller. The txz file is substantially smaller but took forever (30 mins) to compress.
2.0G pythongpu_windows_x86_64__cuda102.tar.gz
1.9G pythongpu_windows_x86_64__cuda102.tbz2
1.2G pythongpu_windows_x86_64__cuda102.txz
ID: 59356 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Ian&Steve C.

Send message
Joined: 21 Feb 20
Posts: 1116
Credit: 40,839,470,595
RAC: 4,772
Level
Trp
Scientific publications
wat
Message 59357 - Posted: 28 Sep 2022, 12:25:25 UTC

more tasks? I'm running dry ;)
ID: 59357 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 662
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59358 - Posted: 28 Sep 2022, 15:39:18 UTC

More tasks please, also.
ID: 59358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bibi

Send message
Joined: 4 May 17
Posts: 15
Credit: 17,444,875,743
RAC: 217
Level
Trp
Scientific publications
watwatwatwatwat
Message 59359 - Posted: 28 Sep 2022, 16:13:06 UTC - in response to Message 59356.  

Hi,

why not producing a zip file, because the boinc client can unzip such file direct from the project folder to the slot like with acemd3.
When it works, 7za.exe and this extra tasks are not necessary.

pythongpu_windows_x86_64__cuda1131.zip has 2,58 GB
pythongpu_windows_x86_64__cuda1131.tar.gz has 2,66 GB
ID: 59359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 26 Dec 13
Posts: 86
Credit: 1,292,358,731
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59360 - Posted: 28 Sep 2022, 18:26:19 UTC - in response to Message 59356.  
Last modified: 28 Sep 2022, 18:42:56 UTC

Good day, abouh

This time seems to be working. It has Change A implemented,

It's nice to hear that!

Maybe tbz2 or txz?

As I understand, tbz2/txz are alias of file extension for tar.bz2/tar.xz.
So in fact these formats are tar containers which compressed by bz2 or xz.
Therefore, this will require pipelining process, which, however, practically does not affect the unpacking speed, and only lengthens command string.
In my test, unpacking of tar.xz done in ~40 seconds.

seems like this ones we can unpacked in a single step as well, if recent versions 7za.exe allow to handle this format.

xz format supported since version 9.04 beta, but more recent version support multi-threaded (de)compression, witch crucial for fast unpacking.


The txz file is substantially smaller but took forever (30 mins) to compress.

This format use LZMA2 algorithm, similar as 7z use by default. So space saving must be the same with the same settings(--compress-level).
It's highly likely you forgot to use this flag
--n-threads <n>, -j <n>

to set number of threads to use for compression. By default conda-pack use only 1 thread!
And also check --compress-level. Levels higher then 5 not so effective for compression_time/archive_size.
Considering how I think that PythonGPU's app file rarely changes, it's not big deal.
As far as I remember, this (practically) does not affect unpacking speed.
On my test(32 threads / Threadripper 2950X), it took ~2,5 minutes with compress-level 5(archive size 1,55 GiB).
ID: 59360 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 26 Dec 13
Posts: 86
Credit: 1,292,358,731
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59361 - Posted: 28 Sep 2022, 18:56:09 UTC - in response to Message 59359.  
Last modified: 28 Sep 2022, 19:47:06 UTC

why not producing a zip file, because the boinc client can unzip such file direct from the project folder to the slot like with acemd3.

You're probably right.
I somehow didn't pay attention to acemd3 archives in project directory.
Is there some info, how BOINC's work with archives?
I suppose boinc-client uses its built-in library to work with archives (zlib ?), rather than some OS functions/tools.

There's still a dilemma:
1) On the one hand, using zip format will simplify process of application launching and reduce the amount of disk space required by application (no need to copy archive to the working directory). Amount of written data on disk reduced accordingly.
2) On other hand, xz format reduce archive size by whole GiB, that helps to save project's network bandwidth and time to download necessary files at first users access to project.
ID: 59361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 26 Dec 13
Posts: 86
Credit: 1,292,358,731
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59362 - Posted: 28 Sep 2022, 19:56:28 UTC - in response to Message 59360.  

On my test(32 threads / Threadripper 2950X), it took ~2,5 minutes with compress-level 5(archive size 1,55 GiB).

It's about compression*
ID: 59362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59364 - Posted: 29 Sep 2022, 14:16:34 UTC - in response to Message 59359.  
Last modified: 29 Sep 2022, 14:49:18 UTC

We tried to pack files with zip at first but encountered problems in windows. Not sure if it was some kind of strange quirk in the wrapper or in conda-pack (the tool for creating, packing and unpacking conda environments, https://conda.github.io/conda-pack/), but the process failed for compressed environment files above a certain memory size.

We then tried to used another format that could compress the files to a smaller size than .zip. We tried .tar but not all windows version have tar.exe (old ones do not).

We finally found this solution of sending 7za.exe along with the conda packed environment to be able to unpack it as part of the job.

I am not 100% sure, but I suspect acemd3 does not use PyTorch machine learning python framework, which increases substantially the size of the packed environment. And I believe acemd4 does use pytorch, and faces the same issue as the PythonGPU tasks.
ID: 59364 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
abouh

Send message
Joined: 31 May 21
Posts: 200
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 59365 - Posted: 29 Sep 2022, 14:54:43 UTC - in response to Message 59362.  

You were absolutely right, I forgot the number of threads! I could now reproduce a a much faster compression as well.

I will proceed to test if I can use the BOINC wrapper and a newer version of 7za.exe to unpack it locally in a reasonable amount of time and then will deploy it to PythonGPUbeta for testing.

Thank you very much!
ID: 59365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bibi

Send message
Joined: 4 May 17
Posts: 15
Credit: 17,444,875,743
RAC: 217
Level
Trp
Scientific publications
watwatwatwatwat
Message 59366 - Posted: 29 Sep 2022, 18:21:05 UTC - in response to Message 59365.  

Hi abouh,

the provided 7za.exe has version 9.20 from 2010. The last version on 7-zip.org is 22.01 (now 7z.exe).
If you want to unpack in a pipe or delete the tar file, you need cmd. But the used starter wrapper_6.1_windows_x86_64.exe (see project folder) don't know about environment and the windows folder isn't necessarily c:\windows, so you also should provide cmd.exe.
Unpacking in a pipe:
<task>
<application>.\cmd.exe</application>
<command_line>/c .\7za.exe -so x pythongpu_windows_x86_64__cuda1131.tar.xz | .\7za.exe -y -sifile.txt.tar x & exit</command_line>
<weight>1</weight>
</task>

Why conda-pack with format zip is not working I don't know.
ID: 59366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
bibi

Send message
Joined: 4 May 17
Posts: 15
Credit: 17,444,875,743
RAC: 217
Level
Trp
Scientific publications
watwatwatwatwat
Message 59367 - Posted: 29 Sep 2022, 18:56:26 UTC - in response to Message 59366.  

7z.exe calls the dll, 7za.exe stands alone. You find it in 7-Zip Extra on https://7-zip.org/download.html
But your version works too.
ID: 59367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[CSF] Aleksey Belkov

Send message
Joined: 26 Dec 13
Posts: 86
Credit: 1,292,358,731
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 59368 - Posted: 29 Sep 2022, 21:01:20 UTC - in response to Message 59366.  


the provided 7za.exe has version 9.20 from 2010. The last version on 7-zip.org is 22.01


7z.exe calls the dll, 7za.exe stands alone. You find it in 7-Zip Extra on https://7-zip.org/download.html

All this has already been discussed by several posts above.
If you had read before writing...

so you also should provide cmd.exe.

I think this is not a good idea.
Some antiviruses may perceive an attempt to launch cmd.exe not from the system directory as suspicious/malicious activity.
ID: 59368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 50 · Next

Message boards : News : Experimental Python tasks (beta) - task description

©2025 Universitat Pompeu Fabra