Message boards :
News :
Experimental Python tasks (beta) - task description
Message board moderation
Previous · 1 . . . 26 · 27 · 28 · 29 · 30 · 31 · 32 . . . 50 · Next
| Author | Message |
|---|---|
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 318 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Looks good to me. Just one question - are there any 'minimum Windows version' constraints on the later versions of 7za? I think it's unlikely to affect us, but it would be good to check, just in case. I mention it, because the original trial runs used native Windows tar decompression (the same as the Linux implementation): but that was only introduced in later versions of Windows 10 and 11. Some of us (myself included) still use Windows 7, which supports 7z but not tar. A reasonable degree of backwards compatibility is desirable! |
|
Send message Joined: 26 Dec 13 Posts: 86 Credit: 1,292,358,731 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi, abouh! Change A: You are correct. Change B You are correct. 2) If this can't be changed or too hard / long to implement - no big deal. In any case, pipelining still save some time and space : )
Of course, if you launch 7za from working directory(/slots/X), than output flag not necessary. Change C You are correct. Using 7z format(LZMA2 compression) significantly reduce archive size, save your bandwidth and some time for unpacking/unzipping process ; ) As I wrote above, the 7za command will be simplified, since the pipelining process will no longer be required. NB! It is important to update the supplied 7za to current version, since version 9.20, a lot of optimizations have been made for compression/decompression of 7z archives(LZMA).
As mentioned on 7-Zip homepage, app support all versions since Windows 2000:
|
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
As a very first step I am trying to remove the .tar.gz file. I am encountering a first issue. The steps of the jobs are specified in the job.xml file in the following way: <job_desc> Essentially I need to execute a task that removes the pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17 file after the very first task. When I try in the Windows command prompt: cmd.exe /C "del pythongpu_windows_x86_64__cuda1131.tar.gz.1a152f102cdad20f16638f0f269a5a17" it works. However when I add to the job.xml file <task> The wrapper seems to ignore it. Doesn't the wrapper have cmd.exe? I need to run more tests to figure out the exact command to delete files |
|
Send message Joined: 26 Dec 13 Posts: 86 Credit: 1,292,358,731 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
<task> Try to use %COMSPEC% variable as alias to %SystemRoot%\system32\cmd.exe If this doesn't work, then I'm sure specifying the full path(C:\Windows\system32\cmd.exe) should work. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 4,772 Level ![]() Scientific publications
|
in other news. looks like we've finally crunched through all the tasks ready to send. all that remains are the ones in progress and the resends that will come from those. any more coming soon?
|
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
True! Specifying the whole path works: <job_desc> I have deployed this Change A into the PythonGPUbeta app, just to test if it works in all Windows machines. Just sent a few (32) jobs. If it works fine on, will move on to introduce the other changes. |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
I will be running new experiments shortly. My idea is to use the whole capacity of the grid. I have already noticed that a few months ago it could absorb around 800 tasks and now it goes up to 1000! Thank you for all the support :) |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
The first batch I sent to PythonGPUbeta yesterday failed, but I figured out the problem this morning. I just sent another batch an hour ago to the PythonGPUbeta app. This time seems to be working. It has Change A implemented, so memory usage is more optimised. |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
Hello Aleksey! I was looking at how to implement Chance C, namely if we can encode and decode the task conda-environment files using 7zip format and recent versions of 7za.exe. We use conda-pack to compress the conda environment that we later unpack in the gpugrid windows machines using 7za.exe. However, looking at the documentation seems like 7zip is not a format conda-pack can deal with. https://conda.github.io/conda-pack/cli.html Apparently the possible formats include: zip, tar.gz, tgz, tar.bz2, tbz2, tar.xz, txz, tar, parcel (?), squashfs (?) So in case of switching from the current tar.gz, we could only go to one of these. Maybe tbz2 or txz? seems like this ones we can unpacked in a single step as well, if recent versions 7za.exe allow to handle this format. Any recommendation? :) For tbz2 the file size is similar, slightly smaller. The txz file is substantially smaller but took forever (30 mins) to compress. 2.0G pythongpu_windows_x86_64__cuda102.tar.gz 1.9G pythongpu_windows_x86_64__cuda102.tbz2 1.2G pythongpu_windows_x86_64__cuda102.txz |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 4,772 Level ![]() Scientific publications
|
more tasks? I'm running dry ;)
|
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 662 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
More tasks please, also. |
|
Send message Joined: 4 May 17 Posts: 15 Credit: 17,444,875,743 RAC: 217 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Hi, why not producing a zip file, because the boinc client can unzip such file direct from the project folder to the slot like with acemd3. When it works, 7za.exe and this extra tasks are not necessary. pythongpu_windows_x86_64__cuda1131.zip has 2,58 GB pythongpu_windows_x86_64__cuda1131.tar.gz has 2,66 GB |
|
Send message Joined: 26 Dec 13 Posts: 86 Credit: 1,292,358,731 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Good day, abouh This time seems to be working. It has Change A implemented, It's nice to hear that! Maybe tbz2 or txz? As I understand, tbz2/txz are alias of file extension for tar.bz2/tar.xz. So in fact these formats are tar containers which compressed by bz2 or xz. Therefore, this will require pipelining process, which, however, practically does not affect the unpacking speed, and only lengthens command string. In my test, unpacking of tar.xz done in ~40 seconds. seems like this ones we can unpacked in a single step as well, if recent versions 7za.exe allow to handle this format. xz format supported since version 9.04 beta, but more recent version support multi-threaded (de)compression, witch crucial for fast unpacking. The txz file is substantially smaller but took forever (30 mins) to compress. This format use LZMA2 algorithm, similar as 7z use by default. So space saving must be the same with the same settings(--compress-level). It's highly likely you forgot to use this flag --n-threads <n>, -j <n> to set number of threads to use for compression. By default conda-pack use only 1 thread! And also check --compress-level. Levels higher then 5 not so effective for compression_time/archive_size. Considering how I think that PythonGPU's app file rarely changes, it's not big deal. As far as I remember, this (practically) does not affect unpacking speed. On my test(32 threads / Threadripper 2950X), it took ~2,5 minutes with compress-level 5(archive size 1,55 GiB). |
|
Send message Joined: 26 Dec 13 Posts: 86 Credit: 1,292,358,731 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
why not producing a zip file, because the boinc client can unzip such file direct from the project folder to the slot like with acemd3. You're probably right. I somehow didn't pay attention to acemd3 archives in project directory. Is there some info, how BOINC's work with archives? I suppose boinc-client uses its built-in library to work with archives (zlib ?), rather than some OS functions/tools. There's still a dilemma: 1) On the one hand, using zip format will simplify process of application launching and reduce the amount of disk space required by application (no need to copy archive to the working directory). Amount of written data on disk reduced accordingly. 2) On other hand, xz format reduce archive size by whole GiB, that helps to save project's network bandwidth and time to download necessary files at first users access to project. |
|
Send message Joined: 26 Dec 13 Posts: 86 Credit: 1,292,358,731 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
On my test(32 threads / Threadripper 2950X), it took ~2,5 minutes with compress-level 5(archive size 1,55 GiB). It's about compression* |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
We tried to pack files with zip at first but encountered problems in windows. Not sure if it was some kind of strange quirk in the wrapper or in conda-pack (the tool for creating, packing and unpacking conda environments, https://conda.github.io/conda-pack/), but the process failed for compressed environment files above a certain memory size. We then tried to used another format that could compress the files to a smaller size than .zip. We tried .tar but not all windows version have tar.exe (old ones do not). We finally found this solution of sending 7za.exe along with the conda packed environment to be able to unpack it as part of the job. I am not 100% sure, but I suspect acemd3 does not use PyTorch machine learning python framework, which increases substantially the size of the packed environment. And I believe acemd4 does use pytorch, and faces the same issue as the PythonGPU tasks. |
|
Send message Joined: 31 May 21 Posts: 200 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
You were absolutely right, I forgot the number of threads! I could now reproduce a a much faster compression as well. I will proceed to test if I can use the BOINC wrapper and a newer version of 7za.exe to unpack it locally in a reasonable amount of time and then will deploy it to PythonGPUbeta for testing. Thank you very much! |
|
Send message Joined: 4 May 17 Posts: 15 Credit: 17,444,875,743 RAC: 217 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Hi abouh, the provided 7za.exe has version 9.20 from 2010. The last version on 7-zip.org is 22.01 (now 7z.exe). If you want to unpack in a pipe or delete the tar file, you need cmd. But the used starter wrapper_6.1_windows_x86_64.exe (see project folder) don't know about environment and the windows folder isn't necessarily c:\windows, so you also should provide cmd.exe. Unpacking in a pipe: <task> <application>.\cmd.exe</application> <command_line>/c .\7za.exe -so x pythongpu_windows_x86_64__cuda1131.tar.xz | .\7za.exe -y -sifile.txt.tar x & exit</command_line> <weight>1</weight> </task> Why conda-pack with format zip is not working I don't know. |
|
Send message Joined: 4 May 17 Posts: 15 Credit: 17,444,875,743 RAC: 217 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
7z.exe calls the dll, 7za.exe stands alone. You find it in 7-Zip Extra on https://7-zip.org/download.html But your version works too. |
|
Send message Joined: 26 Dec 13 Posts: 86 Credit: 1,292,358,731 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
All this has already been discussed by several posts above. If you had read before writing...
I think this is not a good idea. Some antiviruses may perceive an attempt to launch cmd.exe not from the system directory as suspicious/malicious activity. |
©2025 Universitat Pompeu Fabra