Advanced search

Message boards : Graphics cards (GPUs) : Exit Code 195

Author Message
Arkhenia
Send message
Joined: 30 Aug 20
Posts: 3
Credit: 122,586,304
RAC: 853
Level
Cys
Scientific publications
wat
Message 60859 - Posted: 8 Nov 2023 | 14:27:06 UTC
Last modified: 8 Nov 2023 | 14:29:44 UTC

Hello everyone,

This is my first post on this forum, I hope I'm not in the wrong section.

I'm having a problem with all the ATMbeta units I can make. They all fail in less than 2 minutes.

Looking at the logs, I get the message Exit Code 195 for all the units or 195 (0xc3) EXIT_CHILD_FAILED

Exemple : https://www.gpugrid.net/result.php?resultid=33689424

Here is my configuration (https://www.gpugrid.net/show_host_detail.php?hostid=583132)
- OS : Windows 11
- CPU : Ryzen 7 3700X
- GPU : RTX 3060 12Go

I've tried the project on my wife'PC (Windows 11 / Ryzen 7 3800X / GTX 1050 Ti) and it works fine

I've also noticed a difference with a python process that runs on my wife's PC but not on mine.

I've reset the project but nothing changes, do you have any ideas ?

I can run other GPU Project withtout any problem

Thanks for your help

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1079
Credit: 40,231,533,983
RAC: 11
Level
Trp
Scientific publications
wat
Message 60860 - Posted: 8 Nov 2023 | 14:46:32 UTC - in response to Message 60859.

the error code itself isnt very useful (it's pretty generic for a lot of BOINC app failures). but your specific error seems to be something related to attempting to run cmd.exe, which the app uses to call the python processes.

try running running BOINC as administrator. shutdown BOINC then when you re-launch it, right click -> Run as administrator. I'm not sure if this works, but give it a shot. I asked another user to try this, but haven't gotten any feedback if it helps the situation or not. It could also be a Windows 10 vs Windows 11 thing, as it looks like Windows 10 systems are more successful that 11.

Some other notes that are important for these tasks that you should be aware of and keep in mind:

1. they cannot be interrupted at all. if you pause the running task and resume, they will fail. if you reboot your computer, they will fail. if BOINC decides to pause them to run another app, they will fail. so be prepared and allow them to finish without interruption.

2. they require internet access. at least for the initial setup phase of the task. if you lose internet access, the task will fail because it cannot download the necessary additional packages that it needs.

3. most tasks will jump to 100% after a few minutes and remain there for a while. these tasks are NOT stuck and you do not need to abort them. just let them work. the first segment of tasks (tasks with "0-5" or "0-10" in the WU name) will track progress normally. but all subsequent tasks from 1+ will have this jump to 100% behavior. it's all fine, just let them run, which can take several hours.

having said that, there does seem to be more issues with the Windows version of this application, and if possible, I would recommend Linux instead as that runs these more reliably.
____________

Arkhenia
Send message
Joined: 30 Aug 20
Posts: 3
Credit: 122,586,304
RAC: 853
Level
Cys
Scientific publications
wat
Message 60861 - Posted: 8 Nov 2023 | 17:37:24 UTC

Good evening,

Thank you for your reply.

I've just done a test running Boinc as administrator and it doesn't change anything (https://www.gpugrid.net/result.php?resultid=33692094)

My wife's PC is also running Windows 11.

Regarding the various points raised about the project, it's not a problem as my PC is on H24.

As for Linux, it's more complicated because I only use Windows-compatible software. I can set up a Virtual Machine but I don't know whether it will support my video card.

Thanks for your help

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1079
Credit: 40,231,533,983
RAC: 11
Level
Trp
Scientific publications
wat
Message 60862 - Posted: 8 Nov 2023 | 18:00:13 UTC - in response to Message 60861.

thanks for the feedback. now i can stop recommending that :)

I don't have any Windows systems, but I can at least see the run.bat file that these tasks attempt to run.

@echo Setup environment
set HOMEPATH=%CD%
set PATH=%CD%;%CD%\Library\usr\bin;%CD%\Library\bin;C:\Windows\system32;C:\Windows
set PYTHONPATH=%CD%\Lib\python3.9\site-packages
set SYSTEMROOT=C:\Windows

@echo Create a temporary directory
set TEMP=%CD%\tmp
mkdir %TEMP%

@echo Install AToM
set REPO_URL=git+https://github.com/raimis/AToM-OpenMM.git@d7931b9a6217232d481731f7589d64b100a514ac
python.exe -m pip install %REPO_URL% || exit 14
python.exe -m pip list

@echo Configure AToM
echo localhost,0:%CUDA_DEVICE%,1,CUDA,,%TEMP% > nodefile

@echo Extract restart
tar.exe xjvf restart.tar.bz2 || true

@echo Run AToM
set CONFIG_FILE=tnks2_m07_m3b_asyncre.cntl
python.exe Scripts\rbfe_explicit_sync.py %CONFIG_FILE% || exit 22

@echo Save output
tar.exe cjvf output.tar.bz2 run.log r*/*.out r*/*.dcd || true

@echo Save restart
tar.exe cjvf restart.tar.bz2 r*/*.xml


the process seems to be failing sometime before downloading the required packages.

working backwards,
it's either failing to download due to internet connectivity issues (do you have any special network filtering or blocks?)
or maybe failing to download based on AV software blocks? (you could temporarily disable your AV, and/or whitelist your BOINC data folder so that nothing gets blocked on those directories)

or maybe a permissions issue trying to create the temporary folder

or maybe an issue setting the required environment variables.
____________

Arkhenia
Send message
Joined: 30 Aug 20
Posts: 3
Credit: 122,586,304
RAC: 853
Level
Cys
Scientific publications
wat
Message 60863 - Posted: 8 Nov 2023 | 18:34:47 UTC - in response to Message 60862.

Thank you for your reply.

I'll be doing some tests tomorrow as I've reached my quota of 6 spots a day.

As far as running as administrator is concerned, it doesn't change anything in my case, but perhaps it could work for others.

Thank you for your help

Post to thread

Message boards : Graphics cards (GPUs) : Exit Code 195

//