Name | wu_bf8c763e-GIANNI_GLLM-0-1-RND4247_0 |
Workunit | 31482131 |
Created | 23 Apr 2025, 12:23:38 UTC |
Sent | 23 Apr 2025, 12:43:18 UTC |
Report deadline | 28 Apr 2025, 12:43:18 UTC |
Received | 23 Apr 2025, 12:56:04 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 195 (0x000000C3) EXIT_CHILD_FAILED |
Computer ID | 637819 |
Run time | 3 min 50 sec |
CPU time | 3 min 39 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 15,485.45 GFLOPS |
Application version | LLM: LLMs for chemistry v1.00 (cuda124L) x86_64-pc-linux-gnu |
Peak working set size | 3.91 GB |
Peak swap size | 38.25 GB |
Peak disk usage | 8.21 GB |
<core_client_version>7.19.0</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 2025-04-23 08:51:33 (4188403): wrapper (8.1.26018): starting 2025-04-23 08:52:27 (4188403): wrapper: running bin/python (bin/conda-unpack) 2025-04-23 08:52:27 (4188403): wrapper: created child process 4188627 2025-04-23 08:52:29 (4188403): bin/python exited; CPU time 0.949069 2025-04-23 08:52:29 (4188403): wrapper: running bin/tar (xjvf input.tar.bz2) 2025-04-23 08:52:29 (4188403): wrapper: created child process 4188643 2025-04-23 08:52:30 (4188403): bin/tar exited; CPU time 0.019291 2025-04-23 08:52:30 (4188403): wrapper: running bin/bash (run.sh) 2025-04-23 08:52:30 (4188403): wrapper: created child process 4188652 + echo 'Setup environment' + source bin/activate ++ _conda_pack_activate ++ local _CONDA_SHELL_FLAVOR ++ '[' -n x ']' ++ _CONDA_SHELL_FLAVOR=bash ++ local script_dir ++ case "$_CONDA_SHELL_FLAVOR" in +++ dirname bin/activate ++ script_dir=bin +++ cd bin +++ pwd ++ local full_path_script_dir=/home/ian/BOINC/slots/4/bin +++ dirname /home/ian/BOINC/slots/4/bin ++ local full_path_env=/home/ian/BOINC/slots/4 +++ basename /home/ian/BOINC/slots/4 ++ local env_name=4 ++ '[' -n '' ']' ++ export CONDA_PREFIX=/home/ian/BOINC/slots/4 ++ CONDA_PREFIX=/home/ian/BOINC/slots/4 ++ export _CONDA_PACK_OLD_PS1= ++ _CONDA_PACK_OLD_PS1= ++ PATH=/home/ian/BOINC/slots/4/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:. ++ PS1='(4) ' ++ case "$_CONDA_SHELL_FLAVOR" in ++ hash -r ++ local _script_dir=/home/ian/BOINC/slots/4/etc/conda/activate.d ++ '[' -d /home/ian/BOINC/slots/4/etc/conda/activate.d ']' + export PATH=/home/ian/BOINC/slots/4:/home/ian/BOINC/slots/4/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:. + PATH=/home/ian/BOINC/slots/4:/home/ian/BOINC/slots/4/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:. + echo 'Create a temporary directory' + export TMP=/home/ian/BOINC/slots/4/tmp + TMP=/home/ian/BOINC/slots/4/tmp + mkdir -p /home/ian/BOINC/slots/4/tmp + which python + pip install main_generation-0.1.0-py3-none-any.whl -v --no-deps + export CUDA_VISIBLE_DEVICES=0 + CUDA_VISIBLE_DEVICES=0 + export HF_HOME=../.cache + HF_HOME=../.cache + export VLLM_ASSETS_CACHE=../.cache + VLLM_ASSETS_CACHE=../.cache + export VLLM_CACHE_ROOT=../.cache + VLLM_CACHE_ROOT=../.cache + echo RUNNING + pythonbinary=/home/ian/BOINC/slots/4/lib/python3.12/site-packages/aiengine/main_generation.pyc + python /home/ian/BOINC/slots/4/lib/python3.12/site-packages/aiengine/main_generation.pyc --conf conf.yaml Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 1000 examples [00:00, 214696.15 examples/s] Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s] Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:01<00:01, 1.01s/it] Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.02s/it] Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.02s/it] Loading safetensors checkpoint shards: 0% Completed | 0/2 [00:00<?, ?it/s] Loading safetensors checkpoint shards: 50% Completed | 1/2 [00:01<00:01, 1.09s/it] Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.11s/it] Loading safetensors checkpoint shards: 100% Completed | 2/2 [00:02<00:00, 1.10s/it] Map: 0%| | 0/1000 [00:00<?, ? examples/s] Map: 100%|██████████| 1000/1000 [00:00<00:00, 15043.21 examples/s] run.sh: line 26: 4188685 Killed python ${pythonbinary} --conf conf.yaml 2025-04-23 08:55:23 (4188403): bin/bash exited; CPU time 35.388880 2025-04-23 08:55:23 (4188403): app exit status: 0x89 2025-04-23 08:55:23 (4188403): called boinc_finish(195) </stderr_txt> ]]>
©2025 Universitat Pompeu Fabra