Name | test_2-SFARR_TEST_LLM_WINDOWS_101_6-0-1-RND3891_3 |
Workunit | 31482397 |
Created | 24 Apr 2025, 14:29:21 UTC |
Sent | 24 Apr 2025, 14:30:20 UTC |
Report deadline | 29 Apr 2025, 14:30:20 UTC |
Received | 24 Apr 2025, 15:07:03 UTC |
Server state | Over |
Outcome | Computation error |
Client state | Compute error |
Exit status | 195 (0x000000C3) EXIT_CHILD_FAILED |
Computer ID | 628374 |
Run time | 50 sec |
CPU time | 10 sec |
Validate state | Invalid |
Credit | 0.00 |
Device peak FLOPS | 84,054.00 GFLOPS |
Application version | LLM: LLMs for chemistry v1.00 (cuda124L) x86_64-pc-linux-gnu |
Peak working set size | 758.27 MB |
Peak swap size | 7.92 GB |
Peak disk usage | 8.13 GB |
<core_client_version>8.2.0</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 2025-04-24 17:04:38 (575506): wrapper (8.1.26018): starting 2025-04-24 17:05:09 (575506): wrapper: running bin/python (bin/conda-unpack) 2025-04-24 17:05:09 (575506): wrapper: created child process 575554 2025-04-24 17:05:10 (575506): bin/python exited; CPU time 0.408976 2025-04-24 17:05:10 (575506): wrapper: running bin/tar (xjvf input.tar.bz2) 2025-04-24 17:05:10 (575506): wrapper: created child process 575559 2025-04-24 17:05:11 (575506): bin/tar exited; CPU time 0.009698 2025-04-24 17:05:11 (575506): wrapper: running bin/bash (run.sh) 2025-04-24 17:05:11 (575506): wrapper: created child process 575561 + echo 'Setup environment' + source bin/activate ++ _conda_pack_activate ++ local _CONDA_SHELL_FLAVOR ++ '[' -n x ']' ++ _CONDA_SHELL_FLAVOR=bash ++ local script_dir ++ case "$_CONDA_SHELL_FLAVOR" in +++ dirname bin/activate ++ script_dir=bin +++ cd bin +++ pwd ++ local full_path_script_dir=/home/maciek/BOINC/40000/slots/1/bin +++ dirname /home/maciek/BOINC/40000/slots/1/bin ++ local full_path_env=/home/maciek/BOINC/40000/slots/1 +++ basename /home/maciek/BOINC/40000/slots/1 ++ local env_name=1 ++ '[' -n '' ']' ++ export CONDA_PREFIX=/home/maciek/BOINC/40000/slots/1 ++ CONDA_PREFIX=/home/maciek/BOINC/40000/slots/1 ++ export _CONDA_PACK_OLD_PS1= ++ _CONDA_PACK_OLD_PS1= ++ PATH=/home/maciek/BOINC/40000/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:. ++ PS1='(1) ' ++ case "$_CONDA_SHELL_FLAVOR" in ++ hash -r ++ local _script_dir=/home/maciek/BOINC/40000/slots/1/etc/conda/activate.d ++ '[' -d /home/maciek/BOINC/40000/slots/1/etc/conda/activate.d ']' + export PATH=/home/maciek/BOINC/40000/slots/1:/home/maciek/BOINC/40000/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:. + PATH=/home/maciek/BOINC/40000/slots/1:/home/maciek/BOINC/40000/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:. + echo 'Create a temporary directory' + export TMP=/home/maciek/BOINC/40000/slots/1/tmp + TMP=/home/maciek/BOINC/40000/slots/1/tmp + mkdir -p /home/maciek/BOINC/40000/slots/1/tmp + which python + pip install main_generation-0.1.0-py3-none-any.whl -v --no-deps WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning. + export CUDA_VISIBLE_DEVICES=0 + CUDA_VISIBLE_DEVICES=0 + export HF_HOME=../.cache + HF_HOME=../.cache + export VLLM_ASSETS_CACHE=../.cache + VLLM_ASSETS_CACHE=../.cache + export VLLM_CACHE_ROOT=../.cache + VLLM_CACHE_ROOT=../.cache + echo RUNNING + pythonbinary=/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/aiengine/main_generation.pyc + python /home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/aiengine/main_generation.pyc --conf conf.yaml Generating train split: 0 examples [00:00, ? examples/s] Generating train split: 1000 examples [00:00, 324410.55 examples/s] Traceback (most recent call last): File "wheel_contents/aiengine/main_generation.py", line 86, in <module> File "wheel_contents/aiengine/model.py", line 36, in __init__ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/utils.py", line 1096, in inner return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 243, in __init__ self.llm_engine = LLMEngine.from_engine_args( ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 521, in from_engine_args return engine_cls.from_vllm_config( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 497, in from_vllm_config return cls( ^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 281, in __init__ self.model_executor = executor_class(vllm_config=vllm_config, ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, in __init__ self._init_executor() File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 46, in _init_executor self.collective_rpc("init_device") File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc answer = run_method(self.driver_worker, method, args, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/utils.py", line 2347, in run_method return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/worker/worker_base.py", line 604, in init_device self.worker.init_device() # type: ignore ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/worker/worker.py", line 157, in init_device _check_if_gpu_supports_dtype(self.model_config.dtype) File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/worker/worker.py", line 526, in _check_if_gpu_supports_dtype raise ValueError( ValueError: Bfloat16 is only supported on GPUs with compute capability of at least 8.0. Your NVIDIA TITAN V GPU has compute capability 7.0. You can use float16 instead by explicitly setting the `dtype` flag in CLI, for example: --dtype=half. 2025-04-24 17:05:26 (575506): bin/bash exited; CPU time 14.822231 2025-04-24 17:05:26 (575506): app exit status: 0x1 2025-04-24 17:05:26 (575506): called boinc_finish(195) </stderr_txt> ]]>
©2025 Universitat Pompeu Fabra