Task 38487480

Name test_2-SFARR_TEST_LLM_WINDOWS_101_6-0-1-RND3891_3
Workunit 31482397
Created 24 Apr 2025, 14:29:21 UTC
Sent 24 Apr 2025, 14:30:20 UTC
Report deadline 29 Apr 2025, 14:30:20 UTC
Received 24 Apr 2025, 15:07:03 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 195 (0x000000C3) EXIT_CHILD_FAILED
Computer ID 628374
Run time 50 sec
CPU time 10 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 84,054.00 GFLOPS
Application version LLM: LLMs for chemistry v1.00 (cuda124L)
x86_64-pc-linux-gnu
Peak working set size 758.27 MB
Peak swap size 7.92 GB
Peak disk usage 8.13 GB

Stderr output

<core_client_version>8.2.0</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
2025-04-24 17:04:38 (575506): wrapper (8.1.26018): starting
2025-04-24 17:05:09 (575506): wrapper: running bin/python (bin/conda-unpack)
2025-04-24 17:05:09 (575506): wrapper: created child process 575554
2025-04-24 17:05:10 (575506): bin/python exited; CPU time 0.408976
2025-04-24 17:05:10 (575506): wrapper: running bin/tar (xjvf input.tar.bz2)
2025-04-24 17:05:10 (575506): wrapper: created child process 575559
2025-04-24 17:05:11 (575506): bin/tar exited; CPU time 0.009698
2025-04-24 17:05:11 (575506): wrapper: running bin/bash (run.sh)
2025-04-24 17:05:11 (575506): wrapper: created child process 575561
+ echo 'Setup environment'
+ source bin/activate
++ _conda_pack_activate
++ local _CONDA_SHELL_FLAVOR
++ '[' -n x ']'
++ _CONDA_SHELL_FLAVOR=bash
++ local script_dir
++ case "$_CONDA_SHELL_FLAVOR" in
+++ dirname bin/activate
++ script_dir=bin
+++ cd bin
+++ pwd
++ local full_path_script_dir=/home/maciek/BOINC/40000/slots/1/bin
+++ dirname /home/maciek/BOINC/40000/slots/1/bin
++ local full_path_env=/home/maciek/BOINC/40000/slots/1
+++ basename /home/maciek/BOINC/40000/slots/1
++ local env_name=1
++ '[' -n '' ']'
++ export CONDA_PREFIX=/home/maciek/BOINC/40000/slots/1
++ CONDA_PREFIX=/home/maciek/BOINC/40000/slots/1
++ export _CONDA_PACK_OLD_PS1=
++ _CONDA_PACK_OLD_PS1=
++ PATH=/home/maciek/BOINC/40000/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
++ PS1='(1) '
++ case "$_CONDA_SHELL_FLAVOR" in
++ hash -r
++ local _script_dir=/home/maciek/BOINC/40000/slots/1/etc/conda/activate.d
++ '[' -d /home/maciek/BOINC/40000/slots/1/etc/conda/activate.d ']'
+ export PATH=/home/maciek/BOINC/40000/slots/1:/home/maciek/BOINC/40000/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
+ PATH=/home/maciek/BOINC/40000/slots/1:/home/maciek/BOINC/40000/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
+ echo 'Create a temporary directory'
+ export TMP=/home/maciek/BOINC/40000/slots/1/tmp
+ TMP=/home/maciek/BOINC/40000/slots/1/tmp
+ mkdir -p /home/maciek/BOINC/40000/slots/1/tmp
+ which python
+ pip install main_generation-0.1.0-py3-none-any.whl -v --no-deps
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager, possibly rendering your system unusable. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv. Use the --root-user-action option if you know what you are doing and want to suppress this warning.
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ export HF_HOME=../.cache
+ HF_HOME=../.cache
+ export VLLM_ASSETS_CACHE=../.cache
+ VLLM_ASSETS_CACHE=../.cache
+ export VLLM_CACHE_ROOT=../.cache
+ VLLM_CACHE_ROOT=../.cache
+ echo RUNNING
+ pythonbinary=/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/aiengine/main_generation.pyc
+ python /home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/aiengine/main_generation.pyc --conf conf.yaml

Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 1000 examples [00:00, 324410.55 examples/s]
Traceback (most recent call last):
  File "wheel_contents/aiengine/main_generation.py", line 86, in <module>
  File "wheel_contents/aiengine/model.py", line 36, in __init__
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/utils.py", line 1096, in inner
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 243, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 521, in from_engine_args
    return engine_cls.from_vllm_config(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 497, in from_vllm_config
    return cls(
           ^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 281, in __init__
    self.model_executor = executor_class(vllm_config=vllm_config, )
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/executor/executor_base.py", line 52, in __init__
    self._init_executor()
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 46, in _init_executor
    self.collective_rpc("init_device")
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/executor/uniproc_executor.py", line 56, in collective_rpc
    answer = run_method(self.driver_worker, method, args, kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/utils.py", line 2347, in run_method
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/worker/worker_base.py", line 604, in init_device
    self.worker.init_device()  # type: ignore
    ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/worker/worker.py", line 157, in init_device
    _check_if_gpu_supports_dtype(self.model_config.dtype)
  File "/home/maciek/BOINC/40000/slots/1/lib/python3.12/site-packages/vllm/worker/worker.py", line 526, in _check_if_gpu_supports_dtype
    raise ValueError(
ValueError: Bfloat16 is only supported on GPUs with compute capability of at least 8.0. Your NVIDIA TITAN V GPU has compute capability 7.0. You can use float16 instead by explicitly setting the `dtype` flag in CLI, for example: --dtype=half.
2025-04-24 17:05:26 (575506): bin/bash exited; CPU time 14.822231
2025-04-24 17:05:26 (575506): app exit status: 0x1
2025-04-24 17:05:26 (575506): called boinc_finish(195)

</stderr_txt>
]]>


©2025 Universitat Pompeu Fabra