Task 38577923

Name wu_f48458d4-GIANNI_GPROTO7-0-1-RND8020_0
Workunit 31543414
Created 25 Sep 2025, 10:01:42 UTC
Sent 25 Sep 2025, 10:01:55 UTC
Report deadline 30 Sep 2025, 10:01:55 UTC
Received 25 Sep 2025, 14:20:46 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status 195 (0x000000C3) EXIT_CHILD_FAILED
Computer ID 549890
Run time 4 min 1 sec
CPU time 49 sec
Validate state Invalid
Credit 0.00
Device peak FLOPS 24,731.87 GFLOPS
Application version LLM: LLMs for chemistry v1.00 (cuda124L)
x86_64-pc-linux-gnu
Peak working set size 773.51 MB
Peak swap size 10.27 GB
Peak disk usage 8.21 GB

Stderr output

<core_client_version>8.3.0</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
2025-09-25 10:16:12 (516593): wrapper (8.1.26018): starting
2025-09-25 10:17:24 (516593): wrapper: running bin/python (bin/conda-unpack)
2025-09-25 10:17:24 (516593): wrapper: created child process 516652
2025-09-25 10:17:27 (516593): bin/python exited; CPU time 1.288054
2025-09-25 10:17:27 (516593): wrapper: running bin/tar (xjvf input.tar.bz2)
2025-09-25 10:17:27 (516593): wrapper: created child process 516654
2025-09-25 10:17:28 (516593): bin/tar exited; CPU time 0.032979
2025-09-25 10:17:28 (516593): wrapper: running bin/bash (run.sh)
2025-09-25 10:17:28 (516593): wrapper: created child process 516656
+ echo 'Setup environment'
+ source bin/activate
++ _conda_pack_activate
++ local _CONDA_SHELL_FLAVOR
++ '[' -n x ']'
++ _CONDA_SHELL_FLAVOR=bash
++ local script_dir
++ case "$_CONDA_SHELL_FLAVOR" in
+++ dirname bin/activate
++ script_dir=bin
+++ cd bin
+++ pwd
++ local full_path_script_dir=/home/whitlerd/Desktop/BOINC/slots/1/bin
+++ dirname /home/whitlerd/Desktop/BOINC/slots/1/bin
++ local full_path_env=/home/whitlerd/Desktop/BOINC/slots/1
+++ basename /home/whitlerd/Desktop/BOINC/slots/1
++ local env_name=1
++ '[' -n '' ']'
++ export CONDA_PREFIX=/home/whitlerd/Desktop/BOINC/slots/1
++ CONDA_PREFIX=/home/whitlerd/Desktop/BOINC/slots/1
++ export _CONDA_PACK_OLD_PS1=
++ _CONDA_PACK_OLD_PS1=
++ PATH=/home/whitlerd/Desktop/BOINC/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
++ PS1='(1) '
++ case "$_CONDA_SHELL_FLAVOR" in
++ hash -r
++ local _script_dir=/home/whitlerd/Desktop/BOINC/slots/1/etc/conda/activate.d
++ '[' -d /home/whitlerd/Desktop/BOINC/slots/1/etc/conda/activate.d ']'
+ export PATH=/home/whitlerd/Desktop/BOINC/slots/1:/home/whitlerd/Desktop/BOINC/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
+ PATH=/home/whitlerd/Desktop/BOINC/slots/1:/home/whitlerd/Desktop/BOINC/slots/1/bin:/usr/local/bin:/usr/local/sbin:/usr/bin:/usr/sbin:/bin:/sbin:.
+ echo 'Create a temporary directory'
+ export TMP=/home/whitlerd/Desktop/BOINC/slots/1/tmp
+ TMP=/home/whitlerd/Desktop/BOINC/slots/1/tmp
+ mkdir -p /home/whitlerd/Desktop/BOINC/slots/1/tmp
+ which python
+ pip install main_generation-0.1.0-py3-none-any.whl -v --no-deps
+ export CUDA_VISIBLE_DEVICES=0
+ CUDA_VISIBLE_DEVICES=0
+ export HF_HOME=../.cache
+ HF_HOME=../.cache
+ export VLLM_ASSETS_CACHE=../.cache
+ VLLM_ASSETS_CACHE=../.cache
+ export VLLM_CACHE_ROOT=../.cache
+ VLLM_CACHE_ROOT=../.cache
+ echo RUNNING
+ pythonbinary=/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/aiengine/main_generation.pyc
+ python /home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/aiengine/main_generation.pyc --conf conf.yaml

Generating train split: 0 examples [00:00, ? examples/s]
Generating train split: 2500 examples [00:00, 206844.20 examples/s]
Traceback (most recent call last):
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connection.py", line 198, in _new_conn
    sock = connection.create_connection(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/util/connection.py", line 60, in create_connection
    for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/socket.py", line 978, in getaddrinfo
    for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
socket.gaierror: [Errno -3] Temporary failure in name resolution

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connectionpool.py", line 787, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connectionpool.py", line 488, in _make_request
    raise new_e
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connectionpool.py", line 464, in _make_request
    self._validate_conn(conn)
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connectionpool.py", line 1093, in _validate_conn
    conn.connect()
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connection.py", line 704, in connect
    self.sock = sock = self._new_conn()
                       ^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connection.py", line 205, in _new_conn
    raise NameResolutionError(self.host, self, e) from e
urllib3.exceptions.NameResolutionError: <urllib3.connection.HTTPSConnection object at 0x71e736964320>: Failed to resolve 'huggingface.co' ([Errno -3] Temporary failure in name resolution)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/requests/adapters.py", line 667, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/connectionpool.py", line 841, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/urllib3/util/retry.py", line 519, in increment
    raise MaxRetryError(_pool, url, reason) from reason  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/unsloth/Qwen2.5-14B-Instruct-bnb-4bit/tree/main?recursive=True&expand=False (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x71e736964320>: Failed to resolve 'huggingface.co' ([Errno -3] Temporary failure in name resolution)"))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "wheel_contents/aiengine/main_generation.py", line 87, in <module>
  File "wheel_contents/aiengine/model.py", line 36, in __init__
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/utils.py", line 1096, in inner
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/entrypoints/llm.py", line 243, in __init__
    self.llm_engine = LLMEngine.from_engine_args(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/engine/llm_engine.py", line 514, in from_engine_args
    vllm_config = engine_args.create_engine_config(usage_context)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1137, in create_engine_config
    model_config = self.create_model_config()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/engine/arg_utils.py", line 1026, in create_model_config
    return ModelConfig(
           ^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/config.py", line 424, in __init__
    supported_tasks, task = self._resolve_task(task)
                            ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/config.py", line 595, in _resolve_task
    preferred_task = self._get_preferred_task(
                     ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/config.py", line 537, in _get_preferred_task
    if get_pooling_config(model_id, self.revision):
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 441, in get_pooling_config
    if file_or_path_exists(model=model,
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 180, in file_or_path_exists
    return file_exists(str(model),
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 155, in file_exists
    file_list = list_repo_files(repo_id,
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 144, in list_repo_files
    return with_retry(lookup_files, "Error retrieving file list")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 98, in with_retry
    return func()
           ^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/vllm/transformers_utils/config.py", line 134, in lookup_files
    return hf_list_repo_files(repo_id,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 2996, in list_repo_files
    for f in self.list_repo_tree(
             ^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/huggingface_hub/hf_api.py", line 3131, in list_repo_tree
    for path_info in paginate(path=tree_url, headers=headers, params={"recursive": recursive, "expand": expand}):
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/huggingface_hub/utils/_pagination.py", line 36, in paginate
    r = session.get(path, params=params, headers=headers)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/requests/sessions.py", line 602, in get
    return self.request("GET", url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/requests/sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 96, in send
    return super().send(request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/whitlerd/Desktop/BOINC/slots/1/lib/python3.12/site-packages/requests/adapters.py", line 700, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError('HTTPSConnectionPool(host=\'huggingface.co\', port=443): Max retries exceeded with url: /api/models/unsloth/Qwen2.5-14B-Instruct-bnb-4bit/tree/main?recursive=True&expand=False (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x71e736964320>: Failed to resolve \'huggingface.co\' ([Errno -3] Temporary failure in name resolution)"))'), '(Request ID: 174aba12-ee13-4136-8ed7-35f147ca265a)')
2025-09-25 10:20:12 (516593): bin/bash exited; CPU time 48.088419
2025-09-25 10:20:12 (516593): app exit status: 0x1
2025-09-25 10:20:12 (516593): called boinc_finish(195)

</stderr_txt>
]]>


©2025 Universitat Pompeu Fabra