|
1)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62189)
Posted 2 Feb 2025 by TofPete Post: I don't think that memory stick swapping could change anything. There should be problems with other applications as well if there would be a hardware memory problem. |
|
2)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62175)
Posted 28 Jan 2025 by TofPete Post: These crashes are not from my machine. My machine sent another one this morning: https://www.gpugrid.net/result.php?resultid=38275319 As I can see it crashed 5 seconds after the task started and there is no reason in the logs why it crashed (no memory leak entry as earlier). The task was not interrupted, CPU and GPU temperatures are normal, memory is ok and the pagefile size was increased as suggested. The only similar thing to the other crashes you mentioned is that there is windows 10 on my computer as well. Any other idea? Two crashes from today |
|
3)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62173)
Posted 25 Jan 2025 by TofPete Post: Thanks for the replies. Based on this report I don't think that the problem is with my computer, because there are many hosts which have similar problem. However, I investigate my computer to check if everything is fine with it: * GPU temperature is fine, the maximum is 76 Celsius * CPU temperature is also fine, the maximum is 55 Celsius * pagefile size has just increased, we will see if it helps... * memory seems fine, I don't experience any problem with other apps which could caused by a memory issue, but I will run a memtest * usually, I don't interrupt the calculation (BOINC is set to always run) and my computer is on in all day, but I will keep an eye on this too
|
|
4)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62166)
Posted 23 Jan 2025 by TofPete Post: I understand this but my problem is that I don't know what settings need to be changed to solve these fails. Sometimes I only lose several minutes, but there are tasks which needed about 9800 seconds to fail. I think that 32 GB RAM, 3 GHz CPU clock and an Nvidia GTX 1050 Ti with 4 GB VRAM is enough for this kind of tasks. I use my cpu and video card with their's normal settings, I don't use overclocking, etc. And the strange thing is that the problem occurs only with ACEMD3 tasks. And the logs says that there were memory leaks. Why? What settings should I change to prevent the leaking? My host has been crunching these units non-stop without issue. |
|
5)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62165)
Posted 23 Jan 2025 by TofPete Post: I checked the settings mentioned and it's already checked. How can I "unhide" my host to see more details about this problem? |
|
6)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62160)
Posted 22 Jan 2025 by TofPete Post: I have the same problem, the error rate is about 50% (16 errors / 33 total tasks) which is annoying! I use this computer for other computing projects as well but there are errors at the ACEMD 3 of GPUGrid tasks only. I can see unknown error and memory leak in the logs: https://www.gpugrid.net/result.php?resultid=37934054 All of the operation system and graphic card driver updates are installed on my machine, so I don't know what else I can do to solve these memory leak errors. How can I "unhide" my host to see more details about this problem?
|
|
7)
Message boards :
Number crunching :
No new WU's due to limitation on tasks
(Message 62044)
Posted 16 Dec 2024 by TofPete Post: Same deadlock for me! Waiting for someone to clean out the server's disk... With the server disk space problem is there anyway to get around the limitation on tasks that cane be performed? I get the "This computer has reached a limit on tasks in progress" error since my other WU's are waiting to upload. I did a search but didn't find anything. |
|
8)
Message boards :
Number crunching :
ACEMD3 High error rates
(Message 62012)
Posted 10 Dec 2024 by TofPete Post: I have the same problem. Most of the acemd3 tasks failed due to memory leak or unknown error: * memory leak: https://www.gpugrid.net/result.php?resultid=37054183 * unknown error: https://www.gpugrid.net/result.php?resultid=37053274
|
|
9)
Message boards :
Number crunching :
ACEMD 3 error: Unsupported PRMTOP version
(Message 61824)
Posted 23 Sep 2024 by TofPete Post: Hi, I received errors for some of the ACEMD 3 tasks recently with the following error message: Stderr output <core_client_version>8.0.2</core_client_version> <![CDATA[ <message> Incorrect function. (0x1) - exit code 1 (0x1)</message> <stderr_txt> 16:01:42 (35604): wrapper (7.9.26016): starting 16:01:42 (35604): wrapper: running bin/acemd.exe (--boinc --device 0) </stderr_txt> ]]> Any idea? |
|
10)
Message boards :
Number crunching :
ATMML
(Message 61816)
Posted 18 Sep 2024 by TofPete Post: Hi, Why do I receive such an error messages in ATMML tasks recently? Stderr output <core_client_version>8.0.2</core_client_version> <![CDATA[ <message> (unknown error) (0) - exit code 195 (0xc3)</message> <stderr_txt> 09:59:48 (19024): wrapper (7.9.26016): starting 09:59:48 (19024): wrapper: running Library/usr/bin/tar.exe (xjvf input.tar.bz2) aceforce_dft_v0.4.ckpt |
|
11)
Message boards :
News :
Discord channel for GPUGRID
(Message 61802)
Posted 13 Sep 2024 by TofPete Post: Unable to accept invite. :( Hi, |
|
12)
Message boards :
News :
Experimental Python tasks (beta) - task description
(Message 61749)
Posted 28 Aug 2024 by TofPete Post: Thank you Those describe themselves as ATMML tasks - the clue is in the name. |
|
13)
Message boards :
News :
Experimental Python tasks (beta) - task description
(Message 61746)
Posted 28 Aug 2024 by TofPete Post: I think it's a python task because the error message is regarding a python problem: Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding Python runtime state: core initialized ModuleNotFoundError: No module named 'encodings' I got these tasks today and in the recent days: Task received at in UTC | Computing status text | Runtime | Application name 28 Aug 2024 8:31:51 UTC | Error while computing | 672.11 | ATMML: Free energy with neural networks v1.01 (cuda1121) 28 Aug 2024 8:03:54 UTC | Error while computing | 703.63 | ATMML: Free energy with neural networks v1.01 (cuda1121) 28 Aug 2024 7:34:52 UTC | Error while computing | 708.96 | ATMML: Free energy with neural networks v1.01 (cuda1121) 28 Aug 2024 7:20:30 UTC | Error while computing | 714.93 | ATMML: Free energy with neural networks v1.01 (cuda1121) 28 Aug 2024 8:17:39 UTC | Error while computing | 709.18 | ATMML: Free energy with neural networks v1.01 (cuda1121) 28 Aug 2024 7:49:20 UTC | Error while computing | 724.49 | ATMML: Free energy with neural networks v1.01 (cuda1121) 27 Aug 2024 9:35:49 UTC | Error while computing | 776.90 | ATMML: Free energy with neural networks v1.01 (cuda1121) 27 Aug 2024 1:24:00 UTC | Error while computing | 60.60 | ATMML: Free energy with neural networks v1.01 (cuda1121) 26 Aug 2024 9:41:56 UTC | Error while computing | 20.18 | ATMML: Free energy with neural networks v1.01 (cuda1121) |
|
14)
Message boards :
News :
Experimental Python tasks (beta) - task description
(Message 61744)
Posted 28 Aug 2024 by TofPete Post: Hi, I'm receiving the following error message after about 700-800 sec running time: 09:33:09 (32292): Library/usr/bin/tar.exe exited; CPU time 0.000000
09:33:09 (32292): wrapper: running C:/Windows/system32/cmd.exe (/c call Scripts\activate.bat && Scripts\conda-unpack.exe && run.bat)
Could not find platform independent libraries <prefix>
Python path configuration:
PYTHONHOME = (not set)
PYTHONPATH = (not set)
program name = '\\?\D:\ProgramData\BOINC\slots\4\python.exe'
isolated = 0
environment = 1
user site = 1
safe_path = 0
import site = 1
is in build tree = 0
stdlib dir = 'D:\ProgramData\BOINC\slots\4\Lib'
sys._base_executable = '\\\\?\\D:\\ProgramData\\BOINC\\slots\\4\\python.exe'
sys.base_prefix = 'D:\\ProgramData\\BOINC\\slots\\4'
sys.base_exec_prefix = 'D:\\ProgramData\\BOINC\\slots\\4'
sys.platlibdir = 'DLLs'
sys.executable = '\\\\?\\D:\\ProgramData\\BOINC\\slots\\4\\python.exe'
sys.prefix = 'D:\\ProgramData\\BOINC\\slots\\4'
sys.exec_prefix = 'D:\\ProgramData\\BOINC\\slots\\4'
sys.path = [
'D:\\ProgramData\\BOINC\\slots\\4\\python311.zip',
'D:\\ProgramData\\BOINC\\slots\\4\\DLLs',
'D:\\ProgramData\\BOINC\\slots\\4\\Lib',
'\\\\?\\D:\\ProgramData\\BOINC\\slots\\4',
]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'
Current thread 0x000058b0 (most recent call first):
<no Python frame>
09:33:10 (32292): C:/Windows/system32/cmd.exe exited; CPU time 0.000000
09:33:10 (32292): app exit status: 0x1
09:33:10 (32292): called boinc_finish(195)Any idea why this error happens recently? Thanks, Peter |
|
15)
Message boards :
Number crunching :
Computation error
(Message 61652)
Posted 7 Aug 2024 by TofPete Post: Hi, Recently, I receive computation error for all of my tasks: https://www.gpugrid.net/result.php?resultid=35634634 I can see only the following error in the Boinc logs: 07/08/2024 17:07:19 | GPUGRID | [error] Can't rename output file slots/5/progress.log to projects/www.gpugrid.net/e16s9_e12s1p0f183-ADRIA_Explor_srcpp1_e2t_25ns_pp1coor_v2_10us_b0-0-1-RND8727_0_0: Error 32 Earlier there was no problem with GPUGrid tasks but now I receive error every time. What could cause this? Regards, TofPeter |
©2026 Universitat Pompeu Fabra