Advanced search

Message boards : Number crunching : After finishing and uploading Error "Fehler beim Berechnen" and no credits

Author Message
Dr.Mabuse
Send message
Joined: 22 Nov 17
Posts: 1
Credit: 301,570
RAC: 0
Level

Scientific publications
wat
Message 55158 - Posted: 7 Aug 2020 | 15:33:09 UTC

Hello experts, the last 3 tasks for GPUGrid ended with error and no points.
See here:
Task Arbeitspaket Computer Gesendet Meldezeit Status Laufzeit(sek) CPU Zeit(sek) Punkte
27841184 22711992 453826 7 Aug 2020 | 10:08:23 UTC 7 Aug 2020 | 14:20:43 UTC Fehler beim Berechnen 12,387.83 9,622.06 --- New version of ACEMD v2.10 (cuda101)
27817675 22690369 453826 6 Aug 2020 | 11:45:26 UTC 6 Aug 2020 | 14:03:13 UTC Fehler beim Berechnen 7,178.86 5,452.50 --- New version of ACEMD v2.10 (cuda101)
27817520 22690227 453826 6 Aug 2020 | 11:46:00 UTC 6 Aug 2020 | 17:44:13 UTC Fehler beim Berechnen 13,472.83 10,827.86 --- New version of ACEMD v2.10 (cuda101)

[/size]
What can I do to get the right results ?
thanks
Dr.Mabuse

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 689
Credit: 905,002,798
RAC: 288,886
Level
Glu
Scientific publications
watwatwatwatwat
Message 55159 - Posted: 7 Aug 2020 | 16:41:22 UTC

Fix your computer. The tasks failed because of memory leaks.
Don't overclock.
Clean your computer of dust bunnies.
Don't scan the BOINC folder with an AV program.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 345
Credit: 1,927,950,084
RAC: 322,361
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55160 - Posted: 7 Aug 2020 | 17:23:48 UTC - in response to Message 55158.

I fully agree Keith Myers.
Some of his sharp advices should work.
All of your three failed tasks have been successfully processed and reported by other users.
We can rule them out from being defective.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 200
Credit: 337,048,583
RAC: 68,501
Level
Asp
Scientific publications
watwat
Message 55161 - Posted: 9 Aug 2020 | 0:35:05 UTC

Dr Mabuse, Wikipedia states that software ageing is a principle cause of memory leaks https://en.wikipedia.org/wiki/Memory_leak.

Were I you, I would upgrade to windows 10 (it's still free) using a fresh installation. If you have not reinstalled your OS in a long time it will be prone to this problem.
This is assuming you are not overclocking, that would be a more obvious cause.

I hope I've helped.

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,200,441,910
RAC: 199,757
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55162 - Posted: 9 Aug 2020 | 1:43:38 UTC - in response to Message 55158.

The last few lines in the STDerr output file on all failed tasks point to your issue.

<error_code>-240 (stat() failed)</error_code>


This error is most likely caused by your Anti Virus solution quarantining your task output file.
Your tasks seem to be completing successfully but fail to upload as the output file is missing.
Ensure you exclude the c:\program data\boinc folder (and sub folders) from virus scanning.

Toni has stated in the past that some memory leaks are present in the gpugrid tasks and dont adversely affect the task completion.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 200
Credit: 337,048,583
RAC: 68,501
Level
Asp
Scientific publications
watwat
Message 55163 - Posted: 9 Aug 2020 | 2:26:04 UTC - in response to Message 55162.

Rod, that makes me curious as to why one of his tasks finished and uploaded successfully. Why did his AV quarantine some files and not all?
The computation times do show that the tasks were probably complete and failed to upload per the output file messages. Did one slip by the AV client?

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,200,441,910
RAC: 199,757
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55164 - Posted: 9 Aug 2020 | 2:31:27 UTC - in response to Message 55163.
Last modified: 9 Aug 2020 | 2:33:02 UTC

Rod, that makes me curious as to why one of his tasks finished and uploaded successfully. Why did his AV quarantine some files and not all?
The computation times do show that the tasks were probably complete and failed to upload per the output file messages. Did one slip by the AV client?


Yes I thought the same. It may be that one of the tasks was not considered as 'suspicious' by the AV as the failed tasks.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 200
Credit: 337,048,583
RAC: 68,501
Level
Asp
Scientific publications
watwat
Message 55165 - Posted: 9 Aug 2020 | 16:47:23 UTC

I believe Rod4x4 has it nailed, spot-on.
It would be interesting to know what client Dr Mabuse is using.

I uninstalled Norton-360 from my crunch machines because it wouldn't let me run the install file for Quarantine@home.
The built-in protection in Windows-10 has been quite adequate for the purpose these machines serve.

rod4x4
Send message
Joined: 4 Aug 14
Posts: 266
Credit: 2,200,441,910
RAC: 199,757
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 55171 - Posted: 13 Aug 2020 | 4:28:59 UTC

Well this same error just occurred on one of my Win7 (soon to be retired) computers.

From Stderr output:

<message> upload failure: <file_xfer_error> <file_name>e8s45_e3s43p0f60-PABLO_UCB_NMR_KIX_CMYB_8-0-5-RND7723_0_0</file_name> <error_code>-240 (stat() failed)</error_code> </file_xfer_error> </message>


I noted this information from the BOINC Event Log:

13/08/2020 11:10:38 AM | GPUGRID | [error] Can't rename output file slots/3/progress.log to projects/www.gpugrid.net/e8s45_e3s43p0f60-PABLO_UCB_NMR_KIX_CMYB_8-0-5-RND7723_0_0: Error 32

and

13/08/2020 11:10:40 AM | GPUGRID | Output file e8s45_e3s43p0f60-PABLO_UCB_NMR_KIX_CMYB_8-0-5-RND7723_0_0 for task e8s45_e3s43p0f60-PABLO_UCB_NMR_KIX_CMYB_8-0-5-RND7723_0 absent

Errors from both Stderr and BOINC Event logs are related. From what I can gather, it is the stat() function that renames the output files from the BOINC slot folder to the Gpugrid project folder. The error occurs as this operation is susceptible to AV intervention and hardware failure.

As the AV is fine on this host, the likely cause in this case is the 10 year old hardware with an unreliable 10 year old (mechanical) hard drive. Problem with the hard drive is highlighted by the PABLO_UCB_NMR_KIX_CMYB tasks which place more work on the hard drive due to the large size of the output files generated.


Post to thread

Message boards : Number crunching : After finishing and uploading Error "Fehler beim Berechnen" and no credits