Advanced search

Message boards : Number crunching : ATTML tasks failing on Windows hosts

Author Message
jjch
Send message
Joined: 10 Nov 13
Posts: 101
Credit: 15,483,200,388
RAC: 2,100,028
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61851 - Posted: 28 Sep 2024 | 19:34:55 UTC

The ATTML tasks on Windows hosts are failing with Access is denied errors when the wrapper runs the activate and conda-unpack. Such as this example.

10:03:11 (3608): wrapper: running C:/Windows/system32/cmd.exe (/c call Scripts\activate.bat && Scripts\conda-unpack.exe && run.bat)
Access is denied.
Access is denied.
Access is denied.
The system cannot find the path specified.


https://www.gpugrid.net/result.php?resultid=36024969

Additional examples

https://www.gpugrid.net/result.php?resultid=36019001
https://www.gpugrid.net/result.php?resultid=36005142

The only ones that complete successfully are the Benchmark tasks
https://www.gpugrid.net/result.php?resultid=35985408

Is anyone else noticing the same problem?

ACEMD tasks are running OK when they are available but the ATTML tasks are quite different.

Are the ATTML tasks broken for Windows hosts?

Is anyone getting them to run sccessfully?

If so, do they know what may be causing this problem and the trick to fix it?

KeithBriggs
Send message
Joined: 29 Aug 24
Posts: 5
Credit: 473,475,000
RAC: 9,798,202
Level
Gln
Scientific publications
wat
Message 61852 - Posted: 29 Sep 2024 | 1:22:44 UTC - in response to Message 61851.

I've only had one with an exit code of 195. All my others have run to completion. I'm not getting too many as the well seems to be drying up today.

https://www.gpugrid.net/result.php?resultid=36024007

It was a NEW_CDK2

I'm not running nearly as fast a machine or GPU as your heavyweight.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1126
Credit: 9,249,007,676
RAC: 26,819,724
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61853 - Posted: 29 Sep 2024 | 6:39:18 UTC

Except the tasks starting with name "CDK2" all are successful here on Windows.
However, new tasks come in only sporadically since the queue has no longer been filled up since yesterday.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1126
Credit: 9,249,007,676
RAC: 26,819,724
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61854 - Posted: 29 Sep 2024 | 10:36:29 UTC - in response to Message 61853.

Except the tasks starting with name "CDK2" all are successful here on Windows.
However, new tasks come in only sporadically since the queue has no longer been filled up since yesterday.

also all "NEW_CDK2" are failing after about 40 minutes

Billy Ewell 1931
Send message
Joined: 22 Oct 10
Posts: 40
Credit: 1,441,523,674
RAC: 3,835,529
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61855 - Posted: 29 Sep 2024 | 18:53:42 UTC - in response to Message 61851.
Last modified: 29 Sep 2024 | 18:55:20 UTC

Are the ATTML tasks broken for Windows hosts?(Quote)

Is anyone getting them to run sccessfully?(Quote)

I have processed some 40 ATTML tasks over the last 10+ days and 36 completed without difficulty: Two tasks were aborted by server and two errored out: One without any time completed and the other ran about .5 hour.
My equipment is an HP i7 computer and an RDX 3080 GPU. Frakly I am pleased with this success based on previous history using two computers with virtually the same equipment.

jjch
Send message
Joined: 10 Nov 13
Posts: 101
Credit: 15,483,200,388
RAC: 2,100,028
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61856 - Posted: 30 Sep 2024 | 16:06:45 UTC

Seems that I will have to do some digging to figure out why my Windows hosts are not working correctly.

It could have something to do with my setup that is different than the others. I thought about Boinc being installed on another drive or partition but some of the others I found that are working are on other locations.

Is there a particular version of Windows or other software package needed? The NVIDIA drivers are somewhat recent within reason but I wouldn't think that is related to this problem.

I should also look at the AV and other stuff that might be blocking something it needs to run. Not all the hosts have the same setup but it's probably worth checking into.

Thanks,
JJCH

Erich56
Send message
Joined: 1 Jan 15
Posts: 1126
Credit: 9,249,007,676
RAC: 26,819,724
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 61859 - Posted: 1 Oct 2024 | 15:54:10 UTC - in response to Message 61853.

Except the tasks starting with name "CDK2" all are successful here on Windows.
However, new tasks come in only sporadically since the queue has no longer been filled up since yesterday.

quite some tasks were downloaded here within the past 6 hours, all of them errored out after various time spans - not only CDK2.
Checking the working package shows that all these tasks had failed on several other hosts as well.
So it seems that the tasks which are being sent out right now are all faulty :-(

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 135
Credit: 120,119,439
RAC: 231,197
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 61864 - Posted: 4 Oct 2024 | 15:35:05 UTC
Last modified: 4 Oct 2024 | 16:01:55 UTC

never mind..delete this post

Greg _BE
Send message
Joined: 30 Jun 14
Posts: 135
Credit: 120,119,439
RAC: 231,197
Level
Cys
Scientific publications
watwatwatwatwatwat
Message 61865 - Posted: 4 Oct 2024 | 15:39:07 UTC
Last modified: 4 Oct 2024 | 16:03:37 UTC

error extremely short run time: https://www.gpugrid.net/result.php?resultid=36117543

error after stopping all other GPU tasks and letting it finish:
https://www.gpugrid.net/result.php?resultid=36049493

Last task was stalled, but I added report immediatly, restarted BOINC, it uploaded and is ok.

1 waiting to start.

That's what I have done so far.

jjch
Send message
Joined: 10 Nov 13
Posts: 101
Credit: 15,483,200,388
RAC: 2,100,028
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 61877 - Posted: 9 Oct 2024 | 0:11:50 UTC

Update regarding the Access is denied issue.

When I built several of these hosts I had placed a space in the folder name for the ProgramData directory

Program Data


Apparently the path is hard coded somewhere in the scripts and it couldn't find what it needed to start the application

Best practice for future installs, don't change the default name for the ProgramData folder.

At least the tasks are starting now but so far these are failing with the RuntimeError: Simulation failed 5 times!.

Post to thread

Message boards : Number crunching : ATTML tasks failing on Windows hosts

//