acemd simulation vs. python simulation GPU

Message boards : Number crunching : acemd simulation vs. python simulation GPU
Message board moderation

To post messages, you must log in.

AuthorMessage
Jari Kosonen

Send message
Joined: 5 May 22
Posts: 24
Credit: 12,458,305
RAC: 0
Level
Pro
Scientific publications
wat
Message 59093 - Posted: 11 Aug 2022, 3:19:07 UTC

In case of the ACEMD the GPU driver is dropped off
and secondly the remaining acemd process did not get killed with the 'kill'-command.
So that case I guess the /home -partition (where the BOINC-files are)
did not get unmounted always and can cause /home -partition crash and a lot of fsck.

ID: 59093 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59096 - Posted: 11 Aug 2022, 17:29:37 UTC - in response to Message 59093.  

Not sure what you are describing. Sounds like you lost your video drivers and the task crashed?

I've never had an issue with a driver or card dropping off the bus causing disk corruption. Normally just throws away all your onboard tasks with errors and puts you into the penalty box awaiting new work.

I have BOINC installed in /home and have never corrupted /home and I have lost work due to the driver being pulled out from underneath running work by an unattended driver update many times.

I've never found it necessary to kill a BOINC process. Some of them take a while to end though like the python processes. Just need patience to wait out the couple of minutes and they end themselves normally.
ID: 59096 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jari Kosonen

Send message
Joined: 5 May 22
Posts: 24
Credit: 12,458,305
RAC: 0
Level
Pro
Scientific publications
wat
Message 59126 - Posted: 18 Aug 2022, 11:04:04 UTC - in response to Message 59096.  

I think if the process does not get killed during shutdown,
the shutdown will not occur normally and if I press the "reset"-button,
the HDD does not get unmounted and make this segment failures.
fsck possibly can not fix 100% of them correctly during next boot.

ID: 59126 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote

Message boards : Number crunching : acemd simulation vs. python simulation GPU

©2025 Universitat Pompeu Fabra