Advanced search

Message boards : Multicore CPUs : QC tests for windows

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 49817 - Posted: 8 Jul 2018 | 15:11:46 UTC
Last modified: 8 Jul 2018 | 16:29:29 UTC

I'm preparing some infrastructure for the Windows 64bit CPU app. There are some beta WUs out. They don't to actual calculations (for that we have to wait for the nontrivial psi4 port), but they do test the distribution system, dependencies, and so on.

Some are already coming back successful. The setup, if it works, will be completely native, not require VMs, nor the "linux subsystem". It will expand both the type of tasks we can distribute, and the available machines, of course.

In particular, I'd like to know
- whether it behaves well with "normal" unprivileged installations
- if there is an annoying "Anaconda" icon installed in the start menu

Thanks!

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 64
Credit: 2,035,654,991
RAC: 5,072,581
Level
Phe
Scientific publications
watwatwat
Message 49818 - Posted: 9 Jul 2018 | 0:40:04 UTC - in response to Message 49817.

Can't seem to DL anything for my windows machine.

Must have been a quick short run.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 1942
Credit: 12,290,996,669
RAC: 3,169,293
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49819 - Posted: 9 Jul 2018 | 6:34:41 UTC - in response to Message 49818.

Zalster wrote:
Must have been a quick short run.

Toni wrote:
They don't to actual calculations (for that we have to wait for the nontrivial psi4 port), but they do test the distribution system, dependencies, and so on.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 49821 - Posted: 9 Jul 2018 | 8:43:48 UTC - in response to Message 49819.

There weren't many (approx 300), all of them in the "beta" queue. I need to fix simultaneous starts there too.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 98
Credit: 252,619
RAC: 448
Level

Scientific publications
wat
Message 49844 - Posted: 11 Jul 2018 | 9:32:17 UTC - in response to Message 49821.

Waiting for more windows consistent queue....

Dzordzik
Send message
Joined: 5 Feb 17
Posts: 2
Credit: 269,971,794
RAC: 488,049
Level
Asn
Scientific publications
wat
Message 49935 - Posted: 17 Jul 2018 | 19:14:55 UTC

Hi, something new? My 88 thread server is hungry for some native Win work. If want to help with testing, can let me know.

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 8
Credit: 755,362,922
RAC: 868,506
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 49952 - Posted: 19 Jul 2018 | 8:37:12 UTC

I think the team is busy with Windows GPU application issues, see here for more details: https://www.gpugrid.net/forum_thread.php?id=4802

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 883
Credit: 1,729,358,120
RAC: 1,146,177
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49953 - Posted: 19 Jul 2018 | 10:16:54 UTC - in response to Message 49952.

https://www.gpugrid.net/forum_thread.php?id=4802

Also, several members of staff are on vacation at the moment, so progress will be slow until they return.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 98
Credit: 252,619
RAC: 448
Level

Scientific publications
wat
Message 50146 - Posted: 30 Jul 2018 | 7:59:42 UTC - in response to Message 49953.

https://www.gpugrid.net/forum_thread.php?id=4802

Also, several members of staff are on vacation at the moment, so progress will be slow until they return.


Now gpu windows app seems to be resolved.
Waiting for September for windows cpu app...

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50436 - Posted: 8 Sep 2018 | 11:17:25 UTC

Is there any windows CPU WU's, I set everything to be enable but I get no tasks.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50437 - Posted: 8 Sep 2018 | 11:36:12 UTC - in response to Message 50436.

Coming soon (on the QC_beta app initially).

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50445 - Posted: 8 Sep 2018 | 14:37:34 UTC

I got a few beta but they failed, looks like it was installing Python and minconda while trying to run another WU at the same time.

hopefully next round is good.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50446 - Posted: 8 Sep 2018 | 14:45:46 UTC - in response to Message 50445.

Installing python (miniconda actually) is normal. It should be contained in their own boinc directories. Also multiple installations should wait for each other. There is something else I'm investigating.

Did you get any "visible" changes in the system?

Thanks

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50450 - Posted: 8 Sep 2018 | 22:19:24 UTC

I see the anaconda option in the start menu, but non of the task actually worked.

if I click it it does not work as path is wrong it has

C:\%windir%\System32\cmd.exe "/K" C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda\Scripts\activate.bat C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda

but %windir% = C:\windows so it actually resolves to c:\c\windows.

I see that it trys to update conda version but it didn't work as the reported version is still 4.5.4

I see these 2 errors:

Traceback (most recent call last):
File "C:\ProgramData\BOINC\slots\3\qmml3\lib\site-packages\psi4\__init__.py", line 55, in <module>
from . import core
ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run.py", line 13, in <module>
import psi4
File "C:\ProgramData\BOINC\slots\3\qmml3\lib\site-packages\psi4\__init__.py", line 60, in <module>
raise ImportError("{0}".format(err))

I can see the deepdiff, jsonpickle, psi4 in the pkgs folder.

this was most common causes of error. one was

C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda\Scripts\conda install -m -y -p C:\ProgramData\BOINC\slots\1\qmml3 --override-channels -c gpugridbeta -c defaults --file requirements.txt
) 9>C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda.lock
The process cannot access the file because it is being used by another process.
17:08:33 (8180): install_miniconda.bat exited; CPU time 0.046875
17:08:33 (8180): app exit status: 0x1
17:08:33 (8180): called boinc_finish(195)

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50489 - Posted: 13 Sep 2018 | 8:36:38 UTC - in response to Message 50450.
Last modified: 13 Sep 2018 | 8:42:35 UTC

The QC Beta app now seems to be working (group name TST4). To do: workaround the creation of the annoying shortcut.

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50495 - Posted: 13 Sep 2018 | 17:45:30 UTC

Seems like they are working now. I see 100% cpu load, running 4x4cores.


Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50496 - Posted: 13 Sep 2018 | 18:05:44 UTC - in response to Message 50495.
Last modified: 13 Sep 2018 | 18:11:38 UTC

Great. Did you see "black windows" like command prompts?

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50497 - Posted: 13 Sep 2018 | 18:40:29 UTC

Didn't notice any :)

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 8
Credit: 755,362,922
RAC: 868,506
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50506 - Posted: 14 Sep 2018 | 11:22:18 UTC

3 UT validated on i7-3770 W10 host ! :-)

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50507 - Posted: 14 Sep 2018 | 12:24:27 UTC - in response to Message 50506.

Can someone give an eye on the cpu% of windows tasks? In particular, if you run only one task, does it limit itself to 4 threads?

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 8
Credit: 755,362,922
RAC: 868,506
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50509 - Posted: 14 Sep 2018 | 17:18:13 UTC
Last modified: 14 Sep 2018 | 17:19:27 UTC

After few minutes, CPU load is OK: see the picture !

captainjack
Send message
Joined: 9 May 13
Posts: 140
Credit: 951,624,368
RAC: 201,304
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 50512 - Posted: 14 Sep 2018 | 20:12:11 UTC

Just out of curiosity, my Linux PC has been receiving Quantum Chemistry, beta test v3.32 (mt). PC number 258077. Is that supposed to happen? I was under the impression that the beta test tasks were for windows only.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 615
Credit: 1,199,526,929
RAC: 110,687
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 50513 - Posted: 14 Sep 2018 | 20:36:14 UTC - in response to Message 50512.

Just out of curiosity, my Linux PC has been receiving Quantum Chemistry, beta test v3.32 (mt). PC number 258077. Is that supposed to happen? I was under the impression that the beta test tasks were for windows only.

Same for me, but they ran OK, so they must be for Linux.

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50535 - Posted: 17 Sep 2018 | 19:45:10 UTC

Is there any drivers in the package? I had some stability issues with my computer, which I never saw before. I had one bug check that indicated python as culprit, I doubt it but just checking.

captainjack
Send message
Joined: 9 May 13
Posts: 140
Credit: 951,624,368
RAC: 201,304
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 50540 - Posted: 18 Sep 2018 | 14:26:28 UTC
Last modified: 18 Sep 2018 | 14:35:31 UTC

I just picked up some of the beta test 3.33 tasks on two different Windows machines. After 10+ minutes, it looks like the tasks are only using one thread.

After 15+ minutes, one of the machines got this error message:

Debug Error!
Program: C:\ProgramData\BOINC\slots\0\qmml3\pythin.exe
abort() has been called
(Press Retry to debug the application)

Let me know if you need more information.

[Edit] After 22 minutes, the second machine received the same error and the task aborted.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50541 - Posted: 18 Sep 2018 | 14:55:26 UTC - in response to Message 50535.
Last modified: 18 Sep 2018 | 14:55:48 UTC

Is there any drivers in the package? I had some stability issues with my computer, which I never saw before. I had one bug check that indicated python as culprit, I doubt it but just checking.


There are no "device drivers" needed. The app is CPU only: it might crash, but shouldn't affect your PC.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50542 - Posted: 18 Sep 2018 | 15:00:58 UTC - in response to Message 50540.

I just picked up some of the beta test 3.33 tasks on two different Windows machines. After 10+ minutes, it looks like the tasks are only using one thread.

After 15+ minutes, one of the machines got this error message:

Debug Error!
Program: C:\ProgramData\BOINC\slots\0\qmml3\pythin.exe
abort() has been called
(Press Retry to debug the application)

Let me know if you need more information.

[Edit] After 22 minutes, the second machine received the same error and the task aborted.


The root error is here


PSIO_ERROR: unit = 97, errval = 10
PSIO_ERROR: 10 (lseek failed)


your machine appears to be the only one with this specific failure. Do you have space in your HDD, or other uncommon setups (e.g. FAT instead of NTFS)?

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 8
Credit: 755,362,922
RAC: 868,506
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50543 - Posted: 18 Sep 2018 | 15:05:45 UTC

I think I have the same problem on my hostid 193594

I try to increase disk usage.

captainjack
Send message
Joined: 9 May 13
Posts: 140
Credit: 951,624,368
RAC: 201,304
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 50545 - Posted: 18 Sep 2018 | 16:11:14 UTC

Toni asked:

The root error is here

PSIO_ERROR: unit = 97, errval = 10
PSIO_ERROR: 10 (lseek failed)

your machine appears to be the only one with this specific failure. Do you have space in your HDD, or other uncommon setups (e.g. FAT instead of NTFS)?


The error happened on two different computers. Machine 476647 has 100 GB disk space available to BOINC. HDD is formatted as NTFS. I can't think of anything that is non-standard.

Both machines have VirtualBox installed, but that is something that is encountered often in the volunteer computing world.

Let me know if you have more questions.

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 15
Credit: 137,079,143
RAC: 763,347
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 50546 - Posted: 18 Sep 2018 | 16:31:54 UTC

I see the same on 2 computers, one has 238GB free with boinc allowed to use upto 218GB.


The other has 366GB free with same settings.

I clicked ignore and it seemed to do something, but not full CPU load.

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50547 - Posted: 18 Sep 2018 | 20:29:36 UTC - in response to Message 50546.

These WUs are failing indeed. I have no obvious explanation but am investigating. Thanks.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 98
Credit: 252,619
RAC: 448
Level

Scientific publications
wat
Message 50555 - Posted: 19 Sep 2018 | 8:57:33 UTC

I continue to download wu for linux (QC and Beta). Nothing for windows.
Do i have to change something in my profile?

Toni
Volunteer moderator
Project administrator
Project developer
Project scientist
Send message
Joined: 9 Dec 08
Posts: 736
Credit: 4,285,282
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 50559 - Posted: 19 Sep 2018 | 10:52:26 UTC - in response to Message 50555.

If you receive beta WUs for linux, the you will get them for windows too. I had to cancel all outstanding ones because of a nasty bug (which raised a debug dialog).

T

Jim1348
Send message
Joined: 28 Jul 12
Posts: 615
Credit: 1,199,526,929
RAC: 110,687
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 50563 - Posted: 19 Sep 2018 | 22:04:31 UTC
Last modified: 19 Sep 2018 | 22:10:52 UTC

NOTE: This should be in the Linux section, not Windows, so you can move it, though I expect it would apply to either.

This one failed with "196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED".
http://www.gpugrid.net/result.php?resultid=18977559

The BointTasks message is:

Aborting task 1718_36_33_32_50_14f550fc_n00001-SDOERR_SELE6-0-1-RND8308_2: exceeded disk limit: 61716.49MB > 57220.46MB


However, I retain 96 GB free in my root partition, and the BOINC startup message says: "max disk usage: 184.49 GB".

So apparently that limit is coming from somewhere else(?).

Post to thread

Message boards : Multicore CPUs : QC tests for windows