Advanced search

Message boards : Multicore CPUs : QC tests for windows

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 49817 - Posted: 8 Jul 2018 | 15:11:46 UTC
Last modified: 8 Jul 2018 | 16:29:29 UTC

I'm preparing some infrastructure for the Windows 64bit CPU app. There are some beta WUs out. They don't to actual calculations (for that we have to wait for the nontrivial psi4 port), but they do test the distribution system, dependencies, and so on.

Some are already coming back successful. The setup, if it works, will be completely native, not require VMs, nor the "linux subsystem". It will expand both the type of tasks we can distribute, and the available machines, of course.

In particular, I'd like to know
- whether it behaves well with "normal" unprivileged installations
- if there is an annoying "Anaconda" icon installed in the start menu

Thanks!

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 49818 - Posted: 9 Jul 2018 | 0:40:04 UTC - in response to Message 49817.

Can't seem to DL anything for my windows machine.

Must have been a quick short run.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,201,255,749
RAC: 7,520
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49819 - Posted: 9 Jul 2018 | 6:34:41 UTC - in response to Message 49818.

Zalster wrote:
Must have been a quick short run.

Toni wrote:
They don't to actual calculations (for that we have to wait for the nontrivial psi4 port), but they do test the distribution system, dependencies, and so on.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 49821 - Posted: 9 Jul 2018 | 8:43:48 UTC - in response to Message 49819.

There weren't many (approx 300), all of them in the "beta" queue. I need to fix simultaneous starts there too.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 49844 - Posted: 11 Jul 2018 | 9:32:17 UTC - in response to Message 49821.

Waiting for more windows consistent queue....

Dzordzik
Send message
Joined: 5 Feb 17
Posts: 2
Credit: 318,693,957
RAC: 0
Level
Asp
Scientific publications
watwatwat
Message 49935 - Posted: 17 Jul 2018 | 19:14:55 UTC

Hi, something new? My 88 thread server is hungry for some native Win work. If want to help with testing, can let me know.

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 20
Credit: 1,713,956,441
RAC: 466,197
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49952 - Posted: 19 Jul 2018 | 8:37:12 UTC

I think the team is busy with Windows GPU application issues, see here for more details: https://www.gpugrid.net/forum_thread.php?id=4802

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1576
Credit: 5,598,486,851
RAC: 8,769,820
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49953 - Posted: 19 Jul 2018 | 10:16:54 UTC - in response to Message 49952.

https://www.gpugrid.net/forum_thread.php?id=4802

Also, several members of staff are on vacation at the moment, so progress will be slow until they return.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50146 - Posted: 30 Jul 2018 | 7:59:42 UTC - in response to Message 49953.

https://www.gpugrid.net/forum_thread.php?id=4802

Also, several members of staff are on vacation at the moment, so progress will be slow until they return.


Now gpu windows app seems to be resolved.
Waiting for September for windows cpu app...

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50436 - Posted: 8 Sep 2018 | 11:17:25 UTC

Is there any windows CPU WU's, I set everything to be enable but I get no tasks.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50437 - Posted: 8 Sep 2018 | 11:36:12 UTC - in response to Message 50436.

Coming soon (on the QC_beta app initially).

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50445 - Posted: 8 Sep 2018 | 14:37:34 UTC

I got a few beta but they failed, looks like it was installing Python and minconda while trying to run another WU at the same time.

hopefully next round is good.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50446 - Posted: 8 Sep 2018 | 14:45:46 UTC - in response to Message 50445.

Installing python (miniconda actually) is normal. It should be contained in their own boinc directories. Also multiple installations should wait for each other. There is something else I'm investigating.

Did you get any "visible" changes in the system?

Thanks

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50450 - Posted: 8 Sep 2018 | 22:19:24 UTC

I see the anaconda option in the start menu, but non of the task actually worked.

if I click it it does not work as path is wrong it has

C:\%windir%\System32\cmd.exe "/K" C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda\Scripts\activate.bat C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda

but %windir% = C:\windows so it actually resolves to c:\c\windows.

I see that it trys to update conda version but it didn't work as the reported version is still 4.5.4

I see these 2 errors:

Traceback (most recent call last):
File "C:\ProgramData\BOINC\slots\3\qmml3\lib\site-packages\psi4\__init__.py", line 55, in <module>
from . import core
ImportError: DLL load failed: The specified module could not be found.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run.py", line 13, in <module>
import psi4
File "C:\ProgramData\BOINC\slots\3\qmml3\lib\site-packages\psi4\__init__.py", line 60, in <module>
raise ImportError("{0}".format(err))

I can see the deepdiff, jsonpickle, psi4 in the pkgs folder.

this was most common causes of error. one was

C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda\Scripts\conda install -m -y -p C:\ProgramData\BOINC\slots\1\qmml3 --override-channels -c gpugridbeta -c defaults --file requirements.txt
) 9>C:\ProgramData\BOINC\projects\www.gpugrid.net\miniconda.lock
The process cannot access the file because it is being used by another process.
17:08:33 (8180): install_miniconda.bat exited; CPU time 0.046875
17:08:33 (8180): app exit status: 0x1
17:08:33 (8180): called boinc_finish(195)

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50489 - Posted: 13 Sep 2018 | 8:36:38 UTC - in response to Message 50450.
Last modified: 13 Sep 2018 | 8:42:35 UTC

The QC Beta app now seems to be working (group name TST4). To do: workaround the creation of the annoying shortcut.

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50495 - Posted: 13 Sep 2018 | 17:45:30 UTC

Seems like they are working now. I see 100% cpu load, running 4x4cores.


Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50496 - Posted: 13 Sep 2018 | 18:05:44 UTC - in response to Message 50495.
Last modified: 13 Sep 2018 | 18:11:38 UTC

Great. Did you see "black windows" like command prompts?

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50497 - Posted: 13 Sep 2018 | 18:40:29 UTC

Didn't notice any :)

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 20
Credit: 1,713,956,441
RAC: 466,197
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50506 - Posted: 14 Sep 2018 | 11:22:18 UTC

3 UT validated on i7-3770 W10 host ! :-)

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50507 - Posted: 14 Sep 2018 | 12:24:27 UTC - in response to Message 50506.

Can someone give an eye on the cpu% of windows tasks? In particular, if you run only one task, does it limit itself to 4 threads?

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 20
Credit: 1,713,956,441
RAC: 466,197
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50509 - Posted: 14 Sep 2018 | 17:18:13 UTC
Last modified: 14 Sep 2018 | 17:19:27 UTC

After few minutes, CPU load is OK: see the picture !

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50512 - Posted: 14 Sep 2018 | 20:12:11 UTC

Just out of curiosity, my Linux PC has been receiving Quantum Chemistry, beta test v3.32 (mt). PC number 258077. Is that supposed to happen? I was under the impression that the beta test tasks were for windows only.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50513 - Posted: 14 Sep 2018 | 20:36:14 UTC - in response to Message 50512.

Just out of curiosity, my Linux PC has been receiving Quantum Chemistry, beta test v3.32 (mt). PC number 258077. Is that supposed to happen? I was under the impression that the beta test tasks were for windows only.

Same for me, but they ran OK, so they must be for Linux.

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50535 - Posted: 17 Sep 2018 | 19:45:10 UTC

Is there any drivers in the package? I had some stability issues with my computer, which I never saw before. I had one bug check that indicated python as culprit, I doubt it but just checking.

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50540 - Posted: 18 Sep 2018 | 14:26:28 UTC
Last modified: 18 Sep 2018 | 14:35:31 UTC

I just picked up some of the beta test 3.33 tasks on two different Windows machines. After 10+ minutes, it looks like the tasks are only using one thread.

After 15+ minutes, one of the machines got this error message:

Debug Error!
Program: C:\ProgramData\BOINC\slots\0\qmml3\pythin.exe
abort() has been called
(Press Retry to debug the application)

Let me know if you need more information.

[Edit] After 22 minutes, the second machine received the same error and the task aborted.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50541 - Posted: 18 Sep 2018 | 14:55:26 UTC - in response to Message 50535.
Last modified: 18 Sep 2018 | 14:55:48 UTC

Is there any drivers in the package? I had some stability issues with my computer, which I never saw before. I had one bug check that indicated python as culprit, I doubt it but just checking.


There are no "device drivers" needed. The app is CPU only: it might crash, but shouldn't affect your PC.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50542 - Posted: 18 Sep 2018 | 15:00:58 UTC - in response to Message 50540.

I just picked up some of the beta test 3.33 tasks on two different Windows machines. After 10+ minutes, it looks like the tasks are only using one thread.

After 15+ minutes, one of the machines got this error message:

Debug Error!
Program: C:\ProgramData\BOINC\slots\0\qmml3\pythin.exe
abort() has been called
(Press Retry to debug the application)

Let me know if you need more information.

[Edit] After 22 minutes, the second machine received the same error and the task aborted.


The root error is here


PSIO_ERROR: unit = 97, errval = 10
PSIO_ERROR: 10 (lseek failed)


your machine appears to be the only one with this specific failure. Do you have space in your HDD, or other uncommon setups (e.g. FAT instead of NTFS)?

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 20
Credit: 1,713,956,441
RAC: 466,197
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50543 - Posted: 18 Sep 2018 | 15:05:45 UTC

I think I have the same problem on my hostid 193594

I try to increase disk usage.

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50545 - Posted: 18 Sep 2018 | 16:11:14 UTC

Toni asked:

The root error is here

PSIO_ERROR: unit = 97, errval = 10
PSIO_ERROR: 10 (lseek failed)

your machine appears to be the only one with this specific failure. Do you have space in your HDD, or other uncommon setups (e.g. FAT instead of NTFS)?


The error happened on two different computers. Machine 476647 has 100 GB disk space available to BOINC. HDD is formatted as NTFS. I can't think of anything that is non-standard.

Both machines have VirtualBox installed, but that is something that is encountered often in the volunteer computing world.

Let me know if you have more questions.

Toby Broom
Send message
Joined: 11 Dec 08
Posts: 25
Credit: 360,187,443
RAC: 1,622,706
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50546 - Posted: 18 Sep 2018 | 16:31:54 UTC

I see the same on 2 computers, one has 238GB free with boinc allowed to use upto 218GB.


The other has 366GB free with same settings.

I clicked ignore and it seemed to do something, but not full CPU load.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50547 - Posted: 18 Sep 2018 | 20:29:36 UTC - in response to Message 50546.

These WUs are failing indeed. I have no obvious explanation but am investigating. Thanks.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50555 - Posted: 19 Sep 2018 | 8:57:33 UTC

I continue to download wu for linux (QC and Beta). Nothing for windows.
Do i have to change something in my profile?

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50559 - Posted: 19 Sep 2018 | 10:52:26 UTC - in response to Message 50555.

If you receive beta WUs for linux, the you will get them for windows too. I had to cancel all outstanding ones because of a nasty bug (which raised a debug dialog).

T

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50563 - Posted: 19 Sep 2018 | 22:04:31 UTC
Last modified: 19 Sep 2018 | 22:10:52 UTC

NOTE: This should be in the Linux section, not Windows, so you can move it, though I expect it would apply to either.

This one failed with "196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED".
http://www.gpugrid.net/result.php?resultid=18977559

The BointTasks message is:

Aborting task 1718_36_33_32_50_14f550fc_n00001-SDOERR_SELE6-0-1-RND8308_2: exceeded disk limit: 61716.49MB > 57220.46MB


However, I retain 96 GB free in my root partition, and the BOINC startup message says: "max disk usage: 184.49 GB".

So apparently that limit is coming from somewhere else(?).

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50638 - Posted: 3 Oct 2018 | 5:36:51 UTC

any progress with the planned QC app for Windows?

If I look at the Project Status page, it currently shows the following:

Long runs (= GPU tasks) unsent: 1,550; in progress: 3,634; users: 1,002
Quantum Chemistry unsent: 179,513; in progress: 550; users: 71

this shows a vast imbalance, the reason for which is that QC CPU tasks are still available for Linux users only.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50639 - Posted: 3 Oct 2018 | 7:07:33 UTC - in response to Message 50638.

this shows a vast imbalance, the reason for which is that QC CPU tasks are still available for Linux users only.


And not all distro are welcome.
I tried some major distro and i have some problems (for example with Fedora).
No problems with "old" Mint

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50640 - Posted: 3 Oct 2018 | 7:59:10 UTC - in response to Message 50639.
Last modified: 3 Oct 2018 | 8:03:41 UTC

I am running SuSE Leap 42.3 on my main Linux box, and Leap 15.0 on my HP laptop. The kernel of Leap 15.0 is more advanced and is equal to that of the Enterprise version of SuSE Linux, SLES.It is also being updated regularly, while updates of 42.3 seem finished. But I have most of my tools on it, including Kaffeine, and it runs GPUGRID tasks, both CPU and GPU since the box has a GTX 750 Ti GPU board.
Tullio
____________

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50641 - Posted: 3 Oct 2018 | 9:37:13 UTC - in response to Message 50639.


And not all distro are welcome.
I tried some major distro and i have some problems (for example with Fedora).
No problems with "old" Mint


Do you have a task number to check?

Thx

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50644 - Posted: 4 Oct 2018 | 10:13:29 UTC - in response to Message 50641.
Last modified: 4 Oct 2018 | 10:23:21 UTC

Do you have a task number to check?


18961112
I don't know if it is helpful.
I killed this wu after 2h of crunching: wu stopped at 10% and time continued "to restart" from 0
If you need i try other wus

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50645 - Posted: 4 Oct 2018 | 13:32:05 UTC - in response to Message 50644.

Do you have a task number to check?


18961112
I don't know if it is helpful.
I killed this wu after 2h of crunching: wu stopped at 10% and time continued "to restart" from 0
If you need i try other wus


Thanks. What distro is it? Is it updated (i.e. security patches?). The other problem seems to be that the failure was not seen as such, and thus restarted.

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50646 - Posted: 4 Oct 2018 | 17:32:38 UTC - in response to Message 50639.

this shows a vast imbalance, the reason for which is that QC CPU tasks are still available for Linux users only.


And not all distro are welcome.
I tried some major distro and i have some problems (for example with Fedora).
No problems with "old" Mint

I run Ubuntu , not had a problem
____________

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50647 - Posted: 4 Oct 2018 | 17:43:49 UTC

After installing nVidia driver 411.70 on my Windows 10 PC and a Windows 1809 upgrade, my first GPU task completed and validated on it. So far GPU tasks were validated only on my Linux box with SuSE Leap 42.3 and a GTX 750 Ti.
Tullio
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50648 - Posted: 4 Oct 2018 | 18:15:26 UTC - in response to Message 50647.

After installing nVidia driver 411.70 on my Windows 10 PC and a Windows 1809 upgrade ...

your Win10 1809 Upgrade installed the NVIDIA 411.70 driver? Mine did NOT. I made the upgrade on one of my machines today, but there is still the 388... driver (on the other hand, the CPU usage values of the various apps in the Task Manager are no longer shown correctly; from what I was told: a known bug).

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50649 - Posted: 5 Oct 2018 | 8:01:16 UTC - in response to Message 50645.
Last modified: 5 Oct 2018 | 8:01:31 UTC

Thanks. What distro is it? Is it updated (i.e. security patches?). The other problem seems to be that the failure was not seen as such, and thus restarted.


Fedora 28 with security update.
Now i return to Mint....

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50650 - Posted: 5 Oct 2018 | 13:10:55 UTC - in response to Message 50648.

No, I installed it via Geforce. Now GPUGRID tasks run successfully and also Einstein@home GPU tasks.
Tullio

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50654 - Posted: 8 Oct 2018 | 10:47:07 UTC

any progress in developing and testing QC tasks for Windows ?

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50655 - Posted: 8 Oct 2018 | 15:22:40 UTC - in response to Message 50649.

Fedora 28 with security update.
Now i return to Mint....


I have 58 Gb free for gpugrid and, again,
196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED

Oh, my...

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50656 - Posted: 8 Oct 2018 | 19:08:41 UTC - in response to Message 50655.

The same thing happens to me on SELE6 tasks.
Tullio

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50659 - Posted: 9 Oct 2018 | 6:59:27 UTC - in response to Message 50654.

any progress in developing and testing QC tasks for Windows ?


+1

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50666 - Posted: 10 Oct 2018 | 8:22:25 UTC

No sorry, for this week and maybe the next we have other priorities. But we will definitely come back to it soon.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50740 - Posted: 26 Oct 2018 | 12:15:32 UTC - in response to Message 50666.

No sorry, for this week and maybe the next we have other priorities. But we will definitely come back to it soon.


And also this week has gone....

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50741 - Posted: 26 Oct 2018 | 13:03:44 UTC - in response to Message 50740.

And also this week has gone....

Their priorities are most likely scientific. There is the side we don't see, the biology/chemistry side. The computation is only half of the puzzle.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 50755 - Posted: 29 Oct 2018 | 10:30:19 UTC - in response to Message 50741.

Their priorities are most likely scientific. There is the side we don't see, the biology/chemistry side. The computation is only half of the puzzle.


I know, i know, and i like they work on science side.
But i don't know how much science they made with 74 pc on their cpu project...

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50758 - Posted: 29 Oct 2018 | 15:59:58 UTC - in response to Message 50755.

Maybe now that IBM has bought RedHat there will be more Linux users.
Tullio

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50967 - Posted: 27 Nov 2018 | 19:02:28 UTC - in response to Message 50666.

Stefan wrote on October 10:

No sorry, for this week and maybe the next we have other priorities. But we will definitely come back to it soon.


any progress?

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50972 - Posted: 28 Nov 2018 | 9:22:50 UTC

No, sorry. Still working on the paper. I would say a good estimate is from January considering we will need to make a sprint to finish everything before holidays.

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50973 - Posted: 28 Nov 2018 | 9:23:36 UTC

And yes I realize we are losing incredible computational power by not having the Windows App. Priorities are as they are though :/

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 50974 - Posted: 28 Nov 2018 | 11:36:15 UTC

Stefan, thanks for the quick information :-)

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51201 - Posted: 7 Jan 2019 | 8:06:29 UTC - in response to Message 50972.

I would say a good estimate is from January considering we will need to make a sprint to finish everything before holidays.


We are ready! :-P

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51202 - Posted: 7 Jan 2019 | 8:22:46 UTC - in response to Message 51201.

I would say a good estimate is from January considering we will need to make a sprint to finish everything before holidays.


We are ready! :-P

+1

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51259 - Posted: 10 Jan 2019 | 23:36:28 UTC

I received the second batch of Quantum Chemistry, beta test v3.33 (mt) for Windows today, except for 2 errors (so far), they are finishing successfully.

http://www.gpugrid.net/results.php?hostid=263612&offset=0&show_names=0&state=0&appid=35


I also get this message in the Stderr output:

==> WARNING: A newer version of conda exists. <==
current version: 4.5.4
latest version: 4.5.12

http://www.gpugrid.net/result.php?resultid=20150516


captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51260 - Posted: 10 Jan 2019 | 23:41:41 UTC

Got the following error message for a Windows QC_beta task:

16:34:31 (7484): wrapper: running .\qmml3\python.exe (run.py)
Detected memory leaks!
Dumping objects ->
..\api\boinc_api.cpp(309) : {1788} normal block at 0x0000022AE7E17A40, 8 bytes long.
Data: < * > 00 00 DC E7 2A 02 00 00
..\lib\diagnostics_win.cpp(417) : {555} normal block at 0x0000022AE7E40000, 1080 bytes long.
Data: < > 98 14 00 00 CD CD CD CD E4 01 00 00 00 00 00 00
..\zip\boinc_zip.cpp(122) : {294} normal block at 0x0000022AE7DFF750, 260 bytes long.
Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
{280} normal block at 0x0000022AE7DF99D0, 55 bytes long.
Data: < o > 01 00 00 00 6F 00 CD CD 00 00 00 00 00 00 00 00
{275} normal block at 0x0000022AE7E1BCE0, 44 bytes long.
Data: < * > 01 00 00 00 00 00 CD CD 01 BD E1 E7 2A 02 00 00
{270} normal block at 0x0000022AE7E1BE30, 43 bytes long.
Data: < Q * > 01 00 00 00 00 00 CD CD 51 BE E1 E7 2A 02 00 00
{265} normal block at 0x0000022AE7E1BC00, 44 bytes long.
Data: < ! * > 01 00 00 00 00 00 CD CD 21 BC E1 E7 2A 02 00 00
{260} normal block at 0x0000022AE7E1C1B0, 44 bytes long.
Data: < * > 01 00 00 00 00 00 CD CD D1 C1 E1 E7 2A 02 00 00
Object dump complete.

</stderr_txt>
]]>

Looks like it ran for ~61 minutes before it croaked.
Please let me know if you need more information.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51271 - Posted: 11 Jan 2019 | 7:03:19 UTC - in response to Message 51259.

I received the second batch of Quantum Chemistry, beta test v3.33 (mt) for Windows today, except for 2 errors (so far), they are finishing successfully.

http://www.gpugrid.net/results.php?hostid=263612&offset=0&show_names=0&state=0&appid=35


I also get this message in the Stderr output:

==> WARNING: A newer version of conda exists. <==
current version: 4.5.4
latest version: 4.5.12

http://www.gpugrid.net/result.php?resultid=20150516




I am still getting the 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED errors even with disc usage space for boincs set at 500 gigs.





[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51275 - Posted: 11 Jan 2019 | 10:55:37 UTC

My first Windows wu:

195 (0xc3) EXIT_CHILD_FAILED

11:18:52 (12032): install_miniconda.bat exited; CPU time 99.796875
11:18:52 (12032): wrapper: running .\qmml3\python.exe (run.py)
PSIO_ERROR: unit = 97, errval = 12
PSIO_ERROR: 12 (error writing to file)
11:44:45 (12032): .\qmml3\python.exe exited; CPU time 4998.312500
11:44:45 (12032): app exit status: 0xc0000409
11:44:45 (12032): called boinc_finish(195)
0 bytes in 0 Free Blocks.
396 bytes in 4 Normal Blocks.
1144 bytes in 1 CRT Blocks.
0 bytes in 0 Ignore Blocks.
0 bytes in 0 Client Blocks.
Largest number used: 0 bytes.
Total allocations: 22092877 bytes.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51276 - Posted: 11 Jan 2019 | 12:27:16 UTC - in response to Message 51275.

My first Windows wu:
195 (0xc3) EXIT_CHILD_FAILED

11:18:52 (12032): install_miniconda.bat exited; CPU time 99.796875
11:18:52 (12032): wrapper: running .\qmml3\python.exe (run.py)
PSIO_ERROR: unit = 97, errval = 12
PSIO_ERROR: 12 (error writing to file)
11:44:45 (12032): .\qmml3\python.exe exited; CPU time 4998.312500
11:44:45 (12032): app exit status: 0xc0000409
11:44:45 (12032): called boinc_finish(195)
0 bytes in 0 Free Blocks.
396 bytes in 4 Normal Blocks.
1144 bytes in 1 CRT Blocks.
0 bytes in 0 Ignore Blocks.
0 bytes in 0 Client Blocks.
Largest number used: 0 bytes.
Total allocations: 22092877 bytes.


Same here, I had a few those errors.



Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51277 - Posted: 11 Jan 2019 | 14:22:18 UTC - in response to Message 51271.
Last modified: 11 Jan 2019 | 14:23:51 UTC

I am still getting the 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED errors even with disc usage space for boincs set at 500 gigs.

It is not your setting. The project needs to change it.
But now they are fixing the problem by going back to the smaller ones; see Stefan's post on the subject.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51284 - Posted: 12 Jan 2019 | 6:50:05 UTC

Just now I tried to download QC tasks and/or QC beta tasks for windows.

However, none are available (according to a notice in the BOINCS Manager).

So all the "unsent" tasks shown on the Server Status Page are still for Linux only?
I thought that there are some available now for Windows.

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 85
Credit: 1,215,531,270
RAC: 182,549
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51285 - Posted: 12 Jan 2019 | 7:05:22 UTC
Last modified: 12 Jan 2019 | 7:07:36 UTC

Take 3 for test.
If I understand correctly, calculation progress for these WUs does not display correctly. How much time in average this WUs consuming?

P.S. Such WUs producing quite heavy load on storage subsystem. When 3 of them simultaneously writing data on disk, this generates a stream of almost 1 Gigabyte! o_O (~ 900 MiB/ps).
Without high performance SSD, it is better not to run more than one such WU : )


Upd.

One of them already died -_-
https://www.gpugrid.net/result.php?resultid=20269815

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 85
Credit: 1,215,531,270
RAC: 182,549
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51286 - Posted: 12 Jan 2019 | 8:00:49 UTC - in response to Message 51284.
Last modified: 12 Jan 2019 | 8:02:07 UTC

Erich56 wrote:
So all the "unsent" tasks shown on the Server Status Page are still for Linux only?

Only WU Quantum Chemistry, beta test 3.33 (mt) is currently available for Windows. I think on server status page they are Quantum Chemistry, beta test.
You probably need to check your preferences:

Use CPU - yes
Run test applications? - yes
Quantum Chemistry (CPU, beta): yes

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51288 - Posted: 12 Jan 2019 | 9:43:00 UTC - in response to Message 51286.

Use CPU - yes
Run test applications? - yes
Quantum Chemistry (CPU, beta): yes

Thanks for the hint, before I had forgotten to put "Run test applications" on "yes".
I did that now, and one tasks was downloaded and got startet.

What makes me wonder though is that after a few minutes the progress percentage in the BOING Manager got stuck at 1.098%, although the time is progressing, and the Windows task manager shows that Python is running (and using about 214MB RAM).
What I just noticed is that whereas under "remaining time", before some 6 hours where shown; now, about 13 minutes later, the value for "remaining time" has gone up to about 18 hours.
Is this okay, or is the tasks faulty?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51289 - Posted: 12 Jan 2019 | 9:50:38 UTC - in response to Message 51288.

What makes me wonder though is that after a few minutes the progress percentage in the BOING Manager got stuck at 1.098%, although the time is progressing, and the Windows task manager shows that Python is running (and using about 214MB RAM).
What I just noticed is that whereas under "remaining time", before some 6 hours where shown; now, about 13 minutes later, the value for "remaining time" has gone up to about 18 hours.
Is this okay, or is the tasks faulty?

now, 20 minutes after start, the progress bar is still at 1.098%, the remaining time is shown as 1 day + 6 hours, and Python is using 3.194MB RAM (which is not a problem yet at this point).

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51291 - Posted: 12 Jan 2019 | 11:00:08 UTC
Last modified: 12 Jan 2019 | 11:01:01 UTC

the task failed with the error
196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED
after 4.461 secs.

http://gpugrid.net/result.php?resultid=20291478

Another taks is running, so let's see.

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 85
Credit: 1,215,531,270
RAC: 182,549
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51295 - Posted: 12 Jan 2019 | 15:22:41 UTC - in response to Message 51288.

Erich56 wrote:
Is this okay, or is the tasks faulty?

As far as I've noticed, all these WU behave this way.

I have finished a bunch of them and their computation time varies from 0,5 to 3 hours. At average 1,5 hours.

Erich56 wrote:
the task failed with the error
196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED

I think it's "ok".
11 of 14 of these WU completed successfully. Failed WUs have the same error.
https://www.gpugrid.net/results.php?hostid=164968

Profile [AF>Amis des Lapins] Phil...
Send message
Joined: 16 Jul 13
Posts: 56
Credit: 1,626,354,890
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51299 - Posted: 12 Jan 2019 | 17:57:03 UTC

105 errors / failed on 188 calculated WU's
I understand this is a BETA application.
Hope this helps.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51300 - Posted: 12 Jan 2019 | 19:03:32 UTC

so far, all 3 tasks which got downloaded failed with this exit_disk_limit_exceeded error after about 3900 - 4800 seconds.

I do understand that these tasks are beta, so one cannot expect that they succeed.
Maybe I am lucky though and one of these tasks will be successful eventually.

[CSF] Aleksey Belkov
Avatar
Send message
Joined: 26 Dec 13
Posts: 85
Credit: 1,215,531,270
RAC: 182,549
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51304 - Posted: 12 Jan 2019 | 23:19:57 UTC
Last modified: 12 Jan 2019 | 23:21:12 UTC

As a result, from 30 tasks taken only 14 successfully completed.
There are no more test tasks, so I continue crunching WCG.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51305 - Posted: 13 Jan 2019 | 6:34:20 UTC
Last modified: 13 Jan 2019 | 6:49:08 UTC

Since I had started with these tasks yesterday, 20 were processed so far, out of which only 2 succeeded.
Hence, I now stopped downloading them, it doesn't make any sense.

What I don't understand is:

13.01.2019 07:46:22 | GPUGRID | Aborting task 194r20-TONI_TST10-0-1-RND3322_3: exceeded disk limit: 41204.15MB > 28610.23MB

so why is there not sufficient disk limit set by the project to begin with ???

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51310 - Posted: 13 Jan 2019 | 15:10:42 UTC - in response to Message 51305.

Since I had started with these tasks yesterday, 20 were processed so far, out of which only 2 succeeded.
Hence, I now stopped downloading them, it doesn't make any sense.

What I don't understand is:

13.01.2019 07:46:22 | GPUGRID | Aborting task 194r20-TONI_TST10-0-1-RND3322_3: exceeded disk limit: 41204.15MB > 28610.23MB

so why is there not sufficient disk limit set by the project to begin with ???


This means your file size was 41204.15MB, and the limit is 28610.23MB, not the other way around.

The trick to this, is to find which file contains the limit and change it to a larger number and this task would have finished successfully.






Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51311 - Posted: 13 Jan 2019 | 17:49:12 UTC - in response to Message 51310.

The trick to this, is to find which file contains the limit and change it to a larger number and this task would have finished successfully.

any advice how to get this accomplished ?

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51313 - Posted: 13 Jan 2019 | 22:26:19 UTC - in response to Message 51311.

The trick to this, is to find which file contains the limit and change it to a larger number and this task would have finished successfully.

any advice how to get this accomplished ?


Honestly, I have no idea.

I attempted to go to the …/BOINC/slots directory, where there are several directories named 0 ,1, 2, 3 etc. Find the one has the minicoda / wrapper files, and edit with notepad the bonic_task_state.xml file. It should open and look something like this:

<active_task>
<project_master_url>http://www.gpugrid.net/</project_master_url>
<result_name>4534r1-TONI_TST10-0-1-RND0591_4</result_name>
<checkpoint_cpu_time>30.765630</checkpoint_cpu_time>
<checkpoint_elapsed_time>128.215471</checkpoint_elapsed_time>
<fraction_done>0.010989</fraction_done>
<peak_working_set_size>149794816</peak_working_set_size>
<peak_swap_size>139321344</peak_swap_size>
<peak_disk_usage>575110930</peak_disk_usage>
</active_task>

Edit the peak disk usage number to something larger and save the file. In both attempts I failed. WUs errored out. Maybe I should have bumped the numbers higher. I added a zero at the end. The last disk space usage before the crash was over 44 gigs.

Give it try, if you like.

Something similar to this, was done in the past for oversize output files. It worked.







Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51315 - Posted: 14 Jan 2019 | 5:56:07 UTC - in response to Message 51313.

Give it try, if you like.

Something similar to this, was done in the past for oversize output files. It worked.

thanks for the hints.
I remember a similar procedure for LHC/ATLAS tasks sometime last year, when some of them had a "exit_disk_limit_exceeded" problem.
however, to get this done was somehow tricky; from what I remember, one had to do something like stopping the task, even shutting down BOINC, and then increasing the figure (by quite an amount).

However, at this point I cannot try anything, since no betas are available.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51317 - Posted: 14 Jan 2019 | 13:06:15 UTC - in response to Message 51313.

Give it try, if you like.

Something similar to this, was done in the past for oversize output files. It worked.


I don't like to use tricks. I prefer a bugfixed version of the app.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51318 - Posted: 14 Jan 2019 | 14:24:04 UTC - in response to Message 51317.
Last modified: 14 Jan 2019 | 14:24:18 UTC

I prefer a bugfixed version of the app.

me, too !

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51319 - Posted: 14 Jan 2019 | 15:07:37 UTC - in response to Message 51318.

Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51321 - Posted: 14 Jan 2019 | 19:35:06 UTC - in response to Message 51319.

I will raise the limit for the next batch ...

Toni, any rough idea when the next batch will be sent out?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51322 - Posted: 15 Jan 2019 | 12:57:18 UTC

half an hour ago, I got a few tasks downloaded. So let's see whether they will succeed this time, or fail again.

However, what I notice is although my settings in the app_config.xml are for 2-core (which is shown in brackets in the BOINC manager), the Windows task-manager shows that only 1 core is being used by Python. Any explanation for this discrepancy?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51323 - Posted: 15 Jan 2019 | 14:52:28 UTC - in response to Message 51322.

the first task from today got finished successfully after some 4.250 secs, which is very nice :-)
http://gpugrid.net/result.php?resultid=20386975

Right thereafter, the next one got started, and again, although it should be 2-core, only one core is used.

Does anyone else make the same experience?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51327 - Posted: 15 Jan 2019 | 17:28:13 UTC - in response to Message 51319.
Last modified: 15 Jan 2019 | 17:28:49 UTC

Toni wrote

Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail.

1 of 4 tasks which I downloaded and crunched this afternoon showed the
"196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error
again. How come?

http://gpugrid.net/result.php?resultid=20388212

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51333 - Posted: 15 Jan 2019 | 23:51:53 UTC - in response to Message 51327.

Toni wrote
Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail.

1 of 4 tasks which I downloaded and crunched this afternoon showed the
"196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error
again. How come?

http://gpugrid.net/result.php?resultid=20388212


That's because the WUs are from the current batch, not the next batch. You lucked out and got 3 good ones.


Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51342 - Posted: 17 Jan 2019 | 5:50:35 UTC - in response to Message 51333.

Toni wrote
Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail.

1 of 4 tasks which I downloaded and crunched this afternoon showed the
"196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error
again. How come?

http://gpugrid.net/result.php?resultid=20388212


That's because the WUs are from the current batch, not the next batch. You lucked out and got 3 good ones.


yesterday, another 3 tasks were downloaded, all of them failed. Total wasted CPU time about 20.000 secs.
Obviously, all these tasks are still from the current faulty batch.

Question for Toni: Why are these tasks not withdrawn? And when will the next batch with the increased disk limit be sent out?
I hate to waste that much of my CPU work.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51344 - Posted: 17 Jan 2019 | 11:45:28 UTC - in response to Message 51342.
Last modified: 17 Jan 2019 | 11:46:58 UTC

Toni wrote
Approx. 30% of the WUs are outside the disk limits as it was noted. I will raise the limit for the next batch. For the time being, just let them fail.

1 of 4 tasks which I downloaded and crunched this afternoon showed the
"196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error
again. How come?

http://gpugrid.net/result.php?resultid=20388212


That's because the WUs are from the current batch, not the next batch. You lucked out and got 3 good ones.


yesterday, another 3 tasks were downloaded, all of them failed. Total wasted CPU time about 20.000 secs.
Obviously, all these tasks are still from the current faulty batch.

Question for Toni: Why are these tasks not withdrawn? And when will the next batch with the increased disk limit be sent out?
I hate to waste that much of my CPU work.



Actual, it is better not to cancel this batch, and let it run its course, until the WUs all become "Too many errors (may have bug)" WUs, because they will in time disappear from being posted. If you cancel them before that, the WUs will stay posted forever. It is a fault in the system. You may have noticed errors from years ago, still posted.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51345 - Posted: 17 Jan 2019 | 12:26:11 UTC - in response to Message 51344.

okay, I understand.

So, the question (for me) now is: how many more WUs are in the current batch?
Will they be sent out during one more week, one more month ... ?

And when will the new batch be launched?

As said before: I do understand that this is beta, and WUs can fail.
But knowing that due to the too law disk limit setting in the current batch most of the WUs are bound to fail, it doesn't make a whole lot of sense to me to download them and waste quite some CPU time.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51359 - Posted: 19 Jan 2019 | 12:36:15 UTC

Although no "unsent" tasks were shown in the Project Status Page this morning, my PC downloaded 3 WUs.
Again, all 3 failed from the disk limit error, after a total runtime of about 19.000 secs.
Failure rate: 100% :-(((

Hence, it would really be great to know for how much longer these faulty WUs from the "old" batch will be sent out, and when the new batch will be published.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51360 - Posted: 19 Jan 2019 | 13:59:53 UTC

a few minutes ago, the fourth WU within short time failed.
I am giving up now and switched them off in the settings. It's a waste of CPU time.

Perhaps Toni or someone else from the GPUGRID team could post a short message when the new batch which is supposed not to contain this disk limit error will be available.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51372 - Posted: 25 Jan 2019 | 11:00:01 UTC - in response to Message 51360.

Perhaps Toni or someone else from the GPUGRID team could post a short message when the new batch which is supposed not to contain this disk limit error will be available.


It seems there is no hurry...

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51377 - Posted: 26 Jan 2019 | 14:29:10 UTC - in response to Message 51372.

Perhaps Toni or someone else from the GPUGRID team could post a short message when the new batch which is supposed not to contain this disk limit error will be available.


It seems there is no hurry...

I find it a pity that the GPUGRID people obviously don't invest more efforts in developping a well functioning Windows app for QC.
By this, they forego huge computational power.

A look at the project status page shows well what I talk about:
presently, there are some 175.000 unsent tasks for which only 100 Linux users are taking care of.
From what can be seen when when enough GPU tasks available, there are about 1000 users crunching them, mostly Windows of course.
I imagine that crunching CPU tasks on Windows would attract even more people, perhaps the double number or even more (even if the number would be only 1000, this would mean a 10-fold of what it is now under Linux).

So, instead of having 100 users crunching QC on Linux, there would maybe 2000 people crunching QC on Windows.

Hence, I repeat my statement from above: they forego huge computational power. Why so?

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 51381 - Posted: 26 Jan 2019 | 20:12:34 UTC

We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off.
Patience please, we will make it.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51386 - Posted: 27 Jan 2019 | 15:56:59 UTC - in response to Message 51381.

... Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off.
Patience please, we will make it.

thanks for the thorough information, this helps us to understand what's going on.
So we'll keep our fingers crossed that at QC app for Windows will be available soon :-)
My guess is that many crunchers will use it.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51388 - Posted: 27 Jan 2019 | 16:44:30 UTC - in response to Message 51386.

My guess is that many crunchers will use it.

Please, see if you can run their servers dry. Then I can put my Linux machine on something else. I like to use them where they are needed most.

biodoc
Send message
Joined: 26 Aug 08
Posts: 183
Credit: 6,493,864,375
RAC: 2,796,812
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51390 - Posted: 27 Jan 2019 | 21:11:26 UTC

I think the Quantum Chem app should be a subproject and then there should be badges. If you offer badges, they will come.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51395 - Posted: 28 Jan 2019 | 17:22:11 UTC - in response to Message 51386.

thanks for the thorough information, this helps us to understand what's going on.


+1. I didn't know the QC development situation.
Another solution is to open the QC code (if it is possible) to community.
TN-Grid, for example, had a lot of help from two volunteers that helped the team to introduce the sse/avx extensions

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51396 - Posted: 28 Jan 2019 | 18:13:28 UTC - in response to Message 51395.

TN-Grid, for example, had a lot of help from two volunteers that helped the team to introduce the sse/avx extensions

I think on every project where I have seen sse/avx, etc. it has been the volunteers that do it. It takes special expertise. Also, there has been at least one project (POEM, if you remember it) where a volunteer did an outstanding job with an OpenCl app for GPUs, though I don't have his name at the moment. And a very capable developer has done both CUDA and OpenCl apps for XANSONS for COD.

If you ever visit Rosetta, you will quickly find out that rjs5 knows a whole lot about extensions and parallelism; I have seen him at Einstein too.
If you put out a call, they will come.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51401 - Posted: 29 Jan 2019 | 7:40:24 UTC - in response to Message 51396.

If you ever visit Rosetta, you will quickly find out that rjs5 knows a whole lot about extensions and parallelism;

I know Rjs5 (i chatted privately with him time ago).
He have a deep knowledge about c, c++ and optimizations.

If you put out a call, they will come.

I don't know. For example, in Rosetta, the developers seems to be not interested in optimizations. (and, also, they have restrictions on open the code).

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51402 - Posted: 29 Jan 2019 | 11:38:21 UTC - in response to Message 51401.

Actually the current beta QC version should not be too different from the final one. One common reason for failure is that the computations need huge temporary files. We can't make them need less space, but the disk limit will be raised (this occurred for the linux app too).

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51404 - Posted: 29 Jan 2019 | 14:12:46 UTC - in response to Message 51402.

I just got another test WU - and again, it failed with the
"196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error:
http://gpugrid.net/result.php?resultid=20424952

So, obviously the disk limit was not raised enough :-(

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51405 - Posted: 29 Jan 2019 | 14:25:28 UTC - in response to Message 51402.

Actually the current beta QC version should not be too different from the final one. One common reason for failure is that the computations need huge temporary files. We can't make them need less space, but the disk limit will be raised (this occurred for the linux app too).



Will these be released again for Linux? What was the limit raised to?

I installed 4TB HDD into 3 machines in anticipation of these tasks. Would like to see how they run.
____________

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51406 - Posted: 29 Jan 2019 | 16:58:59 UTC - in response to Message 51405.

I installed 4TB HDD into 3 machines in anticipation of these tasks. Would like to see how they run.

Me too. I am putting 500 GB on all my new machines (and some old ones too). If that isn't enough, I can do more.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51407 - Posted: 29 Jan 2019 | 17:44:11 UTC

How much disk space do the QC tasks actually need, on the average?

Stefan
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 51409 - Posted: 29 Jan 2019 | 20:42:56 UTC - in response to Message 51395.
Last modified: 29 Jan 2019 | 20:43:09 UTC

@[VENETO] boboviz: The code is open source https://github.com/psi4/psi4/

biodoc
Send message
Joined: 26 Aug 08
Posts: 183
Credit: 6,493,864,375
RAC: 2,796,812
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51410 - Posted: 29 Jan 2019 | 21:15:07 UTC - in response to Message 51407.
Last modified: 29 Jan 2019 | 21:18:43 UTC

I've got 2 linux machines running QC tasks.

My 40 thread machine is running 10 tasks at once with 4 threads per task. According to boinc, GPUGrid is using 18.8 GB of disk space.

My other machine is running 4 tasks simultaneously at 4 threads per task and that one is taking 9.19 GB of disk space.

The linux app (3.31) is running very smoothly these days. Kudos to Toni et al for fixing the bugs of the earlier versions.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51411 - Posted: 30 Jan 2019 | 8:01:11 UTC - in response to Message 51409.

@[VENETO] boboviz: The code is open source https://github.com/psi4/psi4/



Great!!!

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51572 - Posted: 25 Feb 2019 | 11:51:20 UTC - in response to Message 51381.

About a month ago, Stefan wrote:

We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off.
Patience please, we will make it.


My posting now is not to push or pester, just asking for information on the current status of the project.
Many thanks in advance -

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,617,042,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51573 - Posted: 25 Feb 2019 | 11:56:09 UTC - in response to Message 51572.

I highly doubt anything has changed as they still don't have a C++ developer. I would just lay low and let it happen when it happens.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51578 - Posted: 27 Feb 2019 | 15:27:14 UTC - in response to Message 51573.

I highly doubt anything has changed as they still don't have a C++ developer.


There is no student/graduate student/Phd/etc in University able to develop in C++?
I'll wait for windows app, in meantime i crunch with my linux vm

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51653 - Posted: 28 Mar 2019 | 13:04:30 UTC

Got this error message on the current batch of QC test for Windows:

3/28/2019 7:47:29 AM | GPUGRID | Aborting task 1445r30-TONI_TST11-0-1-RND5643_0: exceeded disk limit: 44800.51MB > 28610.23MB

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51658 - Posted: 28 Mar 2019 | 14:26:44 UTC

And another:

3/28/2019 9:07:38 AM | GPUGRID | Aborting task 1445r7-TONI_TST11-0-1-RND1466_0: exceeded disk limit: 44800.51MB > 28610.23MB

Please let us know when this is fixed and I will start up some more test tasks.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51659 - Posted: 28 Mar 2019 | 14:33:09 UTC - in response to Message 51658.

For now it does not seem a widespread problem. Can you reset the project?

captainjack
Send message
Joined: 9 May 13
Posts: 171
Credit: 2,321,104,288
RAC: 2,339,714
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51661 - Posted: 28 Mar 2019 | 16:02:46 UTC

Did a project reset and started three more tasks. Two task finished successfully and the other aborted with:

3/28/2019 10:37:48 AM | GPUGRID | Aborting task 170r5-TONI_TST11-0-1-RND6393_2: exceeded disk limit: 34955.30MB > 28610.23MB


As a separate test, I tried the tasks on a different Windows PC number 489422. All four tasks aborted with the following message:

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code 195 (0xc3)</message>
<stderr_txt>
10:28:17 (15156): wrapper (7.9.26016): starting
10:28:17 (15156): wrapper: running install_miniconda.bat ()
10:28:18 (15156): install_miniconda.bat exited; CPU time 0.000000
10:28:18 (15156): app exit status: 0x1
10:28:18 (15156): called boinc_finish(195)


I did a project reset and got the same result. Am I missing something?

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51662 - Posted: 28 Mar 2019 | 16:54:51 UTC - in response to Message 51659.

Can you reset the project?


I try on a new machine (Intel Xeon and Windows 7)
All errors:
17:51:36 (6372): install_miniconda.bat exited; CPU time 40.357459
17:51:36 (6372): wrapper: running .\qmml3\python.exe (run.py)
Traceback (most recent call last):
File "C:\ProgramData\BOINC\slots\0\qmml3\lib\site-packages\psi4\__init__.py", line 55, in <module>
from . import core
ImportError: DLL load failed with error code -1073741795

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "run.py", line 14, in <module>
import psi4
File "C:\ProgramData\BOINC\slots\0\qmml3\lib\site-packages\psi4\__init__.py", line 60, in <module>
raise ImportError("{0}".format(err))
ImportError: DLL load failed with error code -1073741795
17:51:37 (6372): .\qmml3\python.exe exited; CPU time 0.000000
17:51:37 (6372): app exit status: 0x1
17:51:37 (6372): called boinc_finish(195)

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 51665 - Posted: 28 Mar 2019 | 20:12:12 UTC - in response to Message 51662.

I try on a new machine (Intel Xeon and Windows 7)
All errors:



That's strange. No errors on my other machine (other Xeon) with Win10 (1809 version) 64 bit.

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 467
Credit: 8,185,871,966
RAC: 10,562,702
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 51666 - Posted: 28 Mar 2019 | 23:46:33 UTC

On my windows 7 computer, I am getting "195 (0xc3) EXIT_CHILD_FAILED" on all the beta WUs.

http://www.gpugrid.net/result.php?resultid=20741510


On my windows 10 computer,I some WUs succeed, while other have the "196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED" error.

http://www.gpugrid.net/results.php?hostid=263612&offset=0&show_names=0&state=0&appid=35


Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51670 - Posted: 29 Mar 2019 | 8:25:45 UTC - in response to Message 51666.

The "ImportError" messages (Windows) likely mean that your processor does not support AVX2. This is something we'll try to sort out.

Another common error is disk space. It's because some molecules we process are indeed too large, and we don't have prior indication on when this occurs.

"Conda timeouts" are also seen. These should be transient, due to the failed connection between your machine and the conda software repository (where we host the packages).

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51671 - Posted: 29 Mar 2019 | 15:00:27 UTC - in response to Message 51670.
Last modified: 29 Mar 2019 | 15:34:39 UTC

Another common error is disk space. It's because some molecules we process are indeed too large, and we don't have prior indication on when this occurs.

which means that this kind of error cannot be precluded to begin with?

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 51672 - Posted: 29 Mar 2019 | 16:52:18 UTC - in response to Message 51671.
Last modified: 29 Mar 2019 | 16:52:42 UTC

which means that this kind of error cannot be precluded to begin with?


We'll refine the task sizes with time.

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 51673 - Posted: 29 Mar 2019 | 17:01:47 UTC - in response to Message 51672.

"Conda timeouts" are also seen. These should be transient, due to the failed connection between your machine and the conda software repository (where we host the packages).


These seem to be the errors on all my QC. No problems with HD space since I replaced them with 4 TB HDD...
____________

Moises Cardona
Send message
Joined: 7 Jun 10
Posts: 3
Credit: 208,405,467
RAC: 0
Level
Leu
Scientific publications
watwatwatwat
Message 51676 - Posted: 30 Mar 2019 | 15:17:38 UTC

Having the Disk Limit Exceeded error too. I'm using a shucked WD 8TB Helium-filled drive with plenty of space available, so disk space is not an issue here as well.

KAMasud
Send message
Joined: 27 Jul 11
Posts: 137
Credit: 523,901,354
RAC: 20
Level
Lys
Scientific publications
watwat
Message 51679 - Posted: 31 Mar 2019 | 13:17:12 UTC

Why are most of my Quantum WU's erroring out except one? When I read the file it points towards mini-conda. Updated Anaconda but still errors?

KAMasud
Send message
Joined: 27 Jul 11
Posts: 137
Credit: 523,901,354
RAC: 20
Level
Lys
Scientific publications
watwat
Message 51680 - Posted: 31 Mar 2019 | 13:18:06 UTC

Why are most of my Quantum WU's error except one? When I read the file it points towards mini-conda. Updated Anaconda but still errors?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51681 - Posted: 31 Mar 2019 | 14:08:27 UTC - in response to Message 51676.

Having the Disk Limit Exceeded error too. I'm using a shucked WD 8TB Helium-filled drive with plenty of space available, so disk space is not an issue here as well.

The failure of the task has not necessarily to do with the size of the drive. Definitely not in your case (unless you did not allocate enough disk space for BOINC in the BOINC settings).

The task itself includes a disk limitation in it's parameters, which are set by the project people. And obviously there are (many) cases where that's not enough.

Ola
Send message
Joined: 8 Apr 18
Posts: 21
Credit: 1,309,700
RAC: 0
Level
Ala
Scientific publications
wat
Message 51686 - Posted: 1 Apr 2019 | 21:13:37 UTC

My both tasks achieved in about 3 minutes 1.098% progress and stopped develop. After about 25 minutes of "empty transit" I broke the tasks because I had recognized, that it had no sense.
My previous beta tasks used to freeze with the same progress. May it mean that I've got badly configurated computer?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51902 - Posted: 22 May 2019 | 5:42:58 UTC - in response to Message 51381.

On January 26, Stefan wrote:

We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off.
Patience please, we will make it.

Any new timeline at this point?

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 51950 - Posted: 31 May 2019 | 5:41:58 UTC - in response to Message 51902.

On January 26, Stefan wrote:

We have stated this in the past but I'll explain it here again. GPUGRID is related to the lab. In the lab we don't have a C++ developer, they are all chemists, biologists and statisticians. Best bet would be me as a computer scientist but I've been busy with many other things recently. So the development of the Windows app has fallen on Raimondas who is very nice to do this as he works in a company (Acellera) which obviously has clients to take care of and does not have an immediate gain from making this work on GPUGRID and Toni who tests it out of personal interest. The current paper we are working on takes priority so once it is finished which is supposed to be this week Raimondas and Toni might finish up the app. According to their latest news the app works mostly fine now except lacking some optimization so it should not be too far off.
Patience please, we will make it.

Any new timeline at this point?

still no anwer after one week.
To me it seems that the project has been abandoned, even more as for quite a while there have not been any QC tasks available at all, for Linux not either.

Would just be great if the GPUGRID people would keep us posted about what's going on.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 52109 - Posted: 19 Jun 2019 | 14:02:45 UTC - in response to Message 51950.

To me it seems that the project has been abandoned... Would just be great if the GPUGRID people would keep us posted about what's going on.


+1

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52171 - Posted: 3 Jul 2019 | 5:13:38 UTC
Last modified: 3 Jul 2019 | 5:13:59 UTC

I assume the project is dead, isn't it?

Stefan or Toni, could you please confirm - many thanks.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52320 - Posted: 21 Jul 2019 | 5:34:50 UTC - in response to Message 52171.

I assume the project is dead, isn't it?

Stefan or Toni, could you please confirm - many thanks.

would be nice to get at least any kind of reply, whatever it says

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52382 - Posted: 4 Aug 2019 | 5:25:46 UTC - in response to Message 52320.

I assume the project is dead, isn't it?

Stefan or Toni, could you please confirm - many thanks.

would be nice to get at least any kind of reply, whatever it says

it's amazing that the GPUGRID team is not willing to give as any information regarding this project.
Why is that so? What's the problems of telling us what's going on?

At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know.

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 52558 - Posted: 3 Sep 2019 | 15:56:10 UTC - in response to Message 52382.

At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know.


And would be nice to know if we volunteers have produced scientific results with our crunching on this app...

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52559 - Posted: 3 Sep 2019 | 16:27:28 UTC - in response to Message 52558.

At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know.

And would be nice to know if we volunteers have produced scientific results with our crunching on this app...

+ 1

Erich56
Send message
Joined: 1 Jan 15
Posts: 1090
Credit: 6,603,906,926
RAC: 21,893,126
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwat
Message 52776 - Posted: 4 Oct 2019 | 15:16:51 UTC - in response to Message 52559.

At any rate, from what it looks: the project is dead. Would just be nice if you people let us volunteers know.

And would be nice to know if we volunteers have produced scientific results with our crunching on this app...

+ 1


+ 1

[VENETO] boboviz
Send message
Joined: 10 Sep 10
Posts: 158
Credit: 388,132
RAC: 0
Level

Scientific publications
wat
Message 53380 - Posted: 21 Dec 2019 | 11:33:41 UTC

I created a dedicated VM for this sub-project and i crunched.
The development of Psi4 is costant.
But here no answer and no news.
It's frustrating.
I'll delete the Vm.

Zalster
Avatar
Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 53386 - Posted: 21 Dec 2019 | 19:17:12 UTC - in response to Message 53380.

If you are looking for QC for CPU only, there is a new project on BOINC called

https://quchempedia.univ-angers.fr/athome/

I figure to give them my CPUs until this subproject restarted.
____________

Post to thread

Message boards : Multicore CPUs : QC tests for windows

//