More CPU jobs

Message boards : News : More CPU jobs
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

AuthorMessage
tullio

Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50394 - Posted: 4 Sep 2018, 16:20:32 UTC - in response to Message 50393.  

Stefan, I see 4 plus one which says psi.30091.clean
Tullio
ID: 50394 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50397 - Posted: 5 Sep 2018, 11:34:11 UTC

Im fixing an issue with SELE2 so I cancelled them and will send out SELE3 in a bit
ID: 50397 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 295,172
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50398 - Posted: 5 Sep 2018, 12:17:35 UTC - in response to Message 50397.  

Im fixing an issue with SELE2 so I cancelled them and will send out SELE3 in a bit

Is this related to the 'upload failure - file size too big' problem reported for SELE2 last week? Whether or not, please double-check the <max_nbytes> value for the new batch.
ID: 50398 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chilean
Avatar

Send message
Joined: 8 Oct 12
Posts: 98
Credit: 385,652,461
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50399 - Posted: 5 Sep 2018, 12:47:46 UTC

I had to add WCG to this 48-thread beast because it isn't using all of the threads @ 100% when running GPUGRID only. I'd wager it's because of scratch space bottleneck (it's running a SSD tho, 200 MB/s according to hdparm)... ?
ID: 50399 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50400 - Posted: 5 Sep 2018, 12:52:31 UTC

No, the issue was with an old version of psi4 giving wrong results on large molecules when using the scratch space. This is fixed in the latest version now.
ID: 50400 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chilean
Avatar

Send message
Joined: 8 Oct 12
Posts: 98
Credit: 385,652,461
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50405 - Posted: 5 Sep 2018, 14:55:45 UTC - in response to Message 50400.  

No, the issue was with an old version of psi4 giving wrong results on large molecules when using the scratch space. This is fixed in the latest version now.


I'll set WCG to don't allow new work and I'll report back!
ID: 50405 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
tullio

Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50406 - Posted: 5 Sep 2018, 16:58:23 UTC

I am running 3.31 SELE6.
Tullio
ID: 50406 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
tullio

Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50408 - Posted: 5 Sep 2018, 17:55:26 UTC

Something is wrong. The BOINC Manager says it is running but python does not appear in the "top" console.

ID: 50408 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zalster
Avatar

Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50409 - Posted: 5 Sep 2018, 18:23:11 UTC - in response to Message 50408.  

I just had about 20 of these fly through before they corrected and started to run correctly.

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
10:09:04 (14352): wrapper (7.7.26016): starting
10:09:04 (14352): wrapper (7.7.26016): starting
10:09:04 (14352): wrapper: running /usr/bin/flock (/home/zalster/Desktop/BOINC/projects/www.gpugrid.net/miniconda.lock -c "/bin/bash ./miniconda-installer.sh -b -u -p /home/zalster/Desktop/BOINC/projects/www.gpugrid.net/miniconda &&
/home/zalster/Desktop/BOINC/projects/www.gpugrid.net/miniconda/bin/conda install -m -y -n qmml2 --override-channels -c defaults -c gpugrid --file requirements.txt ")
Python 3.6.5 :: Anaconda, Inc.

PackagesNotFoundError: The following packages are not available from current channels:

- psi4==1.2.1

Current channels:

- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/free/linux-64
- https://repo.anaconda.com/pkgs/free/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
- https://repo.anaconda.com/pkgs/pro/linux-64
- https://repo.anaconda.com/pkgs/pro/noarch
- https://conda.anaconda.org/gpugrid/linux-64
- https://conda.anaconda.org/gpugrid/noarch

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.


10:09:21 (14352): /usr/bin/flock exited; CPU time 11.828434
10:09:21 (14352): app exit status: 0x1
10:09:21 (14352): called boinc_finish(195)

</stderr_txt>
]]>
ID: 50409 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50413 - Posted: 6 Sep 2018, 7:28:28 UTC - in response to Message 50409.  

Yes we had to do some testing with SELE3-5. SELE6 ought to work fine though. 1741/88 success/fail ratio
ID: 50413 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 50416 - Posted: 6 Sep 2018, 8:45:49 UTC - in response to Message 50413.  

Things seem rather stable for SELE6. For further discussion let's please go to the multicore forum.
ID: 50416 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50428 - Posted: 7 Sep 2018, 12:38:28 UTC - in response to Message 49842.  
Last modified: 7 Sep 2018, 13:13:59 UTC

I'm trying out the AMD EPYC trial from Packet, it runs 48 QM WUs at a time... all valid.
To everybody using hyper-threaded CPUs for crunching:
You should test how well the given app scales with HT on or off on your system. The other approach is leave HT on, but lower the percentage of the usable CPUs in BOINC manager (down to 50%). Too many simultaneous memory intensive apps would cause too many cache misses, resulting in degraded combined performance. With HT off (or by setting the usable CPUs to 50%) calculation time should be halved (due that two threads have one FPU). If it's more than a half, then the number of usable CPUs could be increased, while the RAC has risen accordingly (= in a direct ratio).
I can't test it myself until the Windows app has been released, but I'm interested.
A simultaneous GPU task also could degrade the performance of the CPU tasks and vice versa.


Zoltan, I think you have a great point here. I am noticing much higher CPU utilization and half of the RAM usage when I switched to 50% CPU in BOINC on these new QC WUs. I think it's mostly to due to the much lower Hard drive bandwidth required and perhaps also the cache on the CPU is more efficiently allocated.
ID: 50428 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Chilean
Avatar

Send message
Joined: 8 Oct 12
Posts: 98
Credit: 385,652,461
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50429 - Posted: 7 Sep 2018, 13:31:31 UTC - in response to Message 50428.  

I'm trying out the AMD EPYC trial from Packet, it runs 48 QM WUs at a time... all valid.
To everybody using hyper-threaded CPUs for crunching:
You should test how well the given app scales with HT on or off on your system. The other approach is leave HT on, but lower the percentage of the usable CPUs in BOINC manager (down to 50%). Too many simultaneous memory intensive apps would cause too many cache misses, resulting in degraded combined performance. With HT off (or by setting the usable CPUs to 50%) calculation time should be halved (due that two threads have one FPU). If it's more than a half, then the number of usable CPUs could be increased, while the RAC has risen accordingly (= in a direct ratio).
I can't test it myself until the Windows app has been released, but I'm interested.
A simultaneous GPU task also could degrade the performance of the CPU tasks and vice versa.


Zoltan, I think you have a great point here. I am noticing much higher CPU utilization and half of the RAM usage when I switched to 50% CPU in BOINC on these new QC WUs. I think it's mostly to due to the much lower Hard drive bandwidth required and perhaps also the cache on the CPU is more efficiently allocated.


Yup, I added Rosetta and WCG to the mix and the few GPUGRID WU run constantly @ 400%
ID: 50429 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50435 - Posted: 8 Sep 2018, 3:16:17 UTC

Do you have any tips for getting higher utilization out of these new large molecule QC WUs? I am already running 4 WUs on a 16 core system which is 50% usage in BOINC but the utilization is all over the place. It's using up to 23gb of ram (I have 32gb) with only 4 WUs and I have plenty of space on the SSD.
ID: 50435 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zalster
Avatar

Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50488 - Posted: 13 Sep 2018, 5:41:11 UTC
Last modified: 13 Sep 2018, 5:42:07 UTC

CPU tasks - unsent: 44,723; in progress: 848; users in last 24hrs: 76


Quantum Chemistry unsent: 13,191 in progress: 866

Looks like we are cutting that number down to size quickly...
ID: 50488 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50518 - Posted: 15 Sep 2018, 13:21:38 UTC

QC WUs are almost out, less than 600 to send out.
ID: 50518 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 50519 - Posted: 15 Sep 2018, 18:36:11 UTC - in response to Message 50518.  

They may be waiting until the 3.31 jobs finish before introducing the new 3.32 version. I expect they have plenty more.
ID: 50519 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zalster
Avatar

Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 50521 - Posted: 15 Sep 2018, 20:03:22 UTC
Last modified: 15 Sep 2018, 20:03:51 UTC

Hopefully. We are officially out of cpu work.
ID: 50521 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
tullio

Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 50528 - Posted: 17 Sep 2018, 14:13:58 UTC

I am running two resends. One of them failed with "file too big" error. The other is running.
Tullio
ID: 50528 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 50529 - Posted: 17 Sep 2018, 15:43:45 UTC

I submitted some WUs but I am warning you :P This batch will use lots of scratch space.
ID: 50529 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 3 · 4 · 5 · 6 · 7 · Next

Message boards : News : More CPU jobs

©2025 Universitat Pompeu Fabra