Message boards :
News :
PYSCFbeta: Quantum chemistry calculations on GPU
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 14 · Next
| Author | Message |
|---|---|
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Sending out work for this app today. The work units take an hour (very approximately). They should be using different GPUs on multigpu systems. Please let me know if you see anything not working as you would normally expect |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Everything working as expected at my hosts. Well done! 👍️ |
|
Send message Joined: 18 Mar 10 Posts: 28 Credit: 41,810,583,419 RAC: 13,276 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Steve, so far the first few tasks are completing and being validated for me on single and multi-GPU systems. |
|
Send message Joined: 3 May 20 Posts: 19 Credit: 1,043,759,208 RAC: 39 Level ![]() Scientific publications
|
My host is an R9-3900X, RTX 3070-Ti running ubuntu 20.04.06 LTS but it doesn't receive Quantum chemistry work units. I selected it in the preferences, test work and "ok to send work of other subprojects". Did I miss anything? |
|
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 428 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
My host is an R9-3900X, RTX 3070-Ti running ubuntu 20.04.06 LTS but it doesn't receive Quantum chemistry work units. I selected it in the preferences, test work and "ok to send work of other subprojects". Did I miss anything? I had the same problem until I ticked every available application for the venue, resulting in "(all applications)" showing on the confirmation page. Having cleared that hurdle, I note that the tasks are estimated to run for 1 minute 36 seconds (slower device) and 20 seconds (fastest device). The machines have most recently been running ATMbeta (Python) tasks, and have been left with "Duration Correction Factors" of 0.0148 and 0.0100 as a result. The target value should be 1.0000 in all cases. Please could keep an eye on the <rsc_fpops_est> value for each workunit type, to try and minimise these large fluctuations when new applications are deployed? |
|
Send message Joined: 18 Mar 10 Posts: 28 Credit: 41,810,583,419 RAC: 13,276 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Drago, You also need to check the "Run test applications?" box. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Sending out work for this app today. The work units take an hour (very approximately). They should be using different GPUs on multigpu systems. Please let me know if you see anything not working as you would normally expect at least one of my computers is unable to get any tasks. the scheduler just reports that there are no tasks sent. it's inexplicable since it is the exact same configuration as a system that is receiving tasks just fine. they are both on the same venue. and that venue has ALL projects selected, and has both test/beta apps allowed, and both have allow other apps selected. not sure what's going on here. the only difference is one has 4 GPUs and the other has 7. will get work: https://gpugrid.net/show_host_detail.php?hostid=582493 will not get work: https://gpugrid.net/show_host_detail.php?hostid=605892
|
|
Send message Joined: 3 May 20 Posts: 19 Credit: 1,043,759,208 RAC: 39 Level ![]() Scientific publications
|
Yeah! I got all boxes checked but I still don't get work. Maybe it is a problem with the driver? I have version 470 installed which worked fine for me so far... |
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Ok thanks for this information. There must be something unexpected going on with the scheduler. |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I made a couple tests with these new PYSCFbeta tasks. I tested to stop two of them, and they restarted without erroring. This is good... but both of them got reset their execution times and restarted from the beginning. This is not so good... And the tests were made at a blend double GPU system (GTX 1660 Ti + GTX 1650). Conversely to ACEMD tasks, both tasks were restarted on the different GPU model than they started, and they did not crash. This is good! Also, I've noticed a considerable reduction in power draw (about halved) comparing to ACEMD tasks. GPU power draw at GTX 1660 Ti GPU with PYSCFbeta tasks is half than I'm familiar to see with ACEMD tasks. And the same happens to GTX 1650 GPU. Consequently, although 100% GPU usage is shown, working temperatures are much lower... |
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Stopping and resuming is not currently implemented. It will just restart from the beginning. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
are you able to inspect the scheduler log from this host? can you see more detail about the specific reason it was not sent any work? the only thing i see on my end is "no tasks sent" with no reason.
|
|
Send message Joined: 22 Oct 09 Posts: 5 Credit: 761,886,074 RAC: 9 Level ![]() Scientific publications
|
I have the same problem too: no tasks sent! |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 69 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Here is a tale of 2 computers, one that was getting units, and the other was not. https://www.gpugrid.net/hosts_user.php?userid=19626 They both have the same GPUGRID preferences. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
another observation is keep an eye on your CPU use. these look to be another mt+cuda setup for which BOINC is not prepared to handle, much like the PythonGPU work. i saw upwards of 30 threads utilized per task. but it wasn't sustained, it would come in bursts. on average reported cpu_time and runtime was about 4x actual (15min actual would be reported as about an hour runtime)
|
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Thanks for listing the host ids that are not receiving. I can see them in the scheduler logs so hopefully can pin point why they are not getting work. And yes I missed a setting to limit the multi-threading thanks for catching that! (all the modern libraries try very hard to multi-thread withing telling you they are going to haha) |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
i think if you get a discrete check box selection in the project preferences for QChem on GPU, that will solve the issues of requesting work for this project.
|
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Thank you to whoever got the discrete checkbox implemeted in the settings :). this should make getting work less trivial.
|
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
The app will now appear in the GPUGRID preferences:"Quantum chemistry on GPU (beta)" Previous scheduler problems should be fixed. (I can see that https://gpugrid.net/results.php?hostid=605892 is now getting the jobs when before it was not.) |
|
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 69 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Here is a tale of 2 computers, one that was getting units, and the other was not. I am getting tasks on both computers, now. So far, all tasks are completing successfully. |
©2025 Universitat Pompeu Fabra