Message boards :
News :
*CXCL12_chalcone_umbrella* batch
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Mar 14 Posts: 101 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Hi everyone, yesterday we launched a bit more than 13,000 short WUs called *CXCL12_chalcone_umbrella*. They are pretty small WU, of 8ns (compared to a normal long WU ~40ns) and we hope they are easy and fun for you to crunch. Please post any problem you may encounter. Scientists have been using a technique called Umbrella Sampling for some time now, with relative success in determining what we call binding free energy (which indicates how strongly a drug can bind its protein target). It is a pretty straightforward and much less expensive technique compared to the one we regularly use in our lab (adaptive sampling). However, it is particularly error-prone if the assumptions we take are wrong. We are particularly excited about these WU, because while most scientific effort have been focused on reproducing free energies for single particular models or (most of the time) very simple toy models, it is the first time to our knowledge that, thanks to the fantastic community we have built together in GPUGRID, we can use this technique in a real case to screen potential drugs binding to CXCL12, a chemokine related to cancer metastasis. Thanks to everyone for your contribution and I will be happy to assist you if you find any problem on the way. :) |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Gerard, These workuints, just as the "GERARD_A2AR_NUL1D" (long) workunits are running with very low GPU load (32~53%), especially when the CPU is crunching too. Can you add any comments on this? |
Send message Joined: 20 Jul 14 Posts: 732 Credit: 130,089,082 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the heads-up Gerard! :) [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres |
Send message Joined: 26 Mar 14 Posts: 101 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Hi Retvari, I've been reading a lot the same question in the forums so I think you deserve a concrete answer. The workunits that you refer to include "extra" forces that, because of the simulator implementation, cannot be calculated with the GPU and must be calculated via CPU. I guess that the communication between CPUs and GPUs for each of the simulation step is definetely a bottleneck causing a decrease in GPU performance. I've made some figures for the curious ones, explaning these "extra forces". *A2AR* batch ![]() These are membrane systems. You can see the protein in yellow, embedded in a membrane (sticks) and solvated in a water box (blue cage). In these models, we place a drug (marked as "ligand") in the extracellular space, where it has to bind with its receptor (the protein). Because of the way we simulate, these systems have a property we call "periodic boundary conditions", meaning that each of the molecules on the sides of the box, interact and can actually flip to the opposite side of the box. This allows the nice effect that the system is solvated in a "non-finite" box. Because of this effect, the ligand can jump from one side of the box to the other one. However, this doesn't make any biological sense, because drugs placed in the extracellular space can't freely access the intracellular space (they can't jump the membrane!). To ensure that the ligand stays in the extracellular phase, for each simulation frame we calculate the position of the ligand and if it is higher than the red line we apply a force down to make it stay. This arbitrary "force" is considered extra and must be calculated via CPU. *umbrella* batch ![]() In this case, we use a technique called "Umbrella Sampling" to calculate the binding free energy of the ligand (in yellow) to the protein (in blue). This technique consists in assuming that the unbinding (or the binding) occurs in a linear way (green line), and what we do is to simulate the ligand in different positions along this pathway (ideally every 0.5 angstroms). After the simulation, we calculate how stable the ligand was in each of the positions and using some mathematical framework called WHAM we can calculate the probabilities that the ligand goes from unbound to bound. Now comes the "extra force": in order to force the ligand to sample the desired position along the pathway, we apply a force (in red) to make it stay there. This force increases as the ligand goes further away from the initial position, giving this potential profile of an inverse umbrella (in black). This is why is called "Umbrella sampling". Again, this "extra force" must be calculated via CPU... I hope you found it useful or interesting! If you have any questions please do not hesitate. :) EDIT I have revised the A2AR_1D batch because it was indeed looking much slower than the other A2AR. I've made a change that should speed up new 1D WU you may get. I'll be sending *1Dx* batch soon, tell me if is faster for you guys. |
Send message Joined: 5 Jan 09 Posts: 670 Credit: 2,498,095,550 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the explanation and time Gerard it is what makes me want to crunch GPUGrid unlike other projects who explain nothing. |
Send message Joined: 3 Sep 14 Posts: 152 Credit: 918,557,369 RAC: 21,054 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Great news! Thank you! |
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Very good explanation Gerard, now we know what we crunch. Thank you. Greetings from TJ |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thank you, Gerard |
Send message Joined: 20 Jul 14 Posts: 732 Credit: 130,089,082 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for your time Gerard! Really appreciated! :) [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres |
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
My statistics are still rather thin, but with one GERARD_CXCL12_chalcone_umbrella completed on each of four Maxwell cards: GTX 960: GPU Load 34% (supported by 1 core of i7-4790) MCL: 16% Power: 33.3% TDP = 40 watts Time: 4 hours 11 minutes (ave between the two cards for 2 work units) GTX 750 Ti: GPU Load 49% (supported by 1 core of i7-4770) MCL: 10% Power: 31.3 TDP = 19 watts Time: 5 hours 6 minutes (ave between the two cards for 2 work units) None of the cards were overclocked by me, and only minimally factory-overclocked. They were so lightly loaded that they usually ran at less than their maximum GPU clock settings. So it seems that the GTX 750 Ti is more efficient, running at 80% of the speed of the GTX 960, but using only 50% of the power. |
Send message Joined: 25 Mar 12 Posts: 103 Credit: 14,948,929,771 RAC: 12,866 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Thanks for the explanation Gerard! This one is failing in all hosts at uploading after completing correctly with : <error_code>-131 (file size too big)</error_code> https://www.gpugrid.net/workunit.php?wuid=11490316 |
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 52,725 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It looks like your fix is working: name e1s18_1-GERARD_A2AR_NUL1Dx2-0-2-RND6828 application Long runs (8-12 hours on fastest card) created 29 Feb 2016 | 11:06:27 UTC canonical result 14973218 granted credit 227,850.00 minimum quorum 1 initial replication 1 max # of error/total/success tasks 7, 10, 6 Task click for details Computer Sent Time reported or deadline explain Status Run time (sec) CPU time (sec) Credit Application 14973218 263612 29 Feb 2016 | 22:01:58 UTC 1 Mar 2016 | 7:13:49 UTC Completed and validated 26,058.87 25,948.73 227,850.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65) https://www.gpugrid.net/workunit.php?wuid=11503823 name e1s2_1-GERARD_A2AR_luf6632_b_1Dx2-1-2-RND8928 application Long runs (8-12 hours on fastest card) created 1 Mar 2016 | 8:34:30 UTC canonical result 14975190 granted credit 227,850.00 minimum quorum 1 initial replication 1 max # of error/total/success tasks 7, 10, 6 Task click for details Computer Sent Time reported or deadline explain Status Run time (sec) CPU time (sec) Credit Application 14975190 263612 1 Mar 2016 | 11:12:20 UTC 2 Mar 2016 | 0:13:09 UTC Completed and validated 26,523.80 26,430.66 227,850.00 Long runs (8-12 hours on fastest card) v8.48 (cuda65) https://www.gpugrid.net/workunit.php?wuid=11504861 |
Send message Joined: 26 Mar 14 Posts: 101 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
We've changed the size limit for these WU. I hope this fixes this problem in the new WU. Sorry for the inconvinience! |
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
We've changed the size limit for these WU. I hope this fixes this problem in the new WU. Sorry for the inconvinience! Which problem is this intended to fix? This workunit: https://www.gpugrid.net/workunit.php?wuid=11490978 ... had 3 task failures for: <message> upload failure: <file_xfer_error> <file_name>chalcone537x1x47-GERARD_CXCL12_chalcone_umbrella-0-1-RND4302_2_9</file_name> <error_code>-131 (file size too big)</error_code> </file_xfer_error> </message> https://www.gpugrid.net/result.php?resultid=14972428 https://www.gpugrid.net/result.php?resultid=14973031 https://www.gpugrid.net/result.php?resultid=14975154 Why are these failing? |
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 52,725 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Correction. it should be like this: For these low GPU usage and high CPU usage WUs, if you use this: <app> <name>acemdshort</name> <gpu_versions> <gpu_usage>.5</gpu_usage> <cpu_usage>1</cpu_usage> </gpu_versions> </app> in your app_config.xml file, you can increase GPU usage from 30%-40% range to 60% to 70% range, depending on your hardware and OS. That's what is happening on my computers. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Also note that this app_config.xml should be placed in the project's folder under the BOINC folder. For example on Windows Vista, 7, 8, 8.1, 10: c:\ProgramData\BOINC\projects\www.gpugrid.net\ on Windows XP in the following folder: c:\Documents and Settings\All Users\Application Data\BOINC\projects\www.gpugrid.net\ These short workunits generate very large output files 60~140MB, sometimes it's larger than the long run's output, so no wonder if the server runs out of space, and some contributor's (like mine) ADSL connection gets congested due to continuous uploads. So if these workunits will be the "standard" then the 2 workunits/GPU limit should be raised to 3 per GPU. |
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 52,725 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The website access is also slow at times. I hope these WUs do not become the "standard". That would be a total waste of high end GPU video cards. |
![]() Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
2 umbrella models running on GTX970’s (1 task per GPU): GPU power @ 83% & 84% GPU0 @ 1316MHz, GPU1 @ 1050MHz (it downclocked itself) Temps @ 59°C and 50°C (with power limited to 83% - now at 90%) GPU usage was 43% and 33% ~600MB GDDR each. A few suggestions: If trying to run 2 Umbrella tasks at a time, free up more CPU headroom and stop running long tasks (might hog the GPU). Using NVIDIA Inspector (NVI) could help to set/fix the clocks (GPU and Memory) and raising the Memory from 3005MHz to 3505MHz might help a bit too (partially reduce the bottlenecks; CPU usage/communication supposedly higher with these tasks). - When these WU’s dry up you might need to revert/change settings again. If you’re running a mix of long and short tasks and the GPU frequency drops when running a short task, try changing/fixing the GPU settings using NVI and Suspend any short tasks on slow GPU's to start a long WU. In theory, when going back to the short WU it might run at the new settings (say, 1316MHz rather than 1050MHz). If not try again with a restart (or preferably a cold restart) after suspending the short task. Existing tasks might want to keep their GPU settings but New short tasks should use your NVI defined settings. Website slow for me too. 43% utilization is low, but running 2 tasks would be about the same as one normal task in terms of GPU usage. It's a challenge but it's doable and these WU's won't be around for ever. There are also 'normal' GPU utilizing long WU's. Upload file sizes are set to prevent upload of massive erroneous data. Good luck, FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Send message Joined: 25 Mar 12 Posts: 103 Credit: 14,948,929,771 RAC: 12,866 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting now the "transient upload error" and "server is out of disk space" messages in three units trying unsuccessfully to upload. |
Send message Joined: 28 Mar 09 Posts: 490 Credit: 11,731,645,728 RAC: 52,725 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm getting now the "transient upload error" and "server is out of disk space" messages in three units trying unsuccessfully to upload. Same here, I will soon finish crunching all my task GPUGRD tasks and won't be able to download anymore. Good thing I have a back up project. |
©2025 Universitat Pompeu Fabra