Whatever

Author	Message
Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 71,253 Level Scientific publications	Message 56193 - Posted: 29 Dec 2020, 20:54:16 UTC - in response to Message 56192. FYI, you can go beyond 16, barring any other issues like the absurdly long run times preventing downloading more. but under normal circumstances, you can download as many as you like. I had my faster hosts doing a cache of 80 (with the same pandora settings format keith listed there, just swap "16" with your target cache size). but beware, this project gives a +50% bonus for tasks returned within 24hrs, so it's detrimental to cache tasks longer than this. play it safe. if you're not running the Python tasks, targetting 0.75 days return time is pretty safe and gives you a nice buffer for when the project goes down occasionally. I'm still running both task types, but it takes time for things to even out. ID: 56193 · Rating: 0 · rate: / Reply Quote

Zalster Send message Joined: 26 Feb 14 Posts: 211 Credit: 4,496,324,562 RAC: 0 Level Scientific publications	Message 56194 - Posted: 29 Dec 2020, 21:02:19 UTC - in response to Message 56192. I would avoid the experimental python tasks for now. Maybe in a few months, the admins and scientists will figure out a workable configuration set for all hosts. You can use the Pandora client configuration file to up your cache to the maximum 16 allowed. This is my pandora_config file snippet for this project. Courtesy of Ian. project: https://www.gpugrid.net/ gpu_serverside_limit: 2 gpu_spoof_tasks: 16 gpu_limit: 16 request_min_cooldown: 180 Thank Ian and Keith, for now I'm leaving it as is. I swapped out the intake fan in the back for a be quiet 3 140mm and ordered a be quiet 120mm for the front intake fan. Hopefully that will be enough to move some air to keep that top GPU temps down. The bottom GPU is only at 38C ID: 56194 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,189,946,190 RAC: 306,610 Level Scientific publications	Message 56197 - Posted: 30 Dec 2020, 0:57:42 UTC Yes, my slowest host has a turnaround time of .43 days. Well within the 24 hour bonus return window. That's with a 16 task cache and also running Einstein and Milkyway gpu tasks concurrently also. The project basically runs on 1 for 1 return/replacement mechanism. I don't have 10 gpus running like Ian on a single host though. Max is 4 gpus and mostly 3 on most hosts. 16 task cache is fine for me. ID: 56197 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 71,253 Level Scientific publications	Message 56198 - Posted: 30 Dec 2020, 1:53:22 UTC the Python tasks are reaching insane levels of credit reward. with the last round of Python tasks, I was able to determine that credit reward was ONLY a function of runtime and peak flops reported from the device. I also determined that faking your peak flops value to be higher would result in more credit. and third there seems to be a fail-safe credit limit where if exceeded would result in a default credit rewarded regardless of runtime or flops (if your combined credit reward came out to be greater than about 1,750,000, you get hit with the penalty value. at first this penalty was 20.83 credits, then it increased to 34,722.22 credits with the change from 3,000 to 5,000,000 GFlop task estimate. on the last round of python tasks last week, a 2080ti would earn about 100cred/second of runtime. that's corresponding to a gpu peak_flops value of about 14 TFlops. on this current round, a 2080ti is earning about 1000credit/second of run time (and with a bit more variance than before). easily 100x the credit reward of an MDAD task per unit time. another observation is that for some unknown reason, some hosts are earning disproportionately less credit per unit time, and that's with already excluding tasks that his the 34,722 barrier. It seems that peak flops and run time aren't the only factors anymore, but I can't determine exactly why some hosts are getting HUGE reward, and some are not. for example, 1. both of my 2080ti hosts are earning about 1000cred/sec on the tasks that stay below the penalty threshold. 2. my 2070 system which has about half of the peak_flops value, earns about 350cred/sec (less than the half you'd expect based on peak flops). 3. my GTX 1660 Super (in the RTX 3070 host as device 1) is earning about 71cred/sec, which is about the same as it earned before when it was using the peak flops value from the RTX 3070. but that's even WAY low if the new tasks were using the per device flops value (which it wasnt before). So, I'm enjoying the credits, but obviously something isn't really being calculated fairly here since credit earned isnt consistent with work performed across all hosts. ID: 56198 · Rating: 0 · rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1423 Credit: 9,189,946,190 RAC: 306,610 Level Scientific publications	Message 56200 - Posted: 30 Dec 2020, 8:15:44 UTC I sure hope they can properly debug this python app. Would be nice to have an alternate application and task source other than acemd3. I hope they can sort out the credit situation also. I have no complaints with the acemd3 credit awards. ID: 56200 · Rating: 0 · rate: / Reply Quote

Ian&Steve C. Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,876,970,595 RAC: 71,253 Level Scientific publications	Message 56203 - Posted: 30 Dec 2020, 16:11:56 UTC - in response to Message 56200. Last modified: 30 Dec 2020, 16:22:09 UTC They’ve already made an improvement to the app, at least in getting the efficiency back up. With the last round of Python, it was similar to the Einstein GW app where the overall speed (GPU utilization) seemed to depend on the CPU speed, and it used a lot more GPU memory, and very little GPU PCIE. However with this latest round of Python tasks, they are back to the same basic setup as the MDAD tasks with low GPU mem use, good 95+% GPU utilization even on slow CPUs, and PCIe use is back up to the same as MDAD. So at least that’s better. The MDAD credit scheme seems to work well IMO. I too hope they figure it out. Maybe they aren’t super concerned with credit on these since they are beta tasks. Also still waiting on the app for ampere support :( ID: 56203 · Rating: 0 · rate: / Reply Quote