Message boards :
Number crunching :
Strange host
Message board moderation
| Author | Message |
|---|---|
titoSend message Joined: 21 May 09 Posts: 22 Credit: 2,002,780,169 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
May somebody give advice how is it possible that host no2 on top list compute ATMML WU under 500sec? Additionally it's GPU is listed as 1660S https://www.gpugrid.net/show_host_detail.php?hostid=624047 |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
May somebody give advice how is it possible that host no2 on top list compute ATMML WU under 500sec? Yes, something strange going on here, I agree. Somebody has found a way to 'game' the system. |
|
Send message Joined: 11 May 10 Posts: 68 Credit: 12,293,491,875 RAC: 3,176 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Somebody has found a way to 'game' the system. Who is that? ononoki? |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
May somebody give advice how is it possible that host no2 on top list compute ATMML WU under 500sec? May be some kind of misconfiguration at this host is causing that its ATMML tasks are jumping directly from the environment extracting phase to the end, skipping the long machine-learning phase. Perhaps, the main question is: Are these tasks helping to the Science involved, or not? If not, their utility would reduce to an amazing RAC raising to that "strange" host... |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Talking about strange hosts, I've also noticed some of them at current Hosts ranking positions 11, 12, 13, 15, 17, 20 and 27. They all indicate "[40] NVIDIA NVIDIA TITAN V (4095MB)" This would add up to a total of 7x40=280 NVIDIA TITAN V graphics cards for some anonymous owner(s) |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
More likely they are one owner using cloud instance rentals. They could also be spoofing the coproc_info.xml file to report 40 cards on each host. I don't see how a 8 core cpu could support that many real cards. |
|
Send message Joined: 21 Feb 20 Posts: 1116 Credit: 40,839,470,595 RAC: 6,423 Level ![]() Scientific publications
|
Talking about strange hosts, I've also noticed some of them at current Hosts ranking positions 11, 12, 13, 15, 17, 20 and 27. these are hosts from members of TSBT. either PecosRiver or Megacruncher or both. they are spoofing the GPU count, which is very easy to do. each host probably only has 1 or 2 Titan Vs. all of the platforms are fairly old/low-end and wouldnt support more than a couple Titan Vs anyway. their production doesnt seem weird or overly impressive for what 1 or 2 titan Vs could do since QChem is very good for strong FP64 cards. they're getting a lot of errors from low VRAM tho. spoofing the GPUs to such a high number is a holdover from SETI, where you could spoof up to 64 GPUs and get proportionally more tasks. I don't think any other project these days will react the same way to that extent. both Einstein and GPUGRID will cap the effective GPU count (what's used for scheduling decisions) to just 8 GPUs and any number above that does not count for getting more work. GPUGRID used to only give you 2 tasks per GPU, but I think they changed that a month or two ago to 4 tasks per GPU.
|
|
Send message Joined: 8 Oct 16 Posts: 27 Credit: 4,153,801,869 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Talking about strange hosts, this is more of a compliment to user wscr http://www.gpugrid.net/show_user.php?userid=5728 with two GTX 1660 Super hosts with 6GB vram running qchem with low OOM failures. http://www.gpugrid.net/results.php?hostid=598633&offset=0&show_names=0&state=0&appid=47 ~2% error rate in qchem http://www.gpugrid.net/results.php?hostid=227353&offset=0&show_names=0&state=0&appid=47 ~ 1% error rate in qchem. wscr other RTX 2070 below also has ~1 % error in qchem http://www.gpugrid.net/results.php?hostid=618605&offset=40&show_names=0&state=0&appid=47 but his/her other 2070 has high error. Kudos! |
|
Send message Joined: 12 Sep 10 Posts: 8 Credit: 193,027,934 RAC: 0 Level ![]() Scientific publications ![]() ![]()
|
I crunched one very short ATMML task in the beginning of them, ended in similar very short time, before that - crunched 24/7 for weeks ACEMD3 tasks only, because my old GPU not supported by Quantum Chemistry tasks. |
|
Send message Joined: 18 Mar 10 Posts: 28 Credit: 41,810,583,419 RAC: 13,276 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
The host that started this thread is active again after taking a few days off. Same output in the stderr file. The line "tar: run.log: file changed as we read it" may be what triggers skipping over the science? Seems like someone (I'm guessing ononoki owns this PC). has an issue or a great hack for points on their system, but that's beyond my skillset. https://www.gpugrid.net/results.php?hostid=624047 https://www.gpugrid.net/result.php?resultid=35656131 + tar cjvf output.tar.bz2 run.log r0/QB_A12_A01.out r1/QB_A12_A01.out r10/QB_A12_A01.out r11/QB_A12_A01.out r12/QB_A12_A01.out r13/QB_A12_A01.out r14/QB_A12_A01.out r15/QB_A12_A01.out r16/QB_A12_A01.out r17/QB_A12_A01.out r18/QB_A12_A01.out r19/QB_A12_A01.out r2/QB_A12_A01.out r20/QB_A12_A01.out r21/QB_A12_A01.out r3/QB_A12_A01.out r4/QB_A12_A01.out r5/QB_A12_A01.out r6/QB_A12_A01.out r7/QB_A12_A01.out r8/QB_A12_A01.out r9/QB_A12_A01.out r0/QB_A12_A01.dcd r1/QB_A12_A01.dcd r10/QB_A12_A01.dcd r11/QB_A12_A01.dcd r12/QB_A12_A01.dcd r13/QB_A12_A01.dcd r14/QB_A12_A01.dcd r15/QB_A12_A01.dcd r16/QB_A12_A01.dcd r17/QB_A12_A01.dcd r18/QB_A12_A01.dcd r19/QB_A12_A01.dcd r2/QB_A12_A01.dcd r20/QB_A12_A01.dcd r21/QB_A12_A01.dcd r3/QB_A12_A01.dcd r4/QB_A12_A01.dcd r5/QB_A12_A01.dcd r6/QB_A12_A01.dcd r7/QB_A12_A01.dcd r8/QB_A12_A01.dcd r9/QB_A12_A01.dcd tar: run.log: file changed as we read it + true + echo 'Save restart' + tar cjvf restart.tar.bz2 r0/QB_A12_A01_ckpt.xml r1/QB_A12_A01_ckpt.xml r10/QB_A12_A01_ckpt.xml r11/QB_A12_A01_ckpt.xml r12/QB_A12_A01_ckpt.xml r13/QB_A12_A01_ckpt.xml r14/QB_A12_A01_ckpt.xml r15/QB_A12_A01_ckpt.xml r16/QB_A12_A01_ckpt.xml r17/QB_A12_A01_ckpt.xml r18/QB_A12_A01_ckpt.xml r19/QB_A12_A01_ckpt.xml r2/QB_A12_A01_ckpt.xml r20/QB_A12_A01_ckpt.xml r21/QB_A12_A01_ckpt.xml r3/QB_A12_A01_ckpt.xml r4/QB_A12_A01_ckpt.xml r5/QB_A12_A01_ckpt.xml r6/QB_A12_A01_ckpt.xml r7/QB_A12_A01_ckpt.xml r8/QB_A12_A01_ckpt.xml r9/QB_A12_A01_ckpt.xml 2024-08-16 00:29:43 (36298): bin/bash exited; CPU time 198.063084 2024-08-16 00:29:43 (36298): called boinc_finish(0) |
ServicEnginICSend message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 1,447 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
May be Project Scientists have something to say about this matter. If tasks processed by this host were not useful for what they are intended, it would be disturbing by "burning" tasks instead of "crunching" them... |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
So far the project scientist hasn't done anything about this host other than confirming that the host is only producing 'garbage' results. Not sure why inaction is the only current decision. Maybe they are not concerned because those bad results don't ever corrupt the science. |
titoSend message Joined: 21 May 09 Posts: 22 Credit: 2,002,780,169 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
... But it corrupts our society. BTW - is there any thread regarding credits given here on GPUGrid? They look insane high comparing to other projects (like Collatz years ago). |
|
Send message Joined: 13 Dec 17 Posts: 1419 Credit: 9,119,446,190 RAC: 891 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Unless a project sticks to box stock, broken CreditNew BOINC credit algorithm, credit awarding is entirely arbitrary depending on what project admins decide. An admin can award high task credit to increase the 'attractiveness' of their project in hope it will gain more volunteer participation and increase their science production. |
|
Send message Joined: 15 Jul 20 Posts: 95 Credit: 2,550,803,412 RAC: 248 Level ![]() Scientific publications
|
il suffit de regarder le classement mondial boinc avec bitcoin utopia,des personnes comme moi qui fait tourner un pc toute la journée ne peuvent rivaliser. Just look at the world ranking boinc with bitcoin utopia, people like me who runs a pc all day can’t compete. |
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Hello, we are looking into this! thanks |
|
Send message Joined: 8 Oct 16 Posts: 27 Credit: 4,153,801,869 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Another host with GTX960 with 5.x cc but can ATMML task runs with version 5.x cc (unless the coporoc_info.xml was modified,)? Over the past 24-48 hours, the run time varies from 122khrs and seems to stabilize with shorter run time. https://www.gpugrid.net/results.php?hostid=550055&offset=0&show_names=0&state=3&appid= |
|
Send message Joined: 18 Mar 10 Posts: 28 Credit: 41,810,583,419 RAC: 13,276 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Another host with GTX960 with 5.x cc but can ATMML task runs with version 5.x cc (unless the coporoc_info.xml was modified,)? Over the past 24-48 hours, the run time varies from 122khrs and seems to stabilize with shorter run time. Wow, I was focused on buying newer gpus, but it seems very old ones are the way to go...perhaps with some kind of hack. Just kidding. I hope Steve can find and plug this problem. |
|
Send message Joined: 21 Dec 23 Posts: 51 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Hello. We have identified the problem and it has been fixed in our code. The next round of WUs should not have this problem. This is not any sort of hack. It is just a case of some specific error types (that occur with old GPUs we do not have available locally to test on) not raising a proper error code and slipping through the validation. |
|
Send message Joined: 8 Oct 16 Posts: 27 Credit: 4,153,801,869 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Hello. We have identified the problem and it has been fixed in our code. The next round of WUs should not have this problem. Thanks. Isn't the GTX 1660 Super is technically a newer card (turing) than GTX 1080 that you were using for the testing unless the host has modified coproc_info.xml? |
©2025 Universitat Pompeu Fabra