Message boards :
Graphics cards (GPUs) :
Top hosts exceed 30,000+ RAC
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 2 Jan 09 Posts: 40 Credit: 16,762,688 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
I just noticed that now two machines exceed the 30,000 mark of recent average credit. (Top Hosts) Anyone care to speculate when the first machine will exceed 40K and 50K of RAC? :-) |
|
Send message Joined: 4 Dec 08 Posts: 7 Credit: 2,718,779 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
Care to share the specs of your host? |
|
Send message Joined: 2 Jan 09 Posts: 40 Credit: 16,762,688 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
It's a 64-bit Linux system with a total of 4 GT200 class CUDA devices, made possible due to the 2-in-1 GeForce GTX 295. The Phenom 9550 CPU cores are not so impressive as those of Core i7, but they always seem able to keep the GPUs satisfied. Actively running more than two CUDA devices required an upgrade from a 750 Watt to a 1000 Watt power supply, now a Zalman ZM1000-HP. Meanwhile... anyone yet running an eight GPU quad GTX 295 rig? ;-) |
|
Send message Joined: 4 Sep 08 Posts: 16 Credit: 9,366,617 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Consider the Price of this AMD Phenom the delivered power is although impressive. Now you can upgrade to Phenom II and show me a Intel-System with such a plattform-compatibility. Question: Do you have 2 CPU-Socket's? |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Sorry, but what are you talking about? This is about the GPUs.. you just need a board with 2 PCIe slots, 2 GTX 295 and preferrably 4 CPU cores, though on Linux less may do for 4 GPUs. MrS Scanning for our furry friends since Jan 2002 |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry, but what are you talking about? This is about the GPUs.. you just need a board with 2 PCIe slots, 2 GTX 295 and preferrably 4 CPU cores, though on Linux less may do for 4 GPUs. With 6.62 you might not even need that much on windows ... I am seeing less than 1% average CPU with that application. At that CPU rate you could even run all 4 GPU cores on a single CPU system (if we ignore bus bandwidth and CPU I/O bandwidth issues) ... |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
J.D. wrote: Anyone care to speculate when the first machine will exceed 40K and 50K of RAC? :-) " If " the rig will crunch without producing computation errors and doesn't freeze I'd expect the 40K to be reached around the weekend... ;) (Knock on wood) |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Good luck mate! That would be almost half the output of my entire team.. :D @Paul: yes, with 6.62 fewer cores may be perfectly fine. If they're not I wouldn't look for a problem with bandwidth (because that's in the realm of nanoseconds) but rather the 1 ms scheduler interval. If a single CPU core is busy serving GPU 1 and right now GPU 2, 3 and 4 also need an *whatever*, then they'll have to wait until serving GPU 1 is done and the scheduler grants them a time slice. Thus I choose the careful term "preferrably" ;) MrS Scanning for our furry friends since Jan 2002 |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Good luck mate! That would be almost half the output of my entire team.. :D Um, well that is what I would class as CPU I/O bandwidth, because the CPU has only the one channel to service the interrupts ... a rose by any other name ... But even a multi-core system still has potential issues with bandwidth for the same reason unless the MB has distinct and separate channels for each GPU to be serviced. Then we can get into the same issue with the dual, and soon to come quad, core systems where there is one I/O channel for each card and the two/four GPU cores are contending for service at the same time. This is an issue that has dogged PCs for, like, forever ... though the CPUs we use have more power than the CPUs of mainframes of yore the I/O is simply not really there ... they are getting there, slowly ... but, some of those old systems were masters at I/O ... In any case, my opinion we are in violent agreement ... |
|
Send message Joined: 2 Jan 09 Posts: 40 Credit: 16,762,688 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
" If " the rig will crunch without producing computation errors and doesn't freeze I'd expect the 40K to be reached around the weekend... ;) (Knock on wood) 40K! Even sooner than the weekend. :-) |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I was pleasantly surprised too...especially because the rig had freezes and produced computation errors... My next estimate would be 50 K on next wednesday... ;) |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Um, well that is what I would class as CPU I/O bandwidth, because the CPU has only the one channel to service the interrupts ... a rose by any other name ... I'm still not convinced. Is there an interrupt at all? I don't know about the new method, but as far as I understand the polling is not an interrupt, it's just a normal task switch, which the scheduler would have done anyway. The way I see it: a single core executes only one thread at a time. Thus when multiple GPUs need work all except one are blocked.. no matter how much I/O bandwidth you give that cpu, it couldn't execute the other threads at the same time. If you have multiple CPUs (be it physical ones, more cores or logical ones via multithreading) then each of them can process one thread at the same time and, with otherwise perfect software, lags / breaks could be avoided. What I need is the ability to execute several threads at once, not I/O bandwidth. So.. I'm not sure if we're talking about the same thing ;) MrS Scanning for our furry friends since Jan 2002 |
EdboardSend message Joined: 24 Sep 08 Posts: 72 Credit: 12,410,275 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Four GPUs and 4 CPU cores means one gpugrid WU/core and so, no WU in cache. I have a 2 cores CPU and a gtx295 (2 GPUs) and I can not get that the Boinc scheduler feeds them without my personal intervention (which, e.g., is impossible if I'm sleeping) (Boinc 6.6.3) |
|
Send message Joined: 25 Nov 08 Posts: 51 Credit: 980,186 RAC: 0 Level ![]() Scientific publications ![]()
|
Four GPUs and 4 CPU cores means one gpugrid WU/core and so, no WU in cache. I have a 2 cores CPU and a gtx295 (2 GPUs) and I can not get that the Boinc scheduler feeds them without my personal intervention (which, e.g., is impossible if I'm sleeping) (Boinc 6.6.3) As mentioned in another thread recently 6.6.3 has a problem with uninitialized variables. Sooner or later, it won't get GPU work reliably. Boinc version 6.5.0 seems to cause the least trouble; get it from here. Phoneman1 |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Couldn't keep my promise to reach 50K just in time (yesterday)... ...and am wondering if anyone else had some "ghost WUs"...? Explanation: during the last two days I had WUs, that could only be seen by the BOINC-manager, but not in the web sites task list. So I had eight WUs to crunch whilst in the task list there could only be seen five or six as "in progress". Could have been acceptable...if these WUs would have been listed after they had finished and were submitted...but they seem to have vanished in the Lost-WU-Nirvana...an unnecessary loss of time and credits...kind of annoying... |
|
Send message Joined: 8 Sep 08 Posts: 63 Credit: 1,696,957,181 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yes, I thought also something like that was happening. Further investigation learned me that these WU were on the web page task list, but then on page two or even three. So just try "next" on top of the web page to see your next 20 WUs and so on. That is were you will find them. Kind regards Alain |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
When I saw that there were less than the usual eight tasks "in progress" I checked the previous task-sides, but with no success: I couldn't find any new ones. Also, after submitting these "ghosts" neither the 'avg. cr' nor the 'tot. cr' for this host did change... |
|
Send message Joined: 25 Nov 08 Posts: 51 Credit: 980,186 RAC: 0 Level ![]() Scientific publications ![]()
|
Ul1, your list of computers shows 4 x i7s but 3 have not contacted the server this month. Those 3 also have a number of work units marked no reply. I wonder if the missing work units are to be found on these i7s?? Did you change your email details on this project or make some other change?? If so it might be worth merging those computers with the same name within this project. Phoneman1 |
UL1Send message Joined: 16 Sep 07 Posts: 56 Credit: 35,013,195 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
As you mentioned: these rigs haven't done anything for the project this month...but the days I was dealing with the 'ghosts' were late monday and the whole tuesday... And no: I didn't change anything...and these rigs will be back here as soon as they have cleaned their cache over at SETI... |
|
Send message Joined: 2 Jan 09 Posts: 40 Credit: 16,762,688 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Couldn't keep my promise to reach 50K just in time (yesterday)... Woo! 50K! Here too! Okay, so my machine took 12 days longer, but still. :-) Meanwhile, the stats haven't yet shown a machine over 60K... |
©2025 Universitat Pompeu Fabra