Message boards :
Number crunching :
Simulation has become unstable
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
TThrottle is a great problem, however when using it with GPU-crunching it will results in errors. The program does not "slow down" the CPU and GPU usage at a certain rate, but let it run 100% and then only a few % and so on and then CPU and GPU stay cool(er). For at least with GPUGRID tasks this stopping and starting of the WU will let if fail after a while. That is the reason I don't use TThrottle anymore. With MSI Afterburner you can set the speed of the fan for the GPU nicely and works on any card, any brand, nVidia as well AMD. If your cards run nice and smooth at GPUGRID then they do on other projects as well you don't have to worry about that. At last I will not defend anyone, but you can trust skgiven's advice. He it not always lengthy in his explanation and you have to search and try a bit for yourself. But he knows where he is talking about and I have used a lot from his knowledge myself. Greetings from TJ |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
skgiven. WTF!!!! You've OCd your CPU to 4.33GHz and now you're throttling it back with TThrottle?????!!!!! That just doesn't make any sense. It's like building a super fast engine then slipping the clutch so the vehicle doesn't go too fast. Of all the programs and BOINC projects I run only GPUGRID fails in any way so as far as I am concerned my system runs nicely. Those other GPU projects are wussy projects. Any old FUBAR'd system, even yours, can crunch them. GPUgrid is the cream of the crop, the top dog amongst all GPU using BOINC projects. The admins and the dev here are what other GPU using projects only wish they could be. The only reason your system can't run GPUgrid is because you've FUBAR'd your system. From what I have read it seems that the only option I have to get GPUGRID to run is to downclock the GPU. If I did that then it will impact the other BOINC projects adversely. So I will not downclock the GPU. You have plenty of other options but it seems you've made up your mind that your system is the model of perfection and should run GPUgrid exactly the way it is. It is, as you say, simple to change the various GPU parameters. However, it takes experience to do it properly. Well then get the experience. Plenty of other people have done exactly that. I suspect most people running GPUGRID are not experts nor wish to become one. You don't need to become an expert but if that's the word you prefer to use then fine continue to delude yourself. But that isn't going to get you crunching GPUgrid. Also, what you left unspoken, is that it takes many hours of checking, over several GPUGRID tasks, to ensure that the changes have the desired effect and also that other projects and programs still run as well as before. That's debatable but let's say it's true. If you don't want to do the work then you don't get to run with the big dogs; you just sit on the porch with the pups and watch. One final point. It is quite possible to write programs, either deliberately or by accident, that drive a component beyond its safe limits or seriously affect other programs. It looks to me that GPUGRID, in a perfectly reasonable desire for efficiency, is reaching or has reached that point. It is now up to the programmer to seriously consider what and how they are coding. You obviously don't know spit about coding and the proof is that you're OCing your CPU then throttling it back. Your "advice" to the programmer here is a sad joke at best and the meanderings of a noob at worst. In summary, getting GPUGRID tasks to run without error is not quite as simple as you make out. Yes it is and if you would spend more time reading about it and thinking about it and stop wasting time arguing about it you would be half way there by now. I've seen several other non-experts do it, why can't you? I have no doubt others will, perhaps vehemently, disagree with me. But this is my opinion just as you have yours. You and skgiven both have opinions, that much is true. To say your opinion on this topic is as informed and valid as skgivens' opinion is, well, the thought just makes me ROFLMAO. BOINC <<--- credit whores, pedants, alien hunters |
|
Send message Joined: 27 Nov 13 Posts: 4 Credit: 10,253,081 RAC: 0 Level ![]() Scientific publications ![]()
|
<sigh> 1 - You know nothing about my skills, ability, or achievements any more than I know about yours. 2 - I will NOT turn this thread into a flame-fest. You can, of course, do whatever makes you feel good. 3 - I will let the readers make up their own minds. I'm out of here. Bye. |
|
Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
<sigh> Wrong. From you have told us about yourself and what you're doing it's easy to see you're advising on topics you know little about. Your monkey see monkey do (it works at project A therefore is has to work at project B too) solution is no solution at all. Call that a flame if you want, I call it the truth and I believe the more experienced crunchers here will agree with that opinion. 2 - I will NOT turn this thread into a flame-fest. You can, of course, do whatever makes you feel good. If you think I am posting in this thread to make me feel good you're wrong again. I post to try to help you and others feel good by telling you what isn't likely to happen so that you can pursue a more realistic strategy for achieving success crunching here. 3 - I will let the readers make up their own minds. How magnanamous of you. I'm out of here. <yawn> BOINC <<--- credit whores, pedants, alien hunters |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
In summary, getting GPUGRID tasks to run without error is not quite as simple as you make out. I think you are right it is not simple. Even after reducing the temperature you may have problems; you will very likely have to reduce the clock. But since your card is over-clocked anyway, you are really just reducing the clock to the value that Nvidia specified, which they did for a reason. If you (or the factory) overclock the card, you take your chances. GPUGrid is not a gaming community, and maybe they should make more allowances for that when they design their programs. But I am not a gamer anyway, and only use these cards for GPUGrid, so how they perform on other projects is of no concern to me. Each person has his own tolerance for tweaking up the cards, so you must act accordingly. But it is quite possible to get them stable; here are my results for a recently completed run on two GTX 660s on a PC running Windows 7 64-bit: http://www.gpugrid.net/results.php?hostid=165674&offset=0&show_names=1&state=0&appid= I have since moved the cards to a WinXP machine, where they run faster. But as a consequence, they are now a little unstable and I am having to play the tweaking game again. They will be stable again shortly, but whether you consider that fun or not is up to you. |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Impressive Jim, those two so 660 stable. I can get them stable for only a few days and then an error. I tweak the cards, one by one almost every week. Clocks are way down, that the cards run slower then they should. Mine are EVGA's and the fan can run to a maximum of 75%, that is the reason the master GPU is always almost at 75°C. Greetings from TJ |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
TJ, Have you tried to reduce the RAM clock of your card as well? |
|
Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Also, I have had to increase the power limit (to 110%) of the cards using Nvidia Inspector. In fact, my most recent problem required that I increase the limit more than that. That required downloading the original BIOS using GPU-Z, modifying it using Kepler Bios Tweaker (I set the power limit to 137.5 watts), and then flashing the BIOS of my Zotac GTX 660 using nvflash. It is not for the faint of heart, and I don't think everyone should attempt it. But that is an extreme case because that card didn't have the greatest heatsink to begin with, and most cards don't require going into the BIOS. You can just use Nvidia Inspector (or MSI Afterburner or whatever else you want) for most cases. Other cards require down-clocking the GPU, and maybe the memory too as RZ points out. So that is why it can get complicated; each card is an individual investigation if you want to get down to zero errors, and I don't think that is really necessary but mention it only to show that it is possible. A lot of the errors blamed on the work units are really due to instabilities of the card because it is being pushed too hard. |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
TJ, No I didn't yet. Good advice Zoltan, I will fiddle with that a bit too. Greetings from TJ |
|
Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Thanks Jim, I know the procedure of updating the GPU BIOS have all the tools but still not done it yet. If I get a second 780Ti I will put the 660 in an older system and then experiment with the BIOS too. Greetings from TJ |
©2026 Universitat Pompeu Fabra