Message boards :
Number crunching :
Lots of errors
Message board moderation
Previous · 1 · 2
| Author | Message |
|---|---|
|
Send message Joined: 26 Feb 12 Posts: 184 Credit: 222,376,233 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
There were some faulty WUs, but they have nothing to do with tasks erroring out with "Simulation has become unstable." messages and no other error messages. There is no way you could know this for sure. Errors like yours are usuall a result of overclocking too much. As I stated before these cards are not overclocked. The problem came and left without me changing anything on my set up. Therefore my conclusion is that the WUs were the problem. Moving on. |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The problem is still ongoing, for you. See: https://www.gpugrid.net/result.php?resultid=14213909 # The simulation has become unstable. Terminating to avoid lock-up (1) I know you said there is no factory overclock, but I still would love to know what GPU-Z says for your "GPU Clock" and "Default Clock". If you refuse to share, so be it. And if it's anything above 1020, then it is in fact overclocked. |
|
Send message Joined: 18 Oct 13 Posts: 53 Credit: 406,647,419 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Jacob: We are all very well aware of the idea of overclocking. But , honestly : Are you really thinking I will and need to shut down my system to get the GPU- stuff running ? Really ? Let´s put it that way: If GPU -stuff is not running properly on a majority of systems and several users have the same experience and we ( the users ) do that for free - we shut down a over all properly running system for that ? I really don´t think so. I my opinion, if GPU do not work - make them work instead of begging on the overclocking stuff. If GPU will not work properly, I simply switch to something else - you know, MY PC, MY time, MY decission. Or to use a proverb: Not my circus , not my monkeys. Kind regards, |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have definitely had certain GPU things, such as games and GPUGrid tasks, crash or error ("Simulation has become unstable"), as a direct result of a factory-overclock that was too aggressive. If the GPU is overclocked at all, and you are trying to resolve any GPU problem, you should see if lowering the clocks resolves the problem. Yes, honestly. I have 3 factory-overclocked GPUs. GPU 1 was factory-overclocked way too aggressively, and I've had to dial it back quite a bit to be completely stable in my games and with GPUGrid. GPU 2 was factory-overclocked too little, and I could push it even farther before noticing problems. And GPU 3 was factory-overclocked just right. Forgive me for trying to help. |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Nanoprobe, It may be the case that these Noelia WU's tax the card more than other WU's and it does appear (from what I've seen) to effect the smaller/older cards more; the same WU's that fail on your GTX750Ti complete on other systems but some also fail on the older/smaller cards. While GPU temps look OK the GDDR5 memory temps might be quite high. I would suggest reducing the GDDR5 clocks and the GPU clocks by 10% to see if that prevents the errors recurring, or just crunch the long WU's which are a different type (and are now fixed). Recently used XP with a couple of GPU's and found the drivers to be not great. Would also suggest a regular cold-start, just in case of runaway errors which appears to be the case back on the 20th. Good luck, FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Jacob: Sometimes this is the only (and the fastest) way to fix a malfunctioning system. I had some GPUGrid app crashes in the past on one of my dual GPU systems which caused the other GPU to fail tasks too. In my opinion it's a good practice to restart (by a scheduled task) a Windows based system once a week - regardless if it's running error free - to maintain its stability (especially when running GPU and CPU tasks simultaneously). Let´s put it that way: If GPU -stuff is not running properly on a majority of systems and several users have the same experience and we ( the users ) do that for free - we shut down a over all properly running system for that ? This is more like a rhetorical question, but - as you probably know - there's no warranty for any software (free or commercial) to work on every existing hardware. Besides, your question takes a set of other softwares as a reference which qualifies a system properly running, but from the "no warranty" thing comes that there's no such set of softwares exist. To put it in another way: I wouldn't call a system properly running, if GPUGrid tasks produce "The simulation has become unstable. Terminating to avoid lock-up" messages on that particular system only while these tasks run fine on the next host they were assigned to. I really don´t think so. If someone ask for help, it comes from that they can't figure out the reason of the error, so it might be useful to try things which don't make sense at first sight. I have a GTX780Ti on which I had to reduce the GDDR5 clock to 2900MHz (from 3500MHz) to make it work with GPUGrid (it was brand new). GPUs (and other components) are aging so they might not perform as good as before, different tasks tax the GPU differently. You can't step in the same river twice. If GPU will not work properly, I simply switch to something else - you know, MY PC, MY time, MY decission. If all else fails, or you've tired of trying different workarounds you can do it. Still, fixing the errors on a given system is not the project's responsibility. |
|
Send message Joined: 26 Feb 12 Posts: 184 Credit: 222,376,233 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The problem is still ongoing, for you. I think 1 error out of 20 tasks is about the same I was experiencing before the problem WUs arrived. And just for the record the task you linked to completed and validated. The last failed one was more that 2 days ago. https://www.gpugrid.net/result.php?resultid=14213349 |
|
Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Please keep my advice (use lower clocks) in mind, the next time you try to troubleshoot any GPU errors. |
|
Send message Joined: 20 Jul 14 Posts: 732 Credit: 130,089,082 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Please keep my advice (use lower clocks) in mind, the next time you try to troubleshoot any GPU errors. Jacob, please keep in mind this point : your tips are always appreciated and fortunately that we have you ! :) I will also make adjustments on my GTX 760 via Precision X because I also get errors with LONG RUNS (Gerard). I will publish the settings and the results in this thread. Of course, I also think to Retvari Zoltan* and skgiven whose advices are also very valuable :) Thanks guys! [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres |
©2026 Universitat Pompeu Fabra