Message boards :
Graphics cards (GPUs) :
Cuda Error
Message board moderation
| Author | Message |
|---|---|
The Gas GiantSend message Joined: 20 Sep 08 Posts: 54 Credit: 607,157 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
My last returned wu was invalid. In the stderr out file it reports "Cuda error: Kernel [kick_drift_kernel] failed in file 'step.cu' in line 46". The messages in BOINC were 16/10/2008 12:09:10 AM|PS3GRID|Computation for task Tq15782-GPUTEST3-5-10-acemd_1 finished 16/10/2008 12:09:10 AM|PS3GRID|Output file Tq15782-GPUTEST3-5-10-acemd_1_1 for task Tq15782-GPUTEST3-5-10-acemd_1 absent 16/10/2008 12:09:10 AM|PS3GRID|Output file Tq15782-GPUTEST3-5-10-acemd_1_2 for task Tq15782-GPUTEST3-5-10-acemd_1 absent 16/10/2008 12:09:10 AM|PS3GRID|Output file Tq15782-GPUTEST3-5-10-acemd_1_3 for task Tq15782-GPUTEST3-5-10-acemd_1 absent Any ideas? Paul. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
This is usually given by an overclocked unstabled system. gdf |
The Gas GiantSend message Joined: 20 Sep 08 Posts: 54 Credit: 607,157 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Thanks. I had increased the OC a little more last night....damn. Now set back to where it was. |
The Gas GiantSend message Joined: 20 Sep 08 Posts: 54 Credit: 607,157 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Another error on this wu. Temperatures were all at the cool portion of the day. Why does this happen only at the end of a wu? The last 7 days ran fine. |
The Gas GiantSend message Joined: 20 Sep 08 Posts: 54 Credit: 607,157 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
Another error. Again at the end of the wu. 30/10/2008 2:43:00 AM|PS3GRID|Computation for task NG16509-GPUTEST4-3-10-acemd_0 finished 30/10/2008 2:43:00 AM|PS3GRID|Output file NG16509-GPUTEST4-3-10-acemd_0_1 for task NG16509-GPUTEST4-3-10-acemd_0 absent 30/10/2008 2:43:00 AM|PS3GRID|Output file NG16509-GPUTEST4-3-10-acemd_0_2 for task NG16509-GPUTEST4-3-10-acemd_0 absent 30/10/2008 2:43:00 AM|PS3GRID|Output file NG16509-GPUTEST4-3-10-acemd_0_3 for task NG16509-GPUTEST4-3-10-acemd_0 absent Hmmm..... |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Seems like you're getting them rather frequently now. Lower your OC? MrS Scanning for our furry friends since Jan 2002 |
The Gas GiantSend message Joined: 20 Sep 08 Posts: 54 Credit: 607,157 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]()
|
If it was excessive OC then I have a couple of questions: 1. Why does the wu run to the end before it has a problem? 2. Why did it run fine for 7 days and the 5 days before that at the same OC settings? 3. My wu's are currently ending in the early hours of the morning, so temps are quite cool. (ok a statement not a question) Dust filters are also clean. Yesterday was very warm and that wu completed without error. Hmmm. Live long and BOINC! |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Errors due to OC can be highly random, especially when you are sitting just at the border to stability. You'd expect it to be a bit more systematic than what you're seing, but GDFs "This is usually given by an overclocked unstable system." should really ring your alarm bells. I'd say switch the machine off, physically disconnect the power cord for >15 min and try again. If you still get errors lower the OC by at least 54 MHz shader and 27 MHz core (both correspond to one clock speed step) and see what you get. MrS Scanning for our furry friends since Jan 2002 |
©2025 Universitat Pompeu Fabra