Message boards :
Graphics cards (GPUs) :
Client error - Compute error
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 8 Oct 08 Posts: 15 Credit: 29,603,934 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Does anyone ever had such problem?? I am using 6.3.21 client and had this second times in a few days. Before I was using 6.3.19 and never had this problem! My system is not running 24/7, only crunching when I am at home. Normally it takes me 2-3 days to finish. Someone has an idea?? <core_client_version>6.3.21</core_client_version> <![CDATA[ <message> Unzul�ssige Funktion. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # Using CUDA device 1 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz Cuda error: Kernel [reduce4_kernel] failed in file 'reduction.cu' in line 143 : unspecified launch failure. <core_client_version>6.3.21</core_client_version> <![CDATA[ <message> Unzul�ssige Funktion. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz MDIO ERROR: cannot open file "restart.coor" Cuda error: Kernel [angle_kernel] failed in file 'bonded.cu' in line 547 : unspecified launch failure. |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'd reboot the machine and take the OC back, even if it's a factory-set one, and see if the error still appears. You've got an interesting config with 2 quite different GPUs. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 8 Oct 08 Posts: 15 Credit: 29,603,934 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
As I said I don't overclock the cards. This special configuration normally I am using for games. I use an Intel board with no SLI. Videocard is the GTX280, the 9800GT is only for PhysX. I hope in near future it will be possible to assign which card is used for crunching. This configuration worked for crunching with client 6.3.19. Maybe the new client has a problem with it. Maybe I should reinstall the older one. |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The normal shader clock of a GTX 280 is 1.29 GHz and for 9800GT it's 1.51 GHz, so both of your cards are factory overclocked. Which of your cards is actually crunching now? 2-3 days looks like the 9800GT. The OC on the GT lokks not so bad, but the one on the GTX might lead to problems. There have already been cases of factory overclocked cards which failed when a new game came out, which stressed the cards in a way which no software had done before. I think it was Doom 3.. back in the days. I seriously doubt it's got anything to do with 6.3.21. BOINC only launches the GPU-Grid client, the software which does the actual calculations hasn't changed. Your computers are hidden so I can't take a look.. do the WUs file somewhere in the middle or upon intialization? MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 8 Oct 08 Posts: 15 Credit: 29,603,934 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
You're right, they are factory oced. But it was no problem with the first, lets say about 10 WU's. I saw that sometimes when the WU is crunched with reboot it uses not always the same GPU. How this is done I don't know. Therefore sometimes, when 9800GT is then doing the work, it is hard to reach the deadline. If you want to check the details, I have two computers crunching with GPU's. 12834 (GTX280/9800GT) and 16306 (GTX280). It seems only 12834 with two cards has this problem. Maybe the application has a problem when starting the WU on one card and after new start of pc resuming on the other card. But I had such WU's already and there was no problem working on both cards. <core_client_version>6.3.14</core_client_version> <![CDATA[ <stderr_txt> # Using CUDA device 1 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 1 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 1 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Time per step: 32.226 ms # Approximate elapsed time for entire WU: 27391.880 s called boinc_finish </stderr_txt> ]]> Hmm, I checked some of the WU's on this machine and those ending with no error was all crunched with client 6.3.14. And for me it looked they where crunched normally, the I saw in my account they got an error. I had only the same problems all had with the "download bug" a few days ago. |
|
Send message Joined: 8 Oct 08 Posts: 15 Credit: 29,603,934 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The WU a few minutes ago had no problem with Boinc client 6.3.21. Very strange (and interesting this "MDIO ERROR")! Looks like the application (this WU was 6.48) is alwas using CUDA device 0, good to know because this is the faster one. <core_client_version>6.3.21</core_client_version> <![CDATA[ <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 799200 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 1404000 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz # Time per step: 24.948 ms # Approximate elapsed time for entire WU: 21205.906 s called boinc_finish </stderr_txt> ]]> |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hey Fireman! Have you enabled SLI? If SLI is enabled, only GPU 0 will be used... pixelicious.at - my little photoblog |
|
Send message Joined: 8 Oct 08 Posts: 15 Credit: 29,603,934 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hey Stefan! How are you? Hope fine?! No, I use an Intel board with no SLI. Videocard is the GTX280, the 9800GT is for Games which use PhysX. But for me it is better when only GTX280 is crunching, otherwise it would be sometimes a problem not to go over dead line. My system does not run 24/7! |
Krunchin-Keith [USA]Send message Joined: 17 May 07 Posts: 512 Credit: 111,288,061 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
{quote]Hide Move Last modified: 7 Nov 2008 16:35:39 UTC The WU a few minutes ago had no problem with Boinc client 6.3.21. Very strange (and interesting this "MDIO ERROR")! Looks like the application (this WU was 6.48) is alwas using CUDA device 0, good to know because this is the faster one. <core_client_version>6.3.21</core_client_version> <![CDATA[ <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce GTX 280" # Clock rate: 799200 kilohertz # Device 1: "GeForce 9800 GT" # Clock rate: 1620000 kilohertz MDIO ERROR: cannot open file "restart.coor" [/quote] That MDIO error, is not a real error. What happens is at initial start of task, it looks for a checkpoint file, since it has never written one yet, it doe not find one, hence the erorr only shows up on initial start. Once it checkpoints if it is re-started it finds the file and you don't get the error again. Look at any task on your or any one else's computer and it is always there. |
Stefan LedwinaSend message Joined: 16 Jul 07 Posts: 464 Credit: 298,573,998 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Yea Keith, but I think Fireman is talking about these errors - Cuda error: Kernel [angle_kernel] failed in file 'bonded.cu' in line 547 : unspecified launch failure. and Cuda error: Kernel [reduce4_kernel] failed in file 'reduction.cu' in line 143 : unspecified launch failure. Hey Stefan! How are you? Hope fine?! Yes, thank you. Hope you're fine too!? Sorry I forgot about you using an Intel board without SLI when I read you last post about only GPU beeing used... I also get computation errors sometimes. Mostly on my 9800 GTX SC which is also factory overclocked. Sometimes it runs fine for weeks and then it has two or three different errors in a row and then it runs fine again... Could you try to set the shader/GPU/memory speed to factory settings of a normal card to see if the problem vanishes? pixelicious.at - my little photoblog |
Krunchin-Keith [USA]Send message Joined: 17 May 07 Posts: 512 Credit: 111,288,061 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
@stefan Can't quote properly since forum buttons are missing, but I thought Very strange (and interesting this "MDIO ERROR")! in his post meant the MDIO error shown in his post. ... MDIO ERROR: cannot open file "restart.coor" ... My comments refer to that. |
|
Send message Joined: 8 Oct 08 Posts: 15 Credit: 29,603,934 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry. I didn't know that MDIO ERROR is not an error. Not much time reading all descriptions and diskussions. I will look if it is running now because I had an idea today. Maybe it is no good idea to let the GPU crunch when I play LOTROL. Now I always stop the Boinc client and restart after playing. Hope I don't forget this in future. Possible conflict in video memory??? We will see. |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Maybe it is no good idea to let the GPU crunch when I play LOTROL. The usual error caused by gaming would be the "out of memory"-message, which you're not getting. It won't hurt to switch off BOINC while gaming (or just suspend the active GPU-Gird task), but I doubt it will remove the cause of errors. On a side note: how long would you have to run like that to be sure the error is gone? MrS Scanning for our furry friends since Jan 2002 |
Krunchin-Keith [USA]Send message Joined: 17 May 07 Posts: 512 Credit: 111,288,061 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry. I didn't know that MDIO ERROR is not an error. Not much time reading all descriptions and diskussions. You can set the new option (6.3.13 or above) in cc_config.xml to suspend boinc when an application is running, you need to fill in the application name, in this case your game. When you start the game boinc will suspend all tasks for you, and automatically restart when you exit the game. You of course need to have the 'Leave applications in memory while suspended?' option also set in preferences to no, so they will be removed from memory. cc_config.xml <cc_config> <options> <exclusive_app>appname.exe</exclusive_app> </options> </cc_config> appname must appear exacly as the o/s shows it, case sensative, No drive\paths allowed. You can have multiple exclusive_app options in the file. Note: At this time, bboinc suspends all tasks, both CPU and GPU, when using this option. |
[AF>HFR>RR] alipseSend message Joined: 28 Oct 08 Posts: 4 Credit: 20,372,566 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Hi I just have two WU in error 0937.44 stderr out <core_client_version>6.3.19</core_client_version> <![CDATA[ <message> Fonction incorrecte. (0x1) - exit code 1 (0x1) </message> <stderr_txt> # Using CUDA device 0 # Device 0: "GeForce 9800 GTX/9800 GTX+" # Clock rate: 1836000 kilohertz MDIO ERROR: cannot open file "restart.coor" # Using CUDA device 0 # Device 0: "GeForce 9800 GTX/9800 GTX+" # Clock rate: 1836000 kilohertz Cuda error: Kernel [frc_sum_kernel_angle] failed in file 'force.cu' in line 223 : unspecified launch failure. </stderr_txt> ]]> Validate state Invalid Claimed credit 3232.06365740741 Its a bit borring, cause they bugged at the end of units, meaning, i lost about 24 Hours of crunch (and then no credits :( ) Im using Windows XP32 9800 GTX+ Boinc 6.3.19 Drivers 180.43 Do u want me to post for the other invaldi WU ? |
©2025 Universitat Pompeu Fabra