acemdlong application 8.14 - discussion

Message boards : News : acemdlong application 8.14 - discussion
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

AuthorMessage
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 326,008
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32538 - Posted: 30 Aug 2013, 14:58:44 UTC - in response to Message 32536.  

Getting the unknown error number crashes on kid WUs

It isn't the error number which is unknown, it's the plain-English description for it.

Yours got a 0xffffffffc0000005: mine has just died with a 0xffffffffffffff9f, description equally unknown. Task 7221543.

Ty for the clarification.

No problem.

Looking a little further down, my 0xffffffffffffff9f (more legibly described as 'exit code -97') also says

# Simulation has crashed.

Since then, I've had another with the same failure: Task 7224143.
ID: 32538 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32542 - Posted: 30 Aug 2013, 16:01:02 UTC - in response to Message 32532.  

Noelia has over 1000 WU to submit but we cannot as there are still these problems.

Now they are running on beta to test.

gdf

Well put 500 of them in the beta queque then please, so our rigs can run overnight while we sleep.
Greetings from TJ
ID: 32542 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Operator

Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32554 - Posted: 30 Aug 2013, 22:11:52 UTC

Using the 803-55 application now (on Titans) I received two NATHAN KIDKIX WUs

And both took almost 9k more seconds to complete than with previous 800-55 version.

Still using the 326.84 drivers.

I don't think this 803 is actually comparable to the 800. I think something changed, and not in a good way.

http://www.gpugrid.net/result.php?resultid=7224469

http://www.gpugrid.net/result.php?resultid=7224304

I've also received at least one Nathan baxbimx and a NOELIA KLEBE that I had to abort because they were going no where. Just stuck.

Operator
ID: 32554 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32555 - Posted: 30 Aug 2013, 22:13:26 UTC - in response to Message 32554.  

803 and 800 are the exact same binary.

MJH
ID: 32555 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Operator

Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32556 - Posted: 30 Aug 2013, 22:19:31 UTC - in response to Message 32555.  

803 and 800 are the exact same binary.

MJH


Okay. I think I found the issue. My Precision X was off.

Thanks,

Back in the fast lane now.

Operator.

ID: 32556 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32563 - Posted: 31 Aug 2013, 12:12:17 UTC - in response to Message 32556.  

ID: 32563 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32565 - Posted: 31 Aug 2013, 12:18:26 UTC - in response to Message 32563.  

8.04 is a new beta app. Includes a bit more debugging to help me find the cause of the remaining common failure modes.

MJH
ID: 32565 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32569 - Posted: 31 Aug 2013, 15:14:01 UTC - in response to Message 32565.  

8.04 is a new beta app. Includes a bit more debugging to help me find the cause of the remaining common failure modes.

MJH

Hello MJH,
Do we need to post the error message or the links to it, or find you it yourself at server side? I have 6 error with 8.04 for Harvey test and Noelia_Klebebeata.
8.00 and 8.02 did okay on my 660 and 770.
I have the most error on the 660, both cards have 2GB.
Greetings from TJ
ID: 32569 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32570 - Posted: 31 Aug 2013, 15:24:36 UTC - in response to Message 32569.  
Last modified: 31 Aug 2013, 15:36:20 UTC

Hi,

No need to post errors here, they all end up in a database that I can inspect.

MJH
ID: 32570 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
klepel

Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,798,881,008
RAC: 343
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32613 - Posted: 2 Sep 2013, 3:31:52 UTC

As Stderr output reports:
<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
aborted by user
</message>
]]>
I wanted to inform that the WU "I74R6-NATHAN_KIDKIXc22_6-6-50-RND6702_0" Version 8.03 caused a driver crash and several blue screens until I was able to get rid of it.

The following task: "I83R9-NATHAN_baxbimx-4-6-RND5261_0" short queue seems to go to nowhere, I am pretty sure it is also a blue screen tomorrow morning when I get to the office.

I would not say this video card does not crash at times, but this was the first time that I was not able to simply restart the computer and download a new task.
ID: 32613 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
klepel

Send message
Joined: 23 Dec 09
Posts: 189
Credit: 4,798,881,008
RAC: 343
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32635 - Posted: 2 Sep 2013, 15:44:00 UTC

EDIT my previous post: Just not make false accusations: the task "I83R9-NATHAN_baxbimx-4-6-RND5261_0" did not run until today because I forgot to quite the suspension of BOINC MANAGER. It is running fine now.
ID: 32635 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32864 - Posted: 9 Sep 2013, 21:56:38 UTC

The version 8.14 application that's been live on short is now out on acemdlong.
For those of you that haven't been paying attention to recent developments, this version has improved error recovery and fixes the suspend/resume problems.

MJH
ID: 32864 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Lazydude

Send message
Joined: 25 Sep 08
Posts: 12
Credit: 161,238,437
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwat
Message 32883 - Posted: 11 Sep 2013, 9:12:26 UTC

Hi!
I got lot of this with 8.14 shutdown/restarts
This did not happend with 800 /803
http://www.gpugrid.net/result.php?resultid=7266298

# GPU [GeForce GTX 780] Platform [Windows] Rev [3203] VERSION [55]
# SWAN Device 0 :
# Name : GeForce GTX 780
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.5
# PCI ID : 0000:01:00.0
# Device clock : 954MHz
# Memory clock : 3024MHz
# Memory width : 384bit
# Driver version : r325_00 : 32680
# GPU 0 : 62C
# GPU 1 : 58C
# GPU 0 : 63C
# GPU 0 : 64C
# GPU 0 : 65C
# GPU 0 : 66C
# GPU 0 : 67C
# Access violation : progress made, try to restart]
ID: 32883 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32885 - Posted: 11 Sep 2013, 12:23:30 UTC

While I wasnt getting them with 803, I havent had a crash yet on my 780s.

so while it does slow the task down a little bit. Thats better than crashing the task to me.
ID: 32885 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
John C MacAlister

Send message
Joined: 17 Feb 13
Posts: 181
Credit: 144,871,276
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 32888 - Posted: 11 Sep 2013, 14:20:58 UTC

Hi, MJH:

I am getting short and long tasks for CUDA 4.2. My driver on the 650 Ti GPUs is 314.22 as I had too many failures with driver versions 320.57 and 320.18. What is the difference between CUDA 4.2 and 5.5 and do I need to do anything more to successfully process GPUGRID tasks in an optimum way? Despite wanting to, I cannot buy the higher performing video cards.

Thanks,

John
ID: 32888 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TJ

Send message
Joined: 26 Jun 09
Posts: 815
Credit: 1,470,385,294
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32889 - Posted: 11 Sep 2013, 14:48:15 UTC

Hi Matt,

I had 50% error with your previous CRASH tests, the Santi´s on my 660 with the -97 error you wanted to see lots of. Well have a look if you have time.

Since the last 3 days I some times see this in the output file:
# BOINC suspending at user request (exit)

However I have done nothing, I was asleep, so the rig was unattended.
It is there when a WU failed, but also with a success as in in the last Nathan LR. On the 770 everything works perfect.
Perhaps the latest derives are optimized for the 7xx series by nVidia and not so good for the 6xx series of cards.
Greetings from TJ
ID: 32889 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32890 - Posted: 11 Sep 2013, 14:54:02 UTC - in response to Message 32889.  
Last modified: 11 Sep 2013, 14:54:56 UTC

Since the last 3 days I some times see this in the output file:
# BOINC suspending at user request (exit)

However I have done nothing, I was asleep, so the rig was unattended.


Sometimes benchmarks automatically run, and GPU work gets temporarily suspended. Also, sometimes high-priority CPU jobs step in, and GPU work gets temporarily suspended.

You can always look in Event Viewer, or the stdoutdae.txt file, for more information.
ID: 32890 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nanoprobe

Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 32891 - Posted: 11 Sep 2013, 15:19:07 UTC - in response to Message 32889.  

Hi Matt,

I had 50% error with your previous CRASH tests, the Santi´s on my 660 with the -97 error you wanted to see lots of. Well have a look if you have time.

Since the last 3 days I some times see this in the output file:
# BOINC suspending at user request (exit)

However I have done nothing, I was asleep, so the rig was unattended.
It is there when a WU failed, but also with a success as in in the last Nathan LR. On the 770 everything works perfect.
Perhaps the latest derives are optimized for the 7xx series by nVidia and not so good for the 6xx series of cards.

Add this line of code to your cc_config file to stop the CPU benchmarks from running.
<skip_cpu_benchmarks>1</skip_cpu_benchmarks>

If you're using a version of BOINC above 7.0.55 then you can go to advanced>read config files and the changes will take effect. if you're using a version lower than 7.0.55 you'll have to shut down and restart BOINC for the change to take effect.
ID: 32891 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32892 - Posted: 11 Sep 2013, 15:24:10 UTC - in response to Message 32891.  
Last modified: 11 Sep 2013, 15:25:01 UTC

Since the last 3 days I some times see this in the output file:
# BOINC suspending at user request (exit)

However I have done nothing, I was asleep, so the rig was unattended.


Sometimes benchmarks automatically run, and GPU work gets temporarily suspended. Also, sometimes high-priority CPU jobs step in, and GPU work gets temporarily suspended.

You can always look in Event Viewer, or the stdoutdae.txt file, for more information.


I just did a brief test, where I ran CPU benchmarks manually using: Advanced -> Run CPU benchmarks.
The result in the slots stderr.txt file was:
# BOINC suspending at user request (thread suspend)
# BOINC resuming at user request (thread suspend)

I then ran a test where I selected Activity -> Suspend GPU, then Activity -> Use GPU based on preferences.
The result in the slots stderr.txt file was:
# BOINC suspending at user request (exit)

So...
I no-longer believe your issue was caused by benchmarks.
But it still could have been caused by high-priority CPU jobs.
ID: 32892 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Operator

Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32893 - Posted: 11 Sep 2013, 16:30:39 UTC - in response to Message 32888.  
Last modified: 11 Sep 2013, 16:38:49 UTC

Hi, MJH:

I am getting short and long tasks for CUDA 4.2. My driver on the 650 Ti GPUs is 314.22 as I had too many failures with driver versions 320.57 and 320.18. What is the difference between CUDA 4.2 and 5.5 and do I need to do anything more to successfully process GPUGRID tasks in an optimum way? Despite wanting to, I cannot buy the higher performing video cards.

Thanks,

John


John,

I know you were asking MJH....I have a GTX 650TI (2GB) running the 326.84 drivers without any issues at all.

http://www.gpugrid.net/show_host_detail.php?hostid=155526

Running the older drivers will result in your system getting tasks for the older CUDA version which is less efficient. The project is leaning forward in trying to eventually bring the codebase to the latest CUDA (5.5) version to keep up with the technology and newer GPUs.

You would be wise to try and run newer drivers to allow your system to run as efficiently as possible. Again, I can verify that my GTX 650Ti runs fine on 326.84.

Operator
ID: 32893 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

Message boards : News : acemdlong application 8.14 - discussion

©2025 Universitat Pompeu Fabra