New acemd beta

Message boards : Graphics cards (GPUs) : New acemd beta
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21333 - Posted: 7 Jun 2011, 15:35:23 UTC
Last modified: 7 Jun 2011, 15:43:57 UTC

I have uploaded a new acemdbeta application for Linux and some workunits to test.

Mainly bug fixes.

gdf
ID: 21333 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21334 - Posted: 7 Jun 2011, 15:43:55 UTC - in response to Message 21333.  

Does this new app resolve the Cuda4/downclocking bug discussed in http://www.gpugrid.net/forum_thread.php?id=2534?

If not, may I refer you to http://boinc.berkeley.edu/trac/changeset/23649/, and the new paragraph in http://boinc.berkeley.edu/trac/wiki/AppCoprocessor:

Cleanup on premature exit
The BOINC client may kill your application in the middle. This may leave the GPU in a bad state. To prevent this, call

boinc_begin_critical_section();

before using the GPU, and between GPU kernels do

if (boinc_status.quit_request || boinc_status.abort_request) {
    // cudaThreadSynchronize(); or whatever is needed
    boinc_end_critical_section();
    while (1) boinc_sleep(1);
}
ID: 21334 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21335 - Posted: 7 Jun 2011, 15:49:28 UTC - in response to Message 21254.  

No. This is a cuda3.1 app.

yet I don't understand what that means. In the middle of what?

gdf
ID: 21335 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21336 - Posted: 7 Jun 2011, 15:59:00 UTC - in response to Message 21335.  

No. This is a cuda3.1 app.

yet I don't understand what that means. In the middle of what?

gdf

The BOINC API library code is not threadsafe. If BOINC calls for the application to quit or suspend during computation, BOINC may terminate threads in an unsafe way. The new nVidia drivers which can handle Cuda4 apps are much more sensitive to this behaviour, even if the app that's running is only using a lower CUDA level. In self-protection, nVidia has written the new drivers - eveything strictly *later than* 266.58, from memory - to down-clock the card into a protective state when the abnormal thread termination is detected.

That's my layman's interpretation - I'll try and get you the full report quickly.
ID: 21336 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 261
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21337 - Posted: 7 Jun 2011, 16:06:15 UTC

Ooops, it's just been pointed out to me that this is a Linux beta app, and my remarks have been concentrating on the Windows API - so probably not important in this case.

But, since the new API code was only posted last night, it's still worth you knowing about it in preparation for the next Windows application test.
ID: 21337 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21347 - Posted: 8 Jun 2011, 7:27:11 UTC - in response to Message 21337.  

The first results are fine. We are going to produce today a beta for Windows.
This applications will substitute all production apps already this week, if all goes well
gdf
ID: 21347 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21351 - Posted: 9 Jun 2011, 10:05:54 UTC - in response to Message 21347.  

Windows application is out.

gdf
ID: 21351 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21354 - Posted: 9 Jun 2011, 13:24:40 UTC - in response to Message 21351.  

Windows application is out.

gdf


I got two of these beta wus. Both of them failed immediately with exit code -1073741819 (0xc0000005). Maybe they can't stand overclocking?
ID: 21354 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21358 - Posted: 9 Jun 2011, 14:49:15 UTC - in response to Message 21354.  
Last modified: 9 Jun 2011, 17:55:13 UTC

acemdbeta_6.38_windows_intelx86__cuda31 - Application Error

The exception unknown software exception (0xc0000005) occurred in the application at location 0x0040258c.

Click OK to terminate the program
Click on CANCEL to debug the program

System: 2003 Server x64 i7-2600K, 8GB DDR3, 2TB, GTX470 (native clocks, increased fan speeds)

Task ran for 1h45min but stayed at 0% complete, apparently going through a loop.
ID: 21358 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21359 - Posted: 9 Jun 2011, 15:50:16 UTC - in response to Message 21358.  

My only other two Betas also failed after 7 and 15sec.

Worth noting that if there is a pop-up error message and you don't select to end the task it will continue running indefinately.
So if anyone has such a message do something about it or you will just keep running the same erroneuos Beta task.
ID: 21359 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21360 - Posted: 9 Jun 2011, 16:00:53 UTC - in response to Message 21359.  

I got two more of these 6.38 beta, both of them failed immediately just like the previous ones. There were a pop-up application error message.

Application Failure acemdbeta_6.38_windows_intelx86__cuda31 0.0.0.0 in acemdbeta_6.38_windows_intelx86__cuda31 0.0.0.0 at offset 00002c58

These 6.38 beta WUs seem to fail on every computer, so I guess I shouldn't blame the overclocking. :)
ID: 21360 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21361 - Posted: 9 Jun 2011, 16:47:55 UTC - in response to Message 21360.  

No, I am trying with a quick change in few minutes to see if it works. Otherwise, it will take more time to debug it.

gdf
ID: 21361 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nenym

Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,429,587,071
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21362 - Posted: 9 Jun 2011, 17:02:31 UTC

The same here
Run time 66.171875
CPU time 0
Interesting, Swan_sync set and works with standard/long run tasks.
<core_client_version>6.10.60</core_client_version>
<![CDATA[
<message>
 - exit code -1073741819 (0xc0000005)
</message>
]]>
Win XP 64bit, GTX 560.
ID: 21362 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21363 - Posted: 9 Jun 2011, 17:04:11 UTC - in response to Message 21362.  

acemdbeta_6.39 substitutes acemdbeta_6.38 hopefully with better results...

gdf
ID: 21363 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21364 - Posted: 9 Jun 2011, 17:18:48 UTC - in response to Message 21363.  
Last modified: 9 Jun 2011, 17:49:43 UTC

This 6.39 task reached 10% complete in 17min 43sec on a stock GTX470 (System), so estimated run time is 3h.
98% GPU Utilization, 315MB video memory usage.

Looks good so far... 20%
ID: 21364 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nenym

Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,429,587,071
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21365 - Posted: 9 Jun 2011, 19:04:40 UTC
Last modified: 9 Jun 2011, 19:08:56 UTC

Strange application that 6.39 beta one.
6.39 CPU process started with high priority. The system GUI was sluggish, 1 - 2 minutes response. The Boinc GUI freezed, the Boinc core restarted (CPU tasks without checkpoint have started from zero progress). After setting the priority of 6.39 CPU process to low by Process Tamer - response of PT GUI was about 2 minutes (high priority set for PT process!) - the Boinc core restarted again and now things seems to be OK.
After 12 min 8% progress.
Win XP 64bit, GTX 560Ti, 925MHz.
ID: 21365 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21366 - Posted: 9 Jun 2011, 20:20:36 UTC - in response to Message 21364.  
Last modified: 9 Jun 2011, 22:38:37 UTC

4069069 2520078 9 Jun 2011 17:12:15 UTC 9 Jun 2011 20:10:53 UTC Completed and validated 10,510.59 10,444.27 7,491.18 11,236.77 ACEMD beta version v6.39 (cuda31)

Well, this task ran and finished (2h 55min) without error on the 6.39 app. Hopefully other results are just as positive.
ID: 21366 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist

Send message
Joined: 14 Mar 07
Posts: 1958
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 21367 - Posted: 9 Jun 2011, 21:07:19 UTC - in response to Message 21366.  

Any problem?

gdf
ID: 21367 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21368 - Posted: 9 Jun 2011, 21:21:47 UTC - in response to Message 21367.  
Last modified: 9 Jun 2011, 21:38:29 UTC

I got one 6.39 beta WU, it's running for 16 minutes now, 15% completed, 98% GPU usage (i7-950 @ 3.56GHz, GTX 580 @ 890MHz, WinXP, Above average priority).

Edit:

I put this WU to my GTX 590 @ 700MHz (my monitor is connected to this card) tried on both GPU of the 590, and I don't experience sluggish Windows GUI.

After 32 minutes 28.2% completed.
ID: 21368 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 21369 - Posted: 9 Jun 2011, 23:14:03 UTC - in response to Message 21368.  

This 6.39 beta WU completed fine in 6438s (1h47m) 3.219 ms/step.
I got another one, it's processing will begin in 3 hours, because I'm going to sleep now, so I can't micromanage the processing order of the WUs. :)
ID: 21369 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Graphics cards (GPUs) : New acemd beta

©2025 Universitat Pompeu Fabra