WU: NOELIA_KLEBEs

Message boards : News : WU: NOELIA_KLEBEs
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · Next

AuthorMessage
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32470 - Posted: 29 Aug 2013, 15:22:38 UTC - in response to Message 32464.  
Last modified: 29 Aug 2013, 15:46:47 UTC

063px6-NOELIA_KLEBEbeta-0-3-RND7897_0 Workunit stuck at 0.021% (8.00 app though).

While 'running' the stderr.txt file already reported swanMemset failed, but the WU didn't terminate:

# GPU [GeForce GTX 660] Platform [Windows] Rev [3170M] VERSION [55]
# SWAN Device 1 :
# Name : GeForce GTX 660
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:02:00.0
# Device clock : 1032MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r325_00
swanMemset failed

Suspended the WU. When I resumed it, 5min later, I got the error message,
"Display driver nvlddmkm stopped responding and has successfully recovered".

When I checked my Windows logs I saw,
"A request to disable the Desktop Window Manager was made by process (4)" - listed 2sec before the driver crash/restart entry. The driver log entry was made after the driver restarted rather than when the failure was triggered.

The WU again continued 'running' without progressing. I aborted it but now the stderr has nothing of any use,
    Stderr output

    <core_client_version>7.0.64</core_client_version>
    <![CDATA[
    <message>
    aborted by user
    </message>
    ]]>


FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 32470 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32471 - Posted: 29 Aug 2013, 15:25:21 UTC
Last modified: 29 Aug 2013, 15:25:35 UTC

I have noticed that, using the 8.01 app on a NOELIA_KLEBEbeta task, on my GTX 660 Ti, the process does not utilize a full CPU core (like other GPUGrid tasks normally do for that GPU). It's like SWAN_SYNC is not set correctly. Though I'm still getting good (85-91%) GPU utilization for the task.

Is this behavior new? Also, is it expected?
ID: 32471 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32472 - Posted: 29 Aug 2013, 15:32:51 UTC - in response to Message 32468.  


Oh and when somebody started one too on 200series, plz tell me, got my energybill today, so i would love to stop it the next hours when not needed :p


If it is still running then that's plenty long enough to demonstrate that all is well, thanks. You can kill it off.

Matt
ID: 32472 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32473 - Posted: 29 Aug 2013, 15:33:51 UTC - in response to Message 32471.  


I have noticed that, using the 8.01 app on a NOELIA_KLEBEbeta task, on my GTX 660 Ti, the process does not utilize a full CPU core (like other GPUGrid tasks normally do for that GPU). It's like SWAN_SYNC is not set correctly. Though I'm still getting good (85-91%) GPU utilization for the task.


It should have exactly the same load profile as 8.00 did.

MJH
ID: 32473 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile dskagcommunity
Avatar

Send message
Joined: 28 Apr 11
Posts: 462
Credit: 958,266,958
RAC: 31,461
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32475 - Posted: 29 Aug 2013, 15:50:04 UTC - in response to Message 32472.  


Oh and when somebody started one too on 200series, plz tell me, got my energybill today, so i would love to stop it the next hours when not needed :p


If it is still running then that's plenty long enough to demonstrate that all is well, thanks. You can kill it off.

Matt


Ok it ran one hour, was at 3,3%, 95% gpu load, used 515MB VRAM, cpu was busy working on LHC and still computed normal. Thx Aborted it. ^^
DSKAG Austria Research Team: http://www.research.dskag.at



ID: 32475 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32476 - Posted: 29 Aug 2013, 16:00:08 UTC

Have revved the beta app to 8.02. This might also fix the driver-hang-on-suspend problem.

MJH
ID: 32476 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 32481 - Posted: 29 Aug 2013, 17:30:49 UTC - in response to Message 32471.  

I have noticed that, using the 8.01 app on a NOELIA_KLEBEbeta task, on my GTX 660 Ti, the process does not utilize a full CPU core (like other GPUGrid tasks normally do for that GPU). It's like SWAN_SYNC is not set correctly. Though I'm still getting good (85-91%) GPU utilization for the task.

Is this behavior new? Also, is it expected?


I thought NOELIA's never used a full CPU core, that's the way it's always been. We've talked about it before in different threads.
ID: 32481 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32482 - Posted: 29 Aug 2013, 17:36:48 UTC - in response to Message 32481.  

That's fine if that's the case. I don't know, which is why I asked. I'm "used" to seeing tasks on my "Kepler" (GTX 660 Ti) taking a full CPU core (via SWAN_SYNC) automatically. Maybe NOELIA tasks work differently.
ID: 32482 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile nenym

Send message
Joined: 31 Mar 09
Posts: 137
Credit: 1,429,587,071
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32484 - Posted: 29 Aug 2013, 18:19:04 UTC
Last modified: 29 Aug 2013, 19:11:45 UTC

ID 156955: i7-3770K (no HT, 4 CPU cores) GTX560Ti, W7 64bit driver 328.80.
NOELIA_KLEBEbeta 8.02 CUDA55 application
GPU [GeForce GTX 560 Ti] Platform [Windows] Rev [3182M] VERSION [55]
# SWAN Device 0 :
# Name : GeForce GTX 560 Ti
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:01:00.0
# Device clock : 1720MHz
# Memory clock : 2100MHz
# Memory width : 256bit
# Driver version : r325_00
GPU load 91 - 94%, process priority tamed to realtime, GPU load is 87 - 91% if not tamed
CPU load 20% of one core (5% in all)
active_task>
    <project_master_url>http://www.gpugrid.net/</project_master_url>
    <result_name>063px30-NOELIA_KLEBEbeta2-0-3-RND2325_0</result_name>
    <checkpoint_cpu_time>564.224400</checkpoint_cpu_time>
    <checkpoint_elapsed_time>2994.810513</checkpoint_elapsed_time>
    <fraction_done>0.048235</fraction_done>
</active_task>

Seems to run OK so far.

Concurrently running: 4x CPU Asteroids SSE3, 1x GPU Einstein BPRS on Intel HD4000
Note: previous 6.18 CUDA 4.2 application could run on 875 MHz core clock. Factory OC of the GPU is 900 MHz.
ID: 32484 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 32487 - Posted: 29 Aug 2013, 18:29:41 UTC - in response to Message 32482.  

That's fine if that's the case. I don't know, which is why I asked. I'm "used" to seeing tasks on my "Kepler" (GTX 660 Ti) taking a full CPU core (via SWAN_SYNC) automatically. Maybe NOELIA tasks work differently.


I understand, I know your a very busy man, I thought I saw you debugging apps for other projects in some different forums some place else. I don't know how you manage to keep track of them all.
ID: 32487 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32488 - Posted: 29 Aug 2013, 18:35:58 UTC - in response to Message 32487.  

:) Yeah, thanks. I've helped Einstein fix a bug, MindModeling fix a bug, GPUGrid fix a couple things, Test4Theory fix a bug, Rosetta fix their app, SETI fix a GPU estimate problem, got nVidia to fix a monitor-sleep issue, and more. And I also do alpha/beta testing of the actual BOINC software, and have worked directly with the BOINC devs.

Regarding this particular case, I believe I was aware of some tasks "not using a full CPU core on my Kepler card", but I did not know it was NOELIA ones. I'll try to keep that in mind.

Thanks again.
ID: 32488 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32503 - Posted: 29 Aug 2013, 20:05:10 UTC - in response to Message 32482.  

That's fine if that's the case. I don't know, which is why I asked. I'm "used" to seeing tasks on my "Kepler" (GTX 660 Ti) taking a full CPU core (via SWAN_SYNC) automatically. Maybe NOELIA tasks work differently.

I still consider this different CPU load as a malfunction.
However, with this low CPU load the GPU load is still above 95%, so we can turn this question the way around: is it sure that the other tasks need a full CPU thread to feed a Kepler GPU?
ID: 32503 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32504 - Posted: 29 Aug 2013, 20:09:14 UTC - in response to Message 32476.  

Have revved the beta app to 8.02. This might also fix the driver-hang-on-suspend problem.

MJH

I think you could promote this 8.02 to the production queue at once, as it is proved to be better than the 8.00.
ID: 32504 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32505 - Posted: 29 Aug 2013, 20:14:33 UTC - in response to Message 32504.  

It's there now.

MJH
ID: 32505 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32507 - Posted: 29 Aug 2013, 21:08:18 UTC - in response to Message 32503.  
Last modified: 29 Aug 2013, 21:32:37 UTC

That's fine if that's the case. I don't know, which is why I asked. I'm "used" to seeing tasks on my "Kepler" (GTX 660 Ti) taking a full CPU core (via SWAN_SYNC) automatically. Maybe NOELIA tasks work differently.

I still consider this different CPU load as a malfunction.
However, with this low CPU load the GPU load is still above 95%, so we can turn this question the way around: is it sure that the other tasks need a full CPU thread to feed a Kepler GPU?

On the Folding forum, there have been extended discussions of Nvidia CPU core usage under CUDA. It contrasts to the case of AMD cards running OpenCL, which typically require only a few percent of a CPU core.

As I recall, Nvidia provides the option to the developers to reserve a full CPU core when running under CUDA using spin states, which I don't understand anyway. If the application developers want to ensure that they have enough CPU support, they can reserve it, even though typically not all of it is actually in use

So maybe the other tasks don't really require a full core, except that it may be useful to reserve it for stability or performance or whatever.

EDIT: To further complicate matters, Nvidia cards running OpenCL always require a full CPU core; there is no option not to.
ID: 32507 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
5pot

Send message
Joined: 8 Mar 12
Posts: 411
Credit: 2,083,882,218
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32520 - Posted: 30 Aug 2013, 4:35:07 UTC

8.02 beta tasks seem to work ok on 780s, but now all other tasks fail
ID: 32520 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 326,008
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32525 - Posted: 30 Aug 2013, 8:13:54 UTC - in response to Message 32507.  

I still consider this different CPU load as a malfunction.
However, with this low CPU load the GPU load is still above 95%, so we can turn this question the way around: is it sure that the other tasks need a full CPU thread to feed a Kepler GPU?

On the Folding forum, there have been extended discussions of Nvidia CPU core usage under CUDA. It contrasts to the case of AMD cards running OpenCL, which typically require only a few percent of a CPU core.

As I recall, Nvidia provides the option to the developers to reserve a full CPU core when running under CUDA using spin states, which I don't understand anyway. If the application developers want to ensure that they have enough CPU support, they can reserve it, even though typically not all of it is actually in use

So maybe the other tasks don't really require a full core, except that it may be useful to reserve it for stability or performance or whatever.

EDIT: To further complicate matters, Nvidia cards running OpenCL always require a full CPU core; there is no option not to.

Watching two different third-party developers working on SETI (one specialising in CUDA, the other in OpenCL), we get the opposite outcome: OpenCL on ATI is inefficient unless a spare CPU core is available, but CUDA on Nvidia requires very little CPU.

I'm not a developer myself (at least, not at the level these guys program), but from the peanut gallery it looks as if CPU usage is very much down to the skill of the developer, and how well they know their platform and tools.

But I'm interested by the OpenCL on Nvidia point. That does seem to be a common observation - I wonder if it has necessarily to be so? Or maybe Mvidia didn't port some of their synch technology from CUDA to the OpenCL toolchain yet?
ID: 32525 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32526 - Posted: 30 Aug 2013, 8:19:17 UTC - in response to Message 32520.  
Last modified: 30 Aug 2013, 8:27:09 UTC

8.02 app running the Noelia Beta WU's (one on the CUDA4.2 and the other on the 5.5 app).

When I use snooze the driver still restarts. I have the driver timeout set to 20sec, and it takes 20seconds for the driver to crash/restart.

When I suspended the WU's individually they didn't cause a driver restart.

However when I suspended both at the same time the driver restarted, again after 20sec. (These situation driver restarts/or lack of restarts are repeatable).

I noted that the 5.5 WU kept running (progressing) for about 4seconds after I suspended it.

But I'm interested by the OpenCL on Nvidia point. That does seem to be a common observation - I wonder if it has necessarily to be so? Or maybe Mvidia didn't port some of their synch technology from CUDA to the OpenCL toolchain yet?

The GK104 cards are supposed to be OpenCL 1.2 but the drivers are only OpenCL1.1, which means the toolkit can't be 1.2.
AMD/ATI supports OpenCL1.2, Intel supports OpenCL1.2, NVidia says it's GPU's are OpenCL1.2 but their drivers prevent the cards from being used for OpenCL1.2.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 32526 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32531 - Posted: 30 Aug 2013, 8:55:38 UTC - in response to Message 32526.  

On my Linux systems I have the STABLE Repository drivers (304.88), supposedly only CUDA 5.0.
However I'm presently running a CUDA 5.5 NOELIA Beta WU (12h in 3 to go).
I thought CUDA 5.5 would only be used if the system had the correct drivers?
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 32531 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile MJH

Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32535 - Posted: 30 Aug 2013, 10:33:23 UTC - in response to Message 32531.  
Last modified: 30 Aug 2013, 10:33:34 UTC


I thought CUDA 5.5 would only be used if the system had the correct drivers?


The intent is that you'll get 55 only if the driver revision is >= 315.15
Alas, the scheduler has a will of its own.

MJH
ID: 32535 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · Next

Message boards : News : WU: NOELIA_KLEBEs

©2025 Universitat Pompeu Fabra