WU: OPM simulations

Message boards : News : WU: OPM simulations
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next

AuthorMessage
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 47,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43580 - Posted: 26 May 2016, 0:09:26 UTC - in response to Message 43545.  

Wow ok, this thread derailed. We are supposed to keep discussions related just to the specific WUs here, even though I am sure it's a very productive discussions in general :)


That's what happens when you allow the lunatics to run the asylum.


I am a bit out of time right now so I won't split threads and will just open a new one because I will resend OPM simulations soon.


Ok, bring them on. I'm ready.



ID: 43580 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 43581 - Posted: 26 May 2016, 9:31:57 UTC - in response to Message 43580.  
Last modified: 26 May 2016, 9:33:57 UTC

How I imagine your GPUs after OPM:
ID: 43581 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43583 - Posted: 26 May 2016, 11:32:00 UTC - in response to Message 43581.  



A simulation containing only 35632 atoms is a piece of cake.
ID: 43583 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 869
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 43584 - Posted: 26 May 2016, 16:49:45 UTC - in response to Message 43567.  

... The <rsc_disk_bound> is set to 8*10^9 (7.45GB) which is at least one order of magnitude higher then necessary.

when I temporarily ran BOINC on a RAMDisk some weeks ago, I was harshly confronted with this problem.
There was only limited disk space available for BOINC, and each time the free RAMDisk space went below 7.629MB (7.45GB), the BOINC manager did not download new GPUGRID tasks (the event log complained about too little free disk space).

I contacted the GPUGRID people, and they told me that they will look into this at some time; it can't be done right now, though, as Matt is not available for some reason (and seems be the only one who could change/fix this).
ID: 43584 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43587 - Posted: 26 May 2016, 20:19:57 UTC - in response to Message 43584.  

Are the GERARD_CXCL12VOLK_ Work Units step 2 of the OPM simulations or extensions of the GERARD_FCCXCL work - or something else?

PS Nice to see plenty of tasks over the long weekend:
Tasks ready to send 2,413
Tasks in progress 2,089

Will these auto-generate new work?
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 43587 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 295,172
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43589 - Posted: 26 May 2016, 20:41:42 UTC - in response to Message 43588.  

I haven't tried this, but theoretically it should work.

What theory is that? It isn't a defined field, according to the Application configuration documentation.
ID: 43589 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43590 - Posted: 26 May 2016, 21:31:25 UTC - in response to Message 43589.  
Last modified: 26 May 2016, 21:33:32 UTC

I haven't tried this, but theoretically it should work.

What theory is that? It isn't a defined field, according to the Application configuration documentation.

Oh, my bad!
That won't work...
I read a couple of post about this somewhere, but I've clearly messed it up.
Sorry!
Sk, Could you hide that post please?
ID: 43590 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Bedrich Hajek

Send message
Joined: 28 Mar 09
Posts: 490
Credit: 11,731,645,728
RAC: 47,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43592 - Posted: 26 May 2016, 22:15:01 UTC - in response to Message 43581.  

How I imagine your GPUs after OPM:


For the past few day, while there was little work here, I was crunching at a tough back up project (Einstein), where my computers were able to crunch 2 GPU WUs per card simultaneously with GPU usage of 99% max for my xp computer and 91% max for my windows 10 computer. So, anything you have, should be a walk in the park, even if you come with a 200,000+ atom simulation with 90%+ GPU usage.


Good luck!!


ID: 43592 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Beyond
Avatar

Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43593 - Posted: 27 May 2016, 0:46:51 UTC - in response to Message 43584.  

when I temporarily ran BOINC on a RAMDisk some weeks ago, I was harshly confronted with this problem.
There was only limited disk space available for BOINC, and each time the free RAMDisk space went below 7.629MB (7.45GB), the BOINC manager did not download new GPUGRID tasks (the event log complained about too little free disk space).

I contacted the GPUGRID people, and they told me that they will look into this at some time; it can't be done right now, though, as Matt is not available for some reason (and seems be the only one who could change/fix this).

I had this happen recently when the disk partitions on which BOINC was installed went below that level. Thought it was strange, wasn't sure if it was a GPUGrid or BOINC thing. Anyway resized the partitions with a disk manager and started getting work on those machines again.
ID: 43593 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nanoprobe

Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 43594 - Posted: 27 May 2016, 2:19:39 UTC - in response to Message 43466.  
Last modified: 27 May 2016, 2:21:03 UTC

The Error Rate for the latest GERARD_FX tasks is high and the OPM simulations were higher. Perhaps this should be looked into.
_Application_ 	_unsent_ 	In Progress    Success 	Error Rate
Short runs (2-3 hours on fastest card)
SDOERR_opm99	0	60	2412	48.26%

Long runs (8-12 hours on fastest card)
GERARD_FXCXCL12R_1406742_	0	33	573	38.12%
GERARD_FXCXCL12R_1480490_	0	31	624	35.34%
GERARD_FXCXCL12R_1507586_	0	25	581	33.14%
GERARD_FXCXCL12R_2189739_	0	42	560	31.79%
GERARD_FXCXCL12R_50141_	        0	35	565	35.06%
GERARD_FXCXCL12R_611559_	0	31	565	32.09%
GERARD_FXCXCL12R_630477_	0	34	561	34.31%
GERARD_FXCXCL12R_630478_	0	44	599	34.75%
GERARD_FXCXCL12R_678501_	0	30	564	40.57%
GERARD_FXCXCL12R_747791_	0	32	568	36.89%
GERARD_FXCXCL12R_780273_	0	42	538	39.28%
GERARD_FXCXCL12R_791302_	0	37	497	34.78%

2 or 3 weeks ago the error rate was ~25% to 35% it's now ~35% to 40% - Maybe this varies due to release stage; early in the runs tasks go to everyone so have higher error rates, later more go to the most successful cards so the error rate drops?
...

FWIW the ever increasing error rate is why I no longer crunch here. Hours of wasted time and electricity could be better put to use elsewhere like POEM. My 970s are pretty much useless here nowadays and the 750TIs are completely useless. JMHO
ID: 43594 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 43598 - Posted: 27 May 2016, 8:39:32 UTC - in response to Message 43594.  

These error rates are a bit exaggerated since AFAIK they include instantaneous errors which don't really bother much.
ID: 43598 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
nanoprobe

Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 43607 - Posted: 27 May 2016, 14:31:45 UTC - in response to Message 43598.  
Last modified: 27 May 2016, 14:43:50 UTC

These error rates are a bit exaggerated since AFAIK they include instantaneous errors which don't really bother much.

Unfortunately that is not true for me. I almost never have a task that errors out immediately. They're thousand of seconds in before they puke. Especially so in the last few months. FWIW I'm not a points ho but if we got some kind of credit for tasks that error out before finishing like other projects do I'd be more inclined to run them but 6-10 hours of run time for nada just irks me when that run time could be productive somewhere else. And yes I understand that errors still provide useful info. At least I'm assuming they do and so if they supply useful info we should get some credit. JMHO
ID: 43607 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 295,172
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43608 - Posted: 27 May 2016, 15:32:27 UTC - in response to Message 43607.  

These error rates are a bit exaggerated since AFAIK they include instantaneous errors which don't really bother much.

Unfortunately that is not true for me. I almost never have a task that errors out immediately. They're thousand of seconds in before they puke. Especially so in the last few months. FWIW I'm not a points ho but if we got some kind of credit for tasks that error out before finishing like other projects do I'd be more inclined to run them but 6-10 hours of run time for nada just irks me when that run time could be productive somewhere else. And yes I understand that errors still provide useful info. At least I'm assuming they do and so if they supply useful info we should get some credit. JMHO

On the other hand, I can barely remember a task which errored out here for an unexplained reason. I've certainly had some since the last ones showing, which were for October/November 2013.

I think my most recent failures were because of improper computer shutdown/retstarts - power outages due to the winter storms. I don't see any reason why the project should reward me for those - my bad for not investing in a UPS. The machine I'm posting from - 45218 - has no "Unrecoverable error" events for GPUGrid as far back as the logs go (13 January 2016), and it runs GPUGrid constantly when tasks are available.

If you are seeing a much higher error rate, I think you should look closer to home. I don't think the project's applications and tasks are inherently unstable.
ID: 43608 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 869
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 43609 - Posted: 27 May 2016, 16:44:35 UTC - in response to Message 43590.  

Oh, my bad!
That won't work...
I read a couple of post about this somewhere, but I've clearly messed it up.
Sorry!


Yes, indeed it won't work :-(

One of the comments, a few weeks ago, in the forum was:

The disk space requirement is set in the workunit meta-data. ...

If disk usage was associated with the application, you could re-define it in an app_info.xml: but because it's data, it's correctly assigned to the researcher to configure.


Meanwhile it doesn't bother me any more, since I gave up running BOINC on a RamDisk.
Nevertheless though, I think this should be looked into / questioned by the GPUGRID people.
ID: 43609 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43610 - Posted: 27 May 2016, 18:45:26 UTC - in response to Message 43594.  

My 970s are pretty much useless here nowadays and the 750TIs are completely useless. JMHO

This latest batch might be better, though I have just started. But at 3 hours into the run, it looks like a GERARD_CXCL12VOLK will take 12.5 hours to complete on a GTX 970 running at 1365 MHz (Win7 64-bit).
ID: 43610 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 869
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 43612 - Posted: 27 May 2016, 19:47:46 UTC - in response to Message 43610.  

This latest batch might be better, though I have just started. But at 3 hours into the run, it looks like a GERARD_CXCL12VOLK will take 12.5 hours to complete on a GTX 970 running at 1365 MHz (Win7 64-bit).

here it took 12.7 hrs on a GTX970 (running at 1367 MHz) - Win10 64-bit.
ID: 43612 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43613 - Posted: 27 May 2016, 21:09:21 UTC - in response to Message 43594.  

FWIW the ever increasing error rate is why I no longer crunch here. Hours of wasted time and electricity could be better put to use elsewhere like POEM. My 970s are pretty much useless here nowadays and the 750TIs are completely useless. JMHO
According to the performance page the GTX 970 is a pretty productive GPU:

It suggests that the reason for the increasing error rate you are experiencing lies at your end.
The most probable cause of this is too much overclocking, inadequate cooling or PSU. GPUGrid is more demanding than other GPU projects, so the system/settings which work for other projects could be inappropriate for GPUGrid tasks. In some cases factory overclocked cards will not work here until their factory default frequency is reduced.
ID: 43613 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile skgiven
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43614 - Posted: 27 May 2016, 21:27:40 UTC - in response to Message 43613.  

The comment was thread specific and based on a GTX970 not being able to return some OPM tasks within 24h. That was the case on WDDM systems.
While a GTX970 is still quite capable of finishing most tasks within 24h it remains to be seen how it fares with the next round of OPM tasks (or whatever they are now being called).
That said and generally speaking a GTX970 will likely remain a very good GPU for many months to come. Can't see it being less than a good GPU before the autumn and in 18months I expect there will still be plenty chipping away here.
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help
ID: 43614 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
TyphooN [Gridcoin]

Send message
Joined: 29 Jun 14
Posts: 5
Credit: 29,718,557
RAC: 0
Level
Val
Scientific publications
wat
Message 43615 - Posted: 28 May 2016, 9:07:39 UTC - in response to Message 43614.  
Last modified: 28 May 2016, 9:22:25 UTC

I wanted to report back in, as my overclock on my 980ti hybrid is now stable. I was out of town for a few days, and my OpenVPN tunnel went down (which I do DNS lookups through). As I was unable to resolve DNS, all of my completed tasks were failing to upload. Once I got back into town, I bounced my OpenVPN tunnel and local DNS server and uploaded 2 complete WUs.

I was awarded 187,100 credit instead of 249,600. What is frustrating about this to me is that all of my other machines, including those running POEM, Einstein, and Milkyway were all churning away with no issues or no credit hits. I am awarded the same credit for turning in the work a bit later. With that said, I feel that the deadlines are too short and the penalty is too harsh. I don't understand why the penalty is so quick and severe.

I have had to set my system to store only 0.5 days of work, and store up to an additional 0 days of work. When I do this, I only get 2 GPUGRID tasks. Past that, I am queueing 3 jobs and the credit rewarded gradually grows lower.

For me it makes me not want to queue many jobs, because if I download too many jobs then I am stuck getting low credit indefinitely. This is something that I only have to worry about on GPUGRID. All of my other machines are set to store at least 3 days of work, and store up to an additional 2 days. I urge that the credit penalty should be removed, or only be active at a more reasonable point in the future.

I also delete my second GPUGRID task, because if I suspend BOINC to play a game, the constant processing of WUs is disrupted causing the credit rewarded to plummet. I do like to game from time to time, but it only complicates my usual suspending of BOINC work. Even if I choose to game for too long I might not return the workunit in time to not be punished. So far I have enjoyed my stay here, but can see why some might easily get discouraged.


ID: 43615 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 43616 - Posted: 28 May 2016, 9:24:38 UTC - in response to Message 43615.  
Last modified: 28 May 2016, 9:25:35 UTC

Hmm...

Instead of thinking of the lower credits as a "penalty", instead think of the higher credits as a "bonus". GPUGrid offers "bonuses" for quick returns, which is pretty unique for a BOINC project, and sometimes some scenarios just can't get those bonuses.

For me, I don't care about credit at all. I just ensure that my GPUs can return the GPUGrid tasks within the deadline (within 5 days), such that the project doesn't waste anybody's resources be reassigning those tasks.

Also, regarding your gaming situation, are you aware that you can configure BOINC to automatically suspend when certain applications are running? It's called "Exclusive applications", within BOINC's settings, and I think you may find it quite useful.

Regards,
Jacob
ID: 43616 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 . . . 6 · 7 · 8 · 9 · 10 · Next

Message boards : News : WU: OPM simulations

©2025 Universitat Pompeu Fabra