Message boards :
Graphics cards (GPUs) :
13 hour task, 10K award (Yippeee)
Message board moderation
| Author | Message |
|---|---|
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
This morning I woke to a task that took 13 hours to run with a time-step of 47 ms. I note that the task had a 53M output file and the time to run was about double the normal. Credit asked was 8K, granted 10K (sweet!). The task named 10-KASHIF_HIVPR_dim_ba3-2-100-RND7725 failed for another participant (see Work Unit Page. Looking at the card he/she had looks to me like it should have been able to run the task The stdio from my system had the following lines: # Amber: readparm : Reading parm file parameters # PARM file in AMBER 7 format # Encounter 10-12 H-bond term WARNING: parameters.cu, line 568: Found zero 10-12 H-bond term. WARNING: parameters.cu, line 568: Found zero 10-12 H-bond term. MDIO ERROR: cannot open file "restart.coor" I have no idea what this means or if it is an error or an expected outcome I am only noting it here so that the those of us that might see this happen again will be forewarned and hopefully GDF or my friend ignasi will admit that it is all his fault again ... :) |
X1900AIWSend message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Same warning text, but other credits (4352 > 5440): 598868, Name 35-IBUCH_HIVPR_mon_ba8-1-100-RND8412_0 # Amber: readparm : Reading parm file parameters # PARM file in AMBER 7 format # Encounter 10-12 H-bond term WARNING: parameters.cu, line 568: Found zero 10-12 H-bond term. WARNING: parameters.cu, line 568: Found zero 10-12 H-bond term. MDIO ERROR: cannot open file "restart.coor" # Time per step: 30.302 ms |
mike047Send message Joined: 21 Dec 08 Posts: 47 Credit: 7,330,049 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
I have had several of these, works out to about the same points per hours as the shorter ones. Upload is over 50 meg though. Completely shuts down my DSL until finished uploading.....completely. mike |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have had several of these, works out to about the same points per hours as the shorter ones. First one I THINK I have had. Did not go back to exhaustively check. Seems to mess up the DCF a little bit, not sure why ... |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
See here. Don't know about the warning, though.. it appears to be quite common. I guess it's related to the new amber field method. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
OMG i got a huge one .... i thought only the big guns would get them lolz. I wasn't paying attention what gpugrid is doing Guess i was wrong i wonder if i can finish it in time i got a kashif_hivpr one which has done 12:30 of work and still show more then 26 hours to go if i am right you guys did this one in 13 hours. But you guys do a normal unit in 3 hours which costs me about 20 should i abort it ? |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
It's not 3h, more like a good 5h for the fastest cards. How much of that WU have you done after 12.5h? If it's ~30%, as BOINC thinks, you could just as well spend the other 24h. If your machine runs more tzhan 8h a days you should be able to make it before the deadline. MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
27% done after 13 hours now so i try to see if i can finish it. My machine runs 24/7 playing server :D, i see all others had error out this unit |
|
Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
well i don't have to worry anymore it suddenly errored out like with all the others 13 hours gone. |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ugh... On 5 of my 6 total cores running tasks the normal speed to run a task is about 6.5 hours. There are the shorter ones which obviously would take a little less time. The KASHIF models seem to take 13 some hours to run. The only good news is that the pay is comparable or maybe a little better on a hourly basis. It is a shame that you got one that errored out on you. So far, I have not seen one die yet. I have to admit that I do prefer tasks that run faster than 6 hours per, as a matter of fact I dearly love those that are 1 hour or less in that if they fail you don't lose much. |
|
Send message Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]()
|
The new one i got errored out as well but much sooner Unit2 with - exit code 98 (0x62) |
|
Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have no idea what this means or if it is an error or an expected outcome I am only noting it here so that the those of us that might see this happen again will be forewarned and hopefully GDF or my friend ignasi will admit that it is all his fault again ... :) It is nobody's fault. A new feature has been implemented into the scientific application (ACEMD) which permits the usage of a different force-field called Amber. This Warning message is nothing but this, a warning. It doesn't affect the output. thanks for caring, ignasi |
Paul D. BuckSend message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have no idea what this means or if it is an error or an expected outcome I am only noting it here so that the those of us that might see this happen again will be forewarned and hopefully GDF or my friend ignasi will admit that it is all his fault again ... :) I was only teasing ... :) I just saw something unusual and reported it like a good boy ... And I could not resist the tease ... sorry ... I will be a good boy now. |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
If I could clock my GTX295 higher (or the tasks get minutely smaller) it would be great to have tasks complete in 6 or 12 hours as that makes a nice WU per Day calc. The best I have clocked my shaders sucessfully is 1554 at stock v (have not tried higher yet) but due to my own meddling with CPU OC, driver updates, boinc update, etc. I have caused many compute errors lately (I apologize to all the project team for that) and have vowed not to change anything for a while ... it is just so temping ... up the shader ... grrr ... leaving at stock ... 1274 for now :-( Steve |
X1900AIWSend message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The best I have clocked my shaders sucessfully is 1554 at stock v ... 1274 for now :-( I thought, there are discrete clock rates (+54MHz): 1242 - 1296 - 1350 - 1404 - 1458 - 1512 - 1566 - 1620 and so on. Test it with Rivatuner. |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I have an EVGA card and am using their "Precision" utility which is actually quite nice and very easy to use. When I click the "Reset All" button the shader goes to 1274. I can either manually key in numbers (it will accept anything) or I can push a slider along which moves in increments of either 7 or 9 (I am at work and don't remember exactly what the step sizes are). Are you saying that the driver will take the settings I enter and internally adjust them into the appropraite +54 step bucket? Do you know at what point it decides to go up or down? What I mean is if I set it to 1554 is it being adjusted to 1512 or 1566? I have tested in the past with OCCT but it really suggests that you test with SLI on but that is not how we crunch so I have always wondered how applicable that test is to my crunching configuration. I made a guess that if I was OK in OCCT using SLI that I would be even more stable when dropping SLI (assumption being that there is some overhead needed to make SLI work). I will take a look at Rivatuner tonight. Steve |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Are you saying that the driver will take the settings I enter and internally adjust them into the appropraite +54 step bucket? Exactly. The driver will accept any setting and will adjust the real clock to multiples of 54.. as soon as you're not watching. Do you know at what point it decides to go up or down? What I mean is if I set it to 1554 is it being adjusted to 1512 or 1566? Tell us if you find out! I would have supposed that 1554 already means upclocking, but if you were stable at 1554 and not at e.g. 1570 the actual clock would seem to be 1512. MrS Scanning for our furry friends since Jan 2002 |
X1900AIWSend message Joined: 12 Sep 08 Posts: 74 Credit: 23,566,124 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
First big workunit , took 12+ hours, it was the missing part in the variety of my task list and the last proof to get some reliability for my overclocking settings, I found bios flashing (inclusive fan adjustment) considerable more stable and comfortable than software tuning. 628175Name 19-KASHIF_HIVPR_dim_ba2-6-100-RND7091_0 (...) CPU time 1133.734 (...) # Device 0: "GeForce GTX 260" # Clock rate: 1512000 kilohertz # Total amount of global memory: 939196416 bytes # Number of multiprocessors: 27 # Number of cores: 216 # Amber: readparm : Reading parm file parameters # PARM file in AMBER 7 format # Encounter 10-12 H-bond term WARNING: parameters.cu, line 568: Found zero 10-12 H-bond term. WARNING: parameters.cu, line 568: Found zero 10-12 H-bond term. # Time per step: 43.619 ms # Approximate elapsed time for entire WU: 43618.990 s (...) Claimed credit 8076 Granted credit 10096 |
©2025 Universitat Pompeu Fabra