Message boards :
Number crunching :
do not run CPU benchmarks [issue not reproduced]
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I could be way off base but ... Do not "Run CPU benchmarks" from BOINC Mgr while GPU crunching. I just did this a few minutes ago and it immediately caused my GPU WU to "compute error" when it resumed computation. ERROR: mdsim.cu, line 572: Invalid device selected My guess is that the ripping the CPU out from underneath the GPU to do a benchmark is not a good idea. Perhaps one of the people here who is in contact with BOIC devs might suggest they suspend GPU tasks when performing CPU benchmarks? 4/18/2009 1:58:08 PM Running CPU benchmarks 4/18/2009 1:58:08 PM Suspending computation - running CPU benchmarks 4/18/2009 1:58:39 PM Benchmark results: 4/18/2009 1:58:39 PM Number of CPUs: 8 4/18/2009 1:58:39 PM 3509 floating point MIPS (Whetstone) per CPU 4/18/2009 1:58:39 PM 11138 integer MIPS (Dhrystone) per CPU 4/18/2009 1:58:40 PM Resuming computation 4/18/2009 1:58:43 PM GPUGRID Computation for task xp490000-GIANNI_pYEpYV0304-5-10-RND_0 finished <-- finished because error above 4/18/2009 1:58:43 PM GPUGRID Output file xp490000-GIANNI_pYEpYV0304-5-10-RND_0_1 for task xp490000-GIANNI_pYEpYV0304-5-10-RND_0 absent 4/18/2009 1:58:43 PM GPUGRID Output file xp490000-GIANNI_pYEpYV0304-5-10-RND_0_2 for task xp490000-GIANNI_pYEpYV0304-5-10-RND_0 absent 4/18/2009 1:58:43 PM GPUGRID Output file xp490000-GIANNI_pYEpYV0304-5-10-RND_0_3 for task xp490000-GIANNI_pYEpYV0304-5-10-RND_0 absent |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Which BOINC version? MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Sorry, I was so frustrated I didn't post any specs ... Vista 64 Ult. BOINC 6.6.20 EVGA GTX 295 - 182.5 |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Can anoyone try to reproduce this issue, either with 6.620 or any other version? (I don't have access to my machine right now) MrS Scanning for our furry friends since Jan 2002 |
Dieter MatuschekSend message Joined: 28 Dec 08 Posts: 58 Credit: 231,884,297 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Can anyone try to reproduce this issue <snip> I just did it with BOINC 6.6.20 and Windows XP 32bit SP3: no problems encountered. |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
So we don't need tests with other BOINC versions. Snow, would you mind trying to reproduce the error? You could either do it at the beginning of a new WU or: - forbid BOINC network access - stop BOINC completly - copy the entire BOINC user data folder - start BOINC - test - in case of crash: shut BOINC down and copy the backup over the existing files - otherwise: may have just been a coincidence MrS Scanning for our furry friends since Jan 2002 |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
The client is different for Vista 64 bit ... I really don't want to do this again as I have already messed up my PPD enough recently but if there are no other takers I suppose I could stop getting new tasks, suspend what I do have, let 1 start and try it out. Hopefully I don't crap out the rest at the same time :-) I lost 6 completed WUs on Friday when my internet connecion was broken. When I finally did get back on line 6 hours later (well within the deadline) my "lost results" WUs were sent back to me for processing a second time. Steve |
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
OK ... I will do this but will be going very slow to try and reduce introducing other issues. Please be patient ... I will do my best to stop ranting and get you some real results :-) |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
BOINC runs benchmarks once a week on its own, so every computer that has been running CPUGRID more than a week has had benchmarks running and interrupting the WUs at least once. Since this is the only report of a problem, I would have to assume this is not a problem with the benchmark system itself but only when triggered manually. Anyway, I'll run a manual benchmark in a moment and report back later today on what happens to the WU in progress. I'm running 32 bit Vista, however, so it won't necessarily tell us anything. Mike EDIT: I'm running 6.6.20, Vista 32 bit SP1, EVGA GTX280 and slightly old driver 180.48. The CPU is Q6600; CPU clocks are stock and GPU is factory OC. I ran the benchmarks and the WU is still running. I'll report back when it completes. Want to find one of the largest known primes? Try PrimeGrid. Or help cure disease at WCG.
|
|
Send message Joined: 4 Apr 09 Posts: 450 Credit: 539,316,349 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I finished trying to make this happen again and the good news it that it did not cause a compute error this time. I actually tried it many times (5) and even resumed network activity and tried again (2 times) and everything works fine. While I would be the first to chalk this up to *user hysteria* (I am a dev by profession), as you can see from the log in my OP that it seems a little too convienient that immediatley following a manual benchmark run that a compute error was generated and the WU gets marked completed within 3 seconds. Perhaps we just keep this one in the *maybe someday* pile just in case it happens again to someone else and only at that point spend more time trying to recreate. My guess is that this is probably needs a *perfect storm* of events. Perhaps it was trying to write a checkpoint at the same instant or something. Perhaps it even has something to do with CPU affinity, like GPU0 was using CPU0 at the moment it was suspended, the CPU0 caches were swapped out and refreshed to do the bench and then upon resume the OS tried to use CPU1 to finish what GPU0 needed done and it just got all tripped up. Sorry for the alarmist tone and title in my OP, I will try to remain calmer in the future. Steve ps I wish we could edit the Title so it does not cause unecessary concern now that we know that it may not be an issue at all. |
|
Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I agree: it seems like your BOINC may have been in an unusual state. MrS Scanning for our furry friends since Jan 2002 |
Michael GoetzSend message Joined: 2 Mar 09 Posts: 124 Credit: 124,873,744 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
©2025 Universitat Pompeu Fabra