Message boards :
News :
On new fatty WUs
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
| Author | Message |
|---|---|
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
I've completed two new 'fatty' WUs on two different computers. This is not 1.8M steps. It's still 2.5M steps but the last restart till it finished was of 1.8M steps. (maybe the message could be clearer). gdf |
|
Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
OK, thank you Mark ! The main problem with OpenCL is it doesn't have the library support and developer tools that CUDA has. CUDA has been around longer. Eventually that will change. There is an FFT libray supplied with CUDA that a number of the projects make use of. From what I gather there isn't an equivilent one with OpenCL. From the project point of view OpenCL would be the way to go as you only need one app (however you have to compile with each manufacturers compiler to work on that brand). The coding is different so the apps that have been developed need to be rewritten to work under OpenCL (which all takes time and effort). It will get there eventually, it just takes a long time for OpenCL to catch up to CUDA. BOINC blog |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
This is not 1.8M steps. It's still 2.5M steps but the last restart till it finished was of 1.8M steps. (maybe the message could be clearer). Oh, now it's perfectly clear. Thank you. I overclocked my CPU by 20% since then, and the GPU usage had risen 10%. So I guess it would be better to upgrade my CPU (+MB and RAM of course) to achieve higher GPU usage. |
|
Send message Joined: 19 Aug 07 Posts: 46 Credit: 45,339,082 RAC: 46 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]()
|
Have all the Fatty wu's been sent out & how big are the files that get uploaded? We are talking of 10 nanoseconds of simulation per WU I don't understand. 10 nanoseconds of simulation per WU isn't very long at all. how come the Fatty units are taking twice as long? |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I thought most of the 500 had been sent and returned but there are a few still available/running. The tasks are about 2 to 4 times as long as normal tasks, though there is quite a variety of task lengths at the present time. |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Each of those 500 will be reissued 10 times to achieve 100 ns of total simulation time each. gdf |
|
Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Ignasi, It is now a few days ago and still no answer on this question. Can you please give an exploination for this error? Thanks! Ton (ftpd) Netherlands |
GDFSend message Joined: 14 Mar 07 Posts: 1958 Credit: 629,356 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
Ignasi was abroad, but what error are you referring to? gdf |
|
Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
gdf, Thanks for the reply. See message 18476 from skgiven! Ton (ftpd) Netherlands |
|
Send message Joined: 23 Nov 09 Posts: 29 Credit: 17,591,899 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I'm curious. I'm currently crunching TRYP WU 1859786 which I got at the end of the server outage. Its info page says its's not a resend but a first-time d/l. Is this part of the original batch of 500 or a new batch? Oh...belated congratulations on getting the servers back up again: you were missed, but I'm sure Collatz appreciated the overflow ;D
|
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
To quote GDF, "Each of those 500 will be reissued 10 times to achieve 100 ns of total simulation time each". Basically a batch of 500 was created, sent out, returned, and used to build another 500. When these are returned they will create another 500 and so on for 10 times. This highlights the importance of fast turnaround. |
|
Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
Skgiven & Ignasi, Again two WU aborted after a lot of hours processing at the same machine/card. Can you please take a look and reply? Thanks Ton (ftpd) Netherlands |
|
Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
|
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Motivated to make a systemic suggestion here. Thanks, |
|
Send message Joined: 12 Feb 07 Posts: 9 Credit: 0 RAC: 0 Level ![]() Scientific publications
|
That's a funny error - it looks like a driver bug affecting GTX9800 (which is what a GTS250 is, too). What driver version do you have? Matt |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Well, if it’s a driver bug then the same bug in driver 19713 is still there in driver 25896, and effects both the 9800 GT (511MB) and the 1GB version of the GTS250, suggesting this one is not related to memory size. It would be interesting to see if the bug was present for 6.11 tasks; With the CUDA 3.1 based app rather than 2.3 files. 3.1 might be slower for the older cards but if there were fewer errors, overall it could be faster. ftpd, at the time of the errors did you have other (non-GPUGrid) GPU tasks running? |
|
Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
@skgiven, At that time only gpugrid was running, gts250 can only do one job. RNA World was running using CPU. Later that week two other WU also were aborted. HIVPR? Enough information? Good luck Ton (ftpd) Netherlands |
skgivenSend message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Hi Ton, At that time only gpugrid was running, gts250 can only do one job.Thanks, I was just trying to determine if Collatz or other GPU tasks were overwriting your GPUGrid WU memory, can occasionally happen if you run two GPU projects on the same computer. They don’t run at the same time, but Boinc can stop one running, and suspend it in memory, to let the other run. Sometimes when this happens it can corrupt the task, but I doubt that this is the case (you only ran one Collatz task on that GPU, probably when there was a task shortage). RNA World was running using CPU.Unless it was running Beta tasks it would not mess up the GPUGrid task. If you had lots of RNA failures on the same system, then it would be fair to say it could mess up the system and cause problems for GPU tasks (as they also have to use the CPU too). Later that week two other WU also were aborted. HIVPR?Well, I expect in your case you were not video editing or playing a computer game, or you would have said. So I think so. I see you are having to User Aborting these tasks, and I can fully understand why, they all crashed on that card. A real pain for the cruncher to sit over a system that way. I'm not keen on this situation. As Matt said, there is a driver "related" bug: (SWAN : FATAL : Failure executing kernel sync [transpose_float2] [700] Assertion failed: 0, file swanlib_nv.cpp, line 121) I just wanted to make sure there was not something else. However the same problem has been raised in two other threads, http://www.gpugrid.net/forum_thread.php?id=2274 http://www.gpugrid.net/forum_thread.php?id=2278 We can’t do any more than ignasi did - inform the developers, and throw in a few more suggestions, if only to treat the symptoms. The researchers have been working on new applications for a while now, 5% faster last I heard. Hopefully they can find a work around for this, as well as make the app much faster for GTX460’s and so on. |
|
Send message Joined: 6 Jun 08 Posts: 152 Credit: 328,250,382 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
@skgiven, Hi Kev, I never aborted wu myself. There are never played games on this machine. Just Outlook and Internet Explorer. RNA no Beta-jobs. There were Linux-beta-jobs. No faillures. Success! Ton (ftpd) Netherlands |
Retvari ZoltanSend message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]()
|
I've discovered a new type fatty WU on one of my computers called *_IBUCH_1_pYEEI_long_*. It's running for 4 hours 15 minutes and completed 55% (GTX 480 @ 800MHz, 63% GPU usage, SWAN_SYNC=0, C2Q 9550 @ 3.71GHz). I'm a little surprised. |
©2025 Universitat Pompeu Fabra