Message boards :
News :
More CPU jobs
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · Next
Author | Message |
---|---|
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Actually 14 out of the total 17 failures are on your machines Thomas so it might be specific to your case. Generally they seem ok. They should use only 4GB of memory each WU. |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
It's running at 1.8 GHz and I have a 1220 Opteron in my drawers at 2.8 GHz. It's been running since January 2008.My electricity costs me 0.21 euro /kWh and I have 3 computers running 24/7, this Opteron, an AMD E-450 and a A10-6700 which should have 4 cores but Windows Task Manager says 2 cores and 4 logical processors. My total electricity expenditure is about 60 euro/month. Tullio I forgot to mention my ulefone smart phone with its arm64-v8a CPU running Android 7.1.1 on SETI@home and Einstein@home. |
![]() Send message Joined: 25 Mar 09 Posts: 25 Credit: 582,385 RAC: 0 Level ![]() Scientific publications ![]() |
I have an Intel 8 core (16 thread) Xeon server that has a 146 GB disk drive (has 2 of them but one died). It also has 24GB RAM. WUs are allowed to run with 8 cores. I am getting the message that I need 28610.23 MB Disk Space, I currently have 9486.42 MB spare, so it needs another 19123.81 MB of Disk Space. I leave 10GB that BOINC can't use, other programmes use 12.69 GB, BOINC is using 17.02 GB. Of that 17.02 GB that BOINC is using, GPU Grid is using 8.29 GB, even when it is not running anything. If I allow all my spare space to be used I would just have enough disk space for GPUGrid to run (maybe), however I don't intend to give all that space to BOINC so I can't download and run some of these work units. If they are 6GB then there is no problem. Why does GPUGrid need over 8 GB of disk space just to hold the project files? (I have another computer that is showing the same amount of used disk space so this is normal amount used by the project but Why? (My other computer has a much larger Disk so is not having the same issues). Conan |
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
@Conan: the QM calculations need to store lots of data in memory for best performance. Since we cannot ask for 20GB of RAM the software instead writes any amount of calculation data that exceeds the RAM limit (4GB) to the hard drive. |
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
The current "disk limit" for CPU jobs is set at 20 GB. This is a ballpark estimate to accommodate both the software and libraries (largish by themselves) and the temporary (scratch) data. The software is reused between WUs, but you can reclaim the space by resetting the project. The scratch space is only occupied when a WU is running (or paused). |
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
new WUs don't seem to work: they consume a lot of memory, throw computation errors or just rest at 10% progress forever. On your failures I see "connection errors". Could be firewall filtering, or the like. |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
First SELE task done by my Old Faithful Opteron 1210 running SuSE Linux Leap 42.3. Tullio |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
I have a funny SELE task on my Linux laptop. It is stuck at 10% after 14 hours 38 min, but the remaining estimated time is rising to more than 5 days. All seems normal by the "top" command and it has lots of disk space. Tullio |
Send message Joined: 10 Sep 10 Posts: 163 Credit: 388,132 RAC: 0 Level ![]() Scientific publications ![]() |
new WUs don't seem to work: they consume a lot of memory, throw computation errors or just rest at 10% progress forever. No firewall here. And same problem. |
Send message Joined: 23 Feb 17 Posts: 21 Credit: 5,528,199,475 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
As said, those WUs do not work properly. I am away for another project and come back, if they are fixed. |
![]() Send message Joined: 25 Mar 09 Posts: 25 Credit: 582,385 RAC: 0 Level ![]() Scientific publications ![]() |
OK, thanks Toni and Stefan for the information, that explains a lot. I will run what I can. Thanks again Conan |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
In the slot of a running task there is an output directory which leads to a report of what the program is doing in physical terms. Maybe some explanation by the admins would be welcome. Tullio |
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
We investigated another algorithm which doesn't use scratch disk space. Unfortunately on my test it was 13x slower than the one that uses disk (25 minutes became 5:30 hours). So it is not a realistic choice for us. After this batch of simulations I will probably have to submit more which will use more scratch disk up to 30GB so I assume we are going to fill up some disks. |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
I got plenty of disk space on my two Linux boxes because the slots directory is in my /home/user partition,which has more than 700 GB on my SuSE Linux Leap 42.3 and Leap 15.0 OS. What amazes me is that QC tasks are always stuck at 10% progress while GPU tasks show progress as increasing. Tullio |
Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() |
I got plenty of disk space on my two Linux boxes because the slots directory is in my /home/user partition,which has more than 700 GB on my SuSE Linux Leap 42.3 and Leap 15.0 OS. What amazes me is that QC tasks are always stuck at 10% progress while GPU tasks show progress as increasing. The 10% progress is explained as follows: updating (if necessary) the app is 10%, and usually happens immediately. The remaining 90% advances when molecules are calculated (e.g. 5 molecules = 90%/5 increments). However very big WUs have only one molecule, so no apparent progress until the end. (We have no finer grain progress). |
![]() Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I got plenty of disk space on my two Linux boxes because the slots directory is in my /home/user partition,which has more than 700 GB on my SuSE Linux Leap 42.3 and Leap 15.0 OS. What amazes me is that QC tasks are always stuck at 10% progress while GPU tasks show progress as increasing. So how much space do these WUs need? I'm running 12 at a time with 64 GB of RAM, but no swap space. I see that not all 48 threads are at 100%, I'm thinking it's the lack of swap. ![]() |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
In the old UNIX days a rule of thumb was that you needed a swap space twice the RAM, which was usually small.Now RAM is plenty. I got 22 GB RAM on the Windows 10 PC, and 8 GB RAM on each Linux box. GGPUGRID CPU tasks use some swap but most is not used. tullio |
![]() Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I amped the swap to 300GB, but it only seems to be using RAM. Is this "scratch space" used in swap space or does the WU use the file directory for storage? I'm thinking it is the latter since the BOINC space usage goes up and down. Thing is my install directory is only 120GB... I also have this feeling that now I have 300GB of swap space for nothing lol. I am not a smart man. ![]() |
Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level ![]() Scientific publications ![]() |
I see temporary files in the slots/0 directory They are named psi.25019.number Tullio |
Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level ![]() Scientific publications ![]() |
Yes afaik it doesn't use swap space, so increasing that will not help. It's probably where Tullio mentioned. The files are called `psi.XXXXX.XX`. Usually there are two and the second can grow significantly. |
©2025 Universitat Pompeu Fabra