Message boards : Number crunching : BOINC manager v7.8.2 has been released
Author | Message |
---|---|
You can download it from Berkeley's website. | |
ID: 47866 | Rating: 0 | rate: / Reply Quote | |
What are the changes? I can't find any info about it (?). | |
ID: 47869 | Rating: 0 | rate: / Reply Quote | |
You can download it from Berkeley's website. But I advise you not to bother, unless you're a masochist. It's riddled with bugs. | |
ID: 47870 | Rating: 0 | rate: / Reply Quote | |
You can download it from Berkeley's website. Oh, I've just updated my hosts with this version. Why it is released, if it's full of bugs? (that would look nice on the change list: "We've put more bugs in it than we and you combined could imagine") | |
ID: 47871 | Rating: 0 | rate: / Reply Quote | |
Why it is released, if it's full of bugs? That's a very deep question, and needs some background. Somewhat more than two years ago, the US Government's NSF decided not to renew a research grant which paid the salaries of the three key workers who managed and maintained the BOINC project. Those workers lost their jobs. The BOINC project was - nominally - handed over to the community to manage and maintain, but no preparation had been done: the community wasn't ready to receive it. It dropped into their lap, and they did nothing with it. Two months ago, elements of the community came together in a working group to prepare procedures through which the community could pick up the responsibility which had been thrust upon it. I wrote about it here: I'm still a member of the group, and our work is ongoing. In the meantime, the NSF has funded a new project which expands and builds upon the original BOINC project (described here). That needs some additional features in BOINC, and development work has started again. The v7.8 branch/release is really intended as a simple refresh to ensure that there is a stable base for the new NSF work (which won't appear until v7.10), and to apply some needed updates like a new version of VBox compatible with Windows 10. But life is never as simple as that... In the past, BOINC version releases have gone through a slow and extended process of alpha test releases, debugging, and re-releasing. It's taken months. The working group is moving towards more modern software practices where testing is a continual (and largely automated) process as code is written: that should allow new versions to be deployed practically 'on demand' as circumstances - like Windows and OS X updates - require them. The v7.8.2 release is the first attempt at combining the old and the new ways of working. It's revealed where the gaps lie, and we're working to fix them: I had my own first bugfix accepted into the master codebase this morning - yay! Only 19 left to go... | |
ID: 47872 | Rating: 0 | rate: / Reply Quote | |
Thank you for your work, and for this explanation. | |
ID: 47873 | Rating: 0 | rate: / Reply Quote | |
Nice to know. Keep us posted. | |
ID: 47874 | Rating: 0 | rate: / Reply Quote | |
I've updated on Win 10 x64 creator's edition, and I have not noticed any bugs. | |
ID: 47875 | Rating: 0 | rate: / Reply Quote | |
In particular we would like to know what is the status of virtualization because we are trying to setup in gpugrid a CPU application for which there is only a Linux distribution. | |
ID: 47876 | Rating: 0 | rate: / Reply Quote | |
Ok, so what are the changes? | |
ID: 47878 | Rating: 0 | rate: / Reply Quote | |
Ok, so what are the changes? At the moment there is no official list of changes. "Ageless" wrote: I asked the release manager and he hopes someone else will do them. So it's anyone's guess. Source: https://boinc.berkeley.edu/dev/forum_thread.php?id=11818 Assuming the above lists, there are no new "features" or serious bugsfixes(for Windows/Linux). Surprisingly, even have not been updated to the openssl library, stuck on version 1.02 g(release 1 Mar 2016] and bundle of root certificates(ca-bundle.crt) (30 April 2015). Had to once again to do it manually : / | |
ID: 47879 | Rating: 0 | rate: / Reply Quote | |
I have (or had) BOINC 7.8.2 installed on three Ubuntu 17.04 machines and one Win7 64-bit machine with no problems. That is, until BOINC crashed (manager could not connect to client) on one of the Ubuntu machines. Even after a reboot it did not work, which I don't recall ever seeing before. So I uninstalled BOINC, and went back to 7.6.33, but it was still borked. The only other thing I can think of is that VirtualBox 5.1.28 was installed, but not attached to any projects, and removing it did not fix anything. | |
ID: 47880 | Rating: 0 | rate: / Reply Quote | |
I don't think that's a known problem with v7.8.2 (most of the bugs are more subtle than a downright crash), but it's hard to be sure from the information given. The most common cause of 'manager couldn't connect to client' is that the client isn't running: I'm not sure exactly what logs you get in Linux when an application tries, but fails, to start - that would have been the place to look, but it's probably water under the bridge now. | |
ID: 47881 | Rating: 0 | rate: / Reply Quote | |
OK, I thought it might be a bit much for a mere bug. I have not reinstalled yet, so I will look for any logs, and post as necessary. | |
ID: 47882 | Rating: 0 | rate: / Reply Quote | |
I experienced the exact same thing yesterday too (as Jim1348)! | |
ID: 47884 | Rating: 0 | rate: / Reply Quote | |
And behold, after we'd all written that, we get a report of an apple client crashing at startup. After some digging, I found that the client is crashing pretty much immediately, with the following stack trace on thread 0: If anyone can get a similar report out of your Linux crashes (Linux and OS X are pretty similar behind the fancy graphics, apparently), could you post it here, please? | |
ID: 47886 | Rating: 0 | rate: / Reply Quote | |
I found and saved my BOINC logs, but don't know which one it is. But I have wiped out the OS, so can't do a "stack trace" now. Does that do you any good? | |
ID: 47887 | Rating: 0 | rate: / Reply Quote | |
I found and saved my BOINC logs, but don't know which one it is. But I have wiped out the OS, so can't do a "stack trace" now. Does that do you any good? Logs on their own probably won't help, although stderrdae.txt and stdoutdae.txt might have something. Could you look at the end of those files, please, and see if there's any mention of a failure around the time your problems started? | |
ID: 47888 | Rating: 0 | rate: / Reply Quote | |
In stderrdae.txt, I see "buffer overflow" at the beginning and " [vsyscall] SIGABRT: abort called Stack trace (18 frames):" at the end, with a whole lot in between. [vsyscall] SIGABRT: abort called Stack trace (18 frames): /usr/lib/x86_64-linux-gnu/libboinc.so.7(boinc_catch_signal+0x1d8)[0x7fd5504932ac] /lib/x86_64-linux-gnu/libc.so.6(+0x357f0)[0x7fd5507287f0] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x9f)[0x7fd55072877f] /lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7fd55072a37a] /lib/x86_64-linux-gnu/libc.so.6(+0x79090)[0x7fd55076c090] /lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x54)[0x7fd55080df84] /lib/x86_64-linux-gnu/libc.so.6(+0x118f00)[0x7fd55080bf00] /lib/x86_64-linux-gnu/libc.so.6(+0x1184b9)[0x7fd55080b4b9] /lib/x86_64-linux-gnu/libc.so.6(_IO_default_xsputn+0xa9)[0x7fd5507709a9] /lib/x86_64-linux-gnu/libc.so.6(_IO_vfprintf+0x1ccc)[0x7fd55074255c] /lib/x86_64-linux-gnu/libc.so.6(__vsprintf_chk+0x84)[0x7fd55080b544] /lib/x86_64-linux-gnu/libc.so.6(__sprintf_chk+0x7d)[0x7fd55080b49d] /usr/bin/boinc(+0x9ddec)[0x560aa99fcdec] /usr/bin/boinc(+0x7f9a9)[0x560aa99de9a9] /usr/bin/boinc(+0x39e70)[0x560aa9998e70] /usr/bin/boinc(+0xc4a9)[0x560aa996b4a9] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf1)[0x7fd5507133f1] /usr/bin/boinc(+0xec9a)[0x560aa996dc9a] Exiting... As for stdoutdae.txt, I just see the usual parameters such as upload and download rate, with no obvious problems. I will be happy to zip and send you both of them, if there is a way. | |
ID: 47889 | Rating: 0 | rate: / Reply Quote | |
I don't know yet whether they'll be any help, but please keep them in a safe place in case we need to call for them. Just to be certain, can you be sure (from file timestamps or however) that this stack trace comes from the time when you were running v7.8.2 under, I think you said, Ubuntu 17.04? | |
ID: 47890 | Rating: 0 | rate: / Reply Quote | |
I don't know yet whether they'll be any help, but please keep them in a safe place in case we need to call for them. Just to be certain, can you be sure (from file timestamps or however) that this stack trace comes from the time when you were running v7.8.2 under, I think you said, Ubuntu 17.04? I have been running Ubuntu 17.04 for several weeks, and BOINC 7.8.2 since at least 12 September, which I know from the CPDN results page; probably longer, though I can't tell from the file dates on stderrdae.txt and stdoutdae.txt since they were lost on copying. As for the time stamps, I don't really know, except that the first thing that looks like one is ======= Memory map: ======== 564646ee0000-564646fc6000 r-xp 00000000 08:05 3145802 /usr/bin/boinc and the last one is 7fd550ce0000-7fd550ce1000 rw-p 00026000 08:05 2752569 /lib/x86_64-linux-gnu/ld-2.24.so If that is referring to 08:05 UTC (04:05 EDT), then that is the right time, or for the reboot after I detected it if BOINC was still operational at that point. That would not be more than a couple of hours after it occurred. Beyond that, I will certainly save all the logs and you can PM me here or on BOINC and I will be glad to send them for your expert inspection. | |
ID: 47891 | Rating: 0 | rate: / Reply Quote | |
In later Linux kernels vsyscall is disabled. I'm running Debian and can't go past the 4.9 kernel (without fiddling) due to it. Ubuntu 17.04 ships with the 4.10 kernel as default. My machines are Ryzen 1700 and running BOINC 7.8.2 from the Stretch-backports repo. See this thread at Einstein. | |
ID: 47892 | Rating: 0 | rate: / Reply Quote | |
I have (or had) BOINC 7.8.2 installed on three Ubuntu 17.04 machines and one Win7 64-bit machine with no problems. That is, until BOINC crashed (manager could not connect to client) on one of the Ubuntu machines. Even after a reboot it did not work, which I don't recall ever seeing before. So I uninstalled BOINC, and went back to 7.6.33, but it was still borked. The only other thing I can think of is that VirtualBox 5.1.28 was installed, but not attached to any projects, and removing it did not fix anything. Just reporting back on this one for completeness. All reported cases of "won't connect, won't run, won't even run with old version" have now been traced to a newly released batch (batch 658) of CPDN climate models - sprecifically, WAH2 for the PNW region. These tasks all fail after one simulation month under Linux and OS X (CPDN are trying to track down the reason for that - their problem). When the tasks crash, they leave behind a huge crash dump in stderr_txt, and 51 failed upload messages. BOINC - all current versions - can't cope with that much error information, and fails with the symptoms described here. There are two known recovery routes: a) Delete the file 'account_climateprediction.net.xml' from BOINC's data directory. This detaches you temporarily from the CPDN project, until the problems are resolved and you can re-attach. b) Very carefully, edit client_state.xml to remove the <workunit> and <result> sections for any WAH2 PNW tasks you may have. Set 'no new tasks' for CPDN as soon as you get back control of BOINC. BOINC v7.8.2 is NOT, it turns out, implicated in this problem. A fix has been written, and will be included in the next BOINC release - whenever that is. | |
ID: 47895 | Rating: 0 | rate: / Reply Quote | |
That is a very nice summary, and I (and a lot of other people) are fortunate that Richard visited this forum at the right time. I would add only that the problem does not appear to affect the Windows version of BOINC on CPDN, though it is not clear why not. | |
ID: 47896 | Rating: 0 | rate: / Reply Quote | |
I would add only that the problem does not appear to affect the Windows version of BOINC on CPDN, though it is not clear why not. Because the Windows version of the CPDN application doesn't crash after the first month, and doesn't produce the huge crash dump. | |
ID: 47897 | Rating: 0 | rate: / Reply Quote | |
All reported cases of "won't connect, won't run, won't even run with old version" have now been traced to a newly released batch (batch 658) of CPDN climate models - sprecifically, WAH2 for the PNW region. These tasks all fail after one simulation month under Linux and OS X (CPDN are trying to track down the reason for that - their problem). When the tasks crash, they leave behind a huge crash dump in stderr_txt, and 51 failed upload messages. I am pretty sure, that this was the problem in my case, as I am running 14 WUs of climateprediction.net alongside of gpugrid.net. It is not the first time, that climateprediction.net shut down one of my computers, because the model crashes. But as I wanted to install Lubuntu 17.04 and overclock my RAM anyway, I was quick to install everything anew. And now it works without any problems for three days. I will handpick the WUs of climateprediction.net at this moment. | |
ID: 47898 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : BOINC manager v7.8.2 has been released