Message boards :
News :
Update acemd3 app
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next
Author | Message |
---|---|
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
Toni, I think the first thing that needs to be fixed is the problem with boost 1.74 library not being included in the app distribution. the app is failing right away because it's not there. you either need to distribute the .so file or statically link it into the acemd3 app so it's not needed separately. manually installing it seems to be a workaround, but that's a tall order to make every Linux user have to perform. ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
after manually installing the required boost to get past that error, I now get this error on my 3080 Ti system: 09:55:10 (4806): wrapper (7.7.26016): starting Task: https://www.gpugrid.net/result.php?resultid=32632410 I tried purging and reinstalling the nvidia drivers, but no change. it looks like this same error popped up when you first released acemd3 2 years ago: http://www.gpugrid.net/forum_thread.php?id=4935#51970 biodoc wrote: Multiple failures of this task on both windows and linux you then updated the app which fixed the problem at that time, but you didnt post exactly what was changed: http://www.gpugrid.net/forum_thread.php?id=4935&nowrap=true#52022 Toni wrote: It was a cryptic bug in the order loading shared libraries, or something like that. Otherwise unexplainably system-dependent. so whatever kind of change you made between v2.02 and v2.03 seems to be what needs fixing again. ![]() |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 998,578 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I deployed the new app, which now requires cuda 11.2 and hopefully support all the latest cards. Touching the cuda versions is always a nightmare in boinc scheduler so expect problems. Thank you so much. Those efforts are for noble reasons. Regarding persistent errors: I also manually installed boost as a try at one of my Ubuntu 20.04 hosts, by means of the following commands: sudo add-apt-repository ppa:mhier/libboost-latest But a new task downloaded after that still failed: e3s644_e1s419p0f770-ADRIA_New_KIXcMyb_HIP_AdaptiveBandit-1-2-RND9285_4 Then, I've reset GPUGrid project, and it seems that it did the trick. A new task is currently running on this host, instead of failing after a few seconds past: e4s126_e3s248p0f238-ADRIA_New_KIXcMyb_HIP_AdaptiveBandit-0-2-RND6347_7 49 minutes, 1,919% progress by now. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
I deployed the new app, which now requires cuda 11.2 and hopefully support all the latest cards. Touching the cuda versions is always a nightmare in boinc scheduler so expect problems. Thanks, I'll try a project reset. though I had already done a project reset after the new app was announced. I guess it can't hurt. ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
nope, even after the project reset, still the same error process exited with code 195 (0xc3, -61)</message> https://www.gpugrid.net/result.php?resultid=32632487 ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
sudo add-apt-repository ppa:mhier/libboost-latest small correction here. it's "libboost1.74", not just "boost1.74" ![]() |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 998,578 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Maybe that your problem is an Ampere-specific one (?). I've catched a new task in another of my hosts after applying the same remedy, and it is also running as expected. e3s263_e1s419p0f938-ADRIA_New_KIXcMyb_HIP_AdaptiveBandit-1-2-RND6959_2 25 minutes, 0,560% progress by now for this second task. Turing GPUs and Nvidia drivers 465.31 on both hosts. Installing libboost1.74 didn't worked for me by itself. Resetting project didn't worked for me by itself. Installing libboost1.74 and resetting project afterfards, did work for both my hosts. I've doublechecked, and the commands that I employed were the previously published at message #57064 Watching Synaptic, this lead to libboost1.74 and libboost1.74-dev were correctly installed. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
Maybe that your problem is an Ampere-specific one (?). I had this thought. I put in my old 2080ti to the problem-host, and will see if it starts processing, or if it's really a problem with the host-specific configuration. this isn't the first time this has happened though. and Toni previously fixed it with an app update. so it looks like that will be needed again even if it's Ampere-specifc. I think the difference in install commands comes down to the use of apt vs. apt-get. although apt-get still works, transitioning to just apt will be better in the long term. Difference between apt and apt-get ![]() |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
Maybe that your problem is an Ampere-specific one (?). well, it seems it's not Ampere specific. it failed in the same way on my 2080ti here: https://www.gpugrid.net/result.php?resultid=32632521 still the CUDA compiler error unfortunately I can't easily move the 3080ti to another system since it's a watercooled model that requires a custom water loop. ![]() |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
I just used the ppa method on my other two hosts. But I did not reboot. Picked up another task and it is running. Waiting still on the luck of the draw for the other host without work. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
I think I finally solved the issue! it's running on the 3080ti finally! first I removed the manual installation of boost. and installed the PPA version. I don't think this was the issue though. while poking around in my OS installs, I discovered that I had the CUDA 11.1 toolkit installed (likely from my previous attempts at building some apps to run on Ampere). I removed this old toolkit, cleaned up any files, rebooted, reset the project and waited for a task to show up. so now it's running finally. now to see how long it'll take a 3080ti ;). it has over 10,000 CUDA cores so I'm hoping for a fast time. 2080ti runs about 12hrs, so it'll be interesting to see how fast I can knock it out. using about 310 watts right now. but with the caveat that ever since I've had this card, I've noticed some weird power limiting behavior. I'm waiting on an RMA now for a new card, and I'm hoping it can really stretch it's legs, plan to still power limit it to about 320W though. ![]() |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 592 Credit: 11,972,186,510 RAC: 998,578 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Congratulations! Good news...Anxious to see the performance on a 3080 Ti |
Send message Joined: 19 Nov 17 Posts: 1 Credit: 46,790,085 RAC: 0 Level ![]() Scientific publications ![]() |
Well, it will be your problem, not mine. Even having decent hard- and software I am pretty astonished how folks like cosmology and the likes seemingly do not understand how VM and the like works. I am pretty pissed as an amateur, though... |
Send message Joined: 11 Jul 09 Posts: 1639 Credit: 10,159,968,649 RAC: 295,172 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Just seen my first failures with libboost errors on Linux Mint 20.1, driver 460.80, GTX 1660 super. Applied the PPA and reset the project - waiting on the next tasks now. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
Well, it will be your problem, not mine. Even having decent hard- and software I am pretty astonished how folks like cosmology and the likes seemingly do not understand how VM and the like works. I am pretty pissed as an amateur, though... what problem are you having specifically? this project has nothing to do with cosmology, and this project does not use VMs. ![]() |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I think I finally solved the issue! it's running on the 3080ti finally! This is the moment of truth we're all waiting for. My bet is 9h 15m. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
I think I finally solved the issue! it's running on the 3080ti finally! I’m not sure it’ll be so simple. When I checked earlier, it was tracking a 12.5hr completion time. But the 2080ti was tracking a 14.5hr completion time. Either the new run of tasks are longer, or the CUDA 11.2 app is slower? We’ll have to see. ![]() |
![]() Send message Joined: 13 Dec 17 Posts: 1416 Credit: 9,119,446,190 RAC: 614,515 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() |
I'm curious how you have a real estimated time remaining calculated for a brand new application. AFAIK, you JUST got the application working and I don't believe you have validated ten tasks yet to get an accurate APR which produces the accurate estimated time remaining numbers. All my tasks are in EDF mode and multiple day estimates simply because I have returned exactly one valid task so far. A shorty Cryptic-Scout task. |
Send message Joined: 21 Feb 20 Posts: 1114 Credit: 40,838,348,595 RAC: 4,765,598 Level ![]() Scientific publications ![]() |
I didn’t use the time remaining estimate from BOINC. I estimated it myself based on % complete and elapsed time, assuming a linear completion rate. ![]() |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 1 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
When I checked earlier, it was tracking a 12.5hr completion time. But the 2080ti was tracking a 14.5hr completion time.If the new tasks are longer, the awarded credit should be higher. The present ADRIA_New_KIXcMyb_HIP_AdaptiveBandit workunits "worth" 675.000 credits, while the previous ADRIA_D3RBandit_batch_nmax5000 "worth" 523.125 credits, so the present ones are longer. My estimation was 12h/1.3=9h15m (based on my optimistic 30% performance improvement expectation). Nevertheless we can use the completion times to estimate the actual performance improvement (3080Ti vs 2080Ti): The 3080 Ti completed the task in 44368s (12h 19m 28s) the 2080Ti completed the task in 52642s (14h 37m 22s), so the 3080Ti is "only" 18.65% faster. So the number of the usable CUDA cores in the 30xx series are the half of the advertised number (just as I expected), as 10240/2=5120, 5120/4352=1.1765 (so the 3080Ti has 17.65% more CUDA cores than the 2080Ti has), the CUDA cores of the 3080Ti are 1.4% faster than of the 2080Ti. |
©2025 Universitat Pompeu Fabra