Simultaneously starting MCs

Message boards : Multicore CPUs : Simultaneously starting MCs

Author	Message
DRSMT Send message Joined: 23 Feb 17 Posts: 21 Credit: 5,528,142,362 RAC: 1,893,481 Level Scientific publications	Message 49378 - Posted: 2 May 2018 \| 11:38:08 UTC
	Problem with simultaneously starting Multicore CPU tasks has not been fixed yet!
	ID: 49378 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49382 - Posted: 2 May 2018 \| 14:48:28 UTC - in response to Message 49378.
	Problem with simultaneously starting Multicore CPU tasks has not been fixed yet! I have heard about this error but maybe I don't understand the symptoms. I have had three 4 thread WUs start at once and sometimes they work, sometimes they don't. Could this be the cause of my errors? Linked below are the tasks from the system: http://www.gpugrid.net/results.php?hostid=424454
	ID: 49382 \| Rating: 0 \| rate: / Reply Quote

DRSMT Send message Joined: 23 Feb 17 Posts: 21 Credit: 5,528,142,362 RAC: 1,893,481 Level Scientific publications	Message 49383 - Posted: 2 May 2018 \| 16:15:15 UTC - in response to Message 49382.
	If two WUs start at the same time, in most cases one of the WU failes right at start and throws an calculation error, which is very inconveniant, if you have to start often times more than one WU at the same time (in my case up to 20 WUs on my 80 threads machine). Would like to hear some statement of the developer(s) or simply a bugfix within the WUs, because at the moment, I do not see any suitable work around. Don't think it's just your fault or mine, because several users have reported this issue for a while now.
	ID: 49383 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49614 - Posted: 6 Jun 2018 \| 12:29:25 UTC - in response to Message 49383. Last modified: 6 Jun 2018 \| 12:34:48 UTC
	If it is still failing, please provide a task number for me to check. @Thomas: pls try to reset the project
	ID: 49614 \| Rating: 0 \| rate: / Reply Quote

captainjack Send message Joined: 9 May 13 Posts: 171 Credit: 4,379,345,966 RAC: 9,019,539 Level Scientific publications	Message 49615 - Posted: 6 Jun 2018 \| 12:32:41 UTC Last modified: 6 Jun 2018 \| 12:36:41 UTC
	Toni, Here are two task numbers that started at the same time. Both failed. 17737243 17737150 Let me know if you need more info. EDIT: Here is a task that started by itself and failed. 17714391
	ID: 49615 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49616 - Posted: 6 Jun 2018 \| 12:36:17 UTC - in response to Message 49615. Last modified: 6 Jun 2018 \| 12:37:19 UTC
	@captainjack: please try two things for me 1. open a terminal, and run the flock command. See if it gives an error (command not found) or a longer message. 2. reset the project Thanks
	ID: 49616 \| Rating: 0 \| rate: / Reply Quote

captainjack Send message Joined: 9 May 13 Posts: 171 Credit: 4,379,345,966 RAC: 9,019,539 Level Scientific publications	Message 49617 - Posted: 6 Jun 2018 \| 12:50:46 UTC
	Toni, The "flock" command was found and asked for more arguments. After a project reset, the following two tasks were started and both failed. 17737309 17737152 The following task was started by itself and failed. 17737252 What next?
	ID: 49617 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49618 - Posted: 6 Jun 2018 \| 12:55:33 UTC - in response to Message 49617.
	:( Which means that for your host the new app is a regression. I need an enlightenment. T
	ID: 49618 \| Rating: 0 \| rate: / Reply Quote

Stefan Project administrator Project developer Project tester Project scientist Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level Scientific publications	Message 49620 - Posted: 6 Jun 2018 \| 13:18:42 UTC
	Statistically though the new app seems to have worked on other hosts. We went from 900 WU to 1500 WU in progress.
	ID: 49620 \| Rating: 0 \| rate: / Reply Quote

DRSMT Send message Joined: 23 Feb 17 Posts: 21 Credit: 5,528,142,362 RAC: 1,893,481 Level Scientific publications	Message 49621 - Posted: 6 Jun 2018 \| 13:20:38 UTC
	with all my computers just the same... Toni, does it help if I give you by private message the remote control access credentials of one of my Linux machines, so you can test on your own?
	ID: 49621 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49622 - Posted: 6 Jun 2018 \| 13:40:14 UTC - in response to Message 49621.
	Thomas, that would help, but perhaps let me ask another thing first: What OS do you have, and which procedure did you use to install boinc? (Also, I assume QM tasks were working ok before, right?)
	ID: 49622 \| Rating: 0 \| rate: / Reply Quote

DRSMT Send message Joined: 23 Feb 17 Posts: 21 Credit: 5,528,142,362 RAC: 1,893,481 Level Scientific publications	Message 49623 - Posted: 6 Jun 2018 \| 13:52:34 UTC - in response to Message 49622.
	Sometimes they work and sometimes not. If two or more WUs start at the same time, they all throw calculation errors. This was the state until now. But with the very new version you just released today, it seems like they are not working anymore at all. My operating system is Linux Mint 18.3 64 Bit with actual linux kernel. I installed boinc with "sudo apt-get install boinc". gcc-5 and g++-5 are installed; also python-support.
	ID: 49623 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49624 - Posted: 6 Jun 2018 \| 14:16:27 UTC - in response to Message 49623. Last modified: 6 Jun 2018 \| 14:18:08 UTC
	Ok thanks. This will need a while to debug. As you can see there is no error info to help (as usual in boinc...).
	ID: 49624 \| Rating: 0 \| rate: / Reply Quote

[VENETO] boboviz Send message Joined: 10 Sep 10 Posts: 160 Credit: 388,132 RAC: 0 Level Scientific publications	Message 49625 - Posted: 6 Jun 2018 \| 14:33:24 UTC Last modified: 6 Jun 2018 \| 14:43:00 UTC
	All errors on my VM (with 4 virtual core). <message> process exited with code 195 (0xc3, -61) </message> <stderr_txt> 16:29:25 (2940): wrapper (7.7.26016): starting 16:29:25 (2940): wrapper (7.7.26016): starting 16:29:25 (2940): wrapper: running /bin/bash (-c "flock /var/lib/boinc-client/projects/www.gpugrid.net/miniconda.lock ./miniconda-installer -b -u -p /var/lib/boinc-client/projects/www.gpugrid.net/miniconda") Please run using "bash" or "sh", but not "." or "source"\n16:29:26 (2940): /bin/bash exited; CPU time 0.000000 16:29:26 (2940): app exit status: 0x1 I have not app_config. Addendum: i also tried to stop all wus and started manually one-by-one. Same error.
	ID: 49625 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49626 - Posted: 6 Jun 2018 \| 14:57:07 UTC - in response to Message 49624.
	Version 320 out
	ID: 49626 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49627 - Posted: 6 Jun 2018 \| 15:10:13 UTC - in response to Message 49626.
	Also, you need the libc6-dev package sudo apt install libc6-dev
	ID: 49627 \| Rating: 0 \| rate: / Reply Quote

bormolino Send message Joined: 16 May 13 Posts: 41 Credit: 89,480,947 RAC: 73,192 Level Scientific publications	Message 49629 - Posted: 6 Jun 2018 \| 15:29:59 UTC
	All tasks exit with error: 14:19:18 (8806): wrapper: running /bin/bash (-c "flock /var/lib/boinc-client/projects/www.gpugrid.net/miniconda.lock ./miniconda-installer -b -u -p /var/lib/boinc-client/projects/www.gpugrid.net/miniconda") Please run using "bash" or "sh", but not "." or "source"\n14:19:19 (8806): /bin/bash exited; CPU time 0.001596 14:19:19 (8806): app exit status: 0x1
	ID: 49629 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49630 - Posted: 6 Jun 2018 \| 15:32:00 UTC - in response to Message 49629.
	These were 3.19. See with 3.20
	ID: 49630 \| Rating: 0 \| rate: / Reply Quote

captainjack Send message Joined: 9 May 13 Posts: 171 Credit: 4,379,345,966 RAC: 9,019,539 Level Scientific publications	Message 49631 - Posted: 6 Jun 2018 \| 15:51:43 UTC
	When I tried to start two at the same time, one of them runs okay and the other one aborts. Task id for the aborted work unit is 17714759 Work unit number for the aborted work unit is 13679491 Running version 3.20 Let me know if you need more info.
	ID: 49631 \| Rating: 0 \| rate: / Reply Quote

DRSMT Send message Joined: 23 Feb 17 Posts: 21 Credit: 5,528,142,362 RAC: 1,893,481 Level Scientific publications	Message 49632 - Posted: 6 Jun 2018 \| 17:57:54 UTC
	I have had libc6-dev already installed... Earlier I got a bunch of new WUs, but all failed after several minutes of calculation (~ 5 - 15 minutes).
	ID: 49632 \| Rating: 0 \| rate: / Reply Quote

STARBASEn Send message Joined: 17 Feb 09 Posts: 91 Credit: 1,603,303,394 RAC: 0 Level Scientific publications	Message 49634 - Posted: 6 Jun 2018 \| 21:16:59 UTC
	Started two QC Wu's simultaneously using app version 3.20 and one failed with this stderr report: <core_client_version>7.9.3</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 13:45:21 (5569): wrapper (7.7.26016): starting 13:45:21 (5569): wrapper (7.7.26016): starting 13:45:21 (5569): wrapper: running /bin/bash (-c "flock /var/lib/boinc/projects/www.gpugrid.net/miniconda.lock ./miniconda-installer.sh -b -u -p /var/lib/boinc/projects/www.gpugrid.net/miniconda") flock: failed to execute ./miniconda-installer.sh: Text file busy 13:45:30 (5569): /bin/bash exited; CPU time 0.000265 13:45:30 (5569): app exit status: 0x45 13:45:30 (5569): called boinc_finish(195) </stderr_txt> ]]> Machine is an AMD FX8350 at 4.1 GHz with Fedora 28 kernel 4.16.13 with both gcc and glibc-devel installed. I had been running both my 8 core machines in 4 core mode and 2 concurrent QC WU's for several months with only one simultaneous start error the whole time but with no other project to compete for the CPU. I found that when running both WCG and QC CPU WU's, simultaneous starts occurred more frequently and unfortunately when it did happen, then failures would leap frog through the queue quickly and blacklist the computer from more work for a day.
	ID: 49634 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49636 - Posted: 6 Jun 2018 \| 21:35:46 UTC - in response to Message 49634.
	Yes, now simultaneous starts crash with "text file busy". Will look for yet another workaround.
	ID: 49636 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49640 - Posted: 7 Jun 2018 \| 8:32:32 UTC - in response to Message 49636.
	Attempting fix at version 321
	ID: 49640 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49641 - Posted: 7 Jun 2018 \| 10:48:54 UTC
	Just got a couple of errors from last night's WUs version 3.20. Here are the links: https://www.gpugrid.net/result.php?resultid=17740487 https://www.gpugrid.net/result.php?resultid=17740592 All the rest ran fine.
	ID: 49641 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49642 - Posted: 7 Jun 2018 \| 12:03:25 UTC - in response to Message 49641.
	^^ These seem connection errors (network down or so)
	ID: 49642 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49643 - Posted: 7 Jun 2018 \| 12:37:07 UTC - in response to Message 49642.
	^^ These seem connection errors (network down or so) Network can affect computation after the WU is already downloaded?
	ID: 49643 \| Rating: 0 \| rate: / Reply Quote

kain Send message Joined: 3 Sep 14 Posts: 152 Credit: 907,307,369 RAC: 773,263 Level Scientific publications	Message 49644 - Posted: 7 Jun 2018 \| 13:25:26 UTC
	I have exactly the same problems so I dont think this is connection related.
	ID: 49644 \| Rating: 0 \| rate: / Reply Quote

captainjack Send message Joined: 9 May 13 Posts: 171 Credit: 4,379,345,966 RAC: 9,019,539 Level Scientific publications	Message 49645 - Posted: 7 Jun 2018 \| 13:26:02 UTC
	Version 321 looks promising. Just finished two that started at the same time and they finished normally. Just started three at the same time and they are all processing as they should.
	ID: 49645 \| Rating: 0 \| rate: / Reply Quote

STARBASEn Send message Joined: 17 Feb 09 Posts: 91 Credit: 1,603,303,394 RAC: 0 Level Scientific publications	Message 49646 - Posted: 7 Jun 2018 \| 14:31:14 UTC
	Success with two QC WU's simultaneous start using app 3.21. Both WU's happily crunching @ 1.098% completed so far. Event Log Excerpt: Thu 07 Jun 2018 07:22:04 AM MST \| GPUGRID \| task m0000000638_2278bdac_n00020-SDOERR_QMML50-0-1-RND5957_0 resumed by user Thu 07 Jun 2018 07:22:04 AM MST \| GPUGRID \| task m0000000643_5ed133e9_n00020-SDOERR_QMML50-0-1-RND3172_0 resumed by user Thu 07 Jun 2018 07:22:05 AM MST \| GPUGRID \| Starting task m0000000638_2278bdac_n00020-SDOERR_QMML50-0-1-RND5957_0 Thu 07 Jun 2018 07:22:05 AM MST \| GPUGRID \| Starting task m0000000643_5ed133e9_n00020-SDOERR_QMML50-0-1-RND3172_0
	ID: 49646 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49647 - Posted: 7 Jun 2018 \| 14:55:56 UTC - in response to Message 49643.
	^^ These seem connection errors (network down or so) Network can affect computation after the WU is already downloaded? Yes. WUs check the latest version of conda packages/libraries right after start (from conda cloud).
	ID: 49647 \| Rating: 0 \| rate: / Reply Quote

mmonnin Send message Joined: 2 Jul 16 Posts: 337 Credit: 7,773,367,558 RAC: 72,226 Level Scientific publications	Message 49652 - Posted: 7 Jun 2018 \| 20:30:07 UTC - in response to Message 49647.
	^^ These seem connection errors (network down or so) Network can affect computation after the WU is already downloaded? Yes. WUs check the latest version of conda packages/libraries right after start (from conda cloud). Dang is that only after start? If tasks were started, paused, another started, etc could networking be disabled after they start? Or at each start/resume? Not me, but I've heard of some setup schedules to allow downloads at certain times of the day due to varying bandwidth costs.
	ID: 49652 \| Rating: 0 \| rate: / Reply Quote

STARBASEn Send message Joined: 17 Feb 09 Posts: 91 Credit: 1,603,303,394 RAC: 0 Level Scientific publications	Message 49654 - Posted: 7 Jun 2018 \| 21:14:00 UTC
	Today I have had three successful simultaneous starts and returns without error on two different FX8350 machines. As far as I am concerned, this bug has been resolved at least for the Fedora distro.
	ID: 49654 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49655 - Posted: 8 Jun 2018 \| 1:40:53 UTC
	Haven't gotten a single error with version 3.21 and it's been running all day. What did you change?
	ID: 49655 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49656 - Posted: 8 Jun 2018 \| 12:01:04 UTC - in response to Message 49655. Last modified: 8 Jun 2018 \| 12:06:29 UTC
	The main change was locking the miniconda directory upon initial installation/update. This in turn required some workarounds. May not be perfect but should be much better. Regarding network access, it is attempted at each WU start or re-start. The amount of downloaded data should be usually negligible (except the first time). Edit to add: the network accesses are only to "conda cloud", a python distribution and package manager.
	ID: 49656 \| Rating: 0 \| rate: / Reply Quote

[VENETO] boboviz Send message Joined: 10 Sep 10 Posts: 160 Credit: 388,132 RAC: 0 Level Scientific publications	Message 49657 - Posted: 8 Jun 2018 \| 15:18:33 UTC
	No more error, but a strange behaviour. Remanining time for every wus are over 20h, but wus are crunched in 40/50 minutes...
	ID: 49657 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1376 Credit: 8,054,655,727 RAC: 6,109,659 Level Scientific publications	Message 49661 - Posted: 9 Jun 2018 \| 1:27:58 UTC
	Didn't get any response from my post in the cpu tasks thread. How do you get the cpu tasks? I tried and failed. Is the QC app still considered a Test app? That was the only Preference toggle I didn't select.
	ID: 49661 \| Rating: 0 \| rate: / Reply Quote

klepel Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,759,881,008 RAC: 618,108 Level Scientific publications	Message 49662 - Posted: 9 Jun 2018 \| 1:59:22 UTC - in response to Message 49661. Last modified: 9 Jun 2018 \| 2:02:42 UTC
	Did you check: use CPU as well? You might not have allowed it. No it is not necessary to check BETA tasks. I do not have checked it either, but I do get QC tasks.
	ID: 49662 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1376 Credit: 8,054,655,727 RAC: 6,109,659 Level Scientific publications	Message 49663 - Posted: 9 Jun 2018 \| 18:47:33 UTC - in response to Message 49662.
	Yes CPU was checked for use both places and the QC app. Didn't get any QC tasks both time I tried. Just gpu tasks. The scheduler request was for both cpu and gpu work. I know there is plenty of cpu tasks to farm out. Couldn't explain why I didn't get any work. I'll have to try again without gpu work checked I guess.
	ID: 49663 \| Rating: 0 \| rate: / Reply Quote

Jim1348 Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level Scientific publications	Message 49664 - Posted: 9 Jun 2018 \| 21:42:15 UTC - in response to Message 49663. Last modified: 9 Jun 2018 \| 22:28:02 UTC
	Yes CPU was checked for use both places and the QC app. Didn't get any QC tasks both time I tried. Just gpu tasks. That is your problem. The BOINC scheduler can get all mixed up when you select both CPU and GPU work on the same project, and you are eventually left high and dry on one or the other. There are several discussions on it at Einstein, which has the same problem since they do both CPU and GPU work. Here is one recent discussion, where the moderator explains why the requester is not getting GPU work. https://einsteinathome.org/content/not-getting-gpu-wus-anymore#comment-165295 I use separate machines for the CPU work and the GPU work on GPUGrid.
	ID: 49664 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1376 Credit: 8,054,655,727 RAC: 6,109,659 Level Scientific publications	Message 49665 - Posted: 9 Jun 2018 \| 23:05:59 UTC - in response to Message 49664.
	Thanks for the reply. I don't know. It worked a couple of months ago when the QC app and tasks first showed up. I was crunching both gpu and cpu at the same time. I know that shutting off a gpu request will probably work just to get some of the new QC tasks along with the latest 3.21 app. I was just wondering how well the app works now with concurrent starts.
	ID: 49665 \| Rating: 0 \| rate: / Reply Quote

Jim1348 Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level Scientific publications	Message 49666 - Posted: 10 Jun 2018 \| 2:06:18 UTC - in response to Message 49665.
	I was just wondering how well the app works now with concurrent starts. Let's put it this way. I was running three 3.21 QC on my i7-8700 and rebooted. They all resumed normally without error. So it is solved enough.
	ID: 49666 \| Rating: 0 \| rate: / Reply Quote

kain Send message Joined: 3 Sep 14 Posts: 152 Credit: 907,307,369 RAC: 773,263 Level Scientific publications	Message 49667 - Posted: 10 Jun 2018 \| 9:34:50 UTC
	My threadripper this night crunched 50 WU's, no issues. Good job :)
	ID: 49667 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49668 - Posted: 10 Jun 2018 \| 9:44:30 UTC - in response to Message 49663.
	Yes CPU was checked for use both places and the QC app. Didn't get any QC tasks both time I tried. Just gpu tasks. The scheduler request was for both cpu and gpu work. I know there is plenty of cpu tasks to farm out. Couldn't explain why I didn't get any work. I'll have to try again without gpu work checked I guess. At risk of stating trivialities: did you check in the log if by chance it's a matter of disk space (either allocated to boinc, or actually free)? QC tasks are unusually demanding on disk space. Anyway: thanks for trying to make it work :)
	ID: 49668 \| Rating: 0 \| rate: / Reply Quote

kain Send message Joined: 3 Sep 14 Posts: 152 Credit: 907,307,369 RAC: 773,263 Level Scientific publications	Message 49669 - Posted: 10 Jun 2018 \| 9:58:13 UTC
	Not an important thing, but are you planning to implement separate badges for CPU points?
	ID: 49669 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49673 - Posted: 11 Jun 2018 \| 18:18:03 UTC
	Hello, I just got a new error with v3.21. Does anyone have any idea what could be causing it? Task is available below: http://www.gpugrid.net/result.php?resultid=17764870
	ID: 49673 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1376 Credit: 8,054,655,727 RAC: 6,109,659 Level Scientific publications	Message 49675 - Posted: 12 Jun 2018 \| 4:10:29 UTC - in response to Message 49668.
	Yes CPU was checked for use both places and the QC app. Didn't get any QC tasks both time I tried. Just gpu tasks. The scheduler request was for both cpu and gpu work. I know there is plenty of cpu tasks to farm out. Couldn't explain why I didn't get any work. I'll have to try again without gpu work checked I guess. At risk of stating trivialities: did you check in the log if by chance it's a matter of disk space (either allocated to boinc, or actually free)? QC tasks are unusually demanding on disk space. Anyway: thanks for trying to make it work :) Well the disk space allotted to BOINC is 10GB. Have about 8GB free for BOINC/project use. That wasn't the issue. Probably the request for both gpu and cpu at the same time. I got loaded up with gpu work on my multiple requests for cpu work along with my normal gpu work. Waiting till I clear out the gpu work and can set only cpu work requested. Will see if that makes the scheduler send me cpu work.
	ID: 49675 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49699 - Posted: 20 Jun 2018 \| 1:38:08 UTC
	Hello, I am having issues with a new Ubuntu 18.04 installation. I have installed the necessary packages but I am still getting errors. Can you let me know what the issue is? Errors linked below http://www.gpugrid.net/results.php?hostid=480159&offset=0&show_names=0&state=5&appid=
	ID: 49699 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1376 Credit: 8,054,655,727 RAC: 6,109,659 Level Scientific publications	Message 49700 - Posted: 20 Jun 2018 \| 7:01:18 UTC - in response to Message 49699.
	Lots of things don't work the same in 18.04 the way they did on 16.04. I figure the change in GTK and Python is the base cause of why compute doesn't work the same.
	ID: 49700 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49701 - Posted: 20 Jun 2018 \| 8:28:20 UTC - in response to Message 49699.
	Hello, I am having issues with a new Ubuntu 18.04 installation. I have installed the necessary packages but I am still getting errors. Can you let me know what the issue is? Errors linked below http://www.gpugrid.net/results.php?hostid=480159&offset=0&show_names=0&state=5&appid= Have you carried over the boinc dir from a previous installation? In any case, try resetting the project.
	ID: 49701 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49702 - Posted: 20 Jun 2018 \| 10:58:25 UTC - in response to Message 49701.
	Hello, I am having issues with a new Ubuntu 18.04 installation. I have installed the necessary packages but I am still getting errors. Can you let me know what the issue is? Errors linked below http://www.gpugrid.net/results.php?hostid=480159&offset=0&show_names=0&state=5&appid= Have you carried over the boinc dir from a previous installation? In any case, try resetting the project. Unfortunately I tried resetting the project with no luck. This is a brand new installation. gcc and libc6-dev said they were already the most recent version after I sudo apt update'd. Do I need to sudo apt upgrade after sudo apt update?
	ID: 49702 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49703 - Posted: 20 Jun 2018 \| 11:28:42 UTC - in response to Message 49702. Last modified: 20 Jun 2018 \| 11:31:34 UTC
	Can you try installing python-support (if not already?) I am under the impression that the WUs failing with segmentation fault is the root cause, and other errors are consequences.
	ID: 49703 \| Rating: 0 \| rate: / Reply Quote

tullio Send message Joined: 8 May 18 Posts: 190 Credit: 104,426,808 RAC: 0 Level Scientific publications	Message 49705 - Posted: 20 Jun 2018 \| 16:11:36 UTC
	CPU tasks do not fail on my HP laptop with SuSE Leap 42.3 which is constantly updated by SuSE. They fail instead on my SUN workstation, also with SuSE Linux 42.3 which is not updated by SuSE, I don't know why. On the other hand, GPU tasks run on the GTX 750 Ti board on the SUN, giving me huge credits. Tullio
	ID: 49705 \| Rating: 0 \| rate: / Reply Quote

PappaLitto Send message Joined: 21 Mar 16 Posts: 513 Credit: 4,673,458,277 RAC: 13,387 Level Scientific publications	Message 49706 - Posted: 20 Jun 2018 \| 23:37:10 UTC - in response to Message 49703. Last modified: 20 Jun 2018 \| 23:38:22 UTC
	Can you try installing python-support (if not already?) I am under the impression that the WUs failing with segmentation fault is the root cause, and other errors are consequences. I tried installing python-support and was able to get different errors! Here is the link again to the error page: http://www.gpugrid.net/results.php?hostid=480159&offset=0&show_names=0&state=5&appid= The abandons are due to completely detaching then adding project again which didn't help.
	ID: 49706 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 49710 - Posted: 22 Jun 2018 \| 10:58:08 UTC - in response to Message 49706. Last modified: 22 Jun 2018 \| 14:56:29 UTC
	From what I can tell the error is the same, the segmentation fault in pthread. Try to play around with your gcc installation, e.g. see if you have the latest one, g++, and so on. Ubuntu 18.04 is a widespread distro so it's surprising it doesn't work.
	ID: 49710 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Multicore CPUs : Simultaneously starting MCs

	About	Science	Volunteers	Performance	Forum	Join us	Donate