Testing acemd3 windows (thread no longer relevant)

Message boards : Number crunching : Testing acemd3 windows (thread no longer relevant)
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 52190 - Posted: 5 Jul 2019, 17:06:07 UTC
Last modified: 5 Jul 2019, 17:09:50 UTC

Time to test acemd3 for windows. It worked locally. Now I've sent a few WUs named ...TEST31... . There are a few successes but also several failures.

Common errors appear to be

* ERR_RESULT_START couldn't start app: CreateProcess() failed - Access is denied.
* 195 (0xc3) Unknown error number # Engine failed: Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

The workunits SHOULD be sent to hosts with CUDA 9.2, Kepler and beyond, driver >= 397.44 as per [1]. Each WU uses NVRTC to recompile its kernel at runtime: this (+ driver/card/arch mismatch) may explain the second error.

As for the first error, still no clue.

[1] https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
ID: 52190 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 52191 - Posted: 5 Jul 2019, 17:11:57 UTC - in response to Message 52190.  

Also, I needed to make the assumption that the SystemRoot is c:\windows
ID: 52191 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 52192 - Posted: 5 Jul 2019, 17:19:53 UTC - in response to Message 52191.  

Error 195 must be the 20x0s ! We need cuda 10 for that.
ID: 52192 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52193 - Posted: 5 Jul 2019, 19:13:23 UTC

Thanks Toni, just in time for a more interesting period in the GPU market (Turing SUPER refresh)!

MrS
Scanning for our furry friends since Jan 2002
ID: 52193 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 52196 - Posted: 6 Jul 2019, 3:01:34 UTC

I just noticed I'm crunching with a new acemd3 2.04 application. Still only beta tasks but all have crunched successfully with the Linux OS and CUDA10.
ID: 52196 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 52199 - Posted: 6 Jul 2019, 7:24:19 UTC - in response to Message 52196.  
Last modified: 6 Jul 2019, 7:25:11 UTC

Yes. acemd3 linux is working for what I can tell.

Windows version is next.
ID: 52199 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
flashawk

Send message
Joined: 18 Jun 12
Posts: 297
Credit: 3,572,627,986
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 52200 - Posted: 6 Jul 2019, 15:33:58 UTC - in response to Message 52193.  

Thanks Toni, just in time for a more interesting period in the GPU market (Turing SUPER refresh)!

MrS


Is this for the newer nVidia 20XX cards? This has been the main reason why I've been holding off water cooling the rest of my GPU's
ID: 52200 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 52201 - Posted: 6 Jul 2019, 19:23:05 UTC - in response to Message 52200.  

Yes these new beta wrapper apps correctly work with Turing cards.
ID: 52201 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jul 17
Posts: 404
Credit: 17,408,899,587
RAC: 0
Level
Trp
Scientific publications
watwatwat
Message 52202 - Posted: 6 Jul 2019, 22:54:51 UTC - in response to Message 52199.  

Yes. acemd3 linux is working for what I can tell.

I saw this post and thought it was time to come back. My few Win7 computers are getting work but not my Linux rigs. I checked every box in my Preferences.
Too soon for a steady work flow???

ID: 52202 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 52203 - Posted: 6 Jul 2019, 23:22:43 UTC - in response to Message 52202.  

Yes, Toni only threw out another limited run of beta tasks again. If you didn't grab them right away, you missed them.

I gather we are still a long way before proper generation of non-beta, new acemd3 tasks.

I have a hunch we still have another period of beta work to come for further testing of the Windows application. I don't expect the app to be mainlined until both Linux and Windows apps are validated.
ID: 52203 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 31 Oct 08
Posts: 186
Credit: 3,578,903,157
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52245 - Posted: 13 Jul 2019, 18:04:41 UTC - in response to Message 52199.  

Yes. acemd3 linux is working for what I can tell.

Windows version is next.


I cannot seem to get work for my NVidia Linux system. I just converted it from windows 10 to ubuntu 18.04 as windows could not handle my mix of nvidia boards on risers.
tb85-nvidia

67	GPUGRID	7/13/2019 1:00:27 PM	Sending scheduler request: To fetch work.	
68	GPUGRID	7/13/2019 1:00:27 PM	Requesting new tasks for NVIDIA GPU	
69	GPUGRID	7/13/2019 1:00:29 PM	Scheduler request completed: got 0 new tasks	
70	GPUGRID	7/13/2019 1:00:29 PM	No tasks sent	
71	GPUGRID	7/13/2019 1:00:29 PM	No tasks are available for Short runs (2-3 hours on fastest card)	
72	GPUGRID	7/13/2019 1:00:29 PM	No tasks are available for Long runs (8-12 hours on fastest card)	
73	GPUGRID	7/13/2019 1:00:29 PM	No tasks are available for New version of ACEMD	
74	GPUGRID	7/13/2019 1:00:29 PM	No tasks are available for Anaconda Python 3 Environment	


I am guessing the Linux app is not ready?
ID: 52245 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52246 - Posted: 13 Jul 2019, 19:14:11 UTC - in response to Message 52245.  

I am guessing the Linux app is not ready?

My limited understanding is that it is ready, but they are waiting for the Windows version in order to release them both at the same time.

It is too hot for me anyway. They can wait until September.
ID: 52246 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 31 Oct 08
Posts: 186
Credit: 3,578,903,157
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52263 - Posted: 14 Jul 2019, 19:10:53 UTC - in response to Message 52246.  

It is too hot for me anyway. They can wait until September


I hear you.

Went to open frame mining rig to help with cooling. Windows choked with 5th gpu. I switched to ubuntu with total of 6 gpus. NVidia driver did not spin the fans enough to cool. Spend 2 days figuring out how to enable fan control. Going to make a note here to myself and anyone else:

sudo apt install nvidia-driver-390
// the above puts in the proprietary driver
sudo nvidia-xconfig -a --cool-bits=4
// above created my 6 gpu entries and enabled fan control for all 6
// needs to run every time a board is added or removed.
nvidia-settings &
// above brings up the 6 devices where the fan speed can be set
// hopefully there is a way remember the setting after a reboot
ID: 52263 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52264 - Posted: 14 Jul 2019, 19:23:04 UTC - in response to Message 52263.  

NVidia driver did not spin the fans enough to cool. Spend 2 days figuring out how to enable fan control.

Thanks. I normally don't bother with controlling fans on my Ubuntu machines, but that may be because I didn't know of any way to do it.
ID: 52264 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52265 - Posted: 14 Jul 2019, 19:33:42 UTC - in response to Message 52263.  

sudo apt install nvidia-driver-390
// the above puts in the proprietary driver
I suggest: (suspend GPU tasks first)
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-430
If the last one fails then:
sudo apt-get install libnvidia-compute-430
then the previous
This way you'll have CUDA 10.2 capable drivers.
(If you like the GUI better, then you can use only the first line, then go show apps -> software & updates -> other drives -> select the 430 driver and apply changes, wait for the driver download)
ID: 52265 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zalster
Avatar

Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 52266 - Posted: 14 Jul 2019, 20:43:36 UTC - in response to Message 52263.  

It is too hot for me anyway. They can wait until September


I hear you.

Went to open frame mining rig to help with cooling. Windows choked with 5th gpu.


I'm with both of you there. Hitting over 100F everyday. Shut everything down.

Sounds like you need a bash file to override the nvidia to turn the fans up to 100% all the time.

Keith was kind enough to send me his but you need to make several adjustments to Ubuntu to use them.
ID: 52266 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile JStateson
Avatar

Send message
Joined: 31 Oct 08
Posts: 186
Credit: 3,578,903,157
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52267 - Posted: 14 Jul 2019, 22:51:57 UTC - in response to Message 52265.  

sudo apt install nvidia-driver-390
// the above puts in the proprietary driver
I suggest: (suspend GPU tasks first)
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-driver-430
If the last one fails then:
sudo apt-get install libnvidia-compute-430
then the previous
This way you'll have CUDA 10.2 capable drivers.
(If you like the GUI better, then you can use only the first line, then go show apps -> software & updates -> other drives -> select the 430 driver and apply changes, wait for the driver download)


I got errors from the NVidia download. My attempt
sudo sh ./NVIDIA-Linux-x86_64-430.34.run
failed within seconds. I then read the instructions that recommended using a repository and NOT using their download. Best I could google was that 390 driver but I also read that it fully support the "10" series of boards. I don't have any newer boards. I will try your repository when I get to a stopping point (seti offline)

I have since discovered the "seti special app" for Linux that can does 6-8 work units in the time it would normally take a gtx1070 to do a single one. I only looked into this app since the Linux app is not working on gpugrid. I will probably crunch on seti with all 6 of my "10" Maybe I can get into the top 10. I posted some performance graphs here https://setiathome.berkeley.edu/forum_thread.php?id=81271 If I can get into the top 3 I may not come back to gpugrid for a while.
ID: 52267 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Keith Myers
Avatar

Send message
Joined: 13 Dec 17
Posts: 1419
Credit: 9,119,446,190
RAC: 891
Level
Tyr
Scientific publications
watwatwatwatwat
Message 52269 - Posted: 15 Jul 2019, 20:05:55 UTC - in response to Message 52266.  

It is too hot for me anyway. They can wait until September


I hear you.

Went to open frame mining rig to help with cooling. Windows choked with 5th gpu.


I'm with both of you there. Hitting over 100F everyday. Shut everything down.

Sounds like you need a bash file to override the nvidia to turn the fans up to 100% all the time.

Keith was kind enough to send me his but you need to make several adjustments to Ubuntu to use them.

Just run a bash script file each time you boot the host to set your overclocking and fan control once you have applied the coolbits tweak in xorg.conf.

This is the one I use for my daily driver. It all is accomplished with nvidia-settings and nvidia-smi if you are power limiting.

#!/bin/bash

/usr/bin/nvidia-settings -a "[gpu:0]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUPowerMizerMode=1"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUPowerMizerMode=1"

nvidia-smi -i 0 -pl 215
nvidia-smi -i 1 -pl 215

/usr/bin/nvidia-settings -a "[gpu:0]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:0]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-settings -a "[fan:1]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:2]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-settings -a "[fan:3]/GPUTargetFanSpeed=100"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUFanControlState=1"
/usr/bin/nvidia-settings -a "[fan:4]/GPUTargetFanSpeed=100"

/usr/bin/nvidia-settings -a "[gpu:0]/GPULogoBrightness=20"
/usr/bin/nvidia-settings -a "[gpu:1]/GPULogoBrightness=20"
/usr/bin/nvidia-settings -a "[gpu:2]/GPULogoBrightness=20"

/usr/bin/nvidia-settings -a "[gpu:0]/GPUMemoryTransferRateOffset[4]=600" -a "[gpu:0]/GPUGraphicsClockOffset[4]=60"
/usr/bin/nvidia-settings -a "[gpu:1]/GPUMemoryTransferRateOffset[4]=600" -a "[gpu:1]/GPUGraphicsClockOffset[4]=60"
/usr/bin/nvidia-settings -a "[gpu:2]/GPUMemoryTransferRateOffset[3]=2000" -a "[gpu:2]/GPUGraphicsClockOffset[3]=30"


It only got tricky with the new Turing cards which have TWO fan interfaces since they have two fans on each card. They also have FOUR power levels compared to Pascal's 3 power levels. I had to figure out that you need to increment the fan count to properly identify the fans for control. Also you need to change the [X] number to identify which power level you are applying the overclock to.

This example is for two RTX 2080's and one GTX 1080. Should mention that the GPULogoBrightness command DOES NOT work on the Turing cards. That attribute is not exposed on the Turing cards anymore. Works fine for Maxwell and Pascal though. So for the Turing cards you either have to live with the logo being full on bright or use various levels of opaque tape to cover up the logo.


ID: 52269 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar

Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 52271 - Posted: 15 Jul 2019, 21:08:55 UTC

Guys, you're having a nice discussion here but please don't take this thread completely off-topic - important news could appear here.

MrS
Scanning for our furry friends since Jan 2002
ID: 52271 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Zalster
Avatar

Send message
Joined: 26 Feb 14
Posts: 211
Credit: 4,496,324,562
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 52272 - Posted: 15 Jul 2019, 22:37:58 UTC - in response to Message 52271.  

Apes,

Maybe move this last few discussions to a new thread, something like "Turing adjustments for heat?"
ID: 52272 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : Number crunching : Testing acemd3 windows (thread no longer relevant)

©2025 Universitat Pompeu Fabra