Everyone is getting computation errors

Message boards : Number crunching : Everyone is getting computation errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49917 - Posted: 17 Jul 2018, 8:29:26 UTC
Last modified: 17 Jul 2018, 8:32:04 UTC

Toni who is probably the most qualified person for updating the app with a new ACEMD version is currently on holidays without good internet. I am also on holidays currently although I doubt I could have fixed it anyway. I told the guys at the lab to cancel the GPU workunits until it's fixed, so you might have to wait a few days before we fix it and send out new ones. I'm sorry but some stuff is beyond my control sometimes.
Maybe Gianni can take a look at it while we are out, I have informed him as well.
ID: 49917 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 11 Jul 09
Posts: 1639
Credit: 10,159,968,649
RAC: 351
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49918 - Posted: 17 Jul 2018, 8:39:55 UTC - in response to Message 49917.  

Toni who is probably the most qualified person for updating the app with a new ACEMD version is currently on holidays without good internet. I am also on holidays currently although I doubt I could have fixed it anyway. I told the guys at the lab to cancel the GPU workunits until it's fixed, so you might have to wait a few days before we fix it and send out new ones. I'm sorry but some stuff is beyond my control sometimes.
Maybe Gianni can take a look at it while we are out, I have informed him as well.

It might be better to keep the tasks, but deprecate the Windows apps - that way, you would still get some work done (albeit at only ~20% capacity) by your Linux volunteers.
ID: 49918 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Retvari Zoltan
Avatar

Send message
Joined: 20 Jan 09
Posts: 2380
Credit: 16,897,957,044
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49919 - Posted: 17 Jul 2018, 8:51:25 UTC - in response to Message 49918.  

Toni who is probably the most qualified person for updating the app with a new ACEMD version is currently on holidays without good internet. I am also on holidays currently although I doubt I could have fixed it anyway. I told the guys at the lab to cancel the GPU workunits until it's fixed, so you might have to wait a few days before we fix it and send out new ones. I'm sorry but some stuff is beyond my control sometimes.
Maybe Gianni can take a look at it while we are out, I have informed him as well.

It might be better to keep the tasks, but deprecate the Windows apps - that way, you would still get some work done (albeit at only ~20% capacity) by your Linux volunteers.
+1
ID: 49919 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile [AF] fansyl

Send message
Joined: 26 Sep 13
Posts: 20
Credit: 1,714,356,441
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49920 - Posted: 17 Jul 2018, 9:09:49 UTC
Last modified: 17 Jul 2018, 9:10:17 UTC

You are entitled to a holiday :-)
Courage to the whole team to fix this even if there is no emergency.
ID: 49920 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
mmonnin

Send message
Joined: 2 Jul 16
Posts: 338
Credit: 7,987,341,558
RAC: 213
Level
Tyr
Scientific publications
watwatwatwatwat
Message 49921 - Posted: 17 Jul 2018, 10:40:13 UTC - in response to Message 49919.  

Toni who is probably the most qualified person for updating the app with a new ACEMD version is currently on holidays without good internet. I am also on holidays currently although I doubt I could have fixed it anyway. I told the guys at the lab to cancel the GPU workunits until it's fixed, so you might have to wait a few days before we fix it and send out new ones. I'm sorry but some stuff is beyond my control sometimes.
Maybe Gianni can take a look at it while we are out, I have informed him as well.

It might be better to keep the tasks, but deprecate the Windows apps - that way, you would still get some work done (albeit at only ~20% capacity) by your Linux volunteers.
+1

+2

Still crunching here.
ID: 49921 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 49922 - Posted: 17 Jul 2018, 10:46:11 UTC

They are probably making sure the results given back so far are valid and scientifically useful as I'm sure trust in the results after something like this is probably slim.
ID: 49922 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
tullio

Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 49923 - Posted: 17 Jul 2018, 11:26:15 UTC

I am a new user and don't want to criticize. But I see that minimum quorum is one.Why?
Tullio
ID: 49923 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49924 - Posted: 17 Jul 2018, 11:37:38 UTC - in response to Message 49918.  

It might be better to keep the tasks, but deprecate the Windows apps - that way, you would still get some work done (albeit at only ~20% capacity) by your Linux volunteers.

I will put my GTX 980 on Ubuntu to help. My GTX 1060 that crashed was overheating at 82C or more - it has a bad heatsink or voltage regulator or something.
ID: 49924 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
PappaLitto

Send message
Joined: 21 Mar 16
Posts: 513
Credit: 4,673,458,277
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 49927 - Posted: 17 Jul 2018, 14:24:28 UTC - in response to Message 49924.  

My GTX 1060 that crashed was overheating at 82C or more - it has a bad heatsink or voltage regulator or something.

Try taking off the heat sync and changing the thermal paste. Whatever you put on will definitely be better than stock and will last a lot longer. I recommend Arctic Silver 5, but make sure you don't get any on components because it is conductive.
ID: 49927 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49928 - Posted: 17 Jul 2018, 15:00:57 UTC - in response to Message 49927.  

Try taking off the heat sync and changing the thermal paste.

Yes, I did that a few weeks ago, using Arctic MX-4. It didn't change a thing. I noticed several months ago that it was getting too warm for comfort, and have tried it now in three different machines. One of them has a 120mm rear exhaust fan, a 120mm top exhaust fan, and a 120mm front intake fan. It still ran at 80C, in a cool room. I think it is gone - either a heatpipe, or else the GPU chip itself or voltage regulator is causing too much current to flow.

I have an EVGA GTX 970 though which will work fine until Nvidia decides to release something worth buying at a reasonable price.
ID: 49928 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Stefan
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 5 Mar 13
Posts: 348
Credit: 0
RAC: 0
Level

Scientific publications
wat
Message 49942 - Posted: 18 Jul 2018, 7:19:27 UTC - in response to Message 49923.  

Well apparently Gianni also knows how to deprecate apps. So now we will have Raimondas compiling the new app version which may take a few days and then he will deploy the new app. I assume we should have some sort of tutorial on this stuff for more people but from what I gather managing BOINC is a very esoteric business
ID: 49942 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
MartinKanne

Send message
Joined: 27 Dec 16
Posts: 6
Credit: 53,210,225
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 49991 - Posted: 22 Jul 2018, 14:05:39 UTC

OK, a few days have happen.
Did anyone fix the Problem ?
ID: 49991 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
[AF>P4G] anthony

Send message
Joined: 14 Mar 10
Posts: 14
Credit: 501,938,373
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 49992 - Posted: 22 Jul 2018, 17:35:40 UTC

Hello,
I have also the same problem. Since 5 or 6 days, all the WU have caulculation errors in 2 seconds. My configuration :
GTX 1060
Windows 10
Driver Fichier INF : oem57.inf | Marque : Nvidia | Classe : Display | Version : 24.21.13.9836 | Date : 24/06/2018

I have to install the cuda toolkit 9.1 ?

ID: 49992 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Erich56

Send message
Joined: 1 Jan 15
Posts: 1166
Credit: 12,260,898,501
RAC: 1
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 49993 - Posted: 22 Jul 2018, 18:55:55 UTC - in response to Message 49992.  
Last modified: 22 Jul 2018, 18:56:20 UTC

I have to install the cuda toolkit 9.1 ?

no, the only thing that would help is to install Linux
ID: 49993 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : Everyone is getting computation errors

©2025 Universitat Pompeu Fabra