Message boards :
Number crunching :
The hardware enthusiast's corner
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 16 · Next
Author | Message |
---|---|
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
The problem: One 24/7 processing computer loosing intermitently its network connection, thus not being able to report processed tasks, nor asking for new work. The cause: Its Wireless network card not fully inserted into PCIE x1 socket, resulting in an intermitent bad electrical contact problem. The solution: After checking, it was a mere mechanical problem. It was corrected by dismounting the card from its mounting frame, and bending the fixing tabs in the proper (CW) direction. As a result, the whole card was tilted in the direction of fully insert into PCIE x1 socket. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
In line with my last post: A graphics card without extra power connector(s) is receiving all its power from the PCIE socket. For example, this GTX 1650 rated TDP is 75 Watts, and it has no power connectors. This requires a current of 6,25 Amperes from the +12 Volts supply. (12V x 6.25A = 75W) For this reason, it is particularly important for this kind of cards the best possible electrical contact into PCIE socket. Usually there is enough mechanical play at Graphics cards mounting frame to physically reseat its PCB to be deeply inserted into PCIE socket. In my experience, this mechanical play can vary from about 0.5 to 1.5 milimeters (0.02 to 0.06 inches). It is usually very easy, and it takes only a few minutes to reseat PCB this way. Taking the same above mentioned graphics card as an example: This is how its mounting frame looks like. I'll loosen all frame's fixing nuts/screws. Starting with the two hexagonal female-threaded nuts, marked as 1 and 2 in previous image, then finishing with all screws, here marked as 3 and 4. Depending on the kind of card, there may be a lower or greater number of fixings, but usually they are easy to locate. Once all fixings are loose, the mounting frame will show its mechanical play. Holding PCB at its deeper position relative to mounting frame, all fixings are to be retightened now, starting again with the threaded nuts (here 1 - 2) and finishing with all screws (here 3 - 4). The final result: Graphics card PCB has come down nearly 1 milimeter. It can be appreciated when looking at Before and After images. In a computer where its graphics card is intermitently being unrecognized, this could be a point to discard. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
If keen on bricolage and informatics, how about mixing both? I'm explaining a good example for this. The only fan for this GTX 1650 graphics card started to fail, and I retired it momentarily from work to avoid it to become damaged due to overheating. I asked myself: Should I claim for warranty, and wait perhaps a couple of weeks for the new card to arrive... and lose the fun for solving it by myself? I doubt for about 10 seconds. This is self answered in this post. I looked for something to help among my retired cards, and I found this Gigabyte GT640 GV-N640OC-2GI that I probably would not use any more. I like Gigabyte cards because of their usually good design, constructive quality, and well dimensioned heatsinks and fans. Comparing heatsink mounting spacings in both cards I found to be nearly identical. And Gigabyte heatsink's surface and fan were bigger than original PNY's ones (Ok!). But comparing the components layout below heatsinks, some problems arised. Gigabyte's heatsink was hitting several PNY's card components: One quartz crystal (Y1), one solid capacitor (C204), and one ferrite core choke (L15) Here is when the bricolage part comes in play... - Marquetry saw for metal cutting, to retire some problematic fins. - Minidrill with ceramic milling piece, to make space into aluminum where needed. And mechanical problems are solved. Now it's time for applying [url=http://www.servicenginic.com/Boinc/GPUGrid/Forum/HE/GpuCoolerReplacement/06_Thermal paste.JPG]thermal paste[/url] and heatsink assembly. One more adapt was needed, because fans connectors were not compatible. But a bit of soldering and heat shrink sleeve, and also it's solved. After this, we can compare between Before ![]() ![]() Now this peculiar hybrid PNY-Gigabyte graphics card is working again! A final question for users that may have experienced a similar situation: Is fan usually covered by card's warranty? If so, is the whole card replaced by distributor, or the fan only? Your experiences at this respect would be very appreciated. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 87,795 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Does this card run at a lower temperature than before? One more adapt was needed, because fans connectors were not compatible.The original card doesn't have a 3rd pin (tachometer), so the card can't sense if the fan is not rotating. This is not a good setup for crunching. A final question for users that may have experienced a similar situation: Is fan usually covered by card's warranty?These cards are made for light gaming, not hardcore (7/24) crunching, so crunching (mining) isn't covered by warranty. But GPUs don't have an operating hours counter, so if you don't explicitly express on the RMA form that you used it for crunching, they will replace it. But the replacement will be the same quality, so I usually replace the fans (or the complete heatsink assembly) for a better one. If so, is the whole card replaced by distributor, or the fan only?It depends, but usually the whole card is replaced, then the broken card is sent to the manufacturer for refurbishing (replacing the fan in this case). |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Does this card run at a lower temperature than before? Yes and no. Peak temperatures are about two degrees lower now, as new heatsink and fan are bigger than originals. Explanation continues below. The original card doesn't have a 3rd pin (tachometer), so the card can't sense if the fan is not rotating. Right. This is by this card's design. However, Fan % is temperature controlled. And also by design, at full load card seems to "feel comfortable" at 78ºC. If temperature tends to lower this, also Fan % is lowered and temperature accomodates 78ºC again. But now Fan % at stability is about 10 % lower than with original heatsink/fan (60 % instead of previous observed 70%). ...they will replace it. But the replacement will be the same quality, so I usually replace the fans (or the complete heatsink assembly) for a better one. I thought the same when evaluating solution. This card is not installed in an easy environment: it is directly abobe a GTX 1660 Ti, in this double graphics card computer. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
As of current restrictions in many countries due to COVID-19 impact: It becomes important to solve our hardware problems by ourselves. Please, feel free to share here your problems in a Symptom - Cause - Solution scheme, or your favorite self-learnt tricks. It may be of great help to other colleagues. Thank you in advance! |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
- Symptom: A computer controlling an important process suddenly switched off by itself. Repeated attemps to switch it on again resulted in switching off after a few seconds past. - Cause: Two Processor heatsink's fixings had broken, causing it to tilt and loss tight contact with processor surface. As a self-protecive measure, system is switching off to prevent processor damage due to overheating. - Solution: * Plan A: First attempt consisted of repairing the broken fixings with fast curing cyanocrilate glue. After two hours curing, time to renew processor's thermal paste and reassemble heatsink. Result: After about three minutes waiting, fixing springs overcame glued parts and they got broken again. * Plan B: Studying carefully the heatsink mounting hardware, there was a passthrough hole at every corner in a very suitable placing to solve the problem by means of strategically arranged cable ties. Result: Cable ties are strong enough to keep necessary tension. Problem solved, and everything is working again! Particular conditions for this case: This case comes from a true intervention in the PC controlling a laboratory diagnostic instrument for celiac and autoimmunity diseases. I had to carry out this intervention dressed in all necessary PPEs (Personal Protective Equipments), thus not being fully free to go and come for spare parts. Solving the problem meant that diagnostic results for many patients, otherwise lost, were successfully retrieved. I took it as a NOW or NEVER situation, and happily it was NOW. Finally: Let this be my modest tribute to all those worldwide medical staff and field service colleagues, currently working in hard conditions due to Coronavirus crisis. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Finally, my adventure with Conductonaut thermal compound ended in an unexpected way. For background, please, refer to my previous post dated on January 26th 2020. On past March 29th, while a regular round of temperature checks, I found that the concerned GTX 1660 Ti card's temperature was 83ºC. (!) Yes, it was running an ACEMD3 WU, but when I first tested Conductonaut this temperature was 60ºC... I dismounted the GPU's heatsink and found that the original liquid-metal Conductonaut's state was converted in a soft-solid metal state. On this new state, I observed some cracks and irregularities, explaining a bad thermal coupling and subsequent abnormal temperature raising. It was hard to retire the altered compound, first using a plastic spatula, and then a fine polishing cotton. I can reccomend this kind of silver cleaner, made of a fine polishing-compound impregnated cotton. At the end, heatsink's copper surface recovered its original appearance. I decided to replace Conductonaut using my regular non-conductive thermal paste, Arctic MX-2. Manufacturer promises an eight years durability for it. Based on my own experience, I've tested to last at least 4 years, because I usually prefer to preventively replace it after about this period. It is easy to apply, due to its self-spreading ability. After this, GTX 1660 Ti returned to work, now the temperature being reduced from previous 83ºC to 77ºC. In a 24/7 working rig, it is advisable to check temperatures in a regular way, to prevent overheating on different components. For sure, it will increase the life expectancy for the whole system. |
![]() Send message Joined: 12 Jul 17 Posts: 404 Credit: 17,408,899,587 RAC: 348,486 Level ![]() Scientific publications ![]() ![]() ![]() |
Thermalright TF8 Thermal Compound Paste is the best I've used. It has the highest thermal conductivity at 13.8 W/mK. The best thing about it is that when you remove the CPU cooler after months of use it's still gooey and hasn't solidified like most others. It's the most expensive, until competition comes along. One wants the thinnest continuous layer you can get so use as little as possible and use the spatula to spread it out. I expect it can last for years. https://www.amazon.com/gp/product/B07K442WXV/ref=ppx_yo_dt_b_asin_title_o08_s00?ie=UTF8&psc=1 |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 87,795 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Finally, my adventure with Conductonaut thermal compound ended in an unexpected way.This is very strange. I didn't experienced such change in the liquidity of the Conductonaut, and the temperatures of my CPUs / GPUs on which I've changed the thermal grease. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
This is very strange. I didn't experienced such change in the liquidity of the Conductonaut... I guess that tested heatsink's core is not made of pure copper, but some kind of alloy not compatible with Conductonaut. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
Derived from current COVID-19 regulations at Spain, requiring home confinement, a challenge arose: Will I be able to build a new crunching rig from my stored spare/scrapped pieces? I started by rescuing an ancient Pentium 4 system "stored" at top of a wardrobe. I dismounted motherboard, PSU, peripherals, and I got that old minitower ATX chassis as starting point. PSU: The old PSU had not proper connections for current mainboards. I rescued two PSUs from my scrap drawer, one with failed electronics, and the other with failed fan... I replaced defective fan by the working one, and the PSU problem was solved. Motherboard, CPU, RAM: I had stored at spares drawer the ones leftover from my last hardware upgrade. There was a new problem: Available chassis is an old model one, with PSU hanging directly above CPU location. But I found an original Intel low profile CPU heatsink, and problem was solved also. From spares, I rescued my last remaining 120 GB SSD and a GIGABYTE GTX750 factory overclocked graphics card... With all these and a bit of (free ;-) self-workmanship, the new rig is a fact without leaving home: Test passed! New system Host ID: 540272 New system look: ![]() One more detail: Due to the low power consumption (38 W TDP) graphics card and the reduced CPU heatsink, this is the only of my rigs with CPU running hotter (59 ºC) than GPU (53 ºC) at full load. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
If we call severe to a problem that prevents a computer to start working. If we call ridiculous to a trivial circumstance causing a severe problem. This is one of the most severe-ridiculous problem I've ever found, and more than once. It happened today in one of my rigs. I'm documenting it this afternoon, and I'll publish the solution on tomorrow's afternoon. - Symptom: Starting the system, it runs for some seconds, then it stops and nothing happens on following attempts to restart. I opened this system, I made a quick contacts check, started again, and this time the start attempt succeeded (Fans turning, beep heard...) for a few seconds only. - Cause: I started to think: PSU failure, CPU heatsink disengaged... and, If it was...? And it was it! - Solution: ??? You have 24 hours to guess your favorite cause-solution. |
![]() ![]() Send message Joined: 20 Jan 09 Posts: 2380 Credit: 16,897,957,044 RAC: 87,795 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
A stuck power button can cause this: first it turns on the system, but if it stays in the "pressed" state it will turn off the system after 4-5 seconds (hard power off). |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,496,283,595 RAC: 3,436,646 Level ![]() Scientific publications ![]() |
Bad PSU Bad motherboard Bad memory ![]() |
![]() Send message Joined: 8 Aug 19 Posts: 252 Credit: 458,054,251 RAC: 0 Level ![]() Scientific publications ![]() ![]() |
Bad PSU From my experience, the PSU is most likely to be problematic. Just sayin'. |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
- Symptom: Starting the system, it runs for some seconds, then it stops and nothing happens on following attempts to restart. - Cause: Power On button got temporarily hooked, causing the PSU's hard stop feature to suspend supply after a few seconds. On the tilt and maneuvers to contacts checking, Power On button disengaged, and then it got hooked again on next time it was pressed. - Solution: Usually it is possible to access to Power On button switch, most of times by dismounting chassis front panel. Here is an image of the affected switch at its mounting position, and once it is dismounted. Nowadays, it is a normally-open push-button. A click must be heard when pushing it, and another click when releasing it. Problem was solved by dispensing a few drops of ethanol and pushing it repeatedly until it became disengaged and moving freely. Pretty trivial and ridiculous, but I'm sure that maaany computers have gone to workshop for a problem like this... On Apr 18th 2020 | 19:16:08 UTC Retvari Zoltan wrote: A stuck power button can cause this: first it turns on the system, but if it stays in the "pressed" state it will turn off the system after 4-5 seconds (hard power off). Congratulations! You have won an image of my special Gold - Medal to Outstanding Analyst. (Well... Excuse me, it is not exactly gold, it is really high quality bronze ;-) And my special thanks to Ian&Steve C. and Pop Piasa for participating. |
Send message Joined: 21 Feb 20 Posts: 1109 Credit: 40,496,283,595 RAC: 3,436,646 Level ![]() Scientific publications ![]() |
finally I was able to finish up my newest GPUGRID system. It's one of my old SETI systems, but I needed to convert it from USB risers to ribbon risers (and motherboard swap) for the increased PCIe bandwidth requirements here. CPU: Intel Xeon E5-2630Lv2 (6c/12t,2.6GHz) MB: ASUS P9X79 E-WS RAM: 32GB (4x8) DDR3L-1600MHz ECC UDIMM GPUs: [7] EVGA RTX 2070 PSUs: 1200w PCP&C + 1200W HP server PSU ![]() ![]() ![]() ![]() went with a 2U supermicro active CPU cooler so I had enough room for the ribbon risers on the 2 GPUs above it. replaced the 60mm fan on it with a Noctua one since even at 20% speed the stock fan was very noisy. the Noctua fan doesnt cool as well as the stock server fan that came with it, but it's enough for this 60W chip (temps in the 50's @65% load) and it's a lot quieter. ![]() |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
🙌 |
![]() ![]() Send message Joined: 24 Sep 10 Posts: 588 Credit: 11,424,836,510 RAC: 7,540,279 Level ![]() Scientific publications ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
I'm really impressed watching at your systems. Thank you very much for your Masterclass. That's what I would describe as high-level computer hardware engineering. And your just newborn system is returning processed tasks like a charm...🙌 Congratulations! |
©2025 Universitat Pompeu Fabra