Jump to content
Corsair Community

Help with new build: CPU hitting 100C and shutting down with H115i


Recommended Posts

Hi all, thanks for reading. 

New build:  i9-12900k, Asus Z690-i, 32GB Vengeance DDR5 4800mhz, Asus TUF 3080 OC, H115i Pro XT (with the LGA1700 standoffs), tucked nicely into a ssupd meshilicious.  All stock, no overclock (except for the factory OC in the 3080).  iCue fans and pump on Balanced.  I've been working with PCs since the 90s, but this is my first water-cooled system, so I could be missing something obvious. 

At idle the CPU reads about 27C-30C, iCue shows coolant temp at 25C, and ambient is about 22C. 

Prime95 causes throttling and a reset in just a few minutes.  HWinfo logs show core temps jumping immediately into the 60s (fine) and climbing (not fine).  The first throttling entry comes after about two minutes with some cores hitting 100C, and then the system resets.  iCue shows a coolant temp of just 27 or 28C.

I have tried re-seating the pump block. I used Kryonaut paste, not too much. 

What else can I try?  What should I test?  What other info would be helpful?

Thanks in advance.

Rob

 

Link to comment
Share on other sites

Asus has a setting in the bios for their "all core optimizer" or something that is enabled by default. Set this to instead "enforce all limits" and try again. It should be under the AI Tweaker menu. This setting removes (or puts in place if set to disabled like it should be) some of the built-in intel TDP limits that would otherwise keep your P95 temps in-check. I was just fighting this with my 12900k as well.

Let me know if this helps.

Link to comment
Share on other sites

That seems like power/voltage management.  Tough to run Prime95 on a motherboard straight out of the box.  Not cooler related as the coolant temp responds as expected and if were a contact problem, you would not be able to hold lower CPU number at the start.  

 

You might also try with a milder stress test.  The bench test in CPU-Z is linear and makes for a good load test when you're not sure if things are working properly.  

Link to comment
Share on other sites

I will note that even with this setting in place I still get ~90C temps and the occasional 100C throttling blip during Small FFT runs etc. With the load settings enforced in the BIOS I also see Current/EDP Limit Throttling as well but my Cinebench scores haven't been reduced by more than 1% or so because of this. P95 AVX and the Alder Lake CPU can generate some serious load/heat, and if the TDP limits are removed, it can trip the overcurrent protection on many boards without an OC at all.

 

My average gaming temps are are in the 50's hitting highs of 65ish and lows in the 40's. My idle and ambient temps are otherwise similar to yours. I have re-applied paste to the cooler several times as I swap the CPU into various motherboards to test OC-ability but hey. Paste I used was Arctic Silver MX-4 but I doubt the paste made much of a difference. I did not delid the CPU, but I suspect that there would be gains to be made on Alder Lake here.

 

@c-attack It must be said that not all motherboards come with this setting enabled/disabled the same. "Auto" settings for some brands remove the TDP limits (Asus is one). Some enforce them by default (like MSI). I got lucky with my first attempt being with MSI, but swapping the chip over to an Asus I was trying to figure out why this would crash way longer than I care to admit yesterday. Not all defaults are created equal.

Edited by caveman19
Link to comment
Share on other sites

Thanks c-attach and caveman for the quick replies. 

Mixed results:  

1. Changing the All Core Optimizer setting to seems to have been effective.  It was on Auto (Enabled), now it's Disabled, and Prime95 ran for several minutes with the CPU max temp at 91C--success!

-but-

2. Running 3dMark Timespy caused a reset after about a minute.  HWinfo shows CPU temps averaging 40C, so I'm inclined to rule out CPU thermals as the problem.  Could I be having two different issues here? 

Is there a way to log what is causing the reset?  There's nothing in the Windows event log.  Here's a HWinfo CSV: 

https://www.dropbox.com/s/wheh9slwqp6dj9t/prime95 then timespy with all core limits.CSV?dl=0

Is there something in here that suggests a root cause?  Prime95 starts at row 51 and ends at row 157.  Timespy starts around row 211.  

Thanks again.

 

Link to comment
Share on other sites

2 hours ago, caveman19 said:

 It must be said that not all motherboards come with this setting enabled/disabled the same. "Auto" settings for some brands remove the TDP limits (Asus is one). Some enforce them by default (like MSI). I got lucky with my first attempt being with MSI, but swapping the chip over to an Asus I was trying to figure out why this would crash way longer than I care to admit yesterday. Not all defaults are created equal.

And that would be the entire point of finding out what it's using for Vcore at various load states as well as the power draw.  If it's pulling 330W and using 1.49v during Small FFT, you know you don't need to focus on reseating the block nor are you looking for an AIO problem.  You need to find a way to exclude hardware or physical issues.  Getting the BIOS settings nailed down could take weeks of experimentation, but if there is hardware problem that needs to be identified as soon as possible.  It appears to throttle you around the 230-240W mark.  This is not an AIO settings issue, but is does escalate rather quickly.   You can't quite rule out a contact issue yet.  

 

I would suggest trying a couple of different stress tests, including the CPU-Z bench test mentioned before.  Bad contact will immediately fail them all in seconds.  However, if some (like Prime) go off the chart and others cruise along smoothly, you know its a settings issue related to the specific nature of that test.  Whether or not you want or need to set-up the board to run things like Prime 95 is another question.  Timespy is not a super strenuous CPU test, but 40C seems wrong.  The highest load when using 3DMark is when Sysinfo part of the program is checking you out on load.  Those AVX2 instructions will exceed the temps in the actual test by some measure.  In the results tab of Timespy there is a fairly easy to read CPU temp graph.  If it said 40C all the way, something wasn't quite right.  I would expect 60-70C depending on CPU and voltage.  

Edited by c-attack
Link to comment
Share on other sites

CPU-Z ran for ten minutes or so, no trouble.  CPU hovered around 70C, and I watched the water temp slowly tick up to 33.7C.  Is it normal for the water to be so far behind the CPU temp?  

Interestingly, CPU-Z showed a about 11,000 for most of the test, then after six or seven minutes it dropped to below 1k, and then slowly crept back up to about 3k.  Temps were still at 70C, HWinfo didn't indicate thermal throttling, and core usage showed 100%.  What's happening here?

 

Link to comment
Share on other sites

Ok, good. If it can pass another test then your contact is good. That makes the Prime a result of the test’s specific cpu instructions and the multitude of BIOS settings and CPU management options. I am of the opinion you do not need to set up your machine to run Prime95 unless you actually use it to hunt Mersenne prime numbers or similar programs. If you do need to run programs that are AVX2 heavy, it’s still better to set up for them, not a program you don’t use. 
 

CPU-Z will do that sometimes and I have seen it on my 10900K as well. I suspect it happens when the CPU down shifts to a lower power limit and CPU overreacts but does not recover.  CPU-Z fault, not your gear. 
 

Yes, it is normal for the coolant to be substantially lower than the CPU temp. The coolant is a transport system for moving heat from cpu to radiator to wherever the fans blow it. The CPU’s temp is a product of the socket voltage and CPU material, less what is physically conducted away. All cpu cooling is conductive. Air or water cooled, the rest is heat removal. Quick analogy is the pot of water on the stove. The flame, the bottom of the pot, and the water in the pot will all be different temperatures. The difference here is the radiator and fans are trying to dump the heat out as quick as they can. 
 

Understanding your coolant temp range will take some time. Minimum possible temp will be the same as the environmental temp around the cooling path. Case temp 32C? Then the coolant is going to be 32C at idle as well. A specific wattage will raise coolant temp a specific amount, less the amount of heat expelled. All of those are hard to track nor do you really need to, but a 250W load should raise the coolant about 9C with the fans at 1300 rpm for a 280mm radiator. You seem to be within the expected range. The trickier bit is you will see coolant temp rise with case ambient, so gaming often produces a larger fan response than a cpu test. That’s ok since the radiator fans are part of the case temp regulation and you need them to either remove warm air or help bring in more. There can be exceptions with complex case layouts. Just remember you don’t need the fans to be super reactive. +1C coolant = +1C cpu temp. You don’t need to worry about the cpu being +-1C because of fan speed. 

Edited by c-attack
Link to comment
Share on other sites

All seems good now.  Running Cinebench R23 for 30 minutes results in a score of 26332 (+/- normal for this CPU), HWInfo shows CPU temp averaged 80C, water temp 35C, with an ambient of 22C.  The meshilicious case is basically open on all sides, so "case temp" isn't going to be too far off of room ambient.  Timespy Extreme stress test passes at 99.2% with very little deviation on FPS or temps.  CPU usage peaks at 35%, so the CPU temps never get above 50C. GPU using 335W of power at 1845mhz hitting 65C, so I might try to undervolt that later.

Wrapping this thread up:  changing the ASUS Multicore Enhancement mode to "Disabled - Enforce All Limits" is the only configuration change I made--thanks, Caveman19--and aside from the one time that Timespy caused a reset (mentioned above) it's run stable ever since.  Cinebench, CPU-Z, and Timespy stress tests run for 30 minutes without resets or throttling.  Prime95 still hits 100C on some cores after just a couple of minutes, but very quickly and throttling kicks in, but it doesn't reset.  Per c-attack's note below, I don't really care about Prime95, so I won't bother pursuing a config that accommodates it. 

  • Like 1
Link to comment
Share on other sites

On 1/7/2022 at 1:33 PM, rob099 said:

Hi all, thanks for reading. 

New build:  i9-12900k, Asus Z690-i, 32GB Vengeance DDR5 4800mhz, Asus TUF 3080 OC, H115i Pro XT (with the LGA1700 standoffs), tucked nicely into a ssupd meshilicious.  All stock, no overclock (except for the factory OC in the 3080).  iCue fans and pump on Balanced.  I've been working with PCs since the 90s, but this is my first water-cooled system, so I could be missing something obvious. 

At idle the CPU reads about 27C-30C, iCue shows coolant temp at 25C, and ambient is about 22C. 

Prime95 causes throttling and a reset in just a few minutes.  HWinfo logs show core temps jumping immediately into the 60s (fine) and climbing (not fine).  The first throttling entry comes after about two minutes with some cores hitting 100C, and then the system resets.  iCue shows a coolant temp of just 27 or 28C.

I have tried re-seating the pump block. I used Kryonaut paste, not too much. 

What else can I try?  What should I test?  What other info would be helpful?

Thanks in advance.

Rob

 

I know what is causing all of this. Its H1151 1200 socket bracket post are too long and is causing the standoff to have play in it so its loose and the heat sink is pulling away from the chip surface, I just did a system today and I am using a H115i Pro XT (with the LGA1700 standoffs) But its the back plate thats the issue. It needs to be shimmed under it to keep it from sticking up to high on the top side of the motherboard.. so when you put the LGA1700 standoff bolts its tight to the board. Corsair you need to add nylon washers in that kit to put on the back of the motherboard. or people are going to burn up there high dollar cpus and also cause a fire in their house.

IMG_3576.jpg

  • Like 1
Link to comment
Share on other sites

10 hours ago, alhazen said:

I know what is causing all of this. Its H1151 1200 socket bracket post are too long and is causing the standoff to have play in it so its loose and the heat sink is pulling away from the chip surface, I just did a system today and I am using a H115i Pro XT (with the LGA1700 standoffs) But its the back plate thats the issue. It needs to be shimmed under it to keep it from sticking up to high on the top side of the motherboard.. so when you put the LGA1700 standoff bolts its tight to the board. Corsair you need to add nylon washers in that kit to put on the back of the motherboard. or people are going to burn up there high dollar cpus and also cause a fire in their house.

 

Hi, thanks for the note and pic.  I don't think that's happening here.  My backplate is not shimmed, I'm using the LGA1700 standoffs, and there's no play in the heatsink.  I can run cinebench for 30 minutes straight (score 26332--expected for this CPU), CPU temps plateau at 80C after about six minutes, and coolant temp plateaus at 35C after about 10 minutes.  If there were a contact problem surely the temps would continue to climb?

Also, this article for the LGA1700 standoff kit makes no mention of shimming the backplate:  How to: Use your old retention bracket to mount an Elite Capellix cooler to an LGA 1700 socket – Corsair

 

 

Link to comment
Share on other sites

On 1/7/2022 at 7:40 PM, rob099 said:

Interestingly, CPU-Z showed a about 11,000 for most of the test, then after six or seven minutes it dropped to below 1k, and then slowly crept back up to about 3k.  Temps were still at 70C, HWinfo didn't indicate thermal throttling, and core usage showed 100%.  What's happening here?

Some of this could potentially be attributed to the "enforce all limits" that we re-enabled in your bios. Basically, the Alder Lake chip by default wants something like 241 watts in it's PL1 (power limit 1), and then it can sustain that power for X number of seconds (generally 56 or so stock) and then reduces total package power to the PL2 which is probably something like 220 watts but I'm not actually sure for this chip.

 

To get around this without fully removing the PL1 target is to set your PL1 equal to your PL2 and to maintain the same thermals you've been seeing so far I would set both of these to 241 watts. This will cause the turbo boost power time window (that 56 seconds I mentioned) to effectively do nothing.

 

For my situation, room, ambient temperature, cooler setup, etc. I've set this value to 288 watts for both PL1 and PL2 but your mileage *will* vary. If you set this too high, certain loads like what can be produced with P95 can trigger over-current protection on many motherboards and/or power supplies. For me, that was right around 310 watts but sustaining that for long periods of time on anything short of LN2 would be ill-advised. For most CLC setups, power targets over 260 watts or so would be overkill as most benchmarks don't go over 240 watts or so unless you've gone overclocking with more than ~1.35v vcore under extreme load. I also would not advise pushing more than 1.375v on a CLC with this CPU as that's just a lot of watts to dissipate and CLC's just are not setup for that.

Link to comment
Share on other sites

  • 6 months later...
On 1/9/2022 at 6:30 AM, alhazen said:

.......  Corsair you need to add nylon washers in that kit to put on the back of the motherboard. or people are going to burn up there high dollar cpus and also cause a fire in their house.

No worries, they will pay for all damages hopefully. I mailed their tech support 6 days ago, no reply yet. How is your system keeping as for CPU temp.?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...