Jump to content
Corsair Community

AX1200 - Potential "Culprit" in High End Rig Instability


Dresk

Recommended Posts

I've had a perfectly stable new i7-3960x rig for the past 2 months now. Unfortunately, 2 BSODs have cropped up in recent gaming sessions that have caused me to re-investigate my setup to determine a potential culprit.

 

Though my Corsair forum profile has my system setup, I'll provide the overview here as well.

 

Motherboard : ASuS Rampage IV Extreme X79

CPU : Intel i7-3960x @ 4.8GHZ 1.38v (liquid cooled)

Video Card : eVGA GTX 580 1.5GiB Hydrocopper @ 860MHZ (+10 over stock)

PSU : Corsair AX1200

Hard Drives : 2 Corsair Force GT SSDs (480GiB, 240GiB), 1TiB Magnetic

Sound Cards : Onboard Disabled, NVIDIA Drivers Not Installed + Devices Disabled in Device Manager, Using X-Fi PCI-E (gamer mode, ~5ms latency)

OS : Windows7 64bit SP1

 

INITIAL SYSTEM TESTING

 

When I first built the system, I spent much time finding the right overclock for my i7-3960x. After much testing, I found 4.8GHZ to be stable, where IntelBurnTest could run on Xtreme using 7300MiB of memory with no issues. I even combined that test with eVGA OC Scanner with no issues, and then topped on CrystalDiskMark with no issues (meaning running IntelBurnTest, eVGA OC Scanner and CrystalDiskMark simultaneously had no issues). Temps were fine (max CPU got to 68C, GPU got to 42C), voltages seemed good, no problems.

 

THE CRASHES

 

1 week ago I got my first BSOD while playing Batman : Arkham City. Being a UE3 game it pushes the GPU heavily, and I unofficially uncap the framerate (I have a 120hz display) and achieve consistent 120FPS+ performance. I immediately re-ran my barrage of tests and found no issues. I considered this first BSOD a fluke related to NVIDIA's drivers.

 

3 nights ago I got my 2nd BSOD while playing The Darkness II. Their engine is proprietary but pushes a decent amount of GPU power through their post effects and PC-specific high-res shadowmap. Otherwise I run the game uncapped (no vsync) and achieve 200+ FPS most of the time.

 

POST CRASHES TESTING

 

After that BSOD, I went back to my testing again. This time, however, I experienced different results.

 

1) On its own, I can run IntelBurnTest on Xtreme Mode (and memtest86+) with 7300MiB out of 8GiB all day long with no issues. As long as the GPU isn't being stressed, my CPU / memory work perfectly.

 

2) On its own, I can run eVGA OC Scanner all day long with no issues. I've tried all different resolutions, power overdraw mode, etc, nothing causes it to crash.

 

3) When combining IntelBurnTest and eVGA OC Scanner however, I can accomplish 3 different results :

A) IntelBurnTest detects an error after 2-3 runs, therefore failing.

B) eVGA OC Scanner crashes.

C) Computer BSODs.

 

This 3rd point and its results give me some details. I can confidently say that my CPU, memory and GPU all work perfectly fine. Additionally, temperature is not an issue for any component. I can conclude that my extreme overclock works fine, and that my stock overclocked GTX 580 is working fine.

 

THE CULPRITS

 

So, let's break down the potential fault factors.

 

1) CPU : It's HIGHLY unlikely my new CPU is at fault. Nothing has changed, and it runs just fine when benchmarked on its own. It is a first batch i7-3960x, which does carry risks, but the extreme benchmarking without the GPU involved seems to indicate the CPU is fine.

2) Memory : Again, HIGHLY unlikely, for the same reasons as the CPU.

3) Motherboard : The ASuS Rampage IV X79 COULD be at fault. No capacitors are blown and no visible damage is present, but it is a brand new chipset and we all know the risks with those. I'm using the latest BIOS and I haven't read about issues specific to what I am experiencing.

4) OS : HIGHLY unlikely. I am a Computer Scientist and I keep my OS 100% clean. I purposely do not install Windows Updates and I do not install any software, instead installing it under a VMWare environment and creating wrapper executables to portabilize everything. I haven't installed, updated or changed any component of the OS since my system was initially 100% stable.

5) Drivers : I haven't changed a single driver (including video) since my system was initially 100% stable.

 

Now that I've listed what I don't think is responsible, I'll list what I feel is possible.

 

6) PSU : I started using the AX1200 with my new system 2 months ago, but I had purchased it over a year ago. I don't know if Corsair has had any revisions on it, but I've only "successfully" used it for 2 months, and the results of my independent benchmarks tend to show that my components alone are fine, it's only when my system is stressed that issues occur.

7) UPS. Because my house was built in the 70s, and we don't have reinforced copper wiring, our power is not 100% stable. To counter this, I primary use this UPS : http://www.amazon.com/CyberPower-CP1500PFCLCD-Pure-Sine-1500VA/dp/B00429N19W/ref=sr_1_4?ie=UTF8&qid=1329159284&sr=8-4 . It's a Cyberpower Pure Sinewave 900W UPS. To those who understand UPSs, that does technically mean if my 1200W PSU ever starts asking for more than 900W my UPS will fail. However, the highest I have ever seen my system demand is 743W. In addition, to make sure it wasn't the wattage rating of the UPS, I have also connected my system to this Cyberpower 1125W Pure Sinewave UPS (http://www.amazon.com/gp/product/B001RJEF7M/ref=wms_ohs_product) . The results of my tests remained the same, regardless of UPS used.

8) PCI-E data flooding. My onboard NIC is a PCI-E Intel NIC, and my sound card is connected through PCI-E. Although it would be pretty rare for the issue to arise from nowhere, it is possible that the CPU is having issues with PCI-E flooding (particularly when you have a Sound Blaster in the setup - their latency is very aggressive, but as long as you go through the Windows Mixer it'll never BSOD).

 

SUMMARY

 

EDIT : I've made new post(s) below, reflecting on the user comments and additional testing. I don't believe I agree with the original summary I wrote, but I will leave it here for now.

 

To summarize, it's very important to note that IntelBurnTest has reported a FAILURE (not crashed), while running eVGA OC Scanner. This lets me rule out the GPU has the sole component for the BSODs. It's also very important to note that I can stress my CPU / memory all day long with no issues, as long as the GPU is not also stressed.

 

Which component do I think is the culprit? The Corsair 1200W PSU. It's the least "tested" of all the components, the "oldest" of all the components and the component with the most "to do" when I'm pushing the rig, especially when it comes down to maintaining consistent voltages.

 

Please feel free to add your comments / suggestions to this thread. I am considering simply RMAing the AX1200W, but I'm not sure if Corsair does cross-shipping, and if they don't I'd like more definitive confirmation on the problem before I inconvenience myself with the delivery times.

 

- Dresk

Link to comment
Share on other sites

You might try AIDA64. IntelBurnTest has issues. I freaked when it crashed on me within 20 seconds. The AIDA64 tests are widely regarded as the gold standard and are used by the majority of independent testing labs. There are a variety of tests but according to the folks that make AIDA64 if you can run the FPU and GPU tests together for eight hours or so then your system will survive WWIII. It might be worth a shot. It's not free but you sound like someone that might really enjoy the software.

 

I seriously doubt the PSU is at fault but anything's possible. I would be more suspicious of the CPU overclock. That's a pretty high frequency for so low a voltage. You're more of an expert than I am, clearly, but my 3930K @ 4.375GHz is not stable with less than 1.40v.

 

One last thing: My video cards pull one heck of a lot more power than your single 580 (and that's not a good thing but it is what it is) and I have never (repeat never) had an issue with the AX1200. In fact my former PSU (AX 850) had no trouble powering those cards but I upgraded to the 1200 just for giggles. And I spend my time on Crysis 1, Metro 2033, and S.T.A.L.K.E.R. Call of Pripyat, three of the most demanding games ever coded by Humankind.

Link to comment
Share on other sites

I wouldn't trust a UPS as far as I could throw it. IMO they ruin the AC sine wave, slow down necessary current surges or draw from the AC line, and do nothing to improve or stabilize your AC power.

 

In the (true) high end audio world, there are components that recreate the AC line power, actually very high power amplifiers (1000+ Watts) that output a 60Hz sine wave. This is done to reduce the distortion on the AC line, among other things. A UPS used as a AC power enhancer is unthinkable and never used in that world.

 

A UPS adds more distortion to the AC line, suffers from "clipping" (rounding or flattening of the peak of the AC wave, the same thing an over-driven amplifier does), and makes it harder for a PC PS to function. I'm surprised your AX1200 doesn't shut off at ~750 Watts output when used with that UPS, most PC PS's shut down at much lower output when used with a UPS. That is actually an indication of the quality of the AX1200, and to an extent of your UPS, but no UPS is free of their inherent limitations. One PC hardware review site I read includes in their PS tests, connecting the PS to their UPS and seeing how much power they can draw from it before the PS shuts down. That is not when it's running on battery-only power, but with it's normal AC power on.

 

Before you blame the PS, take the UPS out of the way of the PS's incoming power.

Link to comment
Share on other sites

Agree with G50EED, think you'd be wasting you're money RMAing that AX1200. Frankly I doubt it's any of your hardware. Very slim possibility that your cpu overclock is starting to age, might need to bump the 1.38 a bit. Probably not though. Judging from clues 3 and 8, I would suspect a bit of corruption has slipped into the NVHD audio driver portion of the NVidea video driver package or the X-Fi driver package. Corruption in either can cause IRQ's to step on each other resulting in BSOD while gaming or stress testing. Seriously doubt it's the UPS either. Might try testing without it in line but I'd guess you won't see any difference. I have four APC units of equivalent quality to the CyberPower unit you've pointed to. I've literally yanked the cord out of the wall during an AIDA64 combined cpu/gpu stress test just to see what would happen and the system didn't even notice. Due to the number of power failures I have a year, I wouldn't run without an UPS. Of course with UPS you get what you pay for.
Link to comment
Share on other sites

I've done a lot more testing. I raised my VCore to 1.39v (from 1.38v) and enabled VCore LLC, since it looked like my 1.38v was dropping as low as 1.33v during stress testing. Initially it looked like that fixed the problem, as I had no issues at all during a 2 hour CPU + GPU benchmark.

 

Inexplicably, my computer rebooted during the loading of a level in The Darkness II. No BSOD, no beeps, just a cold reboot. This prompted me to run an extended stress test, again, to see where things were at.

 

IntelBurnTest with OC Scanner running was able to successfully run for 6 hours before IntelBurnTest reported a failure. No BSOD happened and no applications crashed. This was with in Xtreme mode using 7000GiB of memory.

 

Regarding the sound driver concerns, I do not install the NVIDIA audio drivers and I have the devices in device manager disabled. As far as I know this should prevent them from being involved with any data. The X-Fi is a different story (especially with the latency the drivers have it running at), but I've been using this one for a few years now with no issues.

 

I've heard that UPSs are a double-edged sword. I've been using them in my computer setups for over 8 years now, and this is the first build of mine that hasn't been stable (it's also my first liquid cooled build with quite a massive overclock). I'll do some reading on that comment with the expectation that the PS would just shut off at a certain output level (is that what randomly happened for the non-BSOD reboot?) Because of the comments here, I went back to checking the power draw on my UPS. As fast as my Cyberpower display can update, I see the power usage spiking from ~540W to as high as 860W. Lots of the spikes go as high as 800W, and man does the thing spike all over the place constantly, with some getting as high as 860W. In the meantime, if anyone knows of a really, really high quality UPS that won't fudge the power too much, I'm all ears (and price can go as high as $1500 USD). Too many random power surges / loss in my house occur for me to run without some form of battery support.

 

To respond to the IntelBurnTest versus AIDA64 - it certainly sounds like I the latter is the better standard, but the fact that linpack (BurnTests back-end) is failing at all when it never did before greatly concerns me. I've been using IntelBurnTest for 3+ years now, making sure my 10+ hour burn ins never had any issues. However, it wasn't until this system that I started benchmarking with both the CPU and GPU simultaneously.

 

I'm somewhat at a loss for what's going on. The random reboot followed by the fact that the computer only survived 6 hours of benchmarking makes me think that something randomly fails, and probably has nothing to do with benchmarking (ie. it's going to happen again even if gaming lightly or not doing much). There's more LLC stuff in the BIOS and VRM controls to play with, and I can (and will) be improving my liquid cooling to stop the stuff from even getting above 60C (but I am doubting thermals are the issue here).

 

Thanks for the comments, I'll continue testing and trying different configurations to see what I can get to.

 

- Dresk

Link to comment
Share on other sites

You might try AIDA64. IntelBurnTest has issues. I freaked when it crashed on me within 20 seconds...

 

I'm surprised that IBT reacted as it did for you, unless it's a glitch with the new Intel CPUs, since IBT has not been updated for a while, correct?

 

In my experience, IBT does not load all the cores to 100% like Prime95 does, or produce as high CPU temps as Prime95 does. I still like IBT, as I think it is more realistic, with it's load varying with time, and that change can be stressful in a different way.

 

Dresk, since your UPS has never been a problem for you, it may not be the issue, but it's easy to test. Your problems might just be tuning a very new platform optimally, which is exactly what you have. Have you seen what the CPU load is when you're gaming full blast?

Link to comment
Share on other sites

I'm surprised that IBT reacted as it did for you, unless it's a glitch with the new Intel CPUs, since IBT has not been updated for a while, correct?

 

In my experience, IBT does not load all the cores to 100% like Prime95 does, or produce as high CPU temps as Prime95 does. I still like IBT, as I think it is more realistic, with it's load varying with time, and that change can be stressful in a different way.

 

Yes, who knows? I started with the premise that I did not have a problem as I have never had a BSOD or a spontaneous restart of any kind. Also, Prime95 will either crash or produce an error (but keep running -- error visible only in the log file) after an hour or so. So while I was concerned I kept coming back to that old saying "If it ain't broke ...". I did find with some research that both IBT and Prime95 are not up to date with X79 and Sandy Bridge E and that lead me eventually to AIDA64. I was relieved when AIDA ran for 12hours without incident but that only caused me to contact the folks at Fire Wire to inquire about how reliable was their software. They convinced me they know what they're talking about. Then came their latest release which incorporates GPU stress tests as well. They felt pretty strongly that if my PC held up under 8 hours or more of FPU and GPU together that I could rest easy. My box got hot to be sure but it never throttled down or generated an error. And, finally, I had to keep reminding myself that everything works! So I think I'm done tinkering and testing for awhile ...

Link to comment
Share on other sites

hi all

The way you overclock your pc Dresk

Don't be suprise if u set BSOD

You CPU MAX voltage is 1.35 and you give 1.38

http://ark.intel.com/products/63696/Intel-Core-i7-3960X-Processor-Extreme-Edition-(15M-Cache-3_30-GHz)

So memory run more then 1600MHZ if u clock CPU at 4.8

it was test at a maximum speed ove 1600MHz 6-6-6-18

CMT6GX3M3A1600C6 is x58 memory was not test in 4 chanel mod

you VGA is 772MHz normaly and u clock +10 more then your overclock card

 

With all this overclock you can't say for sure that u dont ave burn ur CPU or memory yet.

 

hope this help

Sincerly yours

Link to comment
Share on other sites

Turbonerfs101e,

 

People always overvolt past the normal voltage specifications for CPUs when overclocking. It's just what you do when you overclock. The big trick is keeping the CPU cool while doing that, which I am accomplishing through liquid cooling.

 

My memory runs at 1600MHZ and the exact timings it's supposed to. I have an unlocked CPU, so my BCLK frequency is just 100, with all the rest of the overclock going to my CPU multiplier (which is 48). The new i7 Sandybridge Extremes simplified overclocking by condensing everything into BCLK (uncore frequency, etc has been removed). So, again, my memory is running exactly at 1600MHZ and the timings it's specced to run at. As for it running in quad channel, well, this shouldn't matter at all, there isn't anything specific to memory that allows it to run in a channel configuration - memory "kits" are sold as kits just to make sure people get the same exact DIMMs for their X channel configuration. I have 4 of the same exact DIMMs (same revision from Corsair).

 

My videocard comes STOCK overclocked at 850MHZ. It also comes stock overvolted. I haven't added any additional voltage to it personally, outside of the manufacturer adjustments. This is perfectly normal for videocards that are higher end. Again, the trick is keeping it cool, which I am (it never goes above 45C).

Link to comment
Share on other sites

So some bad news to report.

 

Last night I let IntelBurnTest run as long as it could (I wasn't testing the GPU). After about 6 hours one of the tests came back as a failure (no BSOD or reset though). This means I can no longer claim my CPU overclock is 100% stable. I was running the CPU at 1.39v with "High" CPU LLC, which is still pretty low volts for what most people need to run their i7-3960K at to get the speeds I am running. As a result, I've bumped it to 1.40v (people seem to need that voltage to run the 3960K at my speeds), and I'll run the tests again tonight.

 

I talked to some VERY knowledgeable people about UPSs. Turns out, to quote things most basically, there's 2 kinds of UPSs : line interactive and double(dual) converting. Every UPS I've been using is line interactive. Line interactive UPSs are cheap and do nothing at all to condition the power, they only do ANYTHING when the power is out, then they begin to provide their battery power. Obviously, this is not what I wanted, as clean, consistent power has always been an issue in my household, and this latest system build of mine is really, REALLY pushing the overclock. Let this be a lesson to all users of UPSs : check the topology of the UPS. If you want something that actually conditions your power, you NEED double converting; line interactive doesn't do anything until the power from the line goes out.

 

So, I went ahead and ordered a double converting UPS. These are the real players that literally take the power coming in and completely recondition it, producing their own, consistent output stream. They modify the power at all times, not just when the power to the unit is interrupted. This UPS is the real deal, an entreprise-grade device that should resolve all my power cleanliness issues (here's a link to the model I ordered : http://powerquality.eaton.com/9130-rackmount-ups-specs-1500VA-120V.aspx?CX=3 ).

 

It's strange that my once stable overclock won't function stably anymore. I've talked to people about VRM burn-in and CPU "settle in", which basically means overtime a CPU is going to need a little more juice to keep running at the same frequency. I've never personally experienced this phenomenon, but then again I've never overclocked to this extreme. My motherboard has LOTS of settings regarding the VRM, so I have plenty to play with for testing.

 

At this point I still can't rule out the PSU. I still don't think that it's running "100%" correctly at all times. IntelBurnTest is unique from Prime95 because it has periods where it rampantly adjusts the load, which tests the stability of your PSU greatly (a poor PSU will always have issues with significant load fluctuations {think back to the days of ATis and their aggressive power saving} ).

 

When the UPS arrives, I will do my IntelBurnTest at 1.40v and see if that indeed fixes the issue. I'm suspecting I'll still have problems. If I do, I'm going to have to consider the unfortunate option of reducing my overclock. At the same time, I'll probably be looking into RMAing the PSU to ensure that isn't the issue.

Link to comment
Share on other sites

  • 3 years later...

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...