Jump to content
Corsair Community

Corsair iCUE 5 corrupting SPD data on RAM modules


Recommended Posts

I'd like to draw attention to an issue I've experienced with Corsair iCUE 5. I recently had a problem where my computer would refuse to boot running everything at stock configuration as iCUE had erroneously written bad data to the SPD chips on all four of my RAM modules (all CMK16GX4M2B3200C16 modules).

Here's the relevant thread I posted on the LTT forums, with my solution post at the end explaining the issue in great depth. System specs also posted there.

The summary is that I was using iCUE 4 as I had been for a long long time and got an upgrade notification to install iCUE 5, so I did the update and it was all seemingly fine. A couple of days later I then wanted to clear my BIOS settings and install a new BIOS version. I removed the CMOS battery to clear everything and then attempted to reboot my system so I could flash the new BIOS, however my system managed to complete the POST. After much troubleshooting and banging my head on the wall, I noticed one of my modules was showing the JEDEC 2133 MHz profile as being able to run at 1.0 V which is entirely wrong and out-of-spec for desktop DDR4. After several more hours of troubleshooting and finally being able to get into the BIOS once more so I could reapply the XMP profile - which thankfully was not affected - I could get into WIndows and do more investigation.

I opened Thaiphoon Burner and it confirmed all four of my modules had CRC errors, all at the exact same offset of 0x0B (11) bytes in their SPD data. After reading the JEDEC DDR4 SPD specification, I determined that the reason for the erroneous 1.0 V being shown on the JEDEC profile is because byte 11 determines the voltage the module supports. I had to fix this manually by hand by going into Linux, disabling all IO memory and ACPI restrictions, loading the I2C & SMBus kernel modules and then individually correcting the corrupted bytes using i2c-tools and then confirming the CRC was good in Thaiphoon Burner and comparing it to their database of known good SPD dumps. Big thanks to Softnology for their software because I wouldn't have been able to solve this without them.

The bigger issue here is that iCUE 5 must have an issue with doing dirty writes on the SMBus. Which not only should it not be doing to being with on RAM modules which don't have RGB, but not waiting for the SMBus to be free before performing those write operations. A Corsair software engineer needs to look into this, writing random things to system memory with a kernel level driver isn't the sort of thing you want to be doing on customer's systems if you want them to ever purchase your hardware or use your 1.1 GB software suite for controlling fan speeds and RGB lighting again.

I'd used OpenRGB in the past and never had an issues, iCUE 4 - even with all of its shortcomings - didn't have these issues. Please look into this issue as this is a serious problem that a lot of customers won't have the knowledge to debug and will just return their memory as faulty.

Slightly more concerning now is that I have to wonder is: where else was iCUE doing writes to memory that it really shouldn't? Can I trust the software not to write garbage data into kernel memory? Motherboard firmware regions? Graphics card memory mapped space? It is deeply concerning and disappointing.

Thank you

edit: I want to confirm that no other piece of user installed software had access to the SMBus

Edited by ADAMPOKE111
  • Like 1
  • Thanks 3
Link to comment
Share on other sites

I've attached three relevant images below. One showing the main Thaiphoon Burner window with the CRC error, one with the hex dump and the other showing the comparator against a known good SPD dump for my RAM modules. All 4 of my modules exhibited this issue. Also I don't know if my RAM's serial number is in there, I'm not too bothered if that's public now because I've not only lost the original proof of purchase for these, but I'm not sure I'd want another set of Corsair memory in the future if this is the tier of software support and testing that can be expected.

ThaiphoonBurnerBadByte.jpg

ThaiphoonBurnerHexDump.jpg

ThaiphoonBurnerMain.jpg

  • Like 1
Link to comment
Share on other sites

7 minutes ago, ADAMPOKE111 said:

snip

The only saving grace here is that the software failed in such a way that it seemed to overwrite my modules with 0x00 and 0x05 values. According to the JEDEC DDR4 SPD spec, 0x05 translates to 1.2 V & 1.0 V being operable, but not endurant. Thus it won't run at either of those voltages. Oddly enough being set to 0x00 should mean it isn't operable or endurant at any JEDEC recommended voltage, however the modules that were corrupted with 0x00 values still reported as running at 1.2 V - perhaps it depends on the BIOS/UEFI/memory initialisation implementation.

If the SPD data was corrupted to 0x20 or 0x30, there's a good change your memory modules could've been killed instantly by the motherboard if it supports 2.0 V.

Edited by ADAMPOKE111
Link to comment
Share on other sites

Is this impacting DDR 5 modules also, or just DDR 4?

Link to comment
Share on other sites

it seems to be only DDR4 apparently

 

 

Link to comment
Share on other sites

Thanks to you ADAMPOKE111 for this clear troubleshooting.

 

I have exactly the same issue : I upgraded ICUE for version 5.x the 09-May-2023 and since that, my PC is not able to boot anymore. No windows prompt, no Access to BIOS, no vidéo signal : the motherboard indicates an issue with DDR4 RAM.

I concluded my RAM has been corrupted/broken in some ways, suspecting ICUE software new version and your posts confirmed my thoughts.

I'm affraid of replacing the memory sticks 

 

@Corsair support team, what are the solutions to :

1. Repair/replace the corrupted RAM sticks

2. Downgrade/remove ICUE software when there is no possibility to boot on windows.

 

Link to comment
Share on other sites

read the Reddit post, they detail it.

iCUE 5 has been pulled out, and the download section lists the latest version of iCUE4 that is not affected by the bug, so you should roll back until the fixed V5 is released.

And you sould open a support ticket to have your memory replaced by corsair.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...