Jump to content
Corsair Community

SMART Shows read errors after crash


glugglug

Recommended Posts

2 days ago my computer froze with the disk light on solid, and I ended up hitting the reset button. Shortly afterwards, I got an alert from Stablebit Scanner saying the SMART data on the SSD is outside manufacturer specified tolerances. Specifically the Read Error Rate has a nominal value of 1, and isn't supposed to drop below 6.

 

The raw value is gradually climbing -- immediately after the crash it was a bit below 60,000 and now it is at 66,025.

 

Apparently the errors are all getting ECC corrected, and the OS hasn't noticed any bad blocks yet (and having stablebit scan the drive again didn't find any either). Should this be RMA'ed? I am about to go on a vacation, so that would need to wait until after I get back so that I have a chance to image from the old drive to the new one etc. The PC is also the whole-home DVR, so finding an acceptable downtime window is tricky.

 

Smart data in question pictured below. Other apps show results consistent with this, and a remaining disk life of 92% which seems a bit low for a drive just bought last August or September.

 

http://i.imgur.com/ZvAM0HG.png

http://i.imgur.com/vAs8KnE.png

http://i.imgur.com/EtWQooT.png

 

Also, the E:\ shown in SSD Toolbox as 2.1TB is a 16TB DrivePool.

Link to comment
Share on other sites

How much space you have left on the average?

 

Technically the real read errors count only 1...thus reading might be high likely controller or algorythm error... it could also be induced by a system fault of different PC component, most famously the PSU.

 

It is also possible that not the SSD had a problem but the integrated Neutron-Cache.

 

We probably might never know the reason of that incident.

 

Does the drive operates well now or do another single error add up until now?

 

On the other hand your overall SMART-data looks good.

Link to comment
Share on other sites

You mean the raw value of "Soft ECC Correction Rate"... what tells not much.

 

You we're pleased to ask some questions instead:

 

How much space you have left on the average?

 

Does the drive operates well now or do another single error add up until now? <---this means real errors like bluescreens, unpredicted shutdowns or REAL read or write errors (not ECC correction).

 

 

Are you running additional third party ram-cache solutions that hook-in infront of the device in windows?

 

BTW: No problem to replace such drive within the warranty right now, if you can afford the trouble in the meantime... since I assume "SSD Life left" continues dropping too. I would contact my reseller of trust (if he is in charge for warranty, in your country) to replace it very quickly. :D

 

The Problem is that Corsair might tell you "nono, SMART is irrelevant, especially Soft ECC correction on the Neutron"... and that might be true indeed, so no need to bother, but I don't really know... you have to ask the support. If they tells you the drive needs to be replaced, then request RMA from corsair or supplier (if possible). Aside from SMART-values, when drive simply fails and that is reproducable... there's no need to argue with SMART ;)

 

So the main question is: Does it really fail... or are you just bothering SMART? I still do't get it. Tried scandisk to repair NTFS infrastructure already?

 

2 days ago my computer froze with the disk light on solid, and I ended up hitting the reset button.

The LED showing "disc access" does not mean, disk was causing this.... it only tells us, it was accessing when **** happened. So we do not know if the SSD or anything else was the root-cause for this... we simply see the result of a problem: system-freeze. now you wonder why your SSD-controller hat a read-error in that moment.... well, i don't. that's absolutely normal. but thousands of your NTFS-indices of opened files might me crippled now, thats also normal and the main-reason why one should shutdown NTFS-installed OS and not plug-off the cable during io-operations.... so whatever is causing "Soft Correction" might have nothing to do with your hardware not opperating within specification, but your filesystem now beeing defect. so you should start trying repairing it.

 

Edit: In a non-overclocked environment BSoDs happens mostly because of bad software including drivers trying to access hardware not correctly. Other main reason for BSoDs are overclocking cpu/ram/gpu and causing instability with that. Then finally we have hardware-failures causing the BSoDs...

 

...but there'S no proof we have a hardware-error here...except of a "read error" during freeze .... no wonder.

Link to comment
Share on other sites

After a COLD boot (shutdown for ~15 seconds and restart) instead of just a restart, the counter reset. SMART now shows read error rate nominal value of 66, and raw value 362, and all 5 utilities I have list the smart data as "good" now The drive must have gotten in some screwy state connected to that crash making the counter climb.

 

Oh, and BTW, the raw value I was referring to is the raw value of "Raw Read Error Rate" or "Read Error Rate" in most utilities. Corsair SSD Toolbox seems to be missing that raw value. Look at the CrystalDiskInfo screenshot for example. That same raw number is listed for both Raw Read Error Rate and Soft ECC Correction Rate.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...