Myce.com Latest Updates

Hard Disk SMART data is ineffective at predicting failure

Posted 28 February 2007 21:00 CET by Seán Byrne

As manufacturers continue to release larger hard disk year after year, regardless of what hard drive one gets, the last thing anyone wants to see is a failing hard disk, particularly if it contains a lot of content that is not backed up.  Google has recently released quite an interesting paper going into detail with statistics about its infrastructure’s hard disk failures.  Their main findings were that the drive’s self-monitoring data (S.M.A.R.T.) does not reliably predict failure and that the drive temperature and usage levels are not proportional to failure. 

What is interesting about Google’s hard disk usage is that unlike most businesses that typically use 10,000kRPM and 15,000kRPM SCSI hard disks in their servers, Google uses cheaper consumer-grade serial ATA and parallel ATA 5,400kRPM and 7,200RPM hard disks.  They consider a hard drive failed when it is replaced during a repair.  All drives have had their S.M.A.R.T. information gathered excluding spurious readings. 

Going by their statistics, hard drives tend to fail the most in their early stage with about 3% failing in the first three months and then at a fairly steady rate after 2 years, with 5 years being the typical end-of-life.  When it came to analysing the S.M.A.R.T. data, they found that only four main values were closely related to the failed hard disks which include counts for scan error, sector reallocations, offline reallocations and sectors on probations.  One interesting discovery was that no hard disk has had a single spindle failure or at least a spin retry count in the S.M.A.R.T. data. 

Unfortunately, even with these four S.M.A.R.T. values, 56% of the drives that failed did not have a single count in any of these four values, which means that over half of the hard drives have failed with not even a sign warning from the S.M.A.R.T. data.  Finally, when it came to temperature, despite most expectations, they found that the cooler the drive was, the more prone it was to failing.  Only when it came to very high temperatures did the rate of failure start increasing also. 

Even though Google used consumer grade hard disks, it is worth noting that unlike the average home PC, chances are that most of the hard disks in Google’s servers are only ever switched on once at the time of installation, run continuously until the point of failure and run in a temperature controlled environment.  In home PCs, the hard disks are regularly spun up & down and the temperatures vary from room temperature up to their operating temperature each time the computer is used also.  As a result, it would be interesting to see what the statistics would be like from a large survey of hard disks used in the home.  

Further information can be found in this Ars Technica article and in this Google paper. 

guest
No longer with us
Posted on: 01 Mar 07 00:10
So what Google is saying is that S.M.A.R.T. stands for Stupid Moronic Anal Retentive Technology? :B
0 Agree

guest
No longer with us
Posted on: 01 Mar 07 02:18
Hard Drives are now at ridiculous sizes. It takes forever to format them, and who needs such big space unless your pirating movies. They need to do away with current HD technology, as it's now the slowest part of the computer.
0 Agree

cobi64
New on Forum
Posted on: 01 Mar 07 03:52
Hound - So you're saying Google pirates movies? I'd hate to be editing my camcorder video with only a 40gb drive.
0 Agree

Shadowman69
CD Freaks Member
Posted on: 01 Mar 07 12:45
Nothing really new. A lot of users already know this since their SMART HD failed and only AFTER THE FAILURE the SMART started saying "it seems your HD has a problem...". It's a shame that usually is just too late... :r
0 Agree

DrageMester
Retired Moderator
Posted on: 01 Mar 07 18:31
The last harddrive that failed for me didn't have any SMART warnings until after a partial failure, but it gave me approx. one hour to rescue the newest versions of some files that hadn't been backed up for a week, and then it failed more or less completely. @Shadowman69: Apparently the SMART technology would be more appropriately named SMART-ass technology, since it waits until the hard drive has failed before telling you.
0 Agree

Waethorn
CD Freaks Junior Member
Posted on: 01 Mar 07 19:12
You guys should talk to Steve Gibson! Hard drives will show SMART failures usually only after all the reserved sectors are used up. By that time, though, yes, everything is FUBAR. :r
0 Agree

Waethorn
CD Freaks Junior Member
Posted on: 01 Mar 07 19:13
BTW: I wrote to Steve to ask him about his opinion on the article and on SMART Status in general....just waiting for a reply.
0 Agree

DeadMan
MyCE Resident
Posted on: 01 Mar 07 21:02
Such a shame that they did not name names.
0 Agree

guest
No longer with us
Posted on: 01 Mar 07 23:10
SMART didn't warn me one bit. One day I turned on my computer and click...click...click...
0 Agree

guest
No longer with us
Posted on: 02 Mar 07 14:34
You don't need to worry about "click...click...click...", they are just limping bits.
0 Agree

Register
Login

Register to Myce.com

Register in 10 seconds, pick a username, enter your mail address and proof you're human, that's all!

An username is required and can only contain letters and numbers
Email is required, we'll send the password there

Welcome back

Sign in with your Myce account. Not a member yet? Create an account

A username is required and can only contain letters and numbers
A password is required

Post your comment

Myce.com settings

Several settings at Myce.com can be changed, they are stored in cookies, which means they will be reset if you clear Myce.com cookies

Background

Change the background to a plain color or trianglified image (similar to the default image)

No tracking features

At Myce most social media feature are done server side and impose no privacy risk to the visitor when not used. Several features use Javascript with you can turn off here

Layout

Switch to the List layout for an index with chronologycally listed news items or Grid layout for a block based layout. To see the change you need to reload the page

×