The hard drive started showing symptoms already two weeks ago – while I was on a business trip in Hungary! There were more and more unreadable sectors, and the SMART daemon kept complaining about them daily. When I finally got back to Finland the server was still standing, but barely – many libraries had already been corrupted and starting new programs was pretty haphazard. But the currently running services, such as Apache, Zope, Exim, and sshd, were working.
The original disk had been in continuous use for over 5 years, so I guess it was about time for it to fail. It was an IBM Deskstar, and a good quality product it was. OK, I quickly replaced the disk with two hard disks that had some mirrored space on them, and a preinstalled Debian GNU/Linux on them. The backups were for the most part ok (done with flexbackup), but some important files I hadn’t included in the backups. So I needed to try to recoved the data from the failing disk.
I got a copy of Recovery Is Possible, a small Linux CD image that boots up in just about any decent machine with enough RAM and a bootable CD drive, and which contains a good selection of recovery tools. So I set up an old machine with the broken disk and a brand-new disk, and booted up RIP. After adding more memory and replacing the very old CD drive with a more modern one I got the system to boot properly.
The basic tool that did all the work was ddrescue which was bundled in the RIP distribution. DDrescue needs some place to write its log files to, so I created a small partition on the new disk for this purpose, and mounted it, and then created empty partitions that matched the sizes of the partitions I needed to recover. Then it’s just a matter of running
ddrescue /dev/hda1 /dev/hdb1 /work/rescue_hda1.log
to start the rescue of /dev/hd1 into /dev/hdb1, writing the log file into /work/rescue_hda1.log. DDrescue will try different ways to read problematic areas and can usually read lots of sectors that in normal use would just be classified as unreadable. However, a single run usually doesn’t get everything out, and several runs are required. ddrescue stores in its log file which parts of the drive had problems, and subsequent runs just retry the parts that have not been recovered yet. I noticed that after the hard drive heats up, it’s not working as reliably, so letting the machine cool down between attempts helped. Also, adding the parameter “-n 10” will repeat the process ten times, so you can leave ddrescue to do its work while you go do something else.
Another good trick was to throw the broken hard drive into the freezer. Just wrap it up in a static-protective pouch, let cool off in room temperature or in the fridge, then put it in the freezer and let it stand overnight. Then take it out, let it slowly return to room temperature, take out of the pouch, hook up to the machine and rerun ddrescue a few times. This helped me with one of the partitions, where the errors were right smack in the ext2 inode tables (equivalent to the file allocation tables of Windows disks) meaning that locating files was a bit of a problem. I did work a bit with lde and recovered some critical files manually, but in the end, after two visits to the freezer, ddrescue was able to get the inode tables recovered as well and I got the data out much more easily.
The recovered image on the new drive is of course partially broken, since not all data is usually recovered. You can either just copy the data you need, mounting the partition read-only, or you can try what fsck will do to repair the partition. In any case, you then have most of the data in a working disk where you can copy them to wherever you need them. I had two partitions that weren’t fully backed up, and of the 11GB and 6GB partitions only 15 and 22 kB were left unreadable (while after just the first ddrescue run something like >200kB were unreadable).
Miguel says
Hi,
weird trick that with the freezer! What reason do you think (know?) is behind it working? Realignment of the magnetic compound? I’m quite curious, do you have any links? And couldn’t moist build up on the HD? Even inside? Mmm… but it is supposed to be sealed,so not inside, right?
I’ve had some HD crashes myself too, and though having used ddrescue I had never heard of the freezer-trick! Maybe I’ll give it a try next time! Or better, let’s hope there is no such next time. 🙂
Nice info!
Cheers,
_________
Miguel.
tarmo says
Well, there’s a lot of people who have had success with freezing a hard drive. Some report that you should try to do the backup immediately after 2-3 hours of freezing, but for me the frozen disk did not operate at all. But it may be that I did not seal the thing too well and condensation prevented the drive from working. But keeping it in the freezer overnight and then letting it thaw in room temperature did work for me. My guess is that just the temperature change makes the material reorganize itself a bit, and that may be enough to get a bad sector working again.
And yes, if freezing sounds too risky, you can just use the fridge and see if that helps. Tech Republic has a nice collection of tips, entitled 200 ways to revive a hard drive. And you can of course Google for more info.
Bob says
Hi.
I would be very grateful if you could help me. I have a hp pavilion zt3000 notebook. When I switched it on recently it said ‘1720-SMART Hard Drive detects imminent failure (failing attr : 02)’.
So I tried replacing the hard drive with an old IBM think pad harddisk. However when I put the IBM hard drive into the HP its not working. The pc boots up to the screen where it says ‘start widnows in safe mode’ start windows with command prompt’ etc. When I press enter it starts booting up again.
Would you please be able to help.