Recovering a hard drive

The hard drive started showing symptoms already two weeks ago – while I was on a business trip in Hungary! There were more and more unreadable sectors, and the SMART daemon kept complaining about them daily. When I finally got back to Finland the server was still standing, but barely – many libraries had already been corrupted and starting new programs was pretty haphazard. But the currently running services, such as Apache, Zope, Exim, and sshd, were working.

The original disk had been in continuous use for over 5 years, so I guess it was about time for it to fail. It was an IBM Deskstar, and a good quality product it was. OK, I quickly replaced the disk with two hard disks that had some mirrored space on them, and a preinstalled Debian GNU/Linux on them. The backups were for the most part ok (done with flexbackup), but some important files I hadn’t included in the backups. So I needed to try to recoved the data from the failing disk.

I got a copy of Recovery Is Possible, a small Linux CD image that boots up in just about any decent machine with enough RAM and a bootable CD drive, and which contains a good selection of recovery tools. So I set up an old machine with the broken disk and a brand-new disk, and booted up RIP. After adding more memory and replacing the very old CD drive with a more modern one I got the system to boot properly.

The basic tool that did all the work was ddrescue which was bundled in the RIP distribution. DDrescue needs some place to write its log files to, so I created a small partition on the new disk for this purpose, and mounted it, and then created empty partitions that matched the sizes of the partitions I needed to recover. Then it’s just a matter of running

ddrescue /dev/hda1 /dev/hdb1 /work/rescue_hda1.log

to start the rescue of /dev/hd1 into /dev/hdb1, writing the log file into /work/rescue_hda1.log. DDrescue will try different ways to read problematic areas and can usually read lots of sectors that in normal use would just be classified as unreadable. However, a single run usually doesn’t get everything out, and several runs are required. ddrescue stores in its log file which parts of the drive had problems, and subsequent runs just retry the parts that have not been recovered yet. I noticed that after the hard drive heats up, it’s not working as reliably, so letting the machine cool down between attempts helped. Also, adding the parameter “-n 10″ will repeat the process ten times, so you can leave ddrescue to do its work while you go do something else.

Another good trick was to throw the broken hard drive into the freezer. Just wrap it up in a static-protective pouch, let cool off in room temperature or in the fridge, then put it in the freezer and let it stand overnight. Then take it out, let it slowly return to room temperature, take out of the pouch, hook up to the machine and rerun ddrescue a few times. This helped me with one of the partitions, where the errors were right smack in the ext2 inode tables (equivalent to the file allocation tables of Windows disks) meaning that locating files was a bit of a problem. I did work a bit with lde and recovered some critical files manually, but in the end, after two visits to the freezer, ddrescue was able to get the inode tables recovered as well and I got the data out much more easily.

The recovered image on the new drive is of course partially broken, since not all data is usually recovered. You can either just copy the data you need, mounting the partition read-only, or you can try what fsck will do to repair the partition. In any case, you then have most of the data in a working disk where you can copy them to wherever you need them. I had two partitions that weren’t fully backed up, and of the 11GB and 6GB partitions only 15 and 22 kB were left unreadable (while after just the first ddrescue run something like >200kB were unreadable).

Server disk crash

This server’s old hard drive failed last week. “Did you have backups?” Yes, I did. “Had you tested your backups?” No, I hadn’t. Seems my backup routine did not have enough access to all the files that needed to be backed up, so restoring the server has been a bit of a task. But about everything is now more or less coming up. And at the same time I decided to get rid of the old static xstl pages and replace everything with just this blog. At least for now. I’ll write more on the restoration process a bit later…

So why did the backup not work completely? All of the servers I administer have a centralized backup location on one of the servers. There’s a raid 1 stack that receives all backups. The backup software is “flexbackup”, which quite nicely does backups of remote machines over the net. The problem was that I had decided to use the “backup” user account to do the backups, and of course this user did not have enough access to some of the more secure files. The lesson: add the backup user to the groups that have access to stuff that needs to be backed up. Eg: users, staff, www-data, zope, mysql. You get the idea. Also, after setting up the backups, take a look at /var/log/flexbackup (or wherever you’re saving the flexbackup logs) and look for “access denied” messages and see if those files and folders should be included in the backup. If they should, then you need to grant more access to the backup user account.

Reducing the static buzz of Treo 600, take 2

OK, the buzz returned, but it only manifests itself when the battery is low on charge. A couple of days ago I reopened the Treo to check that the original aluminium patch is still there, and it is. OK, I took out a couple of slivers of aluminium foil and surrounded the entire power wire with them, so that it sits tightly around the thing and – most importantly – between the power cord and the electricity.

Results so far: The buzz is gone even when the battery is already complaining and asking for a recharge. However, by tweaking the antenna I can generate the buzz, or make it go away. Apparently the antenna contact has some problems. Will need to investigate further.

Reducing the background noice of Treo 600

This article details how to fix the static buzz problem of the Treo 600. The problem stems from an unprotected power wire inside the Treo, and it can be fixed quite easily without paying for maintenance.

My Treo started to have symptoms sometime last summer – the people on the other end of the mobile call complained of static noise that nearly prevented them from hearing me. This was a major nuisance, of course. Sometimes it helped if I moved closer to a window to get a better reception. Gradually I noticed that the amount of power left in the battery affected the problem – when the battery was low, the problem appeared more often.

From the Palmone FAQ pages I read that it helps to have a full charge and a good reception. I did not know what was the cause, since I’d done lots of stuff (firmware update, dropping the phone on the ground, damaging it, having the antenna a bit loose, installing lots of behaviour-altering software…). When I finally did a Google on the subject, I found clear
instructions in English and in French on how to rectify the problem.

Apparently the power cord that connects the battery to the system has four wires and they are the cause of the static interference. One of the guides said that twisting the wire ends a couple of times (to achieve in essence the same protection as in twisted-pair cablind (you know, phone lines and the standard network cabling)) helps, and a Scottsman advised that wrapping the wire inside tin foil helps also. I did both of them.

As of now (just two days after the operation), I haven’t had any noise problems. But I’ll have to wait and see how things play out.

UPDATE: Reports from people I’ve talked to indicate a significantly clearer sound and no buzz. Excellent!

An excellent French article on opening the Treo and twisting the cable has very good pictures so even if you don’t know any French you can see how things happen. An English guide to wrapping the tin foil has good pictures as well, but doesn’t show the details of opening the case in as much detail. So read them both and have them open while you operate on your Treo.

Here’s a quick summary:

  1. Do a full backup of the Treo. Remove the SD card, the SIM card and the stylus.
  2. Remove the screw protectors with a wooden toothpick so as not to damage them.
  3. Unscrew the screws with a number 6 torque (star-shaped end).
  4. Remove the antenna.
  5. Slide a credit card into the crevice in the side of the case and slide up and down, separating the back and front sides. Repeat on both sides and twist the card a bit to pry the sides loose.
  6. Open by holding the screen downwards and lifting the back, starting from the top, separating the bottom part last.
  7. Pull out the battery wire. Make a note of which way the connectors go into their sockets.
  8. Twist the wire maybe four rotations.
  9. Wrap a small piece of tin foil (15mm * 30mm) around the twisted wire and add a small strip of adhesive to keep it in place.
  10. Reconnect the wire. The foil will easily break, so be careful. Also remember that as soon as you connect the wire, your Treo will have power, will reset and start with the tutorial and preliminary setup. At this point you can simply press the power button to shut down the screen.
  11. If you drop the longish black rubber pad from inside your Treo, it doesn’t break anything, but having it does give you that luxury feeling when sliding the stylus in and out of its holster. The two documents I linked to did not contain instructions on this. But the correct place to put the rubber pad is just below the camera eye, under the rim of the green circuit
    board. When you place the rubber pad under the circuit board, the small nibble at one end goes towards the top of the phone, under the board. Push gently but firmly to set the pad properly in its place.

    UPDATE: Here’a two images showing the rubber pad and its correct placement. Thanks to Claude
    Morin for the images!

  12. You might want to blow out excess dust and dirt thay may have been gathering inside your Treo.
  13. Replace the back cover, first making sure the bottom is properly positioned and then swiveling the rest in place. Everything should click satisfactorily.
  14. Screws in, place the caps in place, connect the antenna, insert the SIM and SD cards.
  15. Push the power button and complete the primary setup. Set the language and the date properly.
  16. Connect to your computer and restore the contents from your backup. Done!

Credits go to XiaoBin and Ablivio. Thanks for the instructions and the pictures!