19 December 2010

It happens to all of us - sooner or later we lose data. Sometimes it’s important, others not, but rest assured it will happen. Even the most careful of us who take backups with something akin to religious fervor occasionally make mistakes. And so it was that I got a phone call from a very upset young lady who had just lost six months worth of work.

Her company had decided to refresh her PC and told her to drag and drop everything of importance on to the network share. This she did, but was unaware that some of the items had not been copied and were in fact just shortcuts. The weird thing though (or maybe not, I’m not a Windows expert) is that whilst some Excel files copied perfectly fine, one or two copied as shortcuts - and those of course were the important ones. After the copy had been made, the PC was whisked away, formatted and given to another colleague. A few hours later my friend discovered that her spreadsheet was no more and meanwhile her colleague was busy working away on her new machine.

So we have a spreadsheet on a machine that has been formatted, has had Windows reinstalled and is currently in use. The chances of recovering the data weren’t all that great but the work was sufficiently important that it was worth a try. I told her the first thing to do was get hold of the original PC, turn it off and make sure no one goes near it. Most operating systems continue to write data to the disk even if they’re otherwise idle. This is actually a good thing as it tends to make the machine more responsive - but that last thing I wanted was for the part of the disk containing the spreadsheet to get over written.

Stage two was the most important. I took the hard disk out of the PC and using a handy USB to SATA adapter I cloned the hard disk using the disk imaging tool ‘dd’. The disk was 160GB in size so this took some time to complete. However this is the golden rule of digital forensics - never ever work on the real disk! Always take a copy and then work from a copy of the copy! This ensures that you don’t accidentally alter data and break the chain of evidence - but in my case it meant that if I made a mistake, I could easily take another copy of the image and try again. If I made that mistake on the real disk, the data would be lost for good. Remember, always take a disk image first!

Once I had the disk image, I used a really handy open source tool called PhotoRec which as the name suggests was originally designed for recovering images from damaged memory cards. Despite the name it’s also equally effective at recovering other data - such as missing Excel files. I was able to run the tool against the disk image ( I used a demo version of Mount Image Pro v4 to mount the disk under Windows) and it started recovering data. Unfortunately due to the fact that file names are stored in the filesystem and this method of recovery bypasses the filesystem and looks directly at the disk, it is not able to detect filenames. That is, whilst it can identify an Excel file for what it is, it can’t tell you what it was called. This meant that there were about a hundred Excel files I had to go through by hand but fortunately Windows 7 has a nice preview pane which made that pretty painless.

Fortunately I was able to recover the missing file. This is pretty neat considering it happened after the machine had been reinstalled with a new version of Windows and had been used by someone for the best part of a day. Whilst this is a very happy ending to the story, it does highlight how easy it is for anyone to recover data, even if the machine has been reformatted. If you’re selling a PC or laptop, make sure you fully erase it first or the person you sell it to stands a good chance of recovering data that you thought was properly deleted. Whilst rescuing the spreadsheet I came across files from three years - that is three reinstalls - previous!

For more information on how to properly erase your disks before letting them out of your site, check out the article I wrote for the first issue of Digital Forensics Magazine here.  It’s on page 15.

blog comments powered by Disqus