Tuesday, 8 May 2007

Tales of Woe

Been ages since I wrote any entries, but in fairness that isn't all my fault. About a week ago I noticed that all the web sites were down - not a usual occurrence but nevertheless has happened before and is usually cured by a reboot.

However, on this occasion, the reboot had no effect. Now, most of the work I do on the servers, I do remotely from the comfort of the office. But on this occasion, a personal visit was warranted. Even in person, the web server looked fine.

Now, all the visits to my web sites get logged in a database. Further, things like my blog get written to this database also. So, checking the database was the next step. Unfortunately, the message "Windows could not start because the ntoskrnl.exe file is missing or corrupt" kind of gave the game away.

There was nothing wrong with the web server, it had just got itself into a tizzy trying again and again to talk to the database.

So, the database.... Well, I did all the easy things like running up the Repair console, all to no avail. I did run chkdsk, which found some problems, but I couldn't do anything to actually sort them and get the server running. In desparation I ordered an external drive caddy, which at least allowed me to rip the hard disk out of the machine and to attach it to another (working) system for a closer look.

I kicked off another chkdsk, which took a mammoth 36 hours to complete (the database disk was quite large!). Even then I got various errors such as such-and-such a file could not be fixed.

In parallel to this, I only upgraded the server's hard disk last July, and nothing really new had been placed onto the server - so in the worst case, I could restore this disk and just overcopy the nightly database backup. So, no data loss - but I would be back with a smaller disk for the moment.

At the time of writing, this is still an option. The machine is up and working with the old disk. But having looked further at the new disk following its chkdsk (which for all the messages appeared to be successful) and a defrag, I'm wondering if maybe the drive is fixed enough to work?

Haven't tried yet, am going to do this tonight.

Hopefully, when this is complete, I'll be able to get back to blogging about "real" things such as the Cuckoo Fair, and how Alice has started getting some private tuition.....

