![]() |
Gawd o Gawd
Well, the main hard drive on the Cellar's system died. Up and died. Made today a serious pain in the ass, but I think everything is back now, except for a few of the Images of the Day.
Wouldn't wish this on anyone... |
That would explain why I could not access the site earlier today.
|
Ditto.
Ouch, what exactly hapened? Give us a slashdot-style rundown =) |
OK, here goes. About a week ago I got a bad block reported on the drive. I stepped up my schedule of backups but noticed that, through the backups process, I could figure out while file contained the bad block. It was a log file; I figured this was good, since it was probably eating up space that hadn't been touched in a while. I segregated the file and marked it as having no permissions at all.
I ordered another drive but wasn't sure when I'd have a chance to install it. But on Saturday morning, more bad blocks started popping up. The slowly dying drive was now leaping off the cliff. I made a final backup from one SCSI drive to the other - thinking I'd yank the bad drive, install the new one, install a fresh copy of Linux to that drive, and copy the old stuff back from the old drive. During that process, I was forced at one point to reboot. On reboot the bad drive failed a little harder. Now it failed the automatic fsck (that's roughly ScanDisk for you non-Linux weenies). I manually fsck'd and was able to recover and boot past about 100 file system errors. Now, clearly, bad spots were causing file system corruption. Still, I recovered everything I could, then swapped out the bad drive, then began a fresh install from CD. Then I made a fatal error. In the Red Hat install, you are given the option of allowing Red Hat to partition for you. In my haste, I decided that the default partitioning scheme wasn't bad and that I should just go for it. What I didn't realize is that Red Hat assumes control of ALL the drives on the system in such a case, not just the first/booting drive; and it went ahead and politely reformatted and repartitioned the second drive as well. Sys admin lesson: never make any assumptions about a vendor's defaults. Since I had previously given my wife a window of between 1 hour and 7 hours to complete the whole process, I let her know that it would be closer to 7 hours. Then I repartitioned again, this time manually, setting things up precisely how I wanted them, and reinstalled. I had a secondary backup FTPd to another system, besides my main backups which were aging. Putting both of them together on top of a fresh install, without overwriting any system files, I got almost everything back. Total time spent was about 6.5 hours and it was on the middle day of a three-day weekend. But that's OK. It was probably the best day for it to happen as many USians are on vacation. |
All times are GMT -5. The time now is 09:09 PM. |
Powered by: vBulletin Version 3.8.1
Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.