Corrupt file system: Replace hard disk or not?

Corrupt file system: Replace hard disk or not?

Borissh1983 borissh1983 at gmail.com
Fri Sep 22 18:21:29 IDT 2017


I'm assuming you atcually had run smart scan to do set the counters
(few hours per scan), what you describe sounds like something caused
by an X issue - there had been several different bugs both in X itself
and in some DE's that made your "screen freeze" (the workaround was to
switch to a different VT and back) while the apps themselvs continue
to run.

Check your dmesg and other logs for any messages containting stuff
such as  link_down, exception Emask , failed command,SError  If
nothing like that exist (and you did run smart scan) you should be ok.

If any message such that exist, it could be either the drive or the cables.

On 9/22/17, Eli Billauer <eli at billauer.co.il> wrote:
> Hello all,
>
> TL;DR: My hard disk's filesystem was corrupt, but the SMART statistics
> is perfect. Should I replace the hard disk?
>
> Full version:
>
> It seems like one of my hard disks has passed its own premature Yom
> Kippur verdict. Rebooting my computer this morning, it failed to mount,
> saying "Group descriptor 32768 checksum is invalid" and forced me into a
> shell.
>
> I made the mistake (?) of running fsck and then aborting it with a
> (proper CTRL-ALT-DEL) reboot, as it took ages. This is a 3 TB disk,
> which isn't necessary for booting, so I removed it from /etc/fstab, and
> brought up the computer fine.
>
> Then I ran fsck on that disk, which generated a log of 125 MB, and
> basically threw everything into /lost+found, leaving nothing in the root
> directory. Hurray.
>
> It's a Western Digital WDC WD30EZRX-00DC0B0, with one big ext4 over LUKS
> over LVM, 4 years in service, containing stuff that doesn't deserve a
> backup. So the damage is limited, but I wonder if I should replace the
> disk.
>
> Despite its age, this disk's SMART status is perfect: No bad sectors, no
> reallocated sectors, nothing. No parameter can be better. I know there's
> a "don't trust SMART" word around, but had a sector failed, I would
> expect that to appear in the statistics. I mean, I do understand that
> SMART can't predict a failure, but doesn't it mean anything?
>
> And there's another thing: The reason a rebooted the computer was that I
> found the screen frozen, but the mouse pointer moved. The time stood
> still at 3:01 (AM). This is highly unusual on my computer, which usually
> runs of months with zero issues.
>
> So I connected with ssh, and saw nothing suspicious: Not in
> /var/log/messages, not in dmesg, not in .xsession-errors. No process was
> busy in particular. From the remote terminal, I couldn't have guessed
> something was wrong. So I issued a reboot from remote, which failed as I
> mentioned above.
>
> Bottom line: The panic instinct is to replace the disk, even though the
> whole computer is due for replacement within a year or so. Money left
> aside, it's a bit of an effort, and involves a lot of scary commands as
> root, which are a risk factor by themselves. I'm not implying that I'm
> stupid enough to mke2fs the wrong disk. Not me. I never err. ;)
>
> Insights are welcome.
>
> Shana Tova,
>     Eli
>
> --
> Web: http://www.billauer.co.il
>
>
> _______________________________________________
> Linux-il mailing list
> Linux-il at cs.huji.ac.il
> http://mailman.cs.huji.ac.il/mailman/listinfo/linux-il
>



More information about the Linux-il mailing list