XFS problems post RAID crash

sysadmin sysadmin at insinc.com
Wed Jul 27 17:21:32 PDT 2011


I had a look in the XFS mailing list archives and found plenty of people with
similar errors to the ones I'm experience (Google didn't find these archives
when I searched earlier).

None seemed to be able to solve the problem so I've resolved to destroy the
array, recreated it, format and transfer files back.

I realized that of the really important files there is probably around
300GB. Hopefully we won't need the other ~2.7TB for the next few days it's
going to take to transfer it back.

Thanks to all who took the time to look at this issue for me and thanks in
particular Lonni for your advice.

Regards,
Nick


sysadmin wrote:
> 
> Thanks again Lonni.
> 
> I could possibly upgrade the kernel on this system, I'll look into it.
> 
> I agree the FS may be beyond repair but given I can still retrieve files
> off it in RO mode I think there is a chance and given the time saving it's
> worth exploring.
> 
> I'm not 100% sure exactly how the RAID failure occurred. Unfortunately the
> system wasn't being monitored so the disks being offlined may not have
> happened at the same time (i.e. it may have been months between failures).
> 
> Given no data scrubbing was occurring my best guess is that MD offlined
> the drives after trying to write to a bad block as described here:
> 
>   
> http://ashtech.net/~syntax/blog/archives/53-Data-Scrub-with-Linux-RAID-or-Die.html
> 
> This document (under section "Why do drive failures come in pairs") :
> 
>    http://www.nber.org/sys-admin/linux-nas-raid.html
> 
> explains that an MD RAID5 failure is actually quite a likely event given
> the size of hard drives these days. Yes smartd is running - both drives
> are fine according to smartd.
> 
> 
> 
> netllama wrote:
>> 
>> THe first thing I'd suggest doing is trying with a much newer kernel.
>> THat 2.6.25.10 is ancient, and could potentially have a number of
>> fixed XFS bugs.  If that doesn't help, then you likely need to get on
>> the XFS mailing list and ask the experts for guidance. However, if you
>> had HW failure on two disks, its quite likely that your filesystem is
>> beyond repair.
>> 
>> I do have to ask, how did you end up with two disks in the same array
>> with bad sectors at the same time?  What kind of disks are these?
>> Were you running smartd?
>> 
>> 
>> On Wed, Jul 27, 2011 at 2:29 PM, sysadmin <sysadmin at insinc.com> wrote:
>>>
>>> Hi Lonni thanks for the reply.
>>>
>>> I'm using kernel 2.6.25.10 (CentOS 5).
>>>
>>> I have made a copy of the data on another server via NFS mount - the
>>> transfer took 6 days (~3TB of data). As I stated in my original post I
>>> can
>>> reformat this file system and transfer the data back but if I can make
>>> the
>>> current one work it would save me a lot of time. Not to mention we have
>>> some
>>> urgent uses for the space currently being taken up by the backup.
>>>
>>> Nick
>>>
>>>
>>> netllama wrote:
>>>>
>>>> You never stated what kernel version you were using.  Anyway, if
>>>> you're able to mount the filesystem read only, then I'm unclear why
>>>> you can't make a copy of that data and use that?
>>>>
>>>> On Wed, Jul 27, 2011 at 2:20 PM, sysadmin <sysadmin at insinc.com> wrote:
>>>>>
>>>>> Hey all,
>>>>>
>>>>> I had an Linux MD RAID 5 array that had 2 drives go offline due to bad
>>>>> sectors (no data scrubbing was being performed).
>>>>>
>>>>> I've managed to rebuild the array and can mount the XFS file system
>>>>> RO.
>>>>> Some
>>>>> of the files are missing/corrupt but I have managed to transfer many
>>>>> of
>>>>> them
>>>>> off to another system.
>>>>>
>>>>> Only trouble now is when I mount the file system RW I get problems
>>>>> such
>>>>> as
>>>>> "cannot allocate memory" doing an ls on a directory and other such
>>>>> strange
>>>>> problems.
>>>>>
>>>>> Running xfs_repair gives:
>>>>>
>>>>>  10:09 root at servername:~# xfs_repair /dev/md5
>>>>>  Phase 1 - find and verify superblock...
>>>>>  Phase 2 - using internal log
>>>>>         - zero log...
>>>>>         - scan filesystem freespace and inode maps...
>>>>>  bad magic # 0x20000000 in btbno block 5/1098
>>>>>  bad magic # 0 in btcnt block 6/5518
>>>>>  expected level 1 got 0 in btcnt block 6/5518
>>>>>  bad magic # 0 in btcnt block 7/5129
>>>>>  expected level 1 got 0 in btcnt block 7/5129
>>>>>  bad magic # 0x6e73745f in btbno block 8/218842
>>>>>  expected level 0 got 2354 in btbno block 8/218842
>>>>>  bad magic # 0x33340933 in btcnt block 8/218847
>>>>>  expected level 0 got 13104 in btcnt block 8/218847
>>>>>  bad magic # 0x28717476 in btbno block 10/13130259
>>>>>  expected level 1 got 25970 in btbno block 10/13130259
>>>>>  bad magic # 0x2f000000 in btbno block 13/31016602
>>>>>  bad magic # 0 in btcnt block 14/13717213
>>>>>  expected level 1 got 0 in btcnt block 14/13717213
>>>>>  bad magic # 0xf0980300 in btcnt block 15/1358720
>>>>>  bad magic # 0x2e323032 in btbno block 17/28998874
>>>>>  expected level 0 got 11825 in btbno block 17/28998874
>>>>>  bad magic # 0x36332e31 in btcnt block 17/28998875
>>>>>  expected level 0 got 14137 in btcnt block 17/28998875
>>>>>  bad magic # 0 in btbno block 19/91721
>>>>>  block (22,5084) multiply claimed by bno space tree, state - 2
>>>>>  block (22,5085) multiply claimed by bno space tree, state - 2
>>>>>  bcnt freespace btree block claimed (state 1), agno 23, bno 21801,
>>>>> suspect 0
>>>>>  bad magic # 0x2d313509 in btbno block 24/17051
>>>>>  expected level 1 got 12848 in btbno block 24/17051
>>>>>  block (25,22903) multiply claimed by bno space tree, state - 7
>>>>>  block (25,23312) multiply claimed by bno space tree, state - 2
>>>>>  block (25,23313) multiply claimed by bno space tree, state - 2
>>>>>  block (25,23314) multiply claimed by bno space tree, state - 2
>>>>>  block (25,23315) multiply claimed by bno space tree, state - 2
>>>>>  block (25,23316) multiply claimed by bno space tree, state - 2
>>>>>  block (25,23317) multiply claimed by bno space tree, state - 2
>>>>>  bno freespace btree block claimed (state 1), agno 25, bno 22902,
>>>>> suspect
>>>>> 0
>>>>>  bcnt freespace btree block claimed (state 1), agno 25, bno 23400,
>>>>> suspect 0
>>>>>  bad magic # 0x41425442 in btcnt block 27/1293
>>>>>  expected level 1 got 0 in btcnt block 27/1293
>>>>>  bad magic # 0x2000000 in btcnt block 27/16922334
>>>>>
>>>>> No matter the xfs_repair flags I use (even -d with the server booted
>>>>> in
>>>>> single user mode) the repair hangs at this point i.e. with this exact
>>>>> bad
>>>>> magic line:
>>>>>
>>>>>  bad magic # 0x2000000 in btcnt block 27/16922334
>>>>>
>>>>> Is there anything I can do to recover this file system ? Obviously a
>>>>> full
>>>>> recovery would be ideal but given the RAID crash likely impossible. At
>>>>> this
>>>>> stage I'd be happy if I could just get the file system to run on
>>>>> whatever
>>>>> data it's managed to retain. If I have to reformat and transfer data
>>>>> back
>>>>> onto the system the transfer will take days.
>>>>>
>>>>> I've hunted around on the web and found people having similar issues
>>>>> but
>>>>> not
>>>>> quite the same e.g. :
>>>>>
>>>>>  http://old.nabble.com/bad-magic-and-dubious-inode-td8785248.html#a8785248
>>>>>
>>>>> Any help appreciated.
>> 
>> 
>> 
>> -- 
>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> L. Friedman                                    netllama at gmail.com
>> LlamaLand                       https://netllama.linux-sxs.org
>> 
>> _______________________________________________
>> Linux-users mailing list ( Linux-users at linux-sxs.org )
>> Unsub/Password/Etc: 
>> http://linux-sxs.org/mailman/listinfo/linux-users
>> 
>> Need to chat further on this subject? Check out #linux-users on
>> irc.linux-sxs.org !
>> 
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/XFS-problems-post-RAID-crash-tp32150573p32152966.html
Sent from the Linux Users (linux-sxs.org) mailing list archive at Nabble.com.





More information about the Linux-users mailing list