XFS problems post RAID crash

sysadmin sysadmin at insinc.com
Wed Jul 27 14:20:19 PDT 2011


Hey all,

I had an Linux MD RAID 5 array that had 2 drives go offline due to bad
sectors (no data scrubbing was being performed).

I've managed to rebuild the array and can mount the XFS file system RO. Some
of the files are missing/corrupt but I have managed to transfer many of them
off to another system.

Only trouble now is when I mount the file system RW I get problems such as
"cannot allocate memory" doing an ls on a directory and other such strange
problems.

Running xfs_repair gives:

 10:09 root at servername:~# xfs_repair /dev/md5
 Phase 1 - find and verify superblock...
 Phase 2 - using internal log
         - zero log...
         - scan filesystem freespace and inode maps...
 bad magic # 0x20000000 in btbno block 5/1098
 bad magic # 0 in btcnt block 6/5518
 expected level 1 got 0 in btcnt block 6/5518
 bad magic # 0 in btcnt block 7/5129
 expected level 1 got 0 in btcnt block 7/5129
 bad magic # 0x6e73745f in btbno block 8/218842
 expected level 0 got 2354 in btbno block 8/218842
 bad magic # 0x33340933 in btcnt block 8/218847
 expected level 0 got 13104 in btcnt block 8/218847
 bad magic # 0x28717476 in btbno block 10/13130259
 expected level 1 got 25970 in btbno block 10/13130259
 bad magic # 0x2f000000 in btbno block 13/31016602
 bad magic # 0 in btcnt block 14/13717213
 expected level 1 got 0 in btcnt block 14/13717213
 bad magic # 0xf0980300 in btcnt block 15/1358720
 bad magic # 0x2e323032 in btbno block 17/28998874
 expected level 0 got 11825 in btbno block 17/28998874
 bad magic # 0x36332e31 in btcnt block 17/28998875
 expected level 0 got 14137 in btcnt block 17/28998875
 bad magic # 0 in btbno block 19/91721
 block (22,5084) multiply claimed by bno space tree, state - 2
 block (22,5085) multiply claimed by bno space tree, state - 2
 bcnt freespace btree block claimed (state 1), agno 23, bno 21801, suspect 0
 bad magic # 0x2d313509 in btbno block 24/17051
 expected level 1 got 12848 in btbno block 24/17051
 block (25,22903) multiply claimed by bno space tree, state - 7
 block (25,23312) multiply claimed by bno space tree, state - 2
 block (25,23313) multiply claimed by bno space tree, state - 2
 block (25,23314) multiply claimed by bno space tree, state - 2
 block (25,23315) multiply claimed by bno space tree, state - 2
 block (25,23316) multiply claimed by bno space tree, state - 2
 block (25,23317) multiply claimed by bno space tree, state - 2
 bno freespace btree block claimed (state 1), agno 25, bno 22902, suspect 0
 bcnt freespace btree block claimed (state 1), agno 25, bno 23400, suspect 0
 bad magic # 0x41425442 in btcnt block 27/1293
 expected level 1 got 0 in btcnt block 27/1293
 bad magic # 0x2000000 in btcnt block 27/16922334

No matter the xfs_repair flags I use (even -d with the server booted in
single user mode) the repair hangs at this point i.e. with this exact bad
magic line:

 bad magic # 0x2000000 in btcnt block 27/16922334

Is there anything I can do to recover this file system ? Obviously a full
recovery would be ideal but given the RAID crash likely impossible. At this
stage I'd be happy if I could just get the file system to run on whatever
data it's managed to retain. If I have to reformat and transfer data back
onto the system the transfer will take days.

I've hunted around on the web and found people having similar issues but not
quite the same e.g. :

 http://old.nabble.com/bad-magic-and-dubious-inode-td8785248.html#a8785248

Any help appreciated.

Thanks.

Nick
-- 
View this message in context: http://old.nabble.com/XFS-problems-post-RAID-crash-tp32150573p32150573.html
Sent from the Linux Users (linux-sxs.org) mailing list archive at Nabble.com.




More information about the Linux-users mailing list