XFS problems post RAID crash

sysadmin sysadmin at insinc.com
Wed Jul 27 14:29:23 PDT 2011


Hi Lonni thanks for the reply.

I'm using kernel 2.6.25.10 (CentOS 5).

I have made a copy of the data on another server via NFS mount - the
transfer took 6 days (~3TB of data). As I stated in my original post I can
reformat this file system and transfer the data back but if I can make the
current one work it would save me a lot of time. Not to mention we have some
urgent uses for the space currently being taken up by the backup.

Nick


netllama wrote:
> 
> You never stated what kernel version you were using.  Anyway, if
> you're able to mount the filesystem read only, then I'm unclear why
> you can't make a copy of that data and use that?
> 
> On Wed, Jul 27, 2011 at 2:20 PM, sysadmin <sysadmin at insinc.com> wrote:
>>
>> Hey all,
>>
>> I had an Linux MD RAID 5 array that had 2 drives go offline due to bad
>> sectors (no data scrubbing was being performed).
>>
>> I've managed to rebuild the array and can mount the XFS file system RO.
>> Some
>> of the files are missing/corrupt but I have managed to transfer many of
>> them
>> off to another system.
>>
>> Only trouble now is when I mount the file system RW I get problems such
>> as
>> "cannot allocate memory" doing an ls on a directory and other such
>> strange
>> problems.
>>
>> Running xfs_repair gives:
>>
>>  10:09 root at servername:~# xfs_repair /dev/md5
>>  Phase 1 - find and verify superblock...
>>  Phase 2 - using internal log
>>         - zero log...
>>         - scan filesystem freespace and inode maps...
>>  bad magic # 0x20000000 in btbno block 5/1098
>>  bad magic # 0 in btcnt block 6/5518
>>  expected level 1 got 0 in btcnt block 6/5518
>>  bad magic # 0 in btcnt block 7/5129
>>  expected level 1 got 0 in btcnt block 7/5129
>>  bad magic # 0x6e73745f in btbno block 8/218842
>>  expected level 0 got 2354 in btbno block 8/218842
>>  bad magic # 0x33340933 in btcnt block 8/218847
>>  expected level 0 got 13104 in btcnt block 8/218847
>>  bad magic # 0x28717476 in btbno block 10/13130259
>>  expected level 1 got 25970 in btbno block 10/13130259
>>  bad magic # 0x2f000000 in btbno block 13/31016602
>>  bad magic # 0 in btcnt block 14/13717213
>>  expected level 1 got 0 in btcnt block 14/13717213
>>  bad magic # 0xf0980300 in btcnt block 15/1358720
>>  bad magic # 0x2e323032 in btbno block 17/28998874
>>  expected level 0 got 11825 in btbno block 17/28998874
>>  bad magic # 0x36332e31 in btcnt block 17/28998875
>>  expected level 0 got 14137 in btcnt block 17/28998875
>>  bad magic # 0 in btbno block 19/91721
>>  block (22,5084) multiply claimed by bno space tree, state - 2
>>  block (22,5085) multiply claimed by bno space tree, state - 2
>>  bcnt freespace btree block claimed (state 1), agno 23, bno 21801,
>> suspect 0
>>  bad magic # 0x2d313509 in btbno block 24/17051
>>  expected level 1 got 12848 in btbno block 24/17051
>>  block (25,22903) multiply claimed by bno space tree, state - 7
>>  block (25,23312) multiply claimed by bno space tree, state - 2
>>  block (25,23313) multiply claimed by bno space tree, state - 2
>>  block (25,23314) multiply claimed by bno space tree, state - 2
>>  block (25,23315) multiply claimed by bno space tree, state - 2
>>  block (25,23316) multiply claimed by bno space tree, state - 2
>>  block (25,23317) multiply claimed by bno space tree, state - 2
>>  bno freespace btree block claimed (state 1), agno 25, bno 22902, suspect
>> 0
>>  bcnt freespace btree block claimed (state 1), agno 25, bno 23400,
>> suspect 0
>>  bad magic # 0x41425442 in btcnt block 27/1293
>>  expected level 1 got 0 in btcnt block 27/1293
>>  bad magic # 0x2000000 in btcnt block 27/16922334
>>
>> No matter the xfs_repair flags I use (even -d with the server booted in
>> single user mode) the repair hangs at this point i.e. with this exact bad
>> magic line:
>>
>>  bad magic # 0x2000000 in btcnt block 27/16922334
>>
>> Is there anything I can do to recover this file system ? Obviously a full
>> recovery would be ideal but given the RAID crash likely impossible. At
>> this
>> stage I'd be happy if I could just get the file system to run on whatever
>> data it's managed to retain. If I have to reformat and transfer data back
>> onto the system the transfer will take days.
>>
>> I've hunted around on the web and found people having similar issues but
>> not
>> quite the same e.g. :
>>
>>  http://old.nabble.com/bad-magic-and-dubious-inode-td8785248.html#a8785248
>>
>> Any help appreciated.
> 
> -- 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> L. Friedman                                    netllama at gmail.com
> LlamaLand                       https://netllama.linux-sxs.org
> 
> _______________________________________________
> Linux-users mailing list ( Linux-users at linux-sxs.org )
> Unsub/Password/Etc: 
> http://linux-sxs.org/mailman/listinfo/linux-users
> 
> Need to chat further on this subject? Check out #linux-users on
> irc.linux-sxs.org !
> 
> 

-- 
View this message in context: http://old.nabble.com/XFS-problems-post-RAID-crash-tp32150573p32151978.html
Sent from the Linux Users (linux-sxs.org) mailing list archive at Nabble.com.





More information about the Linux-users mailing list