page allocation failure

C M Reinehr cmr
Sat Feb 17 09:50:08 PST 2007


On Friday 09 February 2007 13:01, Net Llama! wrote:
> Actually, you should google on this a bit, as it looks like it might now
> be as simple as an OOM condition but some kind of driver buffer overflow,
> where it needs to store something in memory somewhere  and doesn't have
> enough room.  One report that I read was discussing someone using jumbo
> frames with a weird MTU.
>
> At anyrate, I don't think this is a HW problem.

	I think the driver buffer overflow is the most likely explanation. I couldn't 
find an exact match but I was able to find a number of similar situations. 
The fix is to increase min_free_kbytes. I did that yesterday and it seems to 
have eliminated this particular error condition. At this point, though, I 
don't know if this was the cause of the two recent crashes, although, another 
possibility is that the kernel didn't really crash but just became 
unresponsive. Anyway, one thing at a time.

Thanks for the help.

cmr
> On Fri, 9 Feb 2007, C M Reinehr wrote:
> > On Friday 09 February 2007 12:02, Net Llama! wrote:
> >> That loooks like an OOM to me, not a HW problem.  How much RAM is in
> >> this box?
> >
> > MemTotal:      1026024 kB
> > SwapTotal:      979704 kB
> >
> > Right now, with no backups running:
> >
> > MemFree:        930624 kB
> > SwapFree:       979704 kB
> >
> > I wonder if something could be preventing the swap from being used.
> >
> > # swapon -s
> > Filename				Type		Size	Used	Priority
> > /dev/md2                                partition	979704	0	-1
> >
> > It's quite possible that I did something wrong 18 months ago that only
> > now is surfacing.
> >
> > TIA
> >
> > cmr
> >
> >> On Fri, 9 Feb 2007, C M Reinehr wrote:
> >>> This morning I came in to the office and found one of my servers DOA.
> >>> This server functions primarily as a backup server, performing a
> >>> nightly network backup to disk & tape and has been working well for the
> >>> past 18 or so months until now. I've found it this way on the past two
> >>> Friday mornings (today & last Friday). On Friday mornings the weekly
> >>> cycle of level Full backups starts at 02:35.
> >>>
> >>> I'm guessing that this is an indication of failing hardware, probably
> >>> memory, but would appreciate anyone's insights.
> >>>
> >>> Following is an except of the syslog. The full syslog, from the time
> >>> the problem started until the system stopped responding, is attached.
> >>>
> >>> Thanks in advance!
> >>>
> >>> cmr
> >>>
> >>> Feb  9 02:54:26 Bilskirnir kernel: swapper: page allocation failure.
> >>> order:0, mode:0x20
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> Feb  9 02:54:26 Bilskirnir kernel: Call Trace: <IRQ>
> >>> <ffffffff8024ee08>{__alloc_pages+664}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80264f76>{kmem_getpages+70}
> >>> <ffffffff8020a04a>{apic_timer_interrupt+98}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80265cd1>{cache_grow+177}
> >>> <ffffffff80265ecc>{cache_alloc_refill+348}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff802662db>{__kmalloc+91} <ffffffff804b3fd6>{__alloc_skb+86}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff803e33ad>{e100_rx_alloc_skb+29}
> >>> <ffffffff803e376f>{e100_rx_clean+143}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff803e3aa7>{e100_poll+55} <ffffffff803e2cd0>{e100_watchdog+0}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff804ba299>{net_rx_action+121}
> >>> <ffffffff8022d98b>{__do_softirq+75} Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff8020a6a6>{call_softirq+30} <ffffffff8020c361>{do_softirq+49}
> >>> Feb  9 02:54:26 Bilskirnir kernel:        <ffffffff8020c317>{do_IRQ+71}
> >>> <ffffffff80207a90>{default_idle+0}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80209e14>{ret_from_intr+0} <EOI>
> >>> <ffffffff8053517f>{thread_return+0}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80207abe>{default_idle+46} <ffffffff80207bd4>{cpu_idle+68}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff806ff8a6>{start_kernel+358}
> >>> <ffffffff806ff2ac>{x86_64_start_kernel+428}
> >>> Feb  9 02:54:26 Bilskirnir kernel: Mem-info:
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA per-cpu:
> >>> Feb  9 02:54:26 Bilskirnir kernel: cpu 0 hot: high 0, batch 1 used:0
> >>> Feb  9 02:54:26 Bilskirnir kernel: cpu 0 cold: high 0, batch 1 used:0
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA32 per-cpu:
> >>> Feb  9 02:54:26 Bilskirnir kernel: cpu 0 hot: high 186, batch 31
> >>> used:51 Feb  9 02:54:26 Bilskirnir kernel: cpu 0 cold: high 62, batch
> >>> 15 used:56 Feb  9 02:54:26 Bilskirnir kernel: Normal per-cpu: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: HighMem per-cpu: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: Free pages:        5380kB (0kB
> >>> HighMem) Feb  9 02:54:26 Bilskirnir kernel: Active:24574
> >>> inactive:214681 dirty:82 writeback:9471 unstable:0 free:1345 slab:14425
> >>> mapped:8866 pagetables:262 Feb  9 02:54:26 Bilskirnir kernel: DMA
> >>> free:3992kB min:36kB low:44kB high:52kB active:88kB inactive:4396kB
> >>> present:9384kB pages_scanned:25
> >>> all_unreclaimable? no
> >>> Feb  9 02:54:26 Bilskirnir kernel: lowmem_reserve[]: 0 994 994 994
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA32 free:1388kB min:4012kB
> >>> low:5012kB high:6016kB active:98208kB inactive:854328kB
> >>> present:1018020kB pages_scanned:0 all_unreclaimable? no
> >>> Feb  9 02:54:26 Bilskirnir kernel: lowmem_reserve[]: 0 0 0 0
> >>> Feb  9 02:54:26 Bilskirnir kernel: Normal free:0kB min:0kB low:0kB
> >>> high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
> >>> all_unreclaimable? no Feb  9 02:54:26 Bilskirnir kernel:
> >>> lowmem_reserve[]: 0 0 0 0
> >>> Feb  9 02:54:26 Bilskirnir kernel: HighMem free:0kB min:128kB low:128kB
> >>> high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0
> >>> all_unreclaimable? no
> >>> Feb  9 02:54:26 Bilskirnir kernel: lowmem_reserve[]: 0 0 0 0
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA: 0*4kB 1*8kB 1*16kB 0*32kB
> >>> 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3992kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA32: 1*4kB 1*8kB 0*16kB 1*32kB
> >>> 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1388kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: Normal: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: HighMem:
> >>> emff80265cd1>{cache_grow+177} Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80265ecc>{cache_alloc_refill+348}
> >>> <ffffffff802662db>{__kmalloc+91} Feb  9 02:54:26 Bilskirnir kernel:
> >>>  <ffffffff804b3fd6>{__alloc_skb+86}
> >>> <ffffffff803e33ad>{e100_rx_alloc_skb+29}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff803e376f>{e100_rx_clean+143} <ffffffff803e3aa7>{e100_poll+55}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff804ba299>{net_rx_action+121}
> >>> <ffffffff8022d98b>{__do_softirq+75} Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff8020a6a6>{call_softirq+30} <ffffffff8020c361>{do_softirq+49}
> >>> Feb  9 02:54:26 Bilskirnir kernel:        <ffffffff8020c317>{do_IRQ+71}
> >>> <ffffffff80207a90>{default_idle+0}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80209e14>{ret_from_intr+0} <EOI>
> >>> <ffffffff8053517f>{thread_return+0}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff80207abe>{default_idle+46} <ffffffff80207bd4>{cpu_idle+68}
> >>> Feb  9 02:54:26 Bilskirnir kernel:
> >>> <ffffffff806ff8a6>{start_kernel+358}
> >>> <ffffffff806ff2ac>{x86_64_start_kernel+428}
> >>> Feb  9 02:54:26 Bilskirnir kernel: Mem-info:
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA per-cpu:
> >>> Feb  9 02:54:26 Bilskirnir kernel: cpu 0 hot: high 0, batch 1 used:0
> >>> Feb  9 02:54:26 Bilskirnir kernel: cpu 0 cold: high 0, batch 1 used:0
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA32 per-cpu:
> >>> Feb  9 02:54:26 Bilskirnir kernel: cpu 0 hot: high 186, batch 31
> >>> used:51 Feb  9 02:54:26 Bilskirnir kernel: cpu 0 cold: high 62, batch
> >>> 15 used:56 Feb  9 02:54:26 Bilskirnir kernel: Normal per-cpu: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: HighMem per-cpu: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: Free pages:        5380kB (0kB
> >>> HighMem) Feb  9 02:54:26 Bilskirnir kernel: Active:24574
> >>> inactive:214681 dirty:82 writeback:9471 unstable:0 free:1345 slab:14425
> >>> mapped:8866 pagetables:262 Feb  9 02:54:26 Bilskirnir kernel: DMA
> >>> free:3992kB min:36kB low:44kB high:52kB active:88kB inactive:4396kB
> >>> present:9384kB pages_scanned:25
> >>> all_unreclaimable? no
> >>> Feb  9 02:54:26 Bilskirnir kernel: lowmem_reserve[]: 0 994 994 994
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA32 free:1388kB min:4012kB
> >>> low:5012kB high:6016kB active:98208kB inactive:854328kB
> >>> present:1018020kB pages_scanned:0 all_unreclaimable? no
> >>> Feb  9 02:54:26 Bilskirnir kernel: lowmem_reserve[]: 0 0 0 0
> >>> Feb  9 02:54:26 Bilskirnir kernel: Normal free:0kB min:0kB low:0kB
> >>> high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0
> >>> all_unreclaimable? no Feb  9 02:54:26 Bilskirnir kernel:
> >>> lowmem_reserve[]: 0 0 0 0
> >>> Feb  9 02:54:26 Bilskirnir kernel: HighMem free:0kB min:128kB low:128kB
> >>> high:128kB active:0kB inactive:0kB present:0kB pages_scanned:0
> >>> all_unreclaimable? no
> >>> Feb  9 02:54:26 Bilskirnir kernel: lowmem_reserve[]: 0 0 0 0
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA: 0*4kB 1*8kB 1*16kB 0*32kB
> >>> 0*64kB 1*128kB 1*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3992kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: DMA32: 1*4kB 1*8kB 0*16kB 1*32kB
> >>> 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1388kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: Normal: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: HighMem: empty
> >>> Feb  9 02:54:26 Bilskirnir kernel: Swap cache: add 1, delete 1, find
> >>> 0/0, race 0+0
> >>> Feb  9 02:54:26 Bilskirnir kernel: Free swap  = 979700kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: Total swap = 979704kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: Free swap:       979700kB
> >>> Feb  9 02:54:26 Bilskirnir kernel: 262128 pages of RAM
> >>> Feb  9 02:54:26 Bilskirnir kernel: 5622 reserved pages
> >>> Feb  9 02:54:26 Bilskirnir kernel: 235851 pages shared
> >>> Feb  9 02:54:26 Bilskirnir kernel: 0 pages swap cached

-- 
Debian 'Etch' - Registered Linux User #241964
--------
"More laws, less justice." -- Marcus Tullius Ciceroca, 42 BC



More information about the Linux-users mailing list