SELinux insanity

Tue Dec 15 08:49:19 PST 2020

On Tuesday, December 15, 2020 12:57:55 AM EST Jay Nugent via Linux-users 
wrote:
> >>        an Ubuntu machine will run just perfectly for 1 to 3 days, then
> >>        suddenly the "Load Average" jumps up to 4.xx or 5.xx and upwards
> >>        of
> >>        8.xx???  There is nothing running on the box, it is not attached
> >>        to
> >>        the Internet, and has no users.  DMESG shows that the four CPU's
> >>        suddenly go idle for 23 or more seconds and then magically return
> >>        to
> >>        processing after tens of seconds.  The ONLY way to get the Load
> >>        Average down is to reboot.  Any ideas???
> > 
> > That's odd.  What can you tell us about the hardware?  How old, what
> > style/
> > vendor/etc...
>     Nice thing is that it runs on only 15 watts (12.0 volts D.C.) and with
> a Buck/Boost regulator will run nicely on my whole-house 12 VDC Solar
> power system.  A nice replacement to the 19" rack mounted POWER HUNGRY
> server I've been running the past 10 years.

Nice!

>     Wrote the following cronjob that writes pertinent information out to a
> text file every 10 minutes:
> 
> #
> dmesg | tail -10 >> /root/temp-load.txt ;
> sensors | grep Core >> /root/temp-load.txt ;
> cat /proc/cpuinfo | grep "cpu MHz" >> /root/temp-load.txt ;
> netstat -t >> /root/temp-load.txt ;
> uptime >> /root/temp-load.txt ; echo
> "----------------------------------------" >> /root/temp-load.txt #
> 
> 
>     After generally 3 days of running the LOAD AVERAGE will suddenly jump
> up high and keyboard response gets really sluggish.  As you can see below,
> the Load Average is pretty normal, then in the next sample DMESG reports
> CPUIDLE errors and the Load Average jumps way up...and stays there.
> 
> 
>   13:00:01 up 3 days, 17:28,  2 users,  load average: 0.01, 0.02, 0.00
> ----------------------------------------
> [322833.697550] ba cf f7 53 e3 a5 9b c4 20 48 89
> [322833.697564]  cpuidle_enter+0x17/0x20
> [322833.697565] d8 48 c1 fb
> [322833.697574]  call_cpuidle+0x23/0x40
> [322833.697575] 3f
> [322833.697582]  do_idle+0x18c/0x1f0
> [322833.697587]  cpu_startup_entry+0x73/0x80
> [322833.697593]  start_secondary+0x1ab/0x200
> [322833.697599]  secondary_startup_64+0xa5/0xb0
> [322833.697603] Code: 65 8b 3d 4d 4d c0 6c e8 28 7c 8b ff 48 89 c3 0f 1f
> 44 00 00 31 ff e8 79 d7 8c ff 45 84 ff 0f 85 e4 01 00 00 fb 66 0f 1f 44 00
> 00 <48> 2b 5d d0 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f
> Core 0:       +35.0°C  (high = +105.0°C, crit = +105.0°C)
> Core 1:       +35.0°C  (high = +105.0°C, crit = +105.0°C)
> Core 2:       +33.0°C  (high = +105.0°C, crit = +105.0°C)
> Core 3:       +33.0°C  (high = +105.0°C, crit = +105.0°C)
> cpu MHz         : 1332.957
> cpu MHz         : 1332.836
> cpu MHz         : 1332.446
> cpu MHz         : 1332.776

What's the kernel dump in dmesg?  That is likely your problem.  Can you send 
more of the dmesg output from when it's acting up?  You can send it direct if 
the list won't allow it.

Thanks,
Matt