OT: SCO Release 5 reboots upon printing from filePro

Brian K. White brian at aljex.com
Thu Jan 30 15:32:24 PST 2014


On 1/21/2014 8:02 PM, fpgroups . wrote:
> Thank you all for replying:
>
> (1) reboot is like pressing reset button, no warning, no shutdown warning,
> nothing at all
> (2) It is an old server but I think I can get a couple memory syms - not
> even sure what kind they are
> (3) I will try all of your suggestions starting with memory (very easy) and
> take it from there


That kind of reboot is absolutely no-question a hardware problem.
The 3 most likely culprits, especially given that it's old, are:

1) weak power supply
2) bad capacitors on the motherboard
3) bad ram

You may _also_ now have filesystem corruption due to the hardware going 
flakey during operation but that isn't your real problem, and, I would 
actually NOT try to do any fsck until AFTER you fix the hardware.

If you can't fix the hardware, then pull the drives and capture dd 
images of them using some other machine which is NOT old and flakey.

Basically you do not want to be doing any writing to the disk on that 
machine while it's brains are defective. Especially not fsck with -y 
-ofull ! You risk turning a cold into a lobotomy.

How to address each problem above:
1) weak power supply: There are simple power supply testers at computer 
stores or on-line for $20. Or just get a new power supply of the same 
type (AT/ATX/20pin/24pin/ P4 aux 4pin/6pin etc...) and of equal or 
greater wattage.

2) Bad caps: Visually inspect the motherboard. Look at all the larger 
aluminum cans that have plastic heat-shrink on the sides and exposed 
aluminum flat tops. There are a few different kinds of capacitors but 
those are the electrolytic kind you are interested in. If any of them 
have a swelled/domed top surface instead of perfectly flat, those are 
bad. If any of them have brown residue like something leaked out of them 
and dried, those are bad. Don't be confused my inspector marks which can 
look almost the same. You can google for youtube videos about how to 
identify bad caps. You can actually replace them. Sounds scary since it 
requires soldering things on a motherboard but it's actually pretty 
easy. Moving to new hardware or a VM is the better long term solution of 
course but this can get you out of the jam and give you the luxury of 
time to do a better migration. You should be able to find a local repair 
guy who will do it pretty easy. Best buy probably won't do it, and maybe 
the very first random local independent computer guy you call might not 
do it, but you should find someone within a few calls.

3) Bad ram: memtest86+, and swapping individual sticks in & out. IE, 
start by not changing anything, just make a memtest86+ boot cd (not to 
be confused with memtest86 without the +), boot the cd and let memtest 
run overnight, or until you see errors which might be right away or 
after an hour or never.

If you do see errors right away, make a note of the general address 
ranges, then reboot and see if they happen again in the same area.

If they happen again in the same area, remove one stick and run again. 
What you do then depends on what you find obviously. Basically try to 
find which sticks are good and bad.

Warning, It's not always easy. I have seen cases where all sticks were 
fine by themselves, but they errored when used together. And memtest can 
give false positives. I have a few servers where there is actually NO 
problem with the ram or the motherboard, but memtest86+ and memtest86 
both latest versions and older versions report errors, but there is 
actually no problem, The built-in tests in the bios don't show any 
problem, OS's are good, memory testers that run from within an OS don't 
show any problem and I've never seen any actual problem in years of 
production. The same sticks in another machine show clean, and memtest 
shows the same error on that machine no matter what sticks are used. 
That suggests it's just some kind of incompatibility between memtest and 
that motherboard or some bios features & settings that diddle with memory.

But I don't think I've ever seen or heard of a false negative. IF you do 
happen to get a clean run for 8 hours, then that should be good ram or 
at least it's good on that motherboard under those bios settings.

A clean run would be getting through all tests at least once, ideally 
several times. That's the "pass" number in memtest. One "pass" is one 
pass through every different test. One pass can take hours. If you leave 
it overnight, you are looking for the pass number to be higher than 1 
and no errors on the screen.

-- 
bkw


>
> Thank you all for your input!!!!
>
>
>
> On Mon, Jan 20, 2014 at 5:47 PM, Kenneth Brody <kenbrody at spamcop.net> wrote:
>
>> On 1/20/2014 4:36 PM, fpgroups . wrote:
>>
>>> Marked OT since the problem stems from the OS not filePro.
>>>
>>
>> :-)
>>
>>
>>   If I do:  l |lp
>>>
>>> directory listing prints no poblem ...  If I print a report or anything
>>> from within my filePro application server reboots.
>>>
>>> My printer is defined as lp -dlazen -o raw -s within printer
>>> configuration.
>>>
>>> Same happens if I attempt to run a backup using
>>>
>>> tar cfF - /tmp/list |gzip > /my_full_backup.gz
>>>
>>> One more point is as it gets to "login" prompt after rebooting, it goes
>>> through boot cycle again.  Not sure why but if I let it boot by itself, it
>>> is less likely to reboot before it gives me a "login" prompt - When I go
>>> through the motions of pressing ENTER and CTRL-D, it reboots after
>>> pressing
>>> CTRL-D.
>>>
>>> I am in trouble I know, but how/what can I do to at least get a clean
>>> backup?
>>>
>>
>> What sort of a reboot happens?  Is it like pressing the reset button,
>> where the system is suddenly and without warning going through the boot
>> cycle?  Is there a "panic" message prior to the reboot?  (I'm not sure if
>> there's a way to get OSR5 to not auto-reboot after a panic.)  Does it start
>> a "normal" shutdown/restart sequence?  Something else entirely?
>>
>> You might consider running a memory diagnostic on the system -- I've seen
>> bad RAM produce some very strange symptoms over the years.
>>
>> --
>> Kenneth Brody
>>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: http://mailman.celestial.com/pipermail/filepro-list/attachments/20140121/da23b3ef/attachment.html
> _______________________________________________
> Filepro-list mailing list
> Filepro-list at lists.celestial.com
> Subscribe/Unsubscribe/Subscription Changes
> http://mailman.celestial.com/mailman/listinfo/filepro-list
>



More information about the Filepro-list mailing list