SCSI madness
Net Llama!
netllama
Thu Jan 13 14:50:39 PST 2005
On Thu, 13 Jan 2005, Bill Campbell wrote:
> On Thu, Jan 13, 2005, Net Llama! wrote:
> >On Thu, 13 Jan 2005, A. Khattri wrote:
> >> On Thu, 13 Jan 2005, Net Llama! wrote:
> >>
> >> > I've been tearing my hair out for the past 4 days trying to deal with SCSI
> >> > errors on a box runnign FC3.
> >> >
> >> > For the past week or so, whenever I boot up, see the following error on
> >> > the console:
> >> > sym0: SCSI parity error detected: SCR1=132 DBC=50000000 SBCL=0
> >> >
> >> > When the above error occurs, the box functions just fine, but i fear for
> >> > my data. In fact, sometimes, i don't get the error above, and the
> >> > box boots up fine, and then as soon as the login prompt appears, i start
> >> > seeing SCSI reset/abort errors instead. When the reset/abort errors show
> >> > up then i can't do anything, all input fails to register and the box can
> >> > only be reset.
> >> >
> >> > FWIW, i'm using an old symbios 8951 SCSI controller. I've replaced the
> >> > SCSI drive, the SCSI cable, and now the controller, and yet the fscking
> >> > error persists.
> >> >
> >> > I'm at a loss as to how this error can persist when i've replaced every
> >> > component on the chain.
> >> >
> >> > Anyone have any ideas?
> >>
> >>
> >> Did some Googling and this might be a flaky Symbios driver in the 2.6
> >> kernel:
> >>
> >> http://lists.suse.com/archive/suse-linux-e/2004-Oct/1745.html
> >
> >Hrmmm,maybe, but i've been using a 2.6.x kernel on this box for over a
> >year now, and this problem started about a week ago.
>
> That sounds like a hardware or cabling/termination problem, and given that
> it's SCSI, the first thing I would check is that all the cables are firmly
> seated. Removing and replacing the cables may help to clean the contacts.
> Active terminators are far superior to depending on the internal
> termination of the SCSI devices.
>
> Reset/abort errors may well indicate a failing hard drive. I would suspect
> the electronics on the HD rather than the media. Overheating may well be a
> problem.
I replaced the drive yesterday morning and the problem persisted (and is
actually now worse). I've had this exact setup for about the past 3
years. Nothing in this box has changed other than the version of Linux
during that timespan.
I'm not debating that there is a hardware problem, but i'm at a loss as to
where it can be, since i've replaced the controller, the SCSI cable, the
terminator on the cable, and the drive and the problem has only grown
worse.
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Lonni J Friedman netllama at linux-sxs.org
Linux Step-by-step & TyGeMo http://netllama.ipfox.com
More information about the Linux-users
mailing list