SCSI madness
Tim Wunder
tim
Thu Jan 13 15:02:38 PST 2005
On 1/13/2005 2:50 PM, I believe that Net Llama! wrote:
> On Thu, 13 Jan 2005, Bill Campbell wrote:
>
>>On Thu, Jan 13, 2005, Net Llama! wrote:
>>
>>>On Thu, 13 Jan 2005, A. Khattri wrote:
>>>
>>>>On Thu, 13 Jan 2005, Net Llama! wrote:
>>>>
>>>>
>>>>>I've been tearing my hair out for the past 4 days trying to deal with SCSI
>>>>>errors on a box runnign FC3.
>>>>>
>>>>>For the past week or so, whenever I boot up, see the following error on
>>>>>the console:
>>>>>sym0: SCSI parity error detected: SCR1=132 DBC=50000000 SBCL=0
>>>>>
>>>>>When the above error occurs, the box functions just fine, but i fear for
>>>>>my data. In fact, sometimes, i don't get the error above, and the
>>>>>box boots up fine, and then as soon as the login prompt appears, i start
>>>>>seeing SCSI reset/abort errors instead. When the reset/abort errors show
>>>>>up then i can't do anything, all input fails to register and the box can
>>>>>only be reset.
>>>>>
>>>>>FWIW, i'm using an old symbios 8951 SCSI controller. I've replaced the
>>>>>SCSI drive, the SCSI cable, and now the controller, and yet the fscking
>>>>>error persists.
>>>>>
>>>>>I'm at a loss as to how this error can persist when i've replaced every
>>>>>component on the chain.
>>>>>
>>>>>Anyone have any ideas?
>>>>
>>>>
>>>>Did some Googling and this might be a flaky Symbios driver in the 2.6
>>>>kernel:
>>>>
>>>>http://lists.suse.com/archive/suse-linux-e/2004-Oct/1745.html
>>>
>>>Hrmmm,maybe, but i've been using a 2.6.x kernel on this box for over a
>>>year now, and this problem started about a week ago.
>>
>>That sounds like a hardware or cabling/termination problem, and given that
>>it's SCSI, the first thing I would check is that all the cables are firmly
>>seated. Removing and replacing the cables may help to clean the contacts.
>>Active terminators are far superior to depending on the internal
>>termination of the SCSI devices.
>>
>>Reset/abort errors may well indicate a failing hard drive. I would suspect
>>the electronics on the HD rather than the media. Overheating may well be a
>>problem.
>
>
> I replaced the drive yesterday morning and the problem persisted (and is
> actually now worse). I've had this exact setup for about the past 3
> years. Nothing in this box has changed other than the version of Linux
> during that timespan.
>
> I'm not debating that there is a hardware problem, but i'm at a loss as to
> where it can be, since i've replaced the controller, the SCSI cable, the
> terminator on the cable, and the drive and the problem has only grown
> worse.
>
RAM, CPU, Power?
More information about the Linux-users
mailing list