[Soekris] NET4501 CF/IDE Controller Problem
Jeff Quast
af.dingo at gmail.com
Fri Jul 28 17:07:21 UTC 2006
On 7/27/06, T Sharpe <beaststwo at yahoo.com> wrote:
> I've had a NET4501, comBIOS ver 1.23, with a 1GB
> IBM/Toshiba Microdrive, running Debian Linux 3.0
> (kernel 2.4.19) for about 2 years. The system is
> stable, except for getting the following log entry:
>
> kernel: hda: irq timeout: status=0xd0 { Busy }
> kernel: ide0: unexpected interrupt, status=0x80,
> count=7
> kernel: ide0: reset: success
>
> In the past, after the system had run for months
> without rebooting, I'd get long strings of these,
> followed by messages of more severity and eventually
> leading to things getting bad enough that the Unix
> daemon I wrote would let the hardware watchdog timer
> reboot the system, clearing the problem. This would
> all happen over the course of 15-40 minutes. This
> last happened after the system had been up for 374
> days with no problems.
>
> Since the last reboot, I'm seeing single "sets" of
> these log messages several times per day. The
> messages indicate to me that the kermel is losing the
> ability to talk to the drive and is reseting the IDE
> interface to fix it.
>
> Since there are no IDE cables to be bad, I must assume
> the problem is either the Microdrive, the NET4501
> interfacing hardwar, or the comBIOS. I've not seen
> anything like this mentioned in the soekris-tech
> archives nor in the comBIOS changelogs.
>
> Any ideas on how to tell what the problem is, short of
> giving up and replacing parts until the problem goes
> away?
>
> Thanks!
>
> Tim Sharpe
> beaststwo at yahoo.com
It would be my assumption that the CF card's sectors have reached
their r/w limit.
-this is an error that occured very rarely for some time, and has
degraded recently.
-with such a long uptime, this CF card was in use for a very very long time
the syslog data sent to me in private indicates that file r/w are
done very often (ftp server?)
this unix watchdog daemon you wrote, does this attempt r/w and
trigger the watchdog if it fails? .. just adding more fuel to the
fire.
I would say its just a CF card with too many writes. i would replace
it with a $20 CF and see if the issue persists. I'm betting it wont.
This is a very very cheap part of finding root cause.
More information about the Soekris-tech
mailing list