[Soekris] net6501-70: Corrupt LAN NVM configuration after 1.41c BIOS upgrade

Nix nix at esperi.org.uk
Tue Jun 2 01:32:01 CEST 2015


On 1 Jun 2015, Max C. stated:

> Hello -
> I read about the poor performance of the net6501 with the stock 1.41a BIOS so I upgraded it to 1.41c.  After rebooting, I now get this error during the POST:
>
> PXE-E05: The LAN adapter's NVM configuration is corrupted or has not been
> initialized. The Boot Agent cannot continue.
>
> If I continue and allow FreeBSD 10.1p9 to boot I get these kernel messages on startup, and the em* interfaces go missing:
>
> em0: <Intel(R) PRO/1000 Network Connection 7.4.2> port 0x2000-0x201f mem 0xa1000000-0xa101ffff,0xa1020000-0xa1023fff irq 19 at device 0.0 on pci5
> em0: Using MSIX interrupts with 3 vectors
> em0: The EEPROM Checksum Is Not Valid
> device_attach: em0 attach returned 5
>
> How do I recover from this?  Is there some way to reflash the NIC's BIOS??

Yes, but as I understand it, the EEPROM on these cards is
hardware-specific, associated with the way the card is hooked up to the
rest of the system (it's not a BIOS, it's purely data, as I understand
it, no code). Maybe another net6501 owner could dump their EEPROM? That
might provide you with a starting point (probably differing only in the
MAC address, which you can change at runtime anyway).

(I'm afraid I only know how to do this from Linux, where it's
ethtool --eeprom-dump and then loop calling ethtool --change-eeprom to
write each byte in turn: an unpleasant interface. I hope FreeBSD has
something better.)

Oh, and consider yourself lucky that the thing is still on the PCI bus
at all, since if its EEPROM is buggered up badly enough it can vanish
off the bus entirely and you have to unmount the EEPROM from the NIC and
stick it in an EEPROM writer to fix it, assuming you know what magic
bytes to write in there to fix it with. Obligatory horror story:
<https://lwn.net/Articles/301251/> <https://lwn.net/Articles/304105/>.
Lines like "restoring trashed e1000e adapters appears to be a hard
problem" and "engineers [at Intel] locked themselves into a lab with a
box full of e1000e adapters" are the sort to make even hardened kernel
hackers blanch.

Whatever drugs people at Intel were on when they decided on the
necessity of an EEPROM that's written to routinely and that if messed up
causes this much trouble, it must have been strong. (LWN described it as
"the destruction of hardware", and they're not far wrong.)


One other approach: you may be lucky enough to find that you can just
dike out the eeprom checksum check from the kernel and everything will
work. It's worth a try -- I don't think it can do any *more* harm.

-- 
NULL && (void)


More information about the Soekris-tech mailing list