I had some luck using a custom compiled version of the e1000e module, using the 3.8.7 version from sourceforge.net. I did have to disable an Ubuntu version check and SecureBoot (until I enroll my own MOK). But it boots, and the wired NIC seems to work!
But, while it boots right up, it won't load the e1000e driver module if the NVM checksum is bad. Booo! I'm still running a custom patched e1000e module (through DKMS) until Dell and our cyber-support team sort this out.
I have same identical problem on my Precision 7560 with Ubuntu 20.04.
The story: after first installation of Ubuntu 20.04 (and after update kernel to 5.11.0-38-generic) NIC work but Wi-Fi not (bloutooth was ok). After some search I have to remove file iwlwifi-ty-a0-gf-a0.pnvm and all was ok, except a problem which forced me to replace the motherboard.
After motehrboard replacement SBAM! NIC didn't work anymore: problem was
e1000e 0000:00:1f.6: The NVM Checksum Is Not Valid e1000e: probe of 0000:00:1f.6 failed with error -5
Here the problem: not all system reboot now happen successfully: sometimes system doesn't boot (with errore BUG: soft lockup - CPU#1 stuck for 23s!) or start very slow. If force restart during this time instead (hold power button for 10 seconds), all go on and Ubuntu start as a usual, with NIC working as expected.
Note that every time I access the bios and exit, the next reboot failed, ALWAYS! This behavior does not allow to start an iso because the Ubuntu ISOs do not start for the same problem (not allowing me to test if the NIC works with a different version of Ubuntu).
At the moment, as suggested by xerotope, I am forced to disable NIC in the BIOS and work only with Wi-Fi module.
Now some questions:
why NIC works with old motherboard and not with the replacement?
assuming that there is some catch in the current ubuntu installation after the replacement, how can i be sure that a reinstall of Ubuntu will solve the problem, not being able to test a live iso?
* Use the Ubuntu OEM kernel packages. This is what the Dell ISO includes, but you can also install it just through apt-get. I'm using linux-oem-20.04c which is based on the 5.13 kernel line. * Additional Dell OEM packages include oem-somerville-meta and oem-somerville-factory-meta
Now, the OEM kernels include the fix referenced in launchpad. However, if your NVM checksum is already bad, it will still return the error message when trying to load the e1000e module. As far as I can tell, this problem is caused either by a bad version of the driver trying to write the checksum in some kernels or a factory error programming the NIC.
My current workaround is to manually patch the driver kernel module using the patches from the original kernel bug report https://bugzilla.kernel.org/show_bug.cgi?id=213667 . To keep it up to date, I made a DKMS script to re-build it. However, I hacked together this process so don't currently have any step-by-step instructions.
If I get it fully automated I'll try and get a gist or something on Github and link it here. Good luck!
Yeah, the bugfix just stops it from hard locking, but the checksum is still checked and the module won't load.
I'm not sure about a timetable for a fix. My company's IT team has been in contact with Dell and it's been elevated to some higher level Linux engineering team and confirmed. But no path on a fix (other than USB-C ethernet dongles, yuck)
I also tried the ethtool/bootutil route, and it seems the consensus is the EEPROM is read-only, even if write protection is disabled in the module parameters.
So for now, patching the kernel module to bypass the checksum check entirely is my workaround.
if I understand correctly, the bugfix released into these kernel versions is for resolve the stuck problem during OS boot, not for bypassing "The NVM Checksum Is Not Valid" error, right?
And for some reason, my old motherboard probably had a nic module with correct checksum, while the current one does not.
But is it possible in your opinion that a complete fix for this problem will be released soon?
Unfortunately it does not even seem possible to act on the checksum with ethtool or bootutil64e (because write is probably blocked --> "Unable to write default configuration to EEPROM").
xerotope
8 Posts
1
July 19th, 2021 13:00
I had some luck using a custom compiled version of the e1000e module, using the 3.8.7 version from sourceforge.net. I did have to disable an Ubuntu version check and SecureBoot (until I enroll my own MOK). But it boots, and the wired NIC seems to work!
xerotope
8 Posts
1
July 21st, 2021 19:00
Found a bug report on the Linux Kernel bug tracker here: https://bugzilla.kernel.org/show_bug.cgi?id=213667
The included patches also seem to fix the issue. Hopefully those make it into the OEM kernel soon!
xerotope
8 Posts
0
October 28th, 2021 18:00
As an update, a fix did make it into the OEM kernel. So now it won't hang on boot trying to write a correct checksum to the NIC NVM. Yay! https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1936998
But, while it boots right up, it won't load the e1000e driver module if the NVM checksum is bad. Booo! I'm still running a custom patched e1000e module (through DKMS) until Dell and our cyber-support team sort this out.
marrcin
1 Rookie
•
11 Posts
0
November 1st, 2021 11:00
Is there any hope that someone tries to fix it? I think we cannot expect the next Kernel patch when the NVM checksum is wrong.
Windows doesn't complain and the network card works, so probably it ignores the bad checksum by default.
pindi
1 Rookie
•
10 Posts
0
November 21st, 2021 01:00
Hi,
I have same identical problem on my Precision 7560 with Ubuntu 20.04.
The story: after first installation of Ubuntu 20.04 (and after update kernel to 5.11.0-38-generic) NIC work but Wi-Fi not (bloutooth was ok). After some search I have to remove file iwlwifi-ty-a0-gf-a0.pnvm and all was ok, except a problem which forced me to replace the motherboard.
After motehrboard replacement SBAM! NIC didn't work anymore: problem was
Then, after upgrade on kernel 5.11.0-41-generic, I installed related dkms fix https://github.com/koljah-de/e1000e-dkms-debian/releases and after reboot NIC has returned to work.
Here the problem: not all system reboot now happen successfully: sometimes system doesn't boot (with errore BUG: soft lockup - CPU#1 stuck for 23s!) or start very slow. If force restart during this time instead (hold power button for 10 seconds), all go on and Ubuntu start as a usual, with NIC working as expected.
Note that every time I access the bios and exit, the next reboot failed, ALWAYS! This behavior does not allow to start an iso because the Ubuntu ISOs do not start for the same problem (not allowing me to test if the NIC works with a different version of Ubuntu).
At the moment, as suggested by xerotope, I am forced to disable NIC in the BIOS and work only with Wi-Fi module.
Now some questions:
Thanks in advance!
xerotope
8 Posts
0
November 25th, 2021 06:00
@pindi , here's my current configuration steps:
* Use the Ubuntu OEM kernel packages. This is what the Dell ISO includes, but you can also install it just through apt-get. I'm using linux-oem-20.04c which is based on the 5.13 kernel line.
* Additional Dell OEM packages include oem-somerville-meta and oem-somerville-factory-meta
Now, the OEM kernels include the fix referenced in launchpad. However, if your NVM checksum is already bad, it will still return the error message when trying to load the e1000e module. As far as I can tell, this problem is caused either by a bad version of the driver trying to write the checksum in some kernels or a factory error programming the NIC.
My current workaround is to manually patch the driver kernel module using the patches from the original kernel bug report https://bugzilla.kernel.org/show_bug.cgi?id=213667 . To keep it up to date, I made a DKMS script to re-build it. However, I hacked together this process so don't currently have any step-by-step instructions.
If I get it fully automated I'll try and get a gist or something on Github and link it here. Good luck!
xerotope
8 Posts
0
November 25th, 2021 07:00
Yeah, the bugfix just stops it from hard locking, but the checksum is still checked and the module won't load.
I'm not sure about a timetable for a fix. My company's IT team has been in contact with Dell and it's been elevated to some higher level Linux engineering team and confirmed. But no path on a fix (other than USB-C ethernet dongles, yuck)
I also tried the ethtool/bootutil route, and it seems the consensus is the EEPROM is read-only, even if write protection is disabled in the module parameters.
So for now, patching the kernel module to bypass the checksum check entirely is my workaround.
pindi
1 Rookie
•
10 Posts
0
November 25th, 2021 07:00
Hi @xerotope
if I understand correctly, the bugfix released into these kernel versions is for resolve the stuck problem during OS boot, not for bypassing "The NVM Checksum Is Not Valid" error, right?
And for some reason, my old motherboard probably had a nic module with correct checksum, while the current one does not.
But is it possible in your opinion that a complete fix for this problem will be released soon?
Unfortunately it does not even seem possible to act on the checksum with ethtool or bootutil64e (because write is probably blocked --> "Unable to write default configuration to EEPROM").