UEFI PXE grub boot fails for add-on adapter i350 1Gb interface

Bug #1787637 reported by acd
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Undecided
Unassigned

Bug Description

Severity:
At canonical certification, MAAS will fail to enlist the node during UEFI PXE boot-up using add-on NIC with an i350 chipset

Test environment:
1) The Node has only one IPMI port with add-on NIC interfacing as an option
2) Add-on NIC information: 1Gb link speed I350 Gigabit Network Connection
3) MAAS server OS version:
lsb_release -a:
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial

dpkg -l '*maas*'|cat:
||/ Name Version Architecture Description
+++-===============================-====================================-============-==================================================

ii maas 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all "Metal as a Service" is a physical cloud and IPAM
ii maas-cert-server 0.3.4-0~201806080105~ubuntu16.04.1 all Ubuntu certification support files for MAAS server
ii maas-cli 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS client and command-line interface
ii maas-common 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS server common files
ii maas-dhcp 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS DHCP server
ii maas-dns 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS DNS server
ii maas-proxy 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS Caching Proxy
ii maas-rack-controller 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all Rack Controller for MAAS
ii maas-region-api 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all Region Controller for MAAS
ii python3-django-maas 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.3.0-6434-gd354690-0ubuntu1~16.04.1 all MAAS server provisioning libraries (Python 3)

Below are the symptoms:
1) Attempt to UEFI PXE boot a node under MAAS
2) The normal UEFI PXE boot messages appear on the screen then followed with "grub>" prompt
3) Boot-up stays at GRUB menu and it will never continue to boot under MAAS direction

Additional test:
1) PXE boot at Legacy mode under MAAS, the node with add-on 1Gb NIC works fine.
2) PXE boot at UEFI mode under MAAS using 10Gb add-on NIC with X550T driver, the node was enlisted, commissioned and deployed successfully
3) PXE boot at UEFI mode under different PXE servers, the node with 1Gb add-on NIC works fine.
4) MAAS server version was updated to 18.04 Bionic release, the problem still persists.
lsb_release -a:
Description: Ubuntu 18.04.1 LTS
Release: 18.04
Codename: bionic
dpkg -l '*maas*'|grep -i maas:
ii maas 2.4.0~beta2-6865-gec43e47e6-0ubuntu1 all "Metal as a Service" is a physical cloud and IPAM
5) Tried using different form factor AOC (standard PCI) with the same i350 chipset. Still, the node fails to PXE boot at UEFI.

This issue is isolated only to a system that has an AOC with i350 chipset. However, a system with x550T AOC or a system with an onboard i-350 chipset, UEFI PXE boot has no problem.

As a workaround, in order for the node to boot at UEFI mode, bios configuration needed settings are to boot first at IPv6 before IPv4. By doing this unusual procedure, UEFI PXE boot works fine with 1Gb NIC interface. This is the same workaround mentioned by Rod Smith with his comment #21 on Bug #1437024.

This issue has the same symptom (i.e. UEFI PXE grub boot failure) as reported by other reported bugs. Such reports are the ff:
1. Bug#1437353 2015-03-27 UEFI network boot hangs at grub for adapter 82599ES 10-Gigabit SFI/SFP+
2. Bug #1437024 2015-03-26 Failure to PXE-boot from secondary NIC
3. Bug #1711203 2017-08-16 Deployments fail when Secure Boot enabled
4. Bug #1730493 2017-11-06 MAAS is dropped in grub menu when booting in UEFI mode, and Secure Boot not working

This symptom is reproducible using an add-on NIC card with chipset i350 on a node to PXE boot under UEFI mode.

Changed in maas:
status: New → Invalid
Revision history for this message
Andres Rodriguez (andreserl) wrote :

Hi Alec,

I'm marking this bug as Invalid as I believe this is either a grub issue, or a firmware issue.

A. Based on your bug report description:

2) PXE boot at UEFI mode under MAAS using 10Gb add-on NIC with X550T driver, the node was enlisted, commissioned and deployed successfully
3) PXE boot at UEFI mode under different PXE servers, the node with 1Gb add-on NIC works fine.

Since (2) and (3) confirm this works on all hardware but the one you currently testing, this would lead me to believe there may be an issue with the firmware, more than with grub.

B. However, since:

"As a workaround, in order for the node to boot at UEFI mode, bios configuration needed settings are to boot first at IPv6 before IPv4. By doing this unusual procedure, UEFI PXE boot works fine with 1Gb NIC interface. This is the same workaround mentioned by Rod Smith with his comment #21 on Bug #1437024."

Since the work around on Bug #1437024 works for you, that would lead me to believe this is a duplicate issue. Bug #1437024 is a *grub* issue.

C. "This issue has the same symptom (i.e. UEFI PXE grub boot failure) as reported by other reported bugs. Such reports are the ff"

The list you provided, they are all grub related issues.

The only I would lean towards the most is bug 1711203 if you have secure boot enabled. If you can see in such bug report, the fix is in progress for the shim.

That said, I would also considering exploring firmware issues as well since all other hardware works fine.

Hope this helps.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.