grubnet default grub.cfg should try /grub/grub.cfg-${net_default_mac} before /grub/grub.cfg

Bug #1923268 reported by Lee Trager
28
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Undecided
Unassigned
2.7
Won't Fix
Undecided
Unassigned
2.8
Fix Committed
Undecided
Unassigned
2.9
Fix Released
Undecided
Unassigned
grub2-signed (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

MAAS uses the signed network GRUB bootloader when a machine network boots on AMD64 and ARM64. The configuration MAAS produces depends on the machine which are identified by MAC address. The default grub.cfg in the boot loader downloads /grub/grub.cfg from the remote host. As that doesn't provide the MAC address MAAS provides a default configuration file:

configfile /grub/grub.cfg-${net_default_mac}
configfile /grub/grub.cfg-default-amd64

There are two issues with this:

1. This causes an additional unnecessary request.
2. It is assumed an known machine is amd64.

Can the default grub.cfg embedded in grubnet<arch>.efi be updated to

configfile /grub/grub.cfg-${net_default_mac}
configfile /grub/grub.cfg-default-<ARCH>
configfile /grub/grub.cfg

This would be similar to what PXELinux[1] does.

[1] https://wiki.syslinux.org/wiki/index.php?title=PXELINUX

Tags: maas

Related branches

Revision history for this message
Lee Trager (ltrager) wrote :

We have noticed that this is breaking enlistment in MAAS on ARM64. Using ${grub_cpu} may work but MAAS expects Debian architectures.

[1] https://www.gnu.org/software/grub/manual/grub/grub.html#grub_005fcpu

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in grub2-signed (Ubuntu):
status: New → Confirmed
tags: added: maas
Revision history for this message
Dimitri John Ledkov (xnox) wrote :

I am slightly confused what is being asked here.

The default grub.cfg that is built into grubnet comes from here:

https://git.launchpad.net/~ubuntu-core-dev/grub/+git/ubuntu/tree/debian/build-efi-images?h=ubuntu#n71

Can you please make a merge proposal with exactly what you are asking for?

Cause it should be querying for -default-arm64 already.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Also i am kind of confused why there are different per-arch grub.cfg for enlistment.

there is no reason to have different grub.cfg per arch, as inside the grub.cfg that is served everything can be provided for all arches.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Can you please paste the contents of MAAS generated:

grub.cfg-default-amd64
grub.cfg-default-arm64

?

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Note that upstream and all distro grubs have moved on to query for:

"""
This patch implements a search for a specific configuration when the config
file is on a remoteserver. It uses the following order:
   1) DHCP client UUID option.
   2) MAC address (in lower case hexadecimal with dash separators);
   3) IP (in upper case hexadecimal) or IPv6;
   4) The original grub.cfg file.
"""

Is that not sufficient in the future?

Revision history for this message
Lee Trager (ltrager) wrote :

When MAAS gets the TFTP/HTTP request for grub.cfg it doesn't know what the client architecture is. MAAS identifies the machine based on the requested URL. MAAS comes with a default grub.cfg[1] which tries loading /grub/grub.cfg-${net_default_mac} and if that fails /grub/grub.cfg-default-amd64. Machines which have been added to MAAS will match on grub.cfg-${net_default_mac} which already has the architecture defined in Postgres. During enlistment the machine is unknown so /grub/grub.cfg-default-amd64 is used. /grub/grub.cfg-default-amd64 always returns the kernel and initrd for AMD64.

This wasn't a problem previously on ARM64 because we were building an unsigned GRUB which embedded a grub.cfg that tried /grub/grub.cfg-${net_default_mac} then /grub/grub.cfg-default-arm64[2].

MAAS responds to the following:
1. grub/grub.cfg-<MAC> - MAC address in IEEE 802 colon-seperated format. There is a note in MAAS source that GRUB will only send MAC addresses in this format[1]. Due to this dash separated MAC is not supported.
2. grub/grub.cfg-default-<ARCH> - ARCH is a Debian architecture supported by MAAS(amd64, arm64, etc)
3. grub/grub.cfg - This simply tries 1 then 2.

While its possible to add support for other formats GRUB comes from the MAAS stream. We can't force existing users to upgrade.

[1] https://git.launchpad.net/maas/tree/src/provisioningserver/boot/uefi_amd64.py?h=2.9#n22
[2] https://git.launchpad.net/maas-images/tree/conf/bootloaders.yaml?id=10c44123886a03f828ee19675164ced83928ba27#n46

Revision history for this message
dann frazier (dannf) wrote :

Couldn't MAAS generate a grub.cfg that dynamically determines the appropriate fallback architecture?

grub> echo $grub_cpu
arm64

Revision history for this message
Lee Trager (ltrager) wrote :

MAAS expects the Debian architecture to be used. We could create a map for it and respond as you suggest. However this would only fix new versions of MAAS. MAAS started using bootloaders from the stream in MAAS 2.1. We no longer support MAAS < 2.5, those versions would remain broken. It would most likely take a few months to back port and qualify to MAAS 2.5-2.9. Patching GRUB to try what MAAS expects would be the fastest solution to get a fix to all of our users.

Revision history for this message
dann frazier (dannf) wrote :

Understood, and I see you already covered that above. What I don't follow is what caused MAAS enlistment of ARM to break. I mean, it used to work fine. From reading the tea leaves here, was it that MAAS switched from generating its own grubnet image w/ a built-in grub.cfg to a signed (and therefore immutable) grubnet?

Revision history for this message
Lee Trager (ltrager) wrote :

lp:maas-images produces the bootloaders stream. It was previously pointed to Bionic for i386(pxelinux), amd64(shim+grub-signed), and arm64(grub) while PPC64(grub) is pointed to Xenial. In an attempt to get secure boot working I upgraded i386, amd64, and arm64 to Focal. arm64 and PPC64 had their grub "built" with grub-mkimage as they don't use the signed package. Part of this process included a custom grub.cfg which did the right thing. Our build host is still using Precise and the team managing it doesn't have time to upgrade it. I noticed that grub-signed is now available for arm64 so I moved arm64 to shim+grub-signed.

We test all changes to the stream however our current process doesn't test enlistment, only commissioning, testing, and deployment. We noticed this bug in our CI and filed it. A resolution is being discussed this week at the sprint.

Revision history for this message
Lee Trager (ltrager) wrote :

The attached MP should fix this in newer versions of MAAS. We still may want to change the default grub.cfg for older versions of MAAS as I'm not sure how far we will be backporting it.

Revision history for this message
dann frazier (dannf) wrote :

Thanks for the quick turnaround on an MP @ltrager!

Changed in maas:
milestone: none → 3.0.0-rc1
status: New → Fix Committed
Revision history for this message
dann frazier (dannf) wrote :

Unfortunately it appears this issue goes beyond just enlistment. I manually added 6 new arm64 systems using the MAAS UI, and made sure I set the architecture to "arm64/generic". They then automatically entered commissioning mode. However, they are also failing to Commission. MAAS shows "0/arm64" in the "CORES/ARCH" column, but MAAS' GRUB config still wants to provide them with x86 bits (see below). I was able to proceed by interactively editing the GRUB config on the console (s/amd64/arm64/g; s/(linux|initrd)efi/\1/) on each console. For some reason commissioning that way left the network unconfigured, so I then manually configured a NIC on each system, and then *re-*commissioned, which did work fine.

                             GNU GRUB version 2.04

 ����������������������������������������������������������������������������Ŀ
 �setparams 'Commission' �
 � �
 � echo 'Booting under MAAS direction...' �
 � linuxefi (http,10.229.32.21:5248)/images/ubuntu/amd64/ga-20.04/focal/s\�
 �table/boot-kernel nomodeset ro root=squash:http://10.229.32.21:5248/images/\�
 �ubuntu/amd64/ga-20.04/focal/stable/squashfs ip=::::maas-enlist:BOOTIF ip6=o\�
 �ff overlayroot=tmpfs overlayroot_cfgdisk=disabled cc:\{'datasource_list': [\�
 �'MAAS']\}end_cc cloud-config-url=http://10.229.32.21:5248/MAAS/metadata/lat\�
 �est/enlist-preseed/?op=get_enlist_preseed apparmor=0 log_host=10.229.32.21 \�
 �log_port=5247 --- sysrq_always_enabled BOOTIF=01-${net_default_mac} �
 � initrdefi (http,10.229.32.21:5248)/images/ubuntu/amd64/ga-20.04/focal/s\�
 �table/boot-initrd �
 � �
 ������������������������������������������������������������������������������

Revision history for this message
Lee Trager (ltrager) wrote :

When you added the ARM64 machine are you including the boot MAC address or just the IPMI credentials? When you add just the IPMI credentials the machine actually goes into enlistment but MAAS detects its a known machine based on IPMI credenitals when the BMC is detected.

The associated branch updates the default grub.cfg file which is stored with all boot images. These files only get updated when a new image is found on images.maas.io or whatever your mirror is. To force the update you can run

sudo rm -rf /var/lib/maas/boot-resources/*
sudo systemctl restart maas-rackd

Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1923268] Re: grubnet default grub.cfg should try /grub/grub.cfg-${net_default_mac} before /grub/grub.cfg

On Wed, May 5, 2021 at 2:15 PM Lee Trager <email address hidden> wrote:
>
> When you added the ARM64 machine are you including the boot MAC address
> or just the IPMI credentials? When you add just the IPMI credentials the
> machine actually goes into enlistment but MAAS detects its a known
> machine based on IPMI credenitals when the BMC is detected.

The only MAC address I added was the BMC MAC - I didn't add any host
NIC MACs initially, so your explanation makes sense. Let me know if
it'd be helpful for you if I retested w/ host MACs though.

> The associated branch updates the default grub.cfg file which is stored
> with all boot images. These files only get updated when a new image is
> found on images.maas.io or whatever your mirror is. To force the update
> you can run
>
> sudo rm -rf /var/lib/maas/boot-resources/*
> sudo systemctl restart maas-rackd

Understood. We're still running a snap version, so I won't be able to
easily pick up a source code patch. We plan to switch back to a deb
install so we can be more agile in helping with testing patches, but
we haven't scheduled the downtime yet. I can always set up another
MAAS in the lab w/ debs if you need me to test something though.

  -dann

Changed in maas:
status: Fix Committed → Fix Released
Revision history for this message
dann frazier (dannf) wrote :

I upgraded to 2.9.3~beta1 (9197-g.afe92bb63), which appears to have the fix integrated, but I'm still seeing the same issue. That is, if I PXE boot an unknown arm64 machine, it is given an arm64 GRUB configured to boot amd64 files. I'm therefore reopening.

Revision history for this message
dann frazier (dannf) wrote :

hm.. I don't have perms to reopen

Revision history for this message
dann frazier (dannf) wrote :

Good news everybody! False alarm - it seems to be working w/ 2.9.3-beta1 after waiting a bit. I was certain I had waited for images to sync before testing (according to the UI), so not sure what was needed to lock it in, but it does seem to be good.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.