~dannf/+git/maas-preseeds:hidon-re-enable-mofed-and-warn

Last commit made on 2024-02-13
Get this branch:
git clone -b hidon-re-enable-mofed-and-warn https://git.launchpad.net/~dannf/+git/maas-preseeds
Only dann frazier can upload to this branch. If you are dann frazier please log in for upload directions.

Branch merges

Branch information

Name:
hidon-re-enable-mofed-and-warn
Repository:
lp:~dannf/+git/maas-preseeds

Recent commits

cb5e8aa... by dann frazier

hidon: Re-enable MOFED installation and try to warn the cert team

9233282... by Mitchell Augustin

Moved update-initramfs to MOFED install function

update-initramfs is called as part of the MOFED installation process, so it has been moved to that function

90cdf29... by Mitchell Augustin

Change late.sh so fabric manager and MOFED are only installed on x86

This aligns with our test plans

af76449... by Mitchell Augustin

Added || true to "ipmitool sel clear" so MAAS deployment can continue if that command fails

On Hinyari, "ipmitool sel clear" will exit with code 1 and output "Unable to clear SEL: Unspecified error",
which causes the entire deployment to fail.

Since this is not a critical step of the deployment, this should not happen. This change allows the deployment
to continue while still printing the error message to the console.

efa94af... by Mitchell Augustin

Consolidate GPU driver installation in late.sh and add late.sh call to curtin_userdata_grace

Currently, Grace systems do not have their GPU drivers automatically installed. This commit adds
that step for Grace systems.

Additionally, GPU driver installation will now only run on systems with an Nvidia GPU installed.
The appropriate open/non-open variant of the driver is selected based on the system specs.

8917098... by dann frazier

Use dmidecode to recognize hidon

$HOSTNAME apparently isn't set during deployment

39a9951... by dann frazier

Don't install MOFED on hidon until we're done with certification

I've left a reminder to undo this later here:
  https://warthogs.atlassian.net/browse/NVDGX-615

5ee5d98... by dann frazier

Start installing MOFED on DGX systems running 22.04 ('jammy')

Using 23.10-1.1.9.0 per:
  https://docs.google.com/spreadsheets/d/1gsRi-yTwFwdCRV2Gi_Kixsic7ZxwNdlG4P0qkyGyYfs/edit#gid=1099522443

https://warthogs.atlassian.net/browse/NVDGX-543

f735a64... by dann frazier

Add a comment to explain the use of a regex pattern for the kernel package

966e5f5... by dann frazier

Workaround low-frequency CPU issue on Grace CPUs

Older Grace firmware will leave the CPUs in a low-frequency state until
the cppc_cpufreq kernel module is loaded. Unfortunately, this module
is not available in the MAAS environment. Because of this, MAAS deployments
on these systems are very slow, and often timeout.

While we wait for either all of our systems to get the firmware fix - or
for MAAS images to start including cppc_cpufreq - let's manually download
and load the module in an early command.

See: https://warthogs.atlassian.net/browse/NVDGX-546