lp:~dannf/+git/maas-preseeds

Owned by dann frazier
Get this repository:
git clone https://git.launchpad.net/~dannf/+git/maas-preseeds
Only dann frazier can upload to this repository. If you are dann frazier please log in for upload directions.

Branches

Name Last Modified Last Commit
hidon-cert-done 2024-03-12 13:46:50 UTC
late.sh: Re-enable MOFED on DGX H100 / hidon

Author: dann frazier
Author Date: 2024-03-12 13:46:50 UTC

late.sh: Re-enable MOFED on DGX H100 / hidon

jammy certification is now complete.

Signed-off-by: dann frazier <dann.frazier@canonical.com>

disable-mofed-repo 2024-03-12 13:42:27 UTC
late.sh: Silence shellcheck warnings about "local" usage

Author: dann frazier
Author Date: 2024-02-26 21:32:04 UTC

late.sh: Silence shellcheck warnings about "local" usage

main 2024-02-26 21:32:04 UTC
late.sh: Silence shellcheck warnings about "local" usage

Author: dann frazier
Author Date: 2024-02-26 21:32:04 UTC

late.sh: Silence shellcheck warnings about "local" usage

hidon-disable-mofed-2 2024-02-20 17:02:37 UTC
Disable MOFED installation on hidon again

Author: dann frazier
Author Date: 2024-02-20 17:02:37 UTC

Disable MOFED installation on hidon again

So cert team can run the iperf cert test.

hidon-disable-mofed 2024-02-20 17:02:37 UTC
Disable MOFED installation on hidon again

Author: dann frazier
Author Date: 2024-02-20 17:02:37 UTC

Disable MOFED installation on hidon again

So cert team can run the iperf cert test.

workaround-lp2037682 2024-02-14 19:41:08 UTC
install_gpu_drivers(): Add additional kernel metapackage patterns

Author: dann frazier
Author Date: 2024-02-14 18:22:59 UTC

install_gpu_drivers(): Add additional kernel metapackage patterns

We're currently missing the pattern for linux-nvidia HWE kernels, which
results in us installing the DKMS package instead of the signed modules.

fix-hidon-preseed 2024-02-13 18:18:27 UTC
hidon: don't write to /home/ubuntu, it doesn't exist yet

Author: dann frazier
Author Date: 2024-02-13 18:17:45 UTC

hidon: don't write to /home/ubuntu, it doesn't exist yet

Seen on the console:
[ 452.036488] cloud-init[4510]: + dmidecode -s system-product-name
[ 452.052372] cloud-init[4510]: + [ DGXH100 = DGXH100 ]
[ 452.068371] cloud-init[4510]: + touch /home/ubuntu/JEFF-THIS-HAS-MOFED-INSTALLED-DONT-IPERF
[ 452.088364] cloud-init[4510]: touch: cannot touch '/home/ubuntu/JEFF-THIS-HAS-MOFED-INSTALLED-DONT-IPERF': No such file or directory

hidon-re-enable-mofed-and-warn 2024-02-13 14:50:54 UTC
hidon: Re-enable MOFED installation and try to warn the cert team

Author: dann frazier
Author Date: 2024-02-13 14:50:45 UTC

hidon: Re-enable MOFED installation and try to warn the cert team

hidon-take-2 2024-02-01 16:50:40 UTC
Use dmidecode to recognize hidon

Author: dann frazier
Author Date: 2024-02-01 16:50:40 UTC

Use dmidecode to recognize hidon

$HOSTNAME apparently isn't set during deployment

temp-disable-mofed-hidon 2024-02-01 15:27:04 UTC
Don't install MOFED on hidon until we're done with certification

Author: dann frazier
Author Date: 2024-02-01 15:27:04 UTC

Don't install MOFED on hidon until we're done with certification

I've left a reminder to undo this later here:
  https://warthogs.atlassian.net/browse/NVDGX-615

jammy-mofed 2024-01-30 14:59:50 UTC
Start installing MOFED on DGX systems running 22.04 ('jammy')

Author: dann frazier
Author Date: 2024-01-30 14:59:50 UTC

Start installing MOFED on DGX systems running 22.04 ('jammy')

Using 23.10-1.1.9.0 per:
  https://docs.google.com/spreadsheets/d/1gsRi-yTwFwdCRV2Gi_Kixsic7ZxwNdlG4P0qkyGyYfs/edit#gid=1099522443

https://warthogs.atlassian.net/browse/NVDGX-543

grace-updates 2024-01-03 00:38:28 UTC
Add a comment to explain the use of a regex pattern for the kernel package

Author: dann frazier
Author Date: 2024-01-03 00:38:28 UTC

Add a comment to explain the use of a regex pattern for the kernel package

grace-preseed 2024-01-02 15:49:34 UTC
Add a preseed for grace-based systems

Author: dann frazier
Author Date: 2024-01-02 15:49:34 UTC

Add a preseed for grace-based systems

We are currently using our Grace-based systems as proxies for Grace-based
systems that we do not have in-house. Those systems always use the nvidia-
optimized kernel, so we should do the same on our systems.

This preseed will install the optimized nvidia kernel counterpart of the
generic kernel MAAS has been asked to install (since MAAS does not yet
have the ability to request an optimized kernel itself).
Install the nvidia-64k kernel by default, as this is the one we are testing

clear-sel 2023-08-09 22:47:28 UTC
late.sh: Clear the SEL

Author: dann frazier
Author Date: 2023-08-09 22:46:46 UTC

late.sh: Clear the SEL

https://warthogs.atlassian.net/browse/NVDGX-362

hidon 2023-06-08 21:46:49 UTC
Use the R525 NVIDIA GPU driver branch for jammy

Author: dann frazier
Author Date: 2023-05-30 19:57:01 UTC

Use the R525 NVIDIA GPU driver branch for jammy

Ref: https://canonical.my.salesforce.com/5004K00000TIGSiQAP

oot-driver-updates 2023-06-08 21:46:49 UTC
Use the R525 NVIDIA GPU driver branch for jammy

Author: dann frazier
Author Date: 2023-05-30 19:57:01 UTC

Use the R525 NVIDIA GPU driver branch for jammy

Ref: https://canonical.my.salesforce.com/5004K00000TIGSiQAP

revert-install-nvidia-with-curtin 2023-03-15 15:38:47 UTC
Revert "Use curtin to install linux-nvidia"

Author: dann frazier
Author Date: 2023-03-15 15:36:54 UTC

Revert "Use curtin to install linux-nvidia"

Deployments are failing with:
  E: Unable to locate package linux-nvidia

This reverts commit 1b8e33b87ecd8040182741dc82b27f28e4c067d5.

add-cortez-symlinks 2023-02-14 19:44:56 UTC
Add symlinks for cortez

Author: dann frazier
Author Date: 2023-02-14 18:58:04 UTC

Add symlinks for cortez

disable-dynamic-storage 2022-03-30 15:07:01 UTC
curtin_userdata_dgx: Disable dynamic storage configuration

Author: dann frazier
Author Date: 2022-03-30 15:07:01 UTC

curtin_userdata_dgx: Disable dynamic storage configuration

Currently broken in MAAS 3.2 beta1:
  https://bugs.launchpad.net/maas/+bug/1967008

disable-knem 2021-12-13 21:53:47 UTC
Disable knem autoloading to avoid oops (LP: #1929187)

Author: dann frazier
Author Date: 2021-12-13 21:53:47 UTC

Disable knem autoloading to avoid oops (LP: #1929187)

nps=4 2021-12-09 14:43:08 UTC
late.sh: Require the A100 to be in NPS=4 mode now

Author: dann frazier
Author Date: 2021-12-09 14:43:08 UTC

late.sh: Require the A100 to be in NPS=4 mode now

We're now running performance benchmarks in NPS=4 mode, which is how
these systems are configured from the factory.

drop-lp1926985-workaround 2021-11-04 01:07:44 UTC
m400: drop use of workaround PPA for LP: #1926985

Author: dann frazier
Author Date: 2021-11-04 01:05:16 UTC

m400: drop use of workaround PPA for LP: #1926985

The equivalent SRUs have now been released into the archive. LP: #1949048

knem-workaround 2021-10-22 20:55:21 UTC
Workaround issue w/ knem-dkms

Author: dann frazier
Author Date: 2021-10-22 20:55:21 UTC

Workaround issue w/ knem-dkms

Upon install it only builds against the running kernel, which is not
necessarily the installed kernel.

http://partners.nvidia.com/bug/viewbug/3409217

123 of 23 results
This repository contains Public information 
Everyone can see this information.

Subscribers