Merge ~tai271828/+git/maas-preseeds:mr-linux-nvidia-curtin-templating into ~ce-hyperscale/+git/maas-preseeds:main

Proposed by Taihsiang Ho
Status: Merged
Approved by: dann frazier
Approved revision: f4066bc0bb419e630328e4886e299f9dde13006b
Merged at revision: de45f843188160f36ff7a556d04442efce0b786d
Proposed branch: ~tai271828/+git/maas-preseeds:mr-linux-nvidia-curtin-templating
Merge into: ~ce-hyperscale/+git/maas-preseeds:main
Diff against target: 78 lines (+34/-14)
2 files modified
curtin_userdata_dgx (+5/-0)
late.sh (+29/-14)
Reviewer Review Type Date Requested Status
dann frazier Approve
Review via email: mp+441716@code.launchpad.net

Description of the change

Per our discussion in https://warthogs.atlassian.net/browse/NVDGX-108?focusedCommentId=227032 , let's apply the curtin templating workaround.

- I have deployed blanka with Jammy and focal, and I will get the corresponding linux-nvidia and linux-generic.
- I don't have a chance to try with akis to see if we have any regression yet as it was occupied by kernel testing.

To post a comment you must log in.
Revision history for this message
dann frazier (dannf) wrote :

I'd suggesting cherry-picking the original versions of the first 3 commits instead of reverting revert commits, but with only that change, +1!

review: Approve
Revision history for this message
Taihsiang Ho (tai271828) wrote :

Rebased by following the suggestion of cherry-picking. Landing the mr.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

Merged. Deploy to Needham MAAS and Taipei MAAS.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

Deployed*

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/curtin_userdata_dgx b/curtin_userdata_dgx
index 7380386..d1c2a4e 100644
--- a/curtin_userdata_dgx
+++ b/curtin_userdata_dgx
@@ -97,6 +97,11 @@ debconf_selections:
97swap:97swap:
98 size: 098 size: 0
9999
100{{if release not in ["xenial", "bionic", "focal"]}}
101kernel:
102 package: linux-nvidia
103{{endif}}
104
100late_commands:105late_commands:
101 maas: ['wget', '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']106 maas: ['wget', '--no-proxy', '{{node_disable_pxe_url}}', '--post-data', '{{node_disable_pxe_data}}', '-O', '/dev/null']
102 01_maas_preseed: ['curtin', 'in-target', '--', 'git', 'clone', 'https://git.launchpad.net/~ce-hyperscale/+git/maas-preseeds', '/tmp/maas-dgx-preseed']107 01_maas_preseed: ['curtin', 'in-target', '--', 'git', 'clone', 'https://git.launchpad.net/~ce-hyperscale/+git/maas-preseeds', '/tmp/maas-dgx-preseed']
diff --git a/late.sh b/late.sh
index 37be944..29cfe7e 100755
--- a/late.sh
+++ b/late.sh
@@ -18,9 +18,9 @@ install_mellanox_ofed() {
18 MELLANOX_OFED_VERSION=4.9-2.2.4.018 MELLANOX_OFED_VERSION=4.9-2.2.4.0
19 ;;19 ;;
20 *)20 *)
21 # nvidia seems to make future DGX MOFED-less, and DGX OS does not21 # nvidia seems to make future DGX MOFED-less, and DGX OS starting
22 # install MOFED. Let's try to have the same performance22 # with jammy does not install MOFED. Let's try to have the same
23 # characteristics for jammy and above.23 # performance characteristics for jammy and above.
24 return 024 return 0
25 esac25 esac
26 mlnx_url="${MLNX_REPO}/${MELLANOX_OFED_VERSION}/ubuntu${ubuntu_ver}/mellanox_mlnx_ofed.list"26 mlnx_url="${MLNX_REPO}/${MELLANOX_OFED_VERSION}/ubuntu${ubuntu_ver}/mellanox_mlnx_ofed.list"
@@ -127,18 +127,33 @@ fi
127127
128pkgs="nvidia-utils-${NVIDIA_DRIVER_VERSION}"128pkgs="nvidia-utils-${NVIDIA_DRIVER_VERSION}"
129pkgs="${pkgs} nvidia-kernel-source-${NVIDIA_DRIVER_VERSION}"129pkgs="${pkgs} nvidia-kernel-source-${NVIDIA_DRIVER_VERSION}"
130ubuntu_ver="$(lsb_release -rs)"130# For nvidia flavor it is a bit tricky since the nomenculate is not consistent
131# with generic and other flavors fully. For example,
132# $ dpkg-query -W -f '${Package}\n' linux-nvidia*
133# linux-nvidia
134# linux-nvidia-headers-5.15.0-1015
135# linux-nvidia-source-5.15.0
136# linux-nvidia-tools
137# but the generic headers would look like
138# linux-headers-generic
139# linux-headers-5.15.0-25-generic
140# even more, this is the hwe flavor for example:
141# linux-headers-generic-hwe-20.04
142#
143# So using similar pattern to generalize regex will not work.
144#
145# Lastly, brace expansion is not POSIX. Dash does not support brace expansion
146# so we do not use something like:
147# linux-{generic,nvidia}
148# and let us use the other approaches like a for-loop instead.
149for possible_flavor_pattern in linux-generic* linux-nvidia; do
150 for metapkg in $(dpkg-query -W -f '${Package}\n' "${possible_flavor_pattern}"); do
151 flavor=${metapkg#linux-}
152 pkgs="$pkgs linux-modules-nvidia-${NVIDIA_DRIVER_VERSION}-${flavor}"
153 echo $pkgs
154 done
155done
131156
132case $ubuntu_ver in
133 22.04)
134 pkgs="$pkgs linux-modules-nvidia-${NVIDIA_DRIVER_VERSION}-nvidia linux-nvidia"
135 ;;
136 *)
137 for metapkg in $(dpkg-query -W -f '${Package}\n' "linux-generic*"); do
138 flavor=${metapkg#linux-}
139 pkgs="$pkgs linux-modules-nvidia-${NVIDIA_DRIVER_VERSION}-${flavor}"
140 done
141esac
142apt install -y ${pkgs}157apt install -y ${pkgs}
143install_fabric_manager158install_fabric_manager
144install_mellanox_ofed159install_mellanox_ofed

Subscribers

People subscribed via source and target branches

to all changes: