Merge ~ian-may/+git/autotest-client-tests:ubuntu_nvidia_fs-v2 into ~canonical-kernel-team/+git/autotest-client-tests:master

Proposed by Ian May
Status: Merged
Approved by: Po-Hsu Lin
Approved revision: eca2cad39ae03b2edfd7b57a493f4b68fcf7b2b5
Merge reported by: Po-Hsu Lin
Merged at revision: eca2cad39ae03b2edfd7b57a493f4b68fcf7b2b5
Proposed branch: ~ian-may/+git/autotest-client-tests:ubuntu_nvidia_fs-v2
Merge into: ~canonical-kernel-team/+git/autotest-client-tests:master
Diff against target: 279 lines (+185/-12)
6 files modified
ubuntu_nvidia_fs/control (+12/-0)
ubuntu_nvidia_fs/nvidia-module-lib (+96/-0)
ubuntu_nvidia_fs/ubuntu_nvidia_fs.py (+35/-0)
ubuntu_nvidia_fs/ubuntu_nvidia_fs.sh (+41/-0)
ubuntu_nvidia_server_driver/control (+1/-2)
ubuntu_nvidia_server_driver/ubuntu_nvidia_server_driver.py (+0/-10)
Reviewer Review Type Date Requested Status
Francis Ginther Approve
Po-Hsu Lin Approve
Review via email: mp+430179@code.launchpad.net

This proposal supersedes a proposal from 2022-08-18.

Commit message

Not all DGX systems need 'nvidia-fs' ran. So I'd like to decouple it from the 'nvidia driver load' test. No functional change to the test.

Description of the change

Thanks for the feedback! For v2, I changed the patch to use 'git mv' to move the nvidia-fs test to a newly created 'ubuntu_nvidia_fs' test directory. I also removed the nvidia-fs triggers from the 'ubuntu_nvidia_driver_load' test. Everything should be properly cleaned up now.

I tested both tests on DGX systems and behavior was as expected.

To post a comment you must log in.
Revision history for this message
Po-Hsu Lin (cypressyew) wrote : Posted in a previous version of this proposal

Hi Ian,
overall it's looking good. +1 on this.

Some cleanup questions:
 * Do you still want to keep the nvidia-fs/ in ubuntu_nvidia_server_driver?
 * Also, these lines in ubuntu_nvidia_server_driver.py:

22 def run_nvidia_fs_in_lxc(self):
23 cmd = os.path.join(p_dir, "./nvidia-fs/a-c-t-entry.sh")
24 utils.system(cmd)

And the test_name if statement for checking nvidia-fs.

It's rather trivial. So I am ok to keep or not to keep these.

review: Approve
Revision history for this message
Francis Ginther (fginther) wrote : Posted in a previous version of this proposal

As Sam asked, is there any reason to keep nvidia-fs under ubuntu_nvidia_server_driver. And if not, why not use a 'git mv' on these files to preserve their git history?

Functionally everything looks fine.

review: Needs Information
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

+1, thanks for updating this.

review: Approve
Revision history for this message
Francis Ginther (fginther) wrote :

+1

review: Approve
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/ubuntu_nvidia_fs/control b/ubuntu_nvidia_fs/control
2new file mode 100644
3index 0000000..75d21a4
4--- /dev/null
5+++ b/ubuntu_nvidia_fs/control
6@@ -0,0 +1,12 @@
7+AUTHOR = 'Taihsiang Ho <taihsiang.ho@canonical.com>'
8+TIME = 'SHORT'
9+NAME = 'nvidia-fs module test'
10+TEST_TYPE = 'client'
11+TEST_CLASS = 'General'
12+TEST_CATEGORY = 'Smoke'
13+
14+DOC = """
15+Perform testing of nvidia-fs module
16+"""
17+
18+job.run_test_detail('ubuntu_nvidia_fs', test_name='nvidia-fs', tag='nvidia-fs', timeout=1500)
19diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/00-vars b/ubuntu_nvidia_fs/nvidia-fs/00-vars
20similarity index 100%
21rename from ubuntu_nvidia_server_driver/nvidia-fs/00-vars
22rename to ubuntu_nvidia_fs/nvidia-fs/00-vars
23diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/01-run-test.sh b/ubuntu_nvidia_fs/nvidia-fs/01-run-test.sh
24similarity index 100%
25rename from ubuntu_nvidia_server_driver/nvidia-fs/01-run-test.sh
26rename to ubuntu_nvidia_fs/nvidia-fs/01-run-test.sh
27diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/02-inside-vm-update-kernel.sh b/ubuntu_nvidia_fs/nvidia-fs/02-inside-vm-update-kernel.sh
28similarity index 100%
29rename from ubuntu_nvidia_server_driver/nvidia-fs/02-inside-vm-update-kernel.sh
30rename to ubuntu_nvidia_fs/nvidia-fs/02-inside-vm-update-kernel.sh
31diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/03-inside-vm-install-drivers.sh b/ubuntu_nvidia_fs/nvidia-fs/03-inside-vm-install-drivers.sh
32similarity index 100%
33rename from ubuntu_nvidia_server_driver/nvidia-fs/03-inside-vm-install-drivers.sh
34rename to ubuntu_nvidia_fs/nvidia-fs/03-inside-vm-install-drivers.sh
35diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/04-inside-vm-setup-docker-and-run-test.sh b/ubuntu_nvidia_fs/nvidia-fs/04-inside-vm-setup-docker-and-run-test.sh
36similarity index 100%
37rename from ubuntu_nvidia_server_driver/nvidia-fs/04-inside-vm-setup-docker-and-run-test.sh
38rename to ubuntu_nvidia_fs/nvidia-fs/04-inside-vm-setup-docker-and-run-test.sh
39diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/05-inside-docker-run-test.sh b/ubuntu_nvidia_fs/nvidia-fs/05-inside-docker-run-test.sh
40similarity index 100%
41rename from ubuntu_nvidia_server_driver/nvidia-fs/05-inside-docker-run-test.sh
42rename to ubuntu_nvidia_fs/nvidia-fs/05-inside-docker-run-test.sh
43diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/README b/ubuntu_nvidia_fs/nvidia-fs/README
44similarity index 100%
45rename from ubuntu_nvidia_server_driver/nvidia-fs/README
46rename to ubuntu_nvidia_fs/nvidia-fs/README
47diff --git a/ubuntu_nvidia_server_driver/nvidia-fs/a-c-t-entry.sh b/ubuntu_nvidia_fs/nvidia-fs/a-c-t-entry.sh
48similarity index 100%
49rename from ubuntu_nvidia_server_driver/nvidia-fs/a-c-t-entry.sh
50rename to ubuntu_nvidia_fs/nvidia-fs/a-c-t-entry.sh
51diff --git a/ubuntu_nvidia_fs/nvidia-module-lib b/ubuntu_nvidia_fs/nvidia-module-lib
52new file mode 100644
53index 0000000..06141bf
54--- /dev/null
55+++ b/ubuntu_nvidia_fs/nvidia-module-lib
56@@ -0,0 +1,96 @@
57+# Copyright 2021 Canonical Ltd.
58+# Written by:
59+# Dann Frazier <dann.frazier@canonical.com>
60+# Taihsiang Ho <taihsiang.ho@canonical.com>
61+#
62+# shellcheck shell=bash
63+module_loaded() {
64+ module="$1"
65+ # Check linux/include/linux/module.h for module_state enumeration
66+ # There are the other states like Loading and Unloading besides Live. The
67+ # other states usually only take only few microseconds but let's specify
68+ # Live explicitly.
69+ grep "^${module} " /proc/modules | grep -q Live
70+}
71+
72+get_module_field() {
73+ local module="$1"
74+ local field="$2"
75+ # shellcheck disable=SC2034
76+ read -r mod size usecnt deps rest < <(grep "^${module} " /proc/modules)
77+ case $field in
78+ usecnt)
79+ echo "$usecnt"
80+ ;;
81+ deps)
82+ if [ "$deps" = "-" ]; then
83+ return 0
84+ fi
85+ echo "$deps" | tr ',' ' '
86+ ;;
87+ *)
88+ return 1
89+ esac
90+}
91+
92+module_in_use() {
93+ module="$1"
94+
95+ usecnt="$(get_module_field "$module" usecnt)"
96+
97+ if [ "$usecnt" -eq 0 ]; then
98+ return 1
99+ fi
100+ return 0
101+}
102+
103+recursive_remove_module() {
104+ local module="$1"
105+
106+ if ! module_loaded "$module"; then
107+ return 0
108+ fi
109+
110+ if ! module_in_use "$module"; then
111+ sudo rmmod "$module"
112+ return 0
113+ fi
114+
115+ if [ "$(get_module_field "$module" deps)" = "" ]; then
116+ echo "ERROR: $module is in use, but has no reverse dependencies"
117+ echo "ERROR: Maybe an application is using it."
118+ exit 1
119+ fi
120+ beforecnt="$(get_module_field "$module" usecnt)"
121+ for dep in $(get_module_field "$module" deps); do
122+ recursive_remove_module "$dep"
123+ done
124+ aftercnt="$(get_module_field "$module" usecnt)"
125+ if [ "$beforecnt" -eq "$aftercnt" ]; then
126+ echo "ERROR: Unable to reduce $module use count"
127+ exit 1
128+ fi
129+ recursive_remove_module "$module"
130+}
131+
132+uninstall_all_nvidia_mod_pkgs() {
133+ for pkg in $(dpkg-query -f "\${Package}\n" -W 'linux-modules-nvidia-*'); do
134+ sudo apt remove --purge "$pkg" -y
135+ done
136+ if sudo modinfo nvidia; then
137+ echo "ERROR: Uninstallation of all nvidia modules failed."
138+ exit 1
139+ fi
140+}
141+
142+product="$(sudo dmidecode -s baseboard-product-name)"
143+pkg_compatible_with_platform() {
144+ local pkg="$1"
145+ branch="$(echo "$pkg" | cut -d- -f4)"
146+
147+ if [ "$product" = "DGXA100" ] && [ "$branch" -le "418" ]; then
148+ return 1
149+ fi
150+
151+ return 0
152+}
153diff --git a/ubuntu_nvidia_fs/ubuntu_nvidia_fs.py b/ubuntu_nvidia_fs/ubuntu_nvidia_fs.py
154new file mode 100644
155index 0000000..77ac0bb
156--- /dev/null
157+++ b/ubuntu_nvidia_fs/ubuntu_nvidia_fs.py
158@@ -0,0 +1,35 @@
159+import os
160+from autotest.client import test, utils
161+
162+p_dir = os.path.dirname(os.path.abspath(__file__))
163+sh_executable = os.path.join(p_dir, "ubuntu_nvidia_fs.sh")
164+
165+
166+class ubuntu_nvidia_fs(test.test):
167+ version = 1
168+
169+ def initialize(self):
170+ pass
171+
172+ def setup(self):
173+ cmd = "{} setup".format(sh_executable)
174+ utils.system(cmd)
175+
176+ def run_nvidia_fs_in_lxc(self):
177+ #cmd = os.path.join(p_dir, "./nvidia-fs/a-c-t-entry.sh")
178+ #utils.system(cmd)
179+ cmd = "{} test".format(sh_executable)
180+ utils.system(cmd)
181+
182+ def run_once(self, test_name):
183+ print("HELLO WORLD")
184+ if test_name == "nvidia-fs":
185+ self.run_nvidia_fs_in_lxc()
186+
187+ print("")
188+ print("{} has run.".format(test_name))
189+
190+ print("")
191+
192+ def postprocess_iteration(self):
193+ pass
194diff --git a/ubuntu_nvidia_fs/ubuntu_nvidia_fs.sh b/ubuntu_nvidia_fs/ubuntu_nvidia_fs.sh
195new file mode 100755
196index 0000000..62b1a29
197--- /dev/null
198+++ b/ubuntu_nvidia_fs/ubuntu_nvidia_fs.sh
199@@ -0,0 +1,41 @@
200+#!/usr/bin/env bash
201+#
202+# perform Nvidia driver load testing and corresponding pre-setup.
203+#
204+
205+set -eo pipefail
206+
207+setup() {
208+ # pre-setup testing environment and necessary tools
209+ # currently there is nothing practically but will be used possibly in the future.
210+ echo "begin to pre-setup testing"
211+}
212+
213+run_test() {
214+ exe_dir=$(dirname "${BASH_SOURCE[0]}")
215+ pushd "${exe_dir}"
216+ #./test-each-nvidia-server-driver.sh
217+ ./nvidia-fs/a-c-t-entry.sh
218+ popd
219+}
220+
221+case $1 in
222+ setup)
223+ echo ""
224+ echo "On setting up necessary test environment..."
225+ echo ""
226+ setup
227+ echo ""
228+ echo "Setting up necessary test environment..."
229+ echo ""
230+ ;;
231+ test)
232+ echo ""
233+ echo "On running test..."
234+ echo ""
235+ run_test
236+ echo ""
237+ echo "Running test..."
238+ echo ""
239+ ;;
240+esac
241diff --git a/ubuntu_nvidia_server_driver/control b/ubuntu_nvidia_server_driver/control
242index a88eff0..3052a3c 100644
243--- a/ubuntu_nvidia_server_driver/control
244+++ b/ubuntu_nvidia_server_driver/control
245@@ -9,5 +9,4 @@ DOC = """
246 Perform testing of Nvidia server drivers
247 """
248
249-job.run_test_detail('ubuntu_nvidia_server_driver', test_name='nvidia-fs', tag='nvidia-fs', timeout=1500)
250-job.run_test_detail('ubuntu_nvidia_server_driver', test_name='load', tag='load', timeout=600)
251+job.run_test_detail('ubuntu_nvidia_server_driver', test_name='load', tag='load', timeout=1200)
252diff --git a/ubuntu_nvidia_server_driver/ubuntu_nvidia_server_driver.py b/ubuntu_nvidia_server_driver/ubuntu_nvidia_server_driver.py
253index 6a6f4c5..d0c667a 100644
254--- a/ubuntu_nvidia_server_driver/ubuntu_nvidia_server_driver.py
255+++ b/ubuntu_nvidia_server_driver/ubuntu_nvidia_server_driver.py
256@@ -19,10 +19,6 @@ class ubuntu_nvidia_server_driver(test.test):
257 cmd = "{} test".format(sh_executable)
258 utils.system(cmd)
259
260- def run_nvidia_fs_in_lxc(self):
261- cmd = os.path.join(p_dir, "./nvidia-fs/a-c-t-entry.sh")
262- utils.system(cmd)
263-
264 def run_once(self, test_name):
265 if test_name == "load":
266 self.compare_kernel_modules()
267@@ -30,12 +26,6 @@ class ubuntu_nvidia_server_driver(test.test):
268 print("")
269 print("{} has run.".format(test_name))
270
271- elif test_name == "nvidia-fs":
272- self.run_nvidia_fs_in_lxc()
273-
274- print("")
275- print("{} has run.".format(test_name))
276-
277 print("")
278
279 def postprocess_iteration(self):

Subscribers

People subscribed via source and target branches

to all changes: