Merge ~peter-sabaini/maas:add-nvme-testscript into maas:master

Proposed by Peter Sabaini
Status: Rejected
Rejected by: Blake Rouse
Proposed branch: ~peter-sabaini/maas:add-nvme-testscript
Merge into: maas:master
Diff against target: 142 lines (+136/-0)
1 file modified
src/metadataserver/builtin_scripts/nvme-cli.py (+136/-0)
Reviewer Review Type Date Requested Status
MAAS Lander Needs Fixing
Lee Trager (community) Needs Fixing
Review via email: mp+365212@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Lee Trager (ltrager) wrote :

Thanks for the contribution! This needs a little bit of work before we can accept it in MAAS. In addition to the comments below this needs unit tests and to be added to the BUILTIN_SCRIPTS list in src/metadataserver/builtin_scripts/__init__.py.

review: Needs Fixing
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b add-nvme-testscript lp:~peter-sabaini/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: FAILED
LOG: http://maas-ci-jenkins.internal:8080/job/maas/job/branch-tester/6125/console
COMMIT: 238531241f35306dc8876bf6468a72181348ac4e

review: Needs Fixing
Revision history for this message
Blake Rouse (blake-rouse) wrote :

Rejecting due to inactivity.

Unmerged commits

2385312... by Peter Sabaini

Add a test script for testing nvme devices.

Generate load on nvme devices for 10min, then check health via
nvme-cli

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/src/metadataserver/builtin_scripts/nvme-cli.py b/src/metadataserver/builtin_scripts/nvme-cli.py
2new file mode 100644
3index 0000000..1405ed4
4--- /dev/null
5+++ b/src/metadataserver/builtin_scripts/nvme-cli.py
6@@ -0,0 +1,136 @@
7+#!/usr/bin/env python3
8+#
9+# nvme-cli. Generate load on nvme devices for 10min, then check health via nvme-cli
10+#
11+# --- Start MAAS 1.0 script metadata ---
12+# name: nvme-cli.py
13+# title: Check nvme health
14+# description: Generate load on nvme devices, and check health via nvme-cli
15+# tags: storage
16+# script_type: testing
17+# hardware_type: storage
18+# parallel: instance
19+# results:
20+# fio_errors:
21+# title: Fio errors
22+# description: Errors during fio run
23+# nvme_critical:
24+# title: NVMe critical bitvector
25+# description: NVMe critical bitvector, should be 0 for full health
26+# parameters:
27+# storage: {type: storage}
28+# packages: {apt: [nvme-cli, fio]}
29+# destructive: true
30+# --- End MAAS 1.0 script metadata ---
31+
32+
33+import argparse
34+from copy import deepcopy
35+import os
36+import re
37+from subprocess import PIPE, Popen, STDOUT
38+import sys
39+
40+import yaml
41+
42+
43+FIOCMD = [
44+ "sudo",
45+ "-n",
46+ "fio",
47+ "--randrepeat=1",
48+ "--ioengine=libaio",
49+ "--direct=1",
50+ "--gtod_reduce=1",
51+ "--name=fio_test_verify",
52+ "--bs=4M",
53+ "--iodepth=64",
54+ "--size=100%",
55+ "--verify=crc32c-intel",
56+ "--runtime=600s",
57+ "--readwrite=randwrite",
58+]
59+
60+NVMECMD = ["sudo", "nvme", "smart-log"]
61+
62+REGEX = b"err=([ 0-9]+):"
63+
64+
65+def run_cmd(cmd):
66+ """Execute `cmd` and return output or exit if error."""
67+ proc = Popen(cmd, stdout=PIPE, stderr=STDOUT)
68+ # Currently, we are piping stderr to STDOUT.
69+ stdout, _ = proc.communicate()
70+
71+ # Print stdout to the console.
72+ if stdout is not None:
73+ print("Running command: %s\n" % " ".join(cmd))
74+ print(stdout.decode())
75+ print("-" * 80)
76+ return stdout, proc.returncode
77+
78+
79+def run_fio_test(result_path):
80+ """Run fio for the given type of test specified by `cmd`."""
81+ stdout, returncode = run_cmd(FIOCMD)
82+ if returncode != 0:
83+ sys.exit(proc.returncode)
84+ if result_path is not None:
85+ # Parse the results for the desired information and
86+ # then wrtie this to the results file.
87+ match = re.search(REGEX, stdout)
88+ if match is None:
89+ print("Warning: results could not be found.")
90+ return match
91+
92+
93+def run_nvme_smartlog(storage):
94+ stdout, returncode = run_cmd(NVMECMD + [storage])
95+ return stdout, returncode
96+
97+
98+def run_nvme_critical(storage):
99+ stdout, returncode = run_nvme_smartlog(storage)
100+ if returncode == 0:
101+ match = re.search(b"critical_warning.*: ([0-9]+)", stdout)
102+ if match is None:
103+ print("Warning: nvme results could not be found.")
104+ return match
105+ else:
106+ sys.exit(proc.returncode)
107+
108+
109+def write_results(status, fio_errors, nvme_crit, result_path):
110+ results = {"status": status, "results": {"fio_errors": fio_errors, "nvme_critical": nvme_crit}}
111+ with open(result_path, "w") as results_file:
112+ yaml.safe_dump(results, results_file)
113+
114+
115+def run_nvme(storage):
116+ """Execute nvme tests for supplied storage device.
117+
118+ Performs random write tests with verification, then checks nvme health
119+ """
120+ result_path = os.environ.get("RESULT_PATH")
121+ _, returncode = run_nvme_smartlog(storage)
122+ if returncode != 0: # this is not an NVMe device
123+ write_results("passed", 0, 0, result_path)
124+ return
125+
126+ FIOCMD.append("--filename=%s" % storage)
127+ random_write_match = run_fio_test(result_path)
128+ fio_errors = int(random_write_match.group(1).decode())
129+ nvme_match = run_nvme_critical(storage)
130+ nvme_crit = int(nvme_match.group(1).decode())
131+ if fio_errors == 0 and nvme_crit == 0:
132+ status = "passed"
133+ else:
134+ status = "failed"
135+ write_results(status, fio_errors, nvme_crit, result_path)
136+
137+
138+if __name__ == "__main__":
139+ parser = argparse.ArgumentParser(description="NVMe Hardware Testing.")
140+ parser.add_argument("--storage", dest="storage", help="path to storage device you want to test. e.g. /dev/nvme0")
141+ args = parser.parse_args()
142+ sys.exit(run_nvme(args.storage))

Subscribers

People subscribed via source and target branches