Merge ~dannf/canonical-server-hwe-utils:tremont-post-move-test into canonical-server-hwe-utils:main

Proposed by dann frazier
Status: Rejected
Rejected by: dann frazier
Proposed branch: ~dannf/canonical-server-hwe-utils:tremont-post-move-test
Merge into: canonical-server-hwe-utils:main
Diff against target: 58 lines (+52/-0)
1 file modified
sysadmin-tools/tremont-to-needham-post-move-test.py (+52/-0)
Reviewer Review Type Date Requested Status
Taihsiang Ho Approve
Hon Ming Hui Needs Fixing
Review via email: mp+439539@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Hon Ming Hui (hm-hui) wrote :

the machine.yaml file should be the real configurations(and special notes) on the whole lab setup. It seems to me that we should make this script neat, and fix the special case in machine.yaml.

That said, pauli and lasell should be handled in machine.yaml (with comment blocks on the "address" field and probably notes there)

review: Needs Fixing
Revision history for this message
Hon Ming Hui (hm-hui) wrote :

Further comment after trying that once.

I realise that it is using fping instead of ping, which is fine I could easily install it.

The main problem for me is that the python unittest result print a complete stack trace which make it difficult (or just very long result) to demonstrate the failure machine.

On the other hand, one feature that fping has while ping do not have is that it could accept a list of IP in single command so that it will print the summary as a result.

e.g.
10.228.68.47 : xmt/rcv/%loss = 1/0/100%
10.228.0.22 : xmt/rcv/%loss = 1/0/100%
10.228.68.18 : xmt/rcv/%loss = 1/0/100%
10.228.68.44 : xmt/rcv/%loss = 1/1/0%, min/avg/max = 83.6/83.6/83.6

But fping is not able to show the machine name in machine.yaml which is also not ideal.

I don't have a concrete idea on how to improve it but I think if the script result could be neat and just show the list of success/failure machines. That will be better.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

According to this comment, which is my "user experience", https://warthogs.atlassian.net/browse/SHENG-1953?focusedCommentId=224561 and my knowledge[1], I would say this script is good enough to move forward. Polishing the output format and make a "smarter check" has lower priority.

[1] Pauli is the HMM of the chassis containing saenger and segers, and I have confirmed that saenger and segers are managed by our MAAS. I don't know how saenger and segers are racked exactly. However, once our MAAS could manage saenger and segers, we don't care how Pauli is setup.

lasell is owned and managed by IS according to this sheet of 18T move https://docs.google.com/spreadsheets/d/1Np7LUkv4KpuQOJCklL5NFGEs_8IwiEgbfL62PI99H0g/edit#gid=0 and lasell is not used in our setup now.

review: Approve
Revision history for this message
dann frazier (dannf) wrote :

Thanks for the reviews! I provided this MP as a way to share the script - I'm not sure if it makes sense to merge it at all given it's usefulness has run its course. If others think we would use it again, then I'm happy to eliminate the pauli/lassell special cases and rename the script to something more general. Otherwise, I think we could just reject.

On fping vs. ping, I reach for fping in cases like this out of habit. ISTR only fping used to support exiting non-zero on failure, but that's a 10+ year old habit, and I don't see a reason we need to use fping. I'd be happy to change that too if we decide to keep this.

As for the readable output, yeah - I agree it's not ideal, it was just the easiest thing to put together in the 20 minutes I had. I don't see having time to prioritize improving it though - so (again, if we want to merge this at all) I think we'd have to keep it as-is, or hand off to someone else if addressing that is required.

Perhaps let us discuss in the scalebot call.

Unmerged commits

c91cbdc... by dann frazier

Add a test script to verify system access after the Tremont->Needham move

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/sysadmin-tools/tremont-to-needham-post-move-test.py b/sysadmin-tools/tremont-to-needham-post-move-test.py
2new file mode 100755
3index 0000000..e5db12f
4--- /dev/null
5+++ b/sysadmin-tools/tremont-to-needham-post-move-test.py
6@@ -0,0 +1,52 @@
7+#!/usr/bin/env python3
8+
9+import argparse
10+import subprocess
11+import sys
12+import unittest
13+import yaml
14+
15+
16+class PingTest(unittest.TestCase):
17+ pass
18+
19+
20+def ping_test_generator(ip):
21+ def test_case(self):
22+ subprocess.check_call(["fping", "-c", "1", ip])
23+
24+ return test_case
25+
26+
27+if __name__ == "__main__":
28+ parser = argparse.ArgumentParser()
29+ parser.add_argument("--machines-yaml", "-m", required=True)
30+ args, remaining = parser.parse_known_args()
31+
32+ with open(args.machines_yaml, "r") as f:
33+ yaml_in = yaml.safe_load(f)
34+
35+ for machine in yaml_in.keys():
36+ if "bmc" in yaml_in[machine]:
37+ bmc = yaml_in[machine]["bmc"]
38+ if "address" not in bmc:
39+ continue
40+ if machine == "pauli":
41+ # we never figured out how to set pauli's bmc's ip
42+ continue
43+ test_name = f"test_ping_{machine}_bmc"
44+ test_case = ping_test_generator(bmc["address"])
45+ setattr(PingTest, test_name, test_case)
46+ if "interfaces" not in yaml_in[machine]:
47+ continue
48+ for interface in yaml_in[machine]["interfaces"]:
49+ iface = yaml_in[machine]["interfaces"][interface]
50+ if "address" not in iface:
51+ continue
52+ if machine == "lasell" and interface == "eth0":
53+ # Was not pingable in Tremont
54+ continue
55+ test_name = f"test_ping_{machine}_{interface}"
56+ test_case = ping_test_generator(iface["address"])
57+ setattr(PingTest, test_name, test_case)
58+ unittest.main(argv=[sys.argv[0]] + remaining)

Subscribers

People subscribed via source and target branches

to all changes: