Merge ~dannf/canonical-server-hwe-utils:tremont-post-move-test into canonical-server-hwe-utils:main

Proposed by dann frazier
Status: Rejected
Rejected by: dann frazier
Proposed branch: ~dannf/canonical-server-hwe-utils:tremont-post-move-test
Merge into: canonical-server-hwe-utils:main
Diff against target: 58 lines (+52/-0)
1 file modified
sysadmin-tools/tremont-to-needham-post-move-test.py (+52/-0)
Reviewer Review Type Date Requested Status
Taihsiang Ho Approve
Hon Ming Hui Needs Fixing
Review via email: mp+439539@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Hon Ming Hui (hm-hui) wrote :

the machine.yaml file should be the real configurations(and special notes) on the whole lab setup. It seems to me that we should make this script neat, and fix the special case in machine.yaml.

That said, pauli and lasell should be handled in machine.yaml (with comment blocks on the "address" field and probably notes there)

review: Needs Fixing
Revision history for this message
Hon Ming Hui (hm-hui) wrote :

Further comment after trying that once.

I realise that it is using fping instead of ping, which is fine I could easily install it.

The main problem for me is that the python unittest result print a complete stack trace which make it difficult (or just very long result) to demonstrate the failure machine.

On the other hand, one feature that fping has while ping do not have is that it could accept a list of IP in single command so that it will print the summary as a result.

e.g.
10.228.68.47 : xmt/rcv/%loss = 1/0/100%
10.228.0.22 : xmt/rcv/%loss = 1/0/100%
10.228.68.18 : xmt/rcv/%loss = 1/0/100%
10.228.68.44 : xmt/rcv/%loss = 1/1/0%, min/avg/max = 83.6/83.6/83.6

But fping is not able to show the machine name in machine.yaml which is also not ideal.

I don't have a concrete idea on how to improve it but I think if the script result could be neat and just show the list of success/failure machines. That will be better.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

According to this comment, which is my "user experience", https://warthogs.atlassian.net/browse/SHENG-1953?focusedCommentId=224561 and my knowledge[1], I would say this script is good enough to move forward. Polishing the output format and make a "smarter check" has lower priority.

[1] Pauli is the HMM of the chassis containing saenger and segers, and I have confirmed that saenger and segers are managed by our MAAS. I don't know how saenger and segers are racked exactly. However, once our MAAS could manage saenger and segers, we don't care how Pauli is setup.

lasell is owned and managed by IS according to this sheet of 18T move https://docs.google.com/spreadsheets/d/1Np7LUkv4KpuQOJCklL5NFGEs_8IwiEgbfL62PI99H0g/edit#gid=0 and lasell is not used in our setup now.

review: Approve
Revision history for this message
dann frazier (dannf) wrote :

Thanks for the reviews! I provided this MP as a way to share the script - I'm not sure if it makes sense to merge it at all given it's usefulness has run its course. If others think we would use it again, then I'm happy to eliminate the pauli/lassell special cases and rename the script to something more general. Otherwise, I think we could just reject.

On fping vs. ping, I reach for fping in cases like this out of habit. ISTR only fping used to support exiting non-zero on failure, but that's a 10+ year old habit, and I don't see a reason we need to use fping. I'd be happy to change that too if we decide to keep this.

As for the readable output, yeah - I agree it's not ideal, it was just the easiest thing to put together in the 20 minutes I had. I don't see having time to prioritize improving it though - so (again, if we want to merge this at all) I think we'd have to keep it as-is, or hand off to someone else if addressing that is required.

Perhaps let us discuss in the scalebot call.

Unmerged commits

c91cbdc... by dann frazier

Add a test script to verify system access after the Tremont->Needham move

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/sysadmin-tools/tremont-to-needham-post-move-test.py b/sysadmin-tools/tremont-to-needham-post-move-test.py
0new file mode 1007550new file mode 100755
index 0000000..e5db12f
--- /dev/null
+++ b/sysadmin-tools/tremont-to-needham-post-move-test.py
@@ -0,0 +1,52 @@
1#!/usr/bin/env python3
2
3import argparse
4import subprocess
5import sys
6import unittest
7import yaml
8
9
10class PingTest(unittest.TestCase):
11 pass
12
13
14def ping_test_generator(ip):
15 def test_case(self):
16 subprocess.check_call(["fping", "-c", "1", ip])
17
18 return test_case
19
20
21if __name__ == "__main__":
22 parser = argparse.ArgumentParser()
23 parser.add_argument("--machines-yaml", "-m", required=True)
24 args, remaining = parser.parse_known_args()
25
26 with open(args.machines_yaml, "r") as f:
27 yaml_in = yaml.safe_load(f)
28
29 for machine in yaml_in.keys():
30 if "bmc" in yaml_in[machine]:
31 bmc = yaml_in[machine]["bmc"]
32 if "address" not in bmc:
33 continue
34 if machine == "pauli":
35 # we never figured out how to set pauli's bmc's ip
36 continue
37 test_name = f"test_ping_{machine}_bmc"
38 test_case = ping_test_generator(bmc["address"])
39 setattr(PingTest, test_name, test_case)
40 if "interfaces" not in yaml_in[machine]:
41 continue
42 for interface in yaml_in[machine]["interfaces"]:
43 iface = yaml_in[machine]["interfaces"][interface]
44 if "address" not in iface:
45 continue
46 if machine == "lasell" and interface == "eth0":
47 # Was not pingable in Tremont
48 continue
49 test_name = f"test_ping_{machine}_{interface}"
50 test_case = ping_test_generator(iface["address"])
51 setattr(PingTest, test_name, test_case)
52 unittest.main(argv=[sys.argv[0]] + remaining)

Subscribers

People subscribed via source and target branches

to all changes: