hw-health-charm

Merge ~aieri/charm-hw-health:lp1832906 into ~nagios-charmers/charm-hw-health:master

Proposed by Andrea Ieri on 2020-06-16

Status:	Superseded
Proposed branch:	~aieri/charm-hw-health:lp1832906
Merge into:	~nagios-charmers/charm-hw-health:master
Diff against target:	814 lines (+306/-102) 17 files modified src/README.md (+25/-0) src/files/ipmi/cron_ipmi_sensors.py (+15/-11) src/files/mdadm/cron_mdadm.py (+100/-36) src/lib/hwhealth/tools.py (+9/-2) src/metadata.yaml (+2/-1) src/tests/download_nagios_plugin3.py (+1/-1) src/tests/functional/test_hwhealth.py (+15/-4) src/tests/hw-health-samples/mdadm.output.critical.2 (+33/-0) src/tests/hw-health-samples/mdadm.output.warning (+33/-0) src/tests/unit/test_check_mdadm.py (+10/-18) src/tests/unit/test_check_megacli.py (+7/-7) src/tests/unit/test_check_nvme.py (+5/-3) src/tests/unit/test_check_sas2ircu.py (+5/-4) src/tests/unit/test_check_sas3ircu.py (+5/-3) src/tests/unit/test_cron_mdadm.py (+35/-8) src/tests/unit/test_hwdiscovery.py (+5/-4) src/tox.ini (+1/-0)
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Stuart Bishop		2020-06-16	Pending
Drew Freiberger		2020-06-16	Pending
Xav Paice		2020-06-16	Pending
BootStack Reviewers		2020-06-16	Pending
Giuseppe Petralia		2020-06-16	Pending
Review via email: mp+385837@code.launchpad.net

This proposal supersedes a proposal from 2019-10-29.

Commit message

Fix issue with cron_mdadm.py which causes degraded state to not be reported

Description of the change

Fix issue with cron_mdadm.py which causes degraded state to not be reported

    There was a formatting assumption which broke the detection of the degraded
    flag in the State section of each device report from mdadm --detail <devices>
    This merge adds code to split the State flags and check for both degraded and
    recovering states and sets alert status based on the combination of states.

Also added is the direct detection of a "removed" member of the raid.

Closes-Bug: 1832906

Revision history for this message

🤖 Canonical IS Merge Bot (canonical-is-mergebot) wrote on 2019-10-29: Posted in a previous version of this proposal

#

This merge proposal is being monitored by mergebot. Change the status to Approved to merge.

Revision history for this message

Alvaro Uria (aluria) wrote on 2019-10-29: Posted in a previous version of this proposal

#

Logic looks good to me. Unit tests exist for this script, which should be extended for the extra flags (rebuilding, "clean, degraded", recovering...). Thank you!

Revision history for this message

Stuart Bishop (stub) wrote on 2019-10-31: Posted in a previous version of this proposal

#

All looks good, apart from the use of os.getcwd() in tests per inline comments.

review: Needs Fixing

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2019-11-01: Posted in a previous version of this proposal

#

Those tests were directly copied from the tests above and are working as-is. This is a critical patch for maintenance kicking off in 24 hours. I believe this should be approved per the "follow the style of the code already present" and a new bug filed against the charm to clean up all tests for this issue.

Revision history for this message

Giuseppe Petralia (peppepetra) wrote on 2020-01-30: Posted in a previous version of this proposal

#

All Looks good to me

review: Approve

Revision history for this message

Xav Paice (xavpaice) wrote on 2020-02-18: Posted in a previous version of this proposal

#

https://code.launchpad.net/~peter-sabaini/hw-health-charm/+git/hw-health-charm/+merge/378123 starts to address the getcwd() issues, I think we can complete that change with these fixed too after a rebase.

Revision history for this message

Xav Paice (xavpaice) wrote on 2020-02-18: Posted in a previous version of this proposal

#

This change unfortunately needs rebasing against master, currently there's more merge conflicts than Git can handle.

review: Needs Fixing

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2020-06-15: Posted in a previous version of this proposal

#

Change has been rebased and has had work from both Andrea and Peter to resolve comments from Stu.

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2020-06-15: Posted in a previous version of this proposal

#

ready for review

review: Needs Resubmitting

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2020-06-15: Posted in a previous version of this proposal

#

Green lint/unit/func tests: https://pastebin.canonical.com/p/QbYfYXhGCp/

Revision history for this message

Stuart Bishop (stub) wrote on 2020-06-16: Posted in a previous version of this proposal

#

Looks good. Some inline comments on some things worth fixing before landing, in particular moving the lint pragmas into global flake8 config since black is just going to keep making more of them. There are also some remaining os.getcwd() calls in the test suite that can easily be replaced by making them relative to the TESTS_DIR constant that now exists, making the test suite less fragile.

review: Approve

Revision history for this message

Drew Freiberger (afreiberger) wrote on 2020-06-16: Posted in a previous version of this proposal

#

Thanks for the feedback! Thanks for picking up those bits. There are some clunky bits that could use some clean up after yesterday's merge of other contributions and rebase.

Unmerged commits

6b6fa66... by Andrea Ieri on 2020-06-15

Refactor obtaining samples into a separate shared function

Make imports more robust by switching to absolute paths

cb933c9... by Drew Freiberger on 2019-10-30

Add test sample inputs for new mdadm tests

0d7caac... by Drew Freiberger on 2019-10-29

Updated unit tests for cron_mdadm.py to include new failures

8375697... by Drew Freiberger on 2019-10-29

Fix issue with cron_mdadm.py which causes degraded state to not be reported

There was a formatting assumption which broke the detection of the degraded
flag in the State section of each device report from mdadm --detail <devices>
This merge adds code to split the State flags and check for both degraded and
recovering states and sets alert status based on the combination of states.

Also added is the direct detection of a "removed" member of the raid.

Closes-Bug: 1832906

09564d9... by Márton Kiss on 2020-06-10

Fix LP#1882978 ipmi check alert logic

Fix the logic of ipmi_sensors.out file creation and exception
handling.

9896fd4... by Alvaro Uria on 2020-06-02

Merge remote-tracking branch 'origin/str-bytes-conversion'

Reviewed-on: https://code.launchpad.net/~llama-charmers/charm-hw-health/+git/charm-hw-health/+merge/384228
Reviewed-by: Xav Paice <email address hidden>
Reviewed-by: Alvaro Uria <email address hidden>
Signed-off-by: Alvaro Uria <email address hidden>

d4bf6d5... by Alvaro Uria on 2020-05-22

Merge branch 'feature/add-focal'

Reviewed-on: https://code.launchpad.net/~canonical-is-bootstack/charm-hw-health/+git/charm-hw-health/+merge/383853
Reviewed-by: Paul Goins <email address hidden>
Signed-off-by: Alvaro Uria <email address hidden>

93b270d... by Alvaro Uria on 2020-05-22

Add README details, MegaCli64 checksum

8c01c0d... by Joe Guo on 2020-05-20

cron_ipmi_sensors.py: rm encoding header

should defaults to utf-8.

Signed-off-by: Joe Guo <email address hidden>

56169b4... by Joe Guo on 2020-05-20

cron_ipmi_sensors.py: subprocess.check_output decode

`subprocess.check_output` always return bytes, even in python3.
decode output to str for python3.

Signed-off-by: Joe Guo <email address hidden>

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Andrea Ieri

Nagios Charm developers