Merge ~addyess/charm-openstack-service-checks:bugs/lp1887561-check_octavia_filtering into ~llama-charmers/charm-openstack-service-checks:master
- Git
- lp:~addyess/charm-openstack-service-checks
- bugs/lp1887561-check_octavia_filtering
- Merge into master
Proposed by
Adam Dyess
Status: | Merged |
---|---|
Approved by: | Chris Sanders |
Approved revision: | dcf043ac09c1dea8c1fa18b7f85fa0c559911fa4 |
Merged at revision: | b4a24bff8ea1f9d38d70b1297ef64c63b49327e0 |
Proposed branch: | ~addyess/charm-openstack-service-checks:bugs/lp1887561-check_octavia_filtering |
Merge into: | ~llama-charmers/charm-openstack-service-checks:master |
Diff against target: |
802 lines (+370/-138) 9 files modified
README.md (+41/-0) config.yaml (+20/-0) files/plugins/check_octavia.py (+137/-89) lib/lib_openstack_service_checks.py (+37/-15) tests/unit/conftest.py (+4/-11) tests/unit/test_check_cinder_services.py (+1/-5) tests/unit/test_check_contrail_analytics_alarms.py (+12/-14) tests/unit/test_check_nova_services.py (+1/-4) tests/unit/test_check_octavia.py (+117/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Chris Sanders (community) | Approve | ||
Review via email: mp+387394@code.launchpad.net |
Commit message
Support for filtered alarming of octavia checks
Description of the change
To post a comment you must log in.
Revision history for this message
Adam Dyess (addyess) wrote : | # |
Revision history for this message
Chris Sanders (chris.sanders) wrote : | # |
The commit message here is a very good example, can you add something to the Readme about this as well?
A few in-line comments/questions as well.
review:
Needs Information
Revision history for this message
Chris Sanders (chris.sanders) : | # |
review:
Approve
Revision history for this message
Chris Sanders (chris.sanders) wrote : | # |
Let's see those tests ;)
review:
Needs Information
Revision history for this message
Adam Dyess (addyess) wrote : | # |
CI Results: https:/
Revision history for this message
Chris Sanders (chris.sanders) wrote : | # |
Alright thanks, +1
review:
Approve
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | diff --git a/README.md b/README.md | |||
2 | index 54dceb0..f589fb9 100644 | |||
3 | --- a/README.md | |||
4 | +++ b/README.md | |||
5 | @@ -38,6 +38,47 @@ If such API endpoints use TLS, new checks will monitor the certificates expirati | |||
6 | 38 | 38 | ||
7 | 39 | Alternatively, instead of the above relation, there is also an action "refresh-endpoint-checks" available. Running this action will update the service checks with the current endpoints. | 39 | Alternatively, instead of the above relation, there is also an action "refresh-endpoint-checks" available. Running this action will update the service checks with the current endpoints. |
8 | 40 | 40 | ||
9 | 41 | ## Octavia Checks | ||
10 | 42 | |||
11 | 43 | Knowning when an openstack load-balancer is having an issue is an important | ||
12 | 44 | operational situation which this charm helps manage. There is both course | ||
13 | 45 | grain control over octavia checks, as well as more fine-grained control by | ||
14 | 46 | use of the following config items. | ||
15 | 47 | |||
16 | 48 | ### Course Grain | ||
17 | 49 | |||
18 | 50 | * `check-octavia`: `true` or `false` can enable or disable checks | ||
19 | 51 | |||
20 | 52 | ### Fine Grain | ||
21 | 53 | |||
22 | 54 | * `octavia-loadbalancers-ignored` | ||
23 | 55 | * `octavia-amphorae-ignored` | ||
24 | 56 | * `octavia-pools-ignored` | ||
25 | 57 | * `octavia-image-ignored` | ||
26 | 58 | |||
27 | 59 | Each of these config items adds an ignore-list of keywords. Each keyword in | ||
28 | 60 | the ignore list will be blocked when it appears in the output of the check. | ||
29 | 61 | |||
30 | 62 | #### Examples | ||
31 | 63 | |||
32 | 64 | --------------- | ||
33 | 65 | Ignoring a test or non-production loadbalancer with the ID=`deadbeef-1234 | ||
34 | 66 | -56789012-dead-beef` which is __INACTIVE__ or __DEGRADED__. | ||
35 | 67 | ```bash | ||
36 | 68 | juju config my-openstack-service-checks octavia-loadbalancer-ignored='deadbeef-1234-56789012-dead-beef,' | ||
37 | 69 | ``` | ||
38 | 70 | |||
39 | 71 | Ignoring all loadbalancers which happen to be __DEGRADED__. | ||
40 | 72 | ```bash | ||
41 | 73 | juju config my-openstack-service-checks octavia-loadbalancer-ignored='DEGRADED,' | ||
42 | 74 | ``` | ||
43 | 75 | |||
44 | 76 | Ignoring amphorae that are stuck in __BOOTING__ state | ||
45 | 77 | ```bash | ||
46 | 78 | juju config my-openstack-service-checks octavia-amphorae-ignored='BOOTING,' | ||
47 | 79 | ``` | ||
48 | 80 | |||
49 | 81 | |||
50 | 41 | ## Compute services monitoring | 82 | ## Compute services monitoring |
51 | 42 | 83 | ||
52 | 43 | Compute services are monitored via the 'os-services' interface. Several thresholds can | 84 | Compute services are monitored via the 'os-services' interface. Several thresholds can |
53 | diff --git a/config.yaml b/config.yaml | |||
54 | index 81428e5..d36f4a9 100644 | |||
55 | --- a/config.yaml | |||
56 | +++ b/config.yaml | |||
57 | @@ -15,6 +15,26 @@ options: | |||
58 | 15 | type: boolean | 15 | type: boolean |
59 | 16 | description: | | 16 | description: | |
60 | 17 | Switch to turn on or off check for octavia services. | 17 | Switch to turn on or off check for octavia services. |
61 | 18 | octavia-loadbalancers-ignored: | ||
62 | 19 | type: string | ||
63 | 20 | default: "" | ||
64 | 21 | description: | | ||
65 | 22 | Comma separated list of octavia load balancer alerts to ignore | ||
66 | 23 | octavia-amphorae-ignored: | ||
67 | 24 | type: string | ||
68 | 25 | default: "" | ||
69 | 26 | description: | | ||
70 | 27 | Comma separated list of octavia amphorae alerts to ignore | ||
71 | 28 | octavia-pools-ignored: | ||
72 | 29 | type: string | ||
73 | 30 | default: "" | ||
74 | 31 | description: | | ||
75 | 32 | Comma separated list of octavia pool alerts to ignore | ||
76 | 33 | octavia-image-ignored: | ||
77 | 34 | type: string | ||
78 | 35 | default: "" | ||
79 | 36 | description: | | ||
80 | 37 | Comma separated list of octavia image alerts to ignore | ||
81 | 18 | octavia-amp-image-tag: | 38 | octavia-amp-image-tag: |
82 | 19 | default: "octavia-amphora" | 39 | default: "octavia-amphora" |
83 | 20 | type: string | 40 | type: string |
84 | diff --git a/files/plugins/check_octavia.py b/files/plugins/check_octavia.py | |||
85 | index 09fb6b1..bb6debc 100755 | |||
86 | --- a/files/plugins/check_octavia.py | |||
87 | +++ b/files/plugins/check_octavia.py | |||
88 | @@ -1,13 +1,19 @@ | |||
89 | 1 | #!/usr/bin/env python3 | 1 | #!/usr/bin/env python3 |
90 | 2 | 2 | ||
91 | 3 | import os | ||
92 | 4 | import sys | ||
93 | 5 | import json | ||
94 | 6 | import argparse | 3 | import argparse |
96 | 7 | import subprocess | 4 | import collections |
97 | 8 | from datetime import datetime, timedelta | 5 | from datetime import datetime, timedelta |
98 | 6 | import json | ||
99 | 7 | import os | ||
100 | 8 | import re | ||
101 | 9 | import subprocess | ||
102 | 10 | import sys | ||
103 | 11 | |||
104 | 9 | import openstack | 12 | import openstack |
105 | 10 | 13 | ||
106 | 14 | |||
107 | 15 | Alarm = collections.namedtuple('Alarm', 'lvl, desc') | ||
108 | 16 | DEFAULT_IGNORED = r'' | ||
109 | 11 | NAGIOS_STATUS_OK = 0 | 17 | NAGIOS_STATUS_OK = 0 |
110 | 12 | NAGIOS_STATUS_WARNING = 1 | 18 | NAGIOS_STATUS_WARNING = 1 |
111 | 13 | NAGIOS_STATUS_CRITICAL = 2 | 19 | NAGIOS_STATUS_CRITICAL = 2 |
112 | @@ -21,12 +27,48 @@ NAGIOS_STATUS = { | |||
113 | 21 | } | 27 | } |
114 | 22 | 28 | ||
115 | 23 | 29 | ||
117 | 24 | def nagios_exit(status, message): | 30 | def filter_checks(alarms, ignored=DEFAULT_IGNORED): |
118 | 31 | """ | ||
119 | 32 | Reduce all checks down to an overall check based on the highest level | ||
120 | 33 | not ignored | ||
121 | 34 | |||
122 | 35 | :param List[Tuple] alarms: list of alarms (lvl, message) | ||
123 | 36 | :param str ignored: regular expression of messages to ignore | ||
124 | 37 | :return: | ||
125 | 38 | """ | ||
126 | 39 | search_re = re.compile(ignored) | ||
127 | 40 | full = [Alarm(lvl, msg) for lvl, msg in alarms] | ||
128 | 41 | ignoring = list(filter(lambda m: search_re.search(m.desc), full)) if ignored else [] | ||
129 | 42 | important = set(full) - set(ignoring) | ||
130 | 43 | |||
131 | 44 | total_crit = len([a for a in full if a.lvl == NAGIOS_STATUS_CRITICAL]) | ||
132 | 45 | important_crit = len([a for a in important if a.lvl == NAGIOS_STATUS_CRITICAL]) | ||
133 | 46 | important_count = len(important) | ||
134 | 47 | if important_crit > 0: | ||
135 | 48 | status = NAGIOS_STATUS_CRITICAL | ||
136 | 49 | elif important_count > 0: | ||
137 | 50 | status = NAGIOS_STATUS_WARNING | ||
138 | 51 | else: | ||
139 | 52 | status = NAGIOS_STATUS_OK | ||
140 | 53 | msg = ( | ||
141 | 54 | "total_alarms[{}], total_crit[{}], total_ignored[{}], " | ||
142 | 55 | "ignoring r'{}'\n" | ||
143 | 56 | .format(len(full), total_crit, len(ignoring), ignored) | ||
144 | 57 | ) | ||
145 | 58 | msg += '\n'.join(_.desc for _ in sorted(important)) | ||
146 | 59 | return status, msg | ||
147 | 60 | |||
148 | 61 | |||
149 | 62 | def nagios_exit(args, results): | ||
150 | 63 | # parse ignored list | ||
151 | 64 | unique = sorted(filter(None, set(args.ignored.split(",")))) | ||
152 | 65 | ignored_re = r'|'.join('(?:{})'.format(_) for _ in unique) | ||
153 | 66 | |||
154 | 67 | status, message = filter_checks(results, ignored=ignored_re) | ||
155 | 25 | assert status in NAGIOS_STATUS, "Invalid Nagios status code" | 68 | assert status in NAGIOS_STATUS, "Invalid Nagios status code" |
156 | 26 | # prefix status name to message | 69 | # prefix status name to message |
157 | 27 | output = '{}: {}'.format(NAGIOS_STATUS[status], message) | 70 | output = '{}: {}'.format(NAGIOS_STATUS[status], message) |
160 | 28 | print(output) # nagios requires print to stdout, no stderr | 71 | return status, output |
159 | 29 | sys.exit(status) | ||
161 | 30 | 72 | ||
162 | 31 | 73 | ||
163 | 32 | def check_loadbalancers(connection): | 74 | def check_loadbalancers(connection): |
164 | @@ -39,77 +81,72 @@ def check_loadbalancers(connection): | |||
165 | 39 | lb_enabled = [lb for lb in lb_all if lb.is_admin_state_up] | 81 | lb_enabled = [lb for lb in lb_all if lb.is_admin_state_up] |
166 | 40 | 82 | ||
167 | 41 | # check provisioning_status is ACTIVE for each lb | 83 | # check provisioning_status is ACTIVE for each lb |
174 | 42 | bad_lbs = [lb for lb in lb_enabled if lb.provisioning_status != 'ACTIVE'] | 84 | bad_lbs = [( |
175 | 43 | if bad_lbs: | 85 | NAGIOS_STATUS_CRITICAL, |
176 | 44 | parts = ['loadbalancer {} provisioning_status is {}'.format( | 86 | 'loadbalancer {} provisioning_status is {}'.format( |
177 | 45 | lb.id, lb.provisioning_status) for lb in bad_lbs] | 87 | lb.id, lb.provisioning_status) |
178 | 46 | message = ', '.join(parts) | 88 | ) for lb in lb_enabled if lb.provisioning_status != 'ACTIVE'] |
173 | 47 | return NAGIOS_STATUS_CRITICAL, message | ||
179 | 48 | 89 | ||
180 | 49 | # raise WARNING if operating_status is not ONLINE | 90 | # raise WARNING if operating_status is not ONLINE |
187 | 50 | bad_lbs = [lb for lb in lb_enabled if lb.operating_status != 'ONLINE'] | 91 | bad_lbs += [( |
188 | 51 | if bad_lbs: | 92 | NAGIOS_STATUS_CRITICAL, |
189 | 52 | parts = ['loadbalancer {} operating_status is {}'.format( | 93 | 'loadbalancer {} operating_status is {}'.format( |
190 | 53 | lb.id, lb.operating_status) for lb in bad_lbs] | 94 | lb.id, lb.operating_status) |
191 | 54 | message = ', '.join(parts) | 95 | ) for lb in lb_enabled if lb.operating_status != 'ONLINE'] |
186 | 55 | return NAGIOS_STATUS_CRITICAL, message | ||
192 | 56 | 96 | ||
193 | 57 | net_mgr = connection.network | ||
194 | 58 | # check vip port exists for each lb | 97 | # check vip port exists for each lb |
196 | 59 | bad_lbs = [] | 98 | net_mgr = connection.network |
197 | 99 | vip_lbs = [] | ||
198 | 60 | for lb in lb_enabled: | 100 | for lb in lb_enabled: |
199 | 61 | try: | 101 | try: |
200 | 62 | net_mgr.get_port(lb.vip_port_id) | 102 | net_mgr.get_port(lb.vip_port_id) |
201 | 63 | except openstack.exceptions.NotFoundException: | 103 | except openstack.exceptions.NotFoundException: |
208 | 64 | bad_lbs.append(lb) | 104 | vip_lbs.append(lb) |
209 | 65 | if bad_lbs: | 105 | bad_lbs += [( |
210 | 66 | parts = ['vip port {} for loadbalancer {} not found'.format( | 106 | NAGIOS_STATUS_CRITICAL, |
211 | 67 | lb.vip_port_id, lb.id) for lb in bad_lbs] | 107 | 'vip port {} for loadbalancer {} not found'.format( |
212 | 68 | message = ', '.join(parts) | 108 | lb.vip_port_id, lb.id) |
213 | 69 | return NAGIOS_STATUS_CRITICAL, message | 109 | ) for lb in vip_lbs] |
214 | 70 | 110 | ||
215 | 71 | # warn about disabled lbs if no other error found | 111 | # warn about disabled lbs if no other error found |
222 | 72 | lb_disabled = [lb for lb in lb_all if not lb.is_admin_state_up] | 112 | bad_lbs += [( |
223 | 73 | if lb_disabled: | 113 | NAGIOS_STATUS_WARNING, |
224 | 74 | parts = ['loadbalancer {} admin_state_up is False'.format(lb.id) | 114 | 'loadbalancer {} admin_state_up is False'.format(lb.id) |
225 | 75 | for lb in lb_disabled] | 115 | ) for lb in lb_all if not lb.is_admin_state_up] |
220 | 76 | message = ', '.join(parts) | ||
221 | 77 | return NAGIOS_STATUS_WARNING, message | ||
226 | 78 | 116 | ||
228 | 79 | return NAGIOS_STATUS_OK, 'loadbalancers are happy' | 117 | return bad_lbs |
229 | 80 | 118 | ||
230 | 81 | 119 | ||
231 | 82 | def check_pools(connection): | 120 | def check_pools(connection): |
232 | 83 | """check pools status.""" | 121 | """check pools status.""" |
233 | 84 | lb_mgr = connection.load_balancer | 122 | lb_mgr = connection.load_balancer |
234 | 85 | pools_all = lb_mgr.pools() | 123 | pools_all = lb_mgr.pools() |
235 | 124 | |||
236 | 125 | # only check enabled pools | ||
237 | 86 | pools_enabled = [pool for pool in pools_all if pool.is_admin_state_up] | 126 | pools_enabled = [pool for pool in pools_all if pool.is_admin_state_up] |
238 | 87 | 127 | ||
239 | 88 | # check provisioning_status is ACTIVE for each pool | 128 | # check provisioning_status is ACTIVE for each pool |
246 | 89 | bad_pools = [pool for pool in pools_enabled if pool.provisioning_status != 'ACTIVE'] | 129 | bad_pools = [( |
247 | 90 | if bad_pools: | 130 | NAGIOS_STATUS_CRITICAL, |
248 | 91 | parts = ['pool {} provisioning_status is {}'.format( | 131 | 'pool {} provisioning_status is {}'.format( |
249 | 92 | pool.id, pool.provisioning_status) for pool in bad_pools] | 132 | pool.id, pool.provisioning_status) |
250 | 93 | message = ', '.join(parts) | 133 | ) for pool in pools_enabled if pool.provisioning_status != 'ACTIVE'] |
245 | 94 | return NAGIOS_STATUS_CRITICAL, message | ||
251 | 95 | 134 | ||
252 | 96 | # raise CRITICAL if operating_status is ERROR | 135 | # raise CRITICAL if operating_status is ERROR |
259 | 97 | bad_pools = [pool for pool in pools_enabled if pool.operating_status == 'ERROR'] | 136 | bad_pools += [( |
260 | 98 | if bad_pools: | 137 | NAGIOS_STATUS_CRITICAL, |
261 | 99 | parts = ['pool {} operating_status is {}'.format( | 138 | 'pool {} operating_status is {}'.format( |
262 | 100 | pool.id, pool.operating_status) for pool in bad_pools] | 139 | pool.id, pool.operating_status) |
263 | 101 | message = ', '.join(parts) | 140 | ) for pool in pools_enabled if pool.operating_status == 'ERROR'] |
258 | 102 | return NAGIOS_STATUS_CRITICAL, message | ||
264 | 103 | 141 | ||
265 | 104 | # raise WARNING if operating_status is NO_MONITOR | 142 | # raise WARNING if operating_status is NO_MONITOR |
272 | 105 | bad_pools = [pool for pool in pools_enabled if pool.operating_status == 'NO_MONITOR'] | 143 | bad_pools += [( |
273 | 106 | if bad_pools: | 144 | NAGIOS_STATUS_WARNING, |
274 | 107 | parts = ['pool {} operating_status is {}'.format( | 145 | 'pool {} operating_status is {}'.format( |
275 | 108 | pool.id, pool.operating_status) for pool in bad_pools] | 146 | pool.id, pool.operating_status) |
276 | 109 | message = ', '.join(parts) | 147 | ) for pool in pools_enabled if pool.operating_status == 'NO_MONITOR'] |
271 | 110 | return NAGIOS_STATUS_WARNING, message | ||
277 | 111 | 148 | ||
279 | 112 | return NAGIOS_STATUS_OK, 'pools are happy' | 149 | return bad_pools |
280 | 113 | 150 | ||
281 | 114 | 151 | ||
282 | 115 | def check_amphorae(connection): | 152 | def check_amphorae(connection): |
283 | @@ -120,7 +157,7 @@ def check_amphorae(connection): | |||
284 | 120 | resp = lb_mgr.get('/v2/octavia/amphorae') | 157 | resp = lb_mgr.get('/v2/octavia/amphorae') |
285 | 121 | # python api is not available yet, use url | 158 | # python api is not available yet, use url |
286 | 122 | if resp.status_code != 200: | 159 | if resp.status_code != 200: |
288 | 123 | return NAGIOS_STATUS_WARNING, 'amphorae api not working' | 160 | return [(NAGIOS_STATUS_WARNING, 'amphorae api not working')] |
289 | 124 | 161 | ||
290 | 125 | data = json.loads(resp.content) | 162 | data = json.loads(resp.content) |
291 | 126 | # ouput is like {"amphorae": [{...}, {...}, ...]} | 163 | # ouput is like {"amphorae": [{...}, {...}, ...]} |
292 | @@ -128,26 +165,20 @@ def check_amphorae(connection): | |||
293 | 128 | 165 | ||
294 | 129 | # raise CRITICAL for ERROR status | 166 | # raise CRITICAL for ERROR status |
295 | 130 | bad_status_list = ('ERROR',) | 167 | bad_status_list = ('ERROR',) |
303 | 131 | bad_items = [item for item in items if item['status'] in bad_status_list] | 168 | bad_amp = [( |
304 | 132 | if bad_items: | 169 | NAGIOS_STATUS_CRITICAL, |
305 | 133 | parts = [ | 170 | 'amphora {} status is {}'.format(item['id'], item['status']) |
306 | 134 | 'amphora {} status is {}'.format(item['id'], item['status']) | 171 | ) for item in items if item['status'] in bad_status_list] |
300 | 135 | for item in bad_items] | ||
301 | 136 | message = ', '.join(parts) | ||
302 | 137 | return NAGIOS_STATUS_CRITICAL, message | ||
307 | 138 | 172 | ||
308 | 139 | # raise WARNING for these status | 173 | # raise WARNING for these status |
309 | 140 | bad_status_list = ( | 174 | bad_status_list = ( |
310 | 141 | 'PENDING_CREATE', 'PENDING_UPDATE', 'PENDING_DELETE', 'BOOTING') | 175 | 'PENDING_CREATE', 'PENDING_UPDATE', 'PENDING_DELETE', 'BOOTING') |
318 | 142 | bad_items = [item for item in items if item['status'] in bad_status_list] | 176 | bad_amp += [( |
319 | 143 | if bad_items: | 177 | NAGIOS_STATUS_WARNING, |
320 | 144 | parts = [ | 178 | 'amphora {} status is {}'.format(item['id'], item['status']) |
321 | 145 | 'amphora {} status is {}'.format(item['id'], item['status']) | 179 | ) for item in items if item['status'] in bad_status_list] |
315 | 146 | for item in bad_items] | ||
316 | 147 | message = ', '.join(parts) | ||
317 | 148 | return NAGIOS_STATUS_WARNING, message | ||
322 | 149 | 180 | ||
324 | 150 | return NAGIOS_STATUS_OK, 'amphorae are happy' | 181 | return bad_amp |
325 | 151 | 182 | ||
326 | 152 | 183 | ||
327 | 153 | def check_image(connection, tag, days): | 184 | def check_image(connection, tag, days): |
328 | @@ -157,28 +188,47 @@ def check_image(connection, tag, days): | |||
329 | 157 | if not images: | 188 | if not images: |
330 | 158 | message = ('Octavia requires image with tag {} to create amphora, ' | 189 | message = ('Octavia requires image with tag {} to create amphora, ' |
331 | 159 | 'but none exist').format(tag) | 190 | 'but none exist').format(tag) |
333 | 160 | return NAGIOS_STATUS_CRITICAL, message | 191 | return [(NAGIOS_STATUS_CRITICAL, message)] |
334 | 161 | 192 | ||
335 | 162 | active_images = [image for image in images if image.status == 'active'] | 193 | active_images = [image for image in images if image.status == 'active'] |
336 | 163 | if not active_images: | 194 | if not active_images: |
338 | 164 | parts = ['{}({})'.format(image.name, image.id) for image in images] | 195 | details = ['{}({})'.format(image.name, image.id) for image in images] |
339 | 165 | message = ('Octavia requires image with tag {} to create amphora, ' | 196 | message = ('Octavia requires image with tag {} to create amphora, ' |
342 | 166 | 'but none is active: {}').format(tag, ', '.join(parts)) | 197 | 'but none are active: {}').format(tag, ', '.join(details)) |
343 | 167 | return NAGIOS_STATUS_CRITICAL, message | 198 | return [(NAGIOS_STATUS_CRITICAL, message)] |
344 | 168 | 199 | ||
345 | 169 | # raise WARNING if image is too old | 200 | # raise WARNING if image is too old |
346 | 170 | when = (datetime.now() - timedelta(days=days)).isoformat() | 201 | when = (datetime.now() - timedelta(days=days)).isoformat() |
347 | 171 | # updated_at str format: '2019-12-05T18:21:25Z' | 202 | # updated_at str format: '2019-12-05T18:21:25Z' |
348 | 172 | fresh_images = [image for image in active_images if image.updated_at > when] | 203 | fresh_images = [image for image in active_images if image.updated_at > when] |
349 | 173 | if not fresh_images: | 204 | if not fresh_images: |
350 | 205 | details = ['{}({})'.format(image.name, image.id) for image in images] | ||
351 | 174 | message = ('Octavia requires image with tag {} to create amphora, ' | 206 | message = ('Octavia requires image with tag {} to create amphora, ' |
354 | 175 | 'but it is older than {} days').format(tag, days) | 207 | 'but all images are older than {} day(s): {}' |
355 | 176 | return NAGIOS_STATUS_WARNING, message | 208 | '').format(tag, days, ', '.join(details)) |
356 | 209 | return [(NAGIOS_STATUS_WARNING, message)] | ||
357 | 177 | 210 | ||
359 | 178 | return NAGIOS_STATUS_OK, 'image is ready' | 211 | return [] |
360 | 179 | 212 | ||
361 | 180 | 213 | ||
363 | 181 | if __name__ == '__main__': | 214 | def process_checks(args): |
364 | 215 | # use closure to make all checks have same signature | ||
365 | 216 | # so we can handle them in same way | ||
366 | 217 | def _check_image(_connection): | ||
367 | 218 | return check_image(_connection, args.amp_image_tag, args.amp_image_days) | ||
368 | 219 | |||
369 | 220 | checks = { | ||
370 | 221 | 'loadbalancers': check_loadbalancers, | ||
371 | 222 | 'amphorae': check_amphorae, | ||
372 | 223 | 'pools': check_pools, | ||
373 | 224 | 'image': _check_image, | ||
374 | 225 | } | ||
375 | 226 | |||
376 | 227 | connection = openstack.connect(cloud='envvars') | ||
377 | 228 | return nagios_exit(args, checks[args.check](connection)) | ||
378 | 229 | |||
379 | 230 | |||
380 | 231 | def main(): | ||
381 | 182 | parser = argparse.ArgumentParser( | 232 | parser = argparse.ArgumentParser( |
382 | 183 | description='Check Octavia status', | 233 | description='Check Octavia status', |
383 | 184 | formatter_class=argparse.ArgumentDefaultsHelpFormatter, | 234 | formatter_class=argparse.ArgumentDefaultsHelpFormatter, |
384 | @@ -195,6 +245,11 @@ if __name__ == '__main__': | |||
385 | 195 | help='which check to run') | 245 | help='which check to run') |
386 | 196 | 246 | ||
387 | 197 | parser.add_argument( | 247 | parser.add_argument( |
388 | 248 | '--ignored', dest="ignored", type=str, | ||
389 | 249 | default=DEFAULT_IGNORED, | ||
390 | 250 | help='Comma separated list of alerts to ignore') | ||
391 | 251 | |||
392 | 252 | parser.add_argument( | ||
393 | 198 | '--amp-image-tag', dest='amp_image_tag', default='octavia-amphora', | 253 | '--amp-image-tag', dest='amp_image_tag', default='octavia-amphora', |
394 | 199 | help='amphora image tag for image check') | 254 | help='amphora image tag for image check') |
395 | 200 | 255 | ||
396 | @@ -211,17 +266,10 @@ if __name__ == '__main__': | |||
397 | 211 | os.environ[key.decode('utf-8')] = value.rstrip().decode('utf-8') | 266 | os.environ[key.decode('utf-8')] = value.rstrip().decode('utf-8') |
398 | 212 | proc.communicate() | 267 | proc.communicate() |
399 | 213 | 268 | ||
404 | 214 | # use closure to make all checks have same signature | 269 | status, message = process_checks(args) |
405 | 215 | # so we can handle them in same way | 270 | print(message) |
406 | 216 | def _check_image(connection): | 271 | sys.exit(status) |
403 | 217 | return check_image(connection, args.amp_image_tag, args.amp_image_days) | ||
407 | 218 | 272 | ||
408 | 219 | checks = { | ||
409 | 220 | 'loadbalancers': check_loadbalancers, | ||
410 | 221 | 'amphorae': check_amphorae, | ||
411 | 222 | 'pools': check_pools, | ||
412 | 223 | 'image': _check_image, | ||
413 | 224 | } | ||
414 | 225 | 273 | ||
417 | 226 | connection = openstack.connect(cloud='envvars') | 274 | if __name__ == '__main__': |
418 | 227 | nagios_exit(*checks[args.check](connection)) | 275 | main() |
419 | diff --git a/lib/lib_openstack_service_checks.py b/lib/lib_openstack_service_checks.py | |||
420 | index 5e82395..c74d61d 100644 | |||
421 | --- a/lib/lib_openstack_service_checks.py | |||
422 | +++ b/lib/lib_openstack_service_checks.py | |||
423 | @@ -202,16 +202,8 @@ class OSCHelper(): | |||
424 | 202 | charm_plugin_dir = os.path.join(hookenv.charm_dir(), 'files', 'plugins/') | 202 | charm_plugin_dir = os.path.join(hookenv.charm_dir(), 'files', 'plugins/') |
425 | 203 | host.rsync(charm_plugin_dir, self.plugins_dir, options=['--executability']) | 203 | host.rsync(charm_plugin_dir, self.plugins_dir, options=['--executability']) |
426 | 204 | 204 | ||
437 | 205 | def render_checks(self, creds): | 205 | def _render_nova_checks(self, nrpe): |
438 | 206 | render(source='nagios.novarc', target=self.novarc, context=creds, | 206 | """Nova services health.""" |
429 | 207 | owner='nagios', group='nagios') | ||
430 | 208 | |||
431 | 209 | nrpe = NRPE() | ||
432 | 210 | if not os.path.exists(self.plugins_dir): | ||
433 | 211 | os.makedirs(self.plugins_dir) | ||
434 | 212 | |||
435 | 213 | self.update_plugins() | ||
436 | 214 | # Nova services health | ||
439 | 215 | nova_check_command = os.path.join(self.plugins_dir, 'check_nova_services.py') | 207 | nova_check_command = os.path.join(self.plugins_dir, 'check_nova_services.py') |
440 | 216 | check_command = '{} --warn {} --crit {} --skip-aggregates {} {}'.format( | 208 | check_command = '{} --warn {} --crit {} --skip-aggregates {} {}'.format( |
441 | 217 | nova_check_command, self.nova_warn, self.nova_crit, self.nova_skip_aggregates, | 209 | nova_check_command, self.nova_warn, self.nova_crit, self.nova_skip_aggregates, |
442 | @@ -221,7 +213,8 @@ class OSCHelper(): | |||
443 | 221 | check_cmd=check_command, | 213 | check_cmd=check_command, |
444 | 222 | ) | 214 | ) |
445 | 223 | 215 | ||
447 | 224 | # Neutron agents health | 216 | def _render_neutron_checks(self, nrpe): |
448 | 217 | """Neutron agents health.""" | ||
449 | 225 | if self.is_neutron_agents_check_enabled: | 218 | if self.is_neutron_agents_check_enabled: |
450 | 226 | nrpe.add_check(shortname='neutron_agents', | 219 | nrpe.add_check(shortname='neutron_agents', |
451 | 227 | description='Check that enabled Neutron agents are up', | 220 | description='Check that enabled Neutron agents are up', |
452 | @@ -231,6 +224,7 @@ class OSCHelper(): | |||
453 | 231 | else: | 224 | else: |
454 | 232 | nrpe.remove_check(shortname='neutron_agents') | 225 | nrpe.remove_check(shortname='neutron_agents') |
455 | 233 | 226 | ||
456 | 227 | def _render_cinder_checks(self, nrpe): | ||
457 | 234 | # Cinder services health | 228 | # Cinder services health |
458 | 235 | cinder_check_command = os.path.join(self.plugins_dir, 'check_cinder_services.py') | 229 | cinder_check_command = os.path.join(self.plugins_dir, 'check_cinder_services.py') |
459 | 236 | check_command = '{} {}'.format(cinder_check_command, self.skip_disabled) | 230 | check_command = '{} {}'.format(cinder_check_command, self.skip_disabled) |
460 | @@ -239,6 +233,7 @@ class OSCHelper(): | |||
461 | 239 | check_cmd=check_command, | 233 | check_cmd=check_command, |
462 | 240 | ) | 234 | ) |
463 | 241 | 235 | ||
464 | 236 | def _render_octavia_checks(self, nrpe): | ||
465 | 242 | # only care about octavia after 18.04 | 237 | # only care about octavia after 18.04 |
466 | 243 | if host.lsb_release()['DISTRIB_RELEASE'] >= '18.04': | 238 | if host.lsb_release()['DISTRIB_RELEASE'] >= '18.04': |
467 | 244 | if self.is_octavia_check_enabled: | 239 | if self.is_octavia_check_enabled: |
468 | @@ -246,24 +241,34 @@ class OSCHelper(): | |||
469 | 246 | script = os.path.join(self.plugins_dir, 'check_octavia.py') | 241 | script = os.path.join(self.plugins_dir, 'check_octavia.py') |
470 | 247 | 242 | ||
471 | 248 | for check in ('loadbalancers', 'amphorae', 'pools'): | 243 | for check in ('loadbalancers', 'amphorae', 'pools'): |
472 | 244 | check_cmd = '{} --check {}'.format(script, check) | ||
473 | 245 | ignore = self.charm_config.get('octavia-%s-ignored' % check) | ||
474 | 246 | if ignore: | ||
475 | 247 | check_cmd += ' --ignored {}'.format(ignore) | ||
476 | 249 | nrpe.add_check( | 248 | nrpe.add_check( |
477 | 250 | shortname='octavia_{}'.format(check), | 249 | shortname='octavia_{}'.format(check), |
478 | 251 | description='Check octavia {} status'.format(check), | 250 | description='Check octavia {} status'.format(check), |
480 | 252 | check_cmd='{} --check {}'.format(script, check), | 251 | check_cmd=check_cmd, |
481 | 253 | ) | 252 | ) |
482 | 254 | 253 | ||
483 | 255 | # image check has extra args, add it separately | 254 | # image check has extra args, add it separately |
484 | 256 | check = 'image' | 255 | check = 'image' |
485 | 256 | check_cmd = "{} --check {}".format(script, check) | ||
486 | 257 | check_cmd += " --amp-image-tag {}".format(self.octavia_amp_image_tag) | ||
487 | 258 | check_cmd += " --amp-image-days {}".format(self.octavia_amp_image_days) | ||
488 | 259 | ignore = self.charm_config.get('octavia-%s-ignored' % check) | ||
489 | 260 | if ignore: | ||
490 | 261 | check_cmd += " --ignored {}".format(ignore) | ||
491 | 257 | nrpe.add_check( | 262 | nrpe.add_check( |
492 | 258 | shortname='octavia_{}'.format(check), | 263 | shortname='octavia_{}'.format(check), |
493 | 259 | description='Check octavia {} status'.format(check), | 264 | description='Check octavia {} status'.format(check), |
496 | 260 | check_cmd='{} --check {} --amp-image-tag {} --amp-image-days {}'.format( | 265 | check_cmd=check_cmd, |
495 | 261 | script, check, self.octavia_amp_image_tag, self.octavia_amp_image_days), | ||
497 | 262 | ) | 266 | ) |
498 | 263 | else: | 267 | else: |
499 | 264 | for check in ('loadbalancers', 'amphorae', 'pools', 'image'): | 268 | for check in ('loadbalancers', 'amphorae', 'pools', 'image'): |
500 | 265 | nrpe.remove_check(shortname='octavia_{}'.format(check)) | 269 | nrpe.remove_check(shortname='octavia_{}'.format(check)) |
501 | 266 | 270 | ||
502 | 271 | def _render_contrail_checks(self, nrpe): | ||
503 | 267 | if self.contrail_analytics_vip: | 272 | if self.contrail_analytics_vip: |
504 | 268 | contrail_check_command = '{} --host {}'.format( | 273 | contrail_check_command = '{} --host {}'.format( |
505 | 269 | os.path.join(self.plugins_dir, 'check_contrail_analytics_alarms.py'), | 274 | os.path.join(self.plugins_dir, 'check_contrail_analytics_alarms.py'), |
506 | @@ -279,6 +284,7 @@ class OSCHelper(): | |||
507 | 279 | else: | 284 | else: |
508 | 280 | nrpe.remove_check(shortname='contrail_analytics_alarms') | 285 | nrpe.remove_check(shortname='contrail_analytics_alarms') |
509 | 281 | 286 | ||
510 | 287 | def _render_dns_checks(self, nrpe): | ||
511 | 282 | if len(self.check_dns): | 288 | if len(self.check_dns): |
512 | 283 | nrpe.add_check(shortname='dns_multi', | 289 | nrpe.add_check(shortname='dns_multi', |
513 | 284 | description='Check DNS names are resolvable', | 290 | description='Check DNS names are resolvable', |
514 | @@ -289,8 +295,24 @@ class OSCHelper(): | |||
515 | 289 | ) | 295 | ) |
516 | 290 | else: | 296 | else: |
517 | 291 | nrpe.remove_check(shortname='dns_multi') | 297 | nrpe.remove_check(shortname='dns_multi') |
518 | 292 | nrpe.write() | ||
519 | 293 | 298 | ||
520 | 299 | def render_checks(self, creds): | ||
521 | 300 | render(source='nagios.novarc', target=self.novarc, context=creds, | ||
522 | 301 | owner='nagios', group='nagios') | ||
523 | 302 | |||
524 | 303 | nrpe = NRPE() | ||
525 | 304 | if not os.path.exists(self.plugins_dir): | ||
526 | 305 | os.makedirs(self.plugins_dir) | ||
527 | 306 | |||
528 | 307 | self.update_plugins() | ||
529 | 308 | self._render_nova_checks(nrpe) | ||
530 | 309 | self._render_neutron_checks(nrpe) | ||
531 | 310 | self._render_cinder_checks(nrpe) | ||
532 | 311 | self._render_octavia_checks(nrpe) | ||
533 | 312 | self._render_contrail_checks(nrpe) | ||
534 | 313 | self._render_dns_checks(nrpe) | ||
535 | 314 | |||
536 | 315 | nrpe.write() | ||
537 | 294 | self.create_endpoint_checks(creds) | 316 | self.create_endpoint_checks(creds) |
538 | 295 | 317 | ||
539 | 296 | def _split_url(self, netloc, scheme): | 318 | def _split_url(self, netloc, scheme): |
540 | diff --git a/tests/unit/conftest.py b/tests/unit/conftest.py | |||
541 | index 6797e85..639b91a 100644 | |||
542 | --- a/tests/unit/conftest.py | |||
543 | +++ b/tests/unit/conftest.py | |||
544 | @@ -4,6 +4,10 @@ import sys | |||
545 | 4 | 4 | ||
546 | 5 | import pytest | 5 | import pytest |
547 | 6 | 6 | ||
548 | 7 | TEST_DIR = os.path.dirname(__file__) | ||
549 | 8 | CHECKS_DIR = os.path.join(TEST_DIR, '..', '..', 'files', 'plugins') | ||
550 | 9 | sys.path.append(CHECKS_DIR) | ||
551 | 10 | |||
552 | 7 | 11 | ||
553 | 8 | # If layer options are used, add this to openstackservicechecks | 12 | # If layer options are used, add this to openstackservicechecks |
554 | 9 | # and import layer in lib_openstack_service_checks | 13 | # and import layer in lib_openstack_service_checks |
555 | @@ -77,14 +81,3 @@ def openstackservicechecks(tmpdir, mock_hookenv_config, mock_charm_dir, monkeypa | |||
556 | 77 | monkeypatch.setattr('lib_openstack_service_checks.OSCHelper', lambda: helper) | 81 | monkeypatch.setattr('lib_openstack_service_checks.OSCHelper', lambda: helper) |
557 | 78 | 82 | ||
558 | 79 | return helper | 83 | return helper |
559 | 80 | |||
560 | 81 | |||
561 | 82 | @pytest.fixture(scope='module') | ||
562 | 83 | def check_contrail_analytics(): | ||
563 | 84 | pre = sys.path | ||
564 | 85 | TEST_DIR = os.path.dirname(__file__) | ||
565 | 86 | tests_dir = os.path.join(TEST_DIR, '..', '..', 'files', 'plugins') | ||
566 | 87 | sys.path.append(tests_dir) | ||
567 | 88 | import check_contrail_analytics_alarms as checks # noqa | ||
568 | 89 | yield checks | ||
569 | 90 | sys.path = pre | ||
570 | diff --git a/tests/unit/test_check_cinder_services.py b/tests/unit/test_check_cinder_services.py | |||
571 | index 709b4dc..3428dd6 100644 | |||
572 | --- a/tests/unit/test_check_cinder_services.py | |||
573 | +++ b/tests/unit/test_check_cinder_services.py | |||
574 | @@ -1,11 +1,7 @@ | |||
575 | 1 | import pytest | 1 | import pytest |
576 | 2 | import nagios_plugin3 | 2 | import nagios_plugin3 |
577 | 3 | 3 | ||
583 | 4 | import sys | 4 | import check_cinder_services |
579 | 5 | |||
580 | 6 | sys.path.append("files/plugins") | ||
581 | 7 | |||
582 | 8 | import check_cinder_services # noqa: E402 | ||
584 | 9 | 5 | ||
585 | 10 | 6 | ||
586 | 11 | @pytest.mark.parametrize( | 7 | @pytest.mark.parametrize( |
587 | diff --git a/tests/unit/test_check_contrail_analytics_alarms.py b/tests/unit/test_check_contrail_analytics_alarms.py | |||
588 | index c7fd6e9..886a396 100644 | |||
589 | --- a/tests/unit/test_check_contrail_analytics_alarms.py | |||
590 | +++ b/tests/unit/test_check_contrail_analytics_alarms.py | |||
591 | @@ -1,14 +1,15 @@ | |||
592 | 1 | import json | 1 | import json |
593 | 2 | import os | 2 | import os |
594 | 3 | 3 | ||
595 | 4 | import check_contrail_analytics_alarms | ||
596 | 5 | |||
597 | 4 | TEST_DIR = os.path.dirname(__file__) | 6 | TEST_DIR = os.path.dirname(__file__) |
598 | 5 | 7 | ||
599 | 6 | 8 | ||
601 | 7 | def test_parse_contrail_alarms(check_contrail_analytics): | 9 | def test_parse_contrail_alarms(): |
602 | 8 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: | 10 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: |
603 | 9 | data = json.load(f) | 11 | data = json.load(f) |
606 | 10 | assert hasattr(check_contrail_analytics, 'parse_contrail_alarms') | 12 | parsed = check_contrail_analytics_alarms.parse_contrail_alarms(data) |
605 | 11 | parsed = check_contrail_analytics.parse_contrail_alarms(data) | ||
607 | 12 | assert parsed in """ | 13 | assert parsed in """ |
608 | 13 | CRITICAL: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[0], ignoring r'' | 14 | CRITICAL: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[0], ignoring r'' |
609 | 14 | CRITICAL: vrouter{compute-10.maas, sev=1, ts[2020-06-25 18:29:23.149146]} Vrouter interface(s) down. | 15 | CRITICAL: vrouter{compute-10.maas, sev=1, ts[2020-06-25 18:29:23.149146]} Vrouter interface(s) down. |
610 | @@ -25,12 +26,11 @@ CRITICAL: vrouter{compute-7.maas, sev=1, ts[2020-07-03 18:30:32.481386]} Vrouter | |||
611 | 25 | """ # noqa: ignore=F501 | 26 | """ # noqa: ignore=F501 |
612 | 26 | 27 | ||
613 | 27 | 28 | ||
615 | 28 | def test_parse_contrail_alarms_filter_vrouter_control_9(check_contrail_analytics): | 29 | def test_parse_contrail_alarms_filter_vrouter_control_9(): |
616 | 29 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: | 30 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: |
617 | 30 | data = json.load(f) | 31 | data = json.load(f) |
618 | 31 | assert hasattr(check_contrail_analytics, 'parse_contrail_alarms') | ||
619 | 32 | ignored_re = r'(?:vrouter)|(?:control-9)' | 32 | ignored_re = r'(?:vrouter)|(?:control-9)' |
621 | 33 | parsed = check_contrail_analytics.parse_contrail_alarms(data, ignored=ignored_re) | 33 | parsed = check_contrail_analytics_alarms.parse_contrail_alarms(data, ignored=ignored_re) |
622 | 34 | assert parsed in """ | 34 | assert parsed in """ |
623 | 35 | CRITICAL: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[8], ignoring r'(?:vrouter)|(?:control-9)' | 35 | CRITICAL: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[8], ignoring r'(?:vrouter)|(?:control-9)' |
624 | 36 | WARNING: control-node{control-8-contrail-rmq, sev=0, ts[2020-06-25 18:29:23.684803]} Node Failure. NodeStatus UVE not present. | 36 | WARNING: control-node{control-8-contrail-rmq, sev=0, ts[2020-06-25 18:29:23.684803]} Node Failure. NodeStatus UVE not present. |
625 | @@ -39,32 +39,30 @@ CRITICAL: control-node{control-7-contrail-rmq, sev=1, ts[2020-06-25 18:29:24.377 | |||
626 | 39 | """ # noqa: ignore=F501 | 39 | """ # noqa: ignore=F501 |
627 | 40 | 40 | ||
628 | 41 | 41 | ||
630 | 42 | def test_parse_contrail_alarms_filter_critical(check_contrail_analytics): | 42 | def test_parse_contrail_alarms_filter_critical(): |
631 | 43 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: | 43 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: |
632 | 44 | data = json.load(f) | 44 | data = json.load(f) |
633 | 45 | assert hasattr(check_contrail_analytics, 'parse_contrail_alarms') | ||
634 | 46 | ignored_re = r'(?:CRITICAL)' | 45 | ignored_re = r'(?:CRITICAL)' |
636 | 47 | parsed = check_contrail_analytics.parse_contrail_alarms(data, ignored=ignored_re) | 46 | parsed = check_contrail_analytics_alarms.parse_contrail_alarms(data, ignored=ignored_re) |
637 | 48 | assert parsed in """ | 47 | assert parsed in """ |
638 | 49 | WARNING: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[10], ignoring r'(?:CRITICAL)' | 48 | WARNING: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[10], ignoring r'(?:CRITICAL)' |
639 | 50 | WARNING: control-node{control-8-contrail-rmq, sev=0, ts[2020-06-25 18:29:23.684803]} Node Failure. NodeStatus UVE not present. | 49 | WARNING: control-node{control-8-contrail-rmq, sev=0, ts[2020-06-25 18:29:23.684803]} Node Failure. NodeStatus UVE not present. |
640 | 51 | """ # noqa: ignore=F501 | 50 | """ # noqa: ignore=F501 |
641 | 52 | 51 | ||
642 | 53 | 52 | ||
644 | 54 | def test_parse_contrail_alarms_all_ignored(check_contrail_analytics): | 53 | def test_parse_contrail_alarms_all_ignored(): |
645 | 55 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: | 54 | with open(os.path.join(TEST_DIR, 'contrail_alert_data.json')) as f: |
646 | 56 | data = json.load(f) | 55 | data = json.load(f) |
647 | 57 | assert hasattr(check_contrail_analytics, 'parse_contrail_alarms') | ||
648 | 58 | ignored_re = r'(?:CRITICAL)|(?:WARNING)' | 56 | ignored_re = r'(?:CRITICAL)|(?:WARNING)' |
650 | 59 | parsed = check_contrail_analytics.parse_contrail_alarms(data, ignored=ignored_re) | 57 | parsed = check_contrail_analytics_alarms.parse_contrail_alarms(data, ignored=ignored_re) |
651 | 60 | assert parsed in """ | 58 | assert parsed in """ |
652 | 61 | OK: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[11], ignoring r'(?:CRITICAL)|(?:WARNING)' | 59 | OK: total_alarms[11], unacked_or_sev_gt_0[10], total_ignored[11], ignoring r'(?:CRITICAL)|(?:WARNING)' |
653 | 62 | """ # noqa: ignore=F501 | 60 | """ # noqa: ignore=F501 |
654 | 63 | 61 | ||
655 | 64 | 62 | ||
657 | 65 | def test_parse_contrail_alarms_no_alarms(check_contrail_analytics): | 63 | def test_parse_contrail_alarms_no_alarms(): |
658 | 66 | ignored_re = r'' | 64 | ignored_re = r'' |
660 | 67 | parsed = check_contrail_analytics.parse_contrail_alarms({}, ignored=ignored_re) | 65 | parsed = check_contrail_analytics_alarms.parse_contrail_alarms({}, ignored=ignored_re) |
661 | 68 | assert parsed in """ | 66 | assert parsed in """ |
662 | 69 | OK: total_alarms[0], unacked_or_sev_gt_0[0], total_ignored[0], ignoring r'' | 67 | OK: total_alarms[0], unacked_or_sev_gt_0[0], total_ignored[0], ignoring r'' |
663 | 70 | """ | 68 | """ |
664 | diff --git a/tests/unit/test_check_nova_services.py b/tests/unit/test_check_nova_services.py | |||
665 | index 10c13ed..dad32a6 100644 | |||
666 | --- a/tests/unit/test_check_nova_services.py | |||
667 | +++ b/tests/unit/test_check_nova_services.py | |||
668 | @@ -1,10 +1,7 @@ | |||
669 | 1 | import pytest | 1 | import pytest |
670 | 2 | import nagios_plugin3 | 2 | import nagios_plugin3 |
671 | 3 | 3 | ||
676 | 4 | import sys | 4 | import check_nova_services |
673 | 5 | sys.path.append('files/plugins') | ||
674 | 6 | |||
675 | 7 | import check_nova_services # noqa: E402 | ||
677 | 8 | 5 | ||
678 | 9 | 6 | ||
679 | 10 | @pytest.mark.parametrize('is_skip_disabled,num_nodes', | 7 | @pytest.mark.parametrize('is_skip_disabled,num_nodes', |
680 | diff --git a/tests/unit/test_check_octavia.py b/tests/unit/test_check_octavia.py | |||
681 | 11 | new file mode 100644 | 8 | new file mode 100644 |
682 | index 0000000..08c9357 | |||
683 | --- /dev/null | |||
684 | +++ b/tests/unit/test_check_octavia.py | |||
685 | @@ -0,0 +1,117 @@ | |||
686 | 1 | from datetime import datetime, timedelta | ||
687 | 2 | import json | ||
688 | 3 | import unittest.mock as mock | ||
689 | 4 | from uuid import uuid4 | ||
690 | 5 | |||
691 | 6 | import check_octavia | ||
692 | 7 | import pytest | ||
693 | 8 | |||
694 | 9 | |||
695 | 10 | @mock.patch('check_octavia.openstack.connect') | ||
696 | 11 | @pytest.mark.parametrize('check', [ | ||
697 | 12 | 'loadbalancers', 'pools', "amphorae", "image" | ||
698 | 13 | ]) | ||
699 | 14 | def test_stable_alarms(connect, check): | ||
700 | 15 | args = mock.MagicMock() | ||
701 | 16 | args.ignored = r'' | ||
702 | 17 | args.check = check | ||
703 | 18 | if check == "amphorae": | ||
704 | 19 | # Present 0 Amphora instances | ||
705 | 20 | resp = connect().load_balancer.get() | ||
706 | 21 | resp.status_code = 200 | ||
707 | 22 | resp.content = json.dumps({'amphora': []}) | ||
708 | 23 | elif check == "image": | ||
709 | 24 | # Present 1 Active Fresh Amphora image | ||
710 | 25 | args.amp_image_tag = 'octavia' | ||
711 | 26 | args.amp_image_days = 1 | ||
712 | 27 | amp_image = mock.MagicMock() | ||
713 | 28 | amp_image.status = 'active' | ||
714 | 29 | amp_image.updated_at = datetime.now().isoformat() | ||
715 | 30 | connect().image.images.return_value = [amp_image] | ||
716 | 31 | |||
717 | 32 | status, message = check_octavia.process_checks(args) | ||
718 | 33 | assert message in """ | ||
719 | 34 | OK: total_alarms[0], total_crit[0], total_ignored[0], ignoring r'' | ||
720 | 35 | """ | ||
721 | 36 | assert status == check_octavia.NAGIOS_STATUS_OK | ||
722 | 37 | |||
723 | 38 | |||
724 | 39 | @mock.patch('check_octavia.openstack.connect') | ||
725 | 40 | def test_no_images_is_ignorable(connect): | ||
726 | 41 | args = mock.MagicMock() | ||
727 | 42 | args.ignored = 'none exist' | ||
728 | 43 | args.check = "image" | ||
729 | 44 | # Present 1 Active Fresh Amphora image | ||
730 | 45 | args.amp_image_tag = 'octavia' | ||
731 | 46 | args.amp_image_days = 1 | ||
732 | 47 | connect().image.images.return_value = [] | ||
733 | 48 | |||
734 | 49 | status, message = check_octavia.process_checks(args) | ||
735 | 50 | assert message in """ | ||
736 | 51 | OK: total_alarms[1], total_crit[1], total_ignored[1], ignoring r'(?:none exist)' | ||
737 | 52 | """ | ||
738 | 53 | assert status == check_octavia.NAGIOS_STATUS_OK | ||
739 | 54 | |||
740 | 55 | |||
741 | 56 | @mock.patch('check_octavia.openstack.connect') | ||
742 | 57 | def test_no_images(connect): | ||
743 | 58 | args = mock.MagicMock() | ||
744 | 59 | args.ignored = r'' | ||
745 | 60 | args.check = "image" | ||
746 | 61 | # Present 1 Active Fresh Amphora image | ||
747 | 62 | args.amp_image_tag = 'octavia' | ||
748 | 63 | args.amp_image_days = 1 | ||
749 | 64 | connect().image.images.return_value = [] | ||
750 | 65 | |||
751 | 66 | status, message = check_octavia.process_checks(args) | ||
752 | 67 | assert message in """ | ||
753 | 68 | CRITICAL: total_alarms[1], total_crit[1], total_ignored[0], ignoring r'' | ||
754 | 69 | Octavia requires image with tag octavia to create amphora, but none exist | ||
755 | 70 | """ | ||
756 | 71 | assert status == check_octavia.NAGIOS_STATUS_CRITICAL | ||
757 | 72 | |||
758 | 73 | |||
759 | 74 | @mock.patch('check_octavia.openstack.connect') | ||
760 | 75 | def test_no_active_images(connect): | ||
761 | 76 | args = mock.MagicMock() | ||
762 | 77 | args.ignored = r'' | ||
763 | 78 | args.check = "image" | ||
764 | 79 | # Present 1 Active Fresh Amphora image | ||
765 | 80 | args.amp_image_tag = 'octavia' | ||
766 | 81 | args.amp_image_days = 1 | ||
767 | 82 | amp_image = mock.MagicMock() | ||
768 | 83 | amp_image.name = "bob-the-image" | ||
769 | 84 | amp_image.id = str(uuid4()) | ||
770 | 85 | amp_image.status = 'inactive' | ||
771 | 86 | amp_image.updated_at = datetime.now().isoformat() | ||
772 | 87 | connect().image.images.return_value = [amp_image] | ||
773 | 88 | |||
774 | 89 | status, message = check_octavia.process_checks(args) | ||
775 | 90 | assert message in """ | ||
776 | 91 | CRITICAL: total_alarms[1], total_crit[1], total_ignored[0], ignoring r'' | ||
777 | 92 | Octavia requires image with tag octavia to create amphora, but none are active: bob-the-image({}) | ||
778 | 93 | """.format(amp_image.id) | ||
779 | 94 | assert status == check_octavia.NAGIOS_STATUS_CRITICAL | ||
780 | 95 | |||
781 | 96 | |||
782 | 97 | @mock.patch('check_octavia.openstack.connect') | ||
783 | 98 | def test_no_fresh_images(connect): | ||
784 | 99 | args = mock.MagicMock() | ||
785 | 100 | args.ignored = r'' | ||
786 | 101 | args.check = "image" | ||
787 | 102 | # Present 1 Active Fresh Amphora image | ||
788 | 103 | args.amp_image_tag = 'octavia' | ||
789 | 104 | args.amp_image_days = 1 | ||
790 | 105 | amp_image = mock.MagicMock() | ||
791 | 106 | amp_image.name = "bob-the-image" | ||
792 | 107 | amp_image.id = str(uuid4()) | ||
793 | 108 | amp_image.status = 'active' | ||
794 | 109 | amp_image.updated_at = (datetime.now() - timedelta(days=2)).isoformat() | ||
795 | 110 | connect().image.images.return_value = [amp_image] | ||
796 | 111 | |||
797 | 112 | status, message = check_octavia.process_checks(args) | ||
798 | 113 | assert message in """ | ||
799 | 114 | WARNING: total_alarms[1], total_crit[0], total_ignored[0], ignoring r'' | ||
800 | 115 | Octavia requires image with tag octavia to create amphora, but all images are older than 1 day(s): bob-the-image({}) | ||
801 | 116 | """.format(amp_image.id) | ||
802 | 117 | assert status == check_octavia.NAGIOS_STATUS_WARNING |
This change added an ignore-list of keywords for each of the 4 octavia checks: loadbalancers, amphora, pools, and images.
each keyword in the ignore list will be blocked when it appears in the output of the check_octavia. Presume that you have a test or non-production loadbalancer you do not want alert checks from with the ID=deadbeef- 1234-56789012- dead-beef
You can use this config
juju config <openstack- service- checks- app> octavia- loadbalancer- ignored= 'deadbeef- 1234-56789012- dead-beef, '
to ignore any checks associated with the loadbalancer such as it being inactive or degraded.
Alternatively, you could silence all degraded loadbalancer alerts with
juju config <openstack- service- checks- app> octavia- loadbalancer- ignored= 'DEGRADED, '