Merge ~rgildein/juju-verify:bug/1917599 into ~canonical-solutions-engineering/juju-verify:master

Proposed by Robert Gildein
Status: Merged
Approved by: Martin Kalcok
Approved revision: f291f852910e6db8caae92f7f82488f17fdedcdb
Merged at revision: 1681157330e4f7208529c10dbec1b6cbd3fabfd9
Proposed branch: ~rgildein/juju-verify:bug/1917599
Merge into: ~canonical-solutions-engineering/juju-verify:master
Diff against target: 380 lines (+196/-26)
4 files modified
juju_verify/utils/unit.py (+3/-3)
juju_verify/verifiers/ceph.py (+75/-6)
tests/unit/conftest.py (+13/-3)
tests/unit/verifiers/test_ceph.py (+105/-14)
Reviewer Review Type Date Requested Status
🤖 prod-jenkaas-bootstack (community) continuous-integration Approve
Martin Kalcok (community) Approve
Canonical Solutions Engineering Team Pending
Review via email: mp+400024@code.launchpad.net

Commit message

Add check for number of replication

These changes provide a check number of replications. It groups the
units into an application and obtains the minimum number of
replications from the relation of the ceph-osd and ceph-mon
applications. The idea is to find the minimum number from the
replication list calculated from the pools "size" - "min_size".

If no pools are available, the check returns the Result(True).

Output example: [1]

Closes-bug: LP1917599

---
[1]: https://pastebin.ubuntu.com/p/8W4YyBVj6d/
[LP1917599]: https://bugs.launchpad.net/juju-verify/+bug/1917599

To post a comment you must log in.
Revision history for this message
🤖 Canonical IS Merge Bot (canonical-is-mergebot) wrote :

This merge proposal is being monitored by mergebot. Change the status to Approved to merge.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :

A CI job is currently in progress. A follow up comment will be added when it completes.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alvaro Uria (aluria) wrote :

In the Ceph charms, they use the word "replicas" instead of "replications". I'm fine using any of them as both are correct. However, I've suggested for the output string to use the word "replicas".

Code-related, it looks good and I will give it a final pass tomorrow morning.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :

A CI job is currently in progress. A follow up comment will be added when it completes.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :
review: Approve (continuous-integration)
Revision history for this message
Alvaro Uria (aluria) wrote :

Added a comment about how the minimum value of size/min_size is retrieved.

Revision history for this message
Martin Kalcok (martin-kalcok) wrote :

Overall lgtm, but I do have few recommendations (see inline comments).

Revision history for this message
Alvaro Uria (aluria) wrote :

Amended by comment about min(size - min_size).

Revision history for this message
Robert Gildein (rgildein) wrote :

Thanks for comments.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :

A CI job is currently in progress. A follow up comment will be added when it completes.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :
review: Approve (continuous-integration)
Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :

A CI job is currently in progress. A follow up comment will be added when it completes.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :
review: Approve (continuous-integration)
Revision history for this message
Martin Kalcok (martin-kalcok) wrote :

+1 I'll give final approve after CI run. Thanks.

review: Approve
Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :

A CI job is currently in progress. A follow up comment will be added when it completes.

Revision history for this message
🤖 prod-jenkaas-bootstack (prod-jenkaas-bootstack) wrote :
review: Approve (continuous-integration)
Revision history for this message
🤖 Canonical IS Merge Bot (canonical-is-mergebot) wrote :

Change successfully merged at revision 1681157330e4f7208529c10dbec1b6cbd3fabfd9

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/juju_verify/utils/unit.py b/juju_verify/utils/unit.py
2index 0318cdb..05a9785 100644
3--- a/juju_verify/utils/unit.py
4+++ b/juju_verify/utils/unit.py
5@@ -17,7 +17,7 @@
6 """Helper function to manage Juju unit."""
7 import asyncio
8 import re
9-from typing import List, Dict
10+from typing import List, Dict, Any
11
12 from juju.action import Action
13 from juju.unit import Unit
14@@ -27,8 +27,8 @@ from juju_verify.exceptions import VerificationError, CharmException
15 CHARM_URL_PATTERN = re.compile(r'^(.*):(.*/)?(?P<charm>.*)(-\d+)$')
16
17
18-def run_action_on_units(units: List[Unit], action: str,
19- **params: str) -> Dict[str, Action]:
20+def run_action_on_units(
21+ units: List[Unit], action: str, **params: Any) -> Dict[str, Action]:
22 """Run juju action on specified units.
23
24 :param units: List/Tuple of Unit object
25diff --git a/juju_verify/verifiers/ceph.py b/juju_verify/verifiers/ceph.py
26index 32556cc..96253bd 100644
27--- a/juju_verify/verifiers/ceph.py
28+++ b/juju_verify/verifiers/ceph.py
29@@ -15,8 +15,9 @@
30 # You should have received a copy of the GNU General Public License along with
31 # this program. If not, see https://www.gnu.org/licenses/.
32 """ceph-osd verification."""
33+import json
34 import logging
35-from typing import Dict, Optional
36+from typing import Dict, Optional, Any, List
37
38 from juju.unit import Unit
39
40@@ -36,6 +37,8 @@ class CephCommon(BaseVerifier): # pylint: disable=W0223
41 """Check Ceph cluster health for specific units.
42
43 This will execute `get-health` against each unit provided.
44+
45+ :raises CharmException: if the units do not belong to the ceph-mon charm
46 """
47 verify_charm_unit("ceph-mon", *units)
48 result = Result(success=True)
49@@ -56,12 +59,52 @@ class CephCommon(BaseVerifier): # pylint: disable=W0223
50
51 return result
52
53+ @classmethod
54+ def get_replication_number(cls, unit: Unit) -> Optional[int]:
55+ """Get minimum replication number from ceph-mon unit.
56+
57+ This function runs the `list-pools` action with the parameter 'detail=true'
58+ to get the replication number.
59+ :raises CharmException: if the unit does not belong to the ceph-mon charm
60+ :raises TypeError: if the object pools is not iterable
61+ :raises KeyError: if the pool detail does not contain `size` or `min_size`
62+ :raises json.decoder.JSONDecodeError: if json.loads failed
63+ """
64+ verify_charm_unit("ceph-mon", unit)
65+ action_map = run_action_on_units([unit], "list-pools", detail=True)
66+ action_output = data_from_action(action_map.get(unit.entity_id), "pools")
67+ logger.debug("parse information about pools: %s", action_output)
68+ pools: List[Dict[str, Any]] = json.loads(action_output)
69+
70+ if pools:
71+ return min([pool["size"] - pool["min_size"] for pool in pools])
72+
73+ return None
74+
75
76 class CephOsd(CephCommon):
77 """Implementation of verification checks for the ceph-osd charm."""
78
79 NAME = 'ceph-osd'
80
81+ def __init__(self, units: List[Unit]):
82+ """Ceph-osd charm verifier."""
83+ super().__init__(units=units)
84+ self._ceph_mon_app_map: Optional[Dict[str, Unit]] = None
85+
86+ @property
87+ def ceph_mon_app_map(self) -> Dict[str, Unit]:
88+ """Get a map between ceph-osd applications and the first ceph-mon unit.
89+
90+ :returns: Dictionary with keys as distinct applications of verified units and
91+ values as the first ceph-mon unit obtained from the relation with the
92+ ceph-mon application (<application_name>:mon).
93+ """
94+ if self._ceph_mon_app_map is None:
95+ self._ceph_mon_app_map = self._get_ceph_mon_app_map()
96+
97+ return self._ceph_mon_app_map
98+
99 def _get_ceph_mon_unit(self, app_name: str) -> Optional[Unit]:
100 """Get first ceph-mon unit from relation."""
101 try:
102@@ -74,7 +117,7 @@ class CephOsd(CephCommon):
103
104 return None
105
106- def get_ceph_mon_units(self) -> Dict[str, Unit]:
107+ def _get_ceph_mon_app_map(self) -> Dict[str, Unit]:
108 """Get first ceph-mon units related to verified units.
109
110 This function groups by distinct application names for verified units, and then
111@@ -95,12 +138,38 @@ class CephOsd(CephCommon):
112
113 return ceph_mon_app_map
114
115+ def check_ceph_cluster_health(self) -> Result:
116+ """Check Ceph cluster health for unique ceph-mon units from ceph_mon_app_map."""
117+ unique_ceph_mon_units = set(self.ceph_mon_app_map.values())
118+ return self.check_cluster_health(*unique_ceph_mon_units)
119+
120+ def check_replication_number(self) -> Result:
121+ """Check the minimum number of replications for related applications."""
122+ result = Result(True)
123+
124+ for app_name, ceph_mon_unit in self.ceph_mon_app_map.items():
125+ min_replication_number = self.get_replication_number(ceph_mon_unit)
126+ units = len([unit for unit in self.units if unit.application == app_name])
127+ inactive_units = len(
128+ [unit for unit in self.model.applications[app_name].units
129+ if unit.workload_status != "active"]
130+ )
131+
132+ if (min_replication_number and
133+ (units + inactive_units) > min_replication_number):
134+ result += Result(
135+ False,
136+ f"The minimum number of replicas in '{app_name}' is "
137+ f"{min_replication_number:d} and it's not safe to restart/shutdown "
138+ f"{units:d} units. {inactive_units:d} units are not active."
139+ )
140+
141+ return result
142+
143 def verify_reboot(self) -> Result:
144 """Verify that it's safe to reboot selected ceph-osd units."""
145- ceph_mon_app_map = self.get_ceph_mon_units()
146- # get unique ceph-mon units
147- unique_ceph_mon_units = set(ceph_mon_app_map.values())
148- return aggregate_results(self.check_cluster_health(*unique_ceph_mon_units))
149+ return aggregate_results(self.check_ceph_cluster_health(),
150+ self.check_replication_number())
151
152 def verify_shutdown(self) -> Result:
153 """Verify that it's safe to shutdown selected ceph-osd units."""
154diff --git a/tests/unit/conftest.py b/tests/unit/conftest.py
155index 8a7b55a..17b2f80 100644
156--- a/tests/unit/conftest.py
157+++ b/tests/unit/conftest.py
158@@ -56,7 +56,6 @@ def model(session_mocker, all_units):
159 mock_model = Model()
160 session_mocker.patch.object(Model, 'connect_current')
161 session_mocker.patch.object(Model, 'connect_model')
162- session_mocker.patch.object(Model, 'applications')
163 session_mocker.patch.object(Unit, 'data')
164 session_mocker.patch.object(Unit, 'machine')
165 session_mocker.patch.object(Unit, 'run_action', new_callable=MagicMock)
166@@ -64,15 +63,26 @@ def model(session_mocker, all_units):
167 session_mocker.patch.object(Action, 'status').return_value = 'pending'
168 session_mocker.patch.object(Action, 'data')
169 units = session_mocker.patch('juju.model.Model.units', new_callable=PropertyMock)
170+ applications = session_mocker.patch('juju.model.Model.applications',
171+ new_callable=PropertyMock)
172
173- unit_map = {}
174+ unit_map, app_map = {}, {}
175 for unit_id in all_units:
176 unit = Unit(unit_id, mock_model)
177 charm_name = unit_id.rstrip(digits + '/')
178 unit.data = {'charm-url': 'cs:focal/{}-1'.format(charm_name),
179- "application": charm_name}
180+ "application": charm_name, "workload-status": {"current": "active"}}
181 unit_map[unit_id] = unit
182 unit.run_action.return_value = Action(unit_id + '-action', mock_model)
183+
184+ # add unit to model.applications
185+ if charm_name not in app_map:
186+ app_map[charm_name] = MagicMock()
187+ app_map[charm_name].units = []
188+
189+ app_map[charm_name].units.append(unit)
190+
191 units.return_value = unit_map
192+ applications.return_value = app_map
193
194 return mock_model
195diff --git a/tests/unit/verifiers/test_ceph.py b/tests/unit/verifiers/test_ceph.py
196index f5d16ad..b9fb4b5 100644
197--- a/tests/unit/verifiers/test_ceph.py
198+++ b/tests/unit/verifiers/test_ceph.py
199@@ -15,12 +15,14 @@
200 # You should have received a copy of the GNU General Public License along with
201 # this program. If not, see https://www.gnu.org/licenses/.
202 """CephOsd verifier class test suite."""
203+import json
204 import os
205 from unittest import mock
206 from unittest.mock import MagicMock
207
208 import pytest
209
210+from juju_verify.exceptions import CharmException
211 from juju_verify.verifiers import CephOsd, Result
212 from juju_verify.verifiers.ceph import CephCommon
213
214@@ -74,6 +76,37 @@ def test_check_cluster_health_unknown_state(mock_run_action_on_units, model):
215 assert result == Result(False, "Ceph cluster is in an unknown state")
216
217
218+def test_check_cluster_health_error(model):
219+ """Test check Ceph cluster health raise CharmException."""
220+ with pytest.raises(CharmException):
221+ CephCommon.check_cluster_health(model.units["ceph-osd/0"])
222+
223+
224+@mock.patch("juju_verify.verifiers.ceph.run_action_on_units")
225+def test_get_replication_number(mock_run_action_on_units, model):
226+ """Test get minimum replication number from ceph-mon unit."""
227+ action = MagicMock()
228+ action.data.get.side_effect = {"results": {"pools": json.dumps([
229+ {"pool": 1, "name": "test_1", "size": 5, "min_size": 2},
230+ {"pool": 2, "name": "test_2", "size": 5, "min_size": 2},
231+ {"pool": 3, "name": "test_3", "size": 3, "min_size": 2},
232+ ])}}.get
233+ mock_run_action_on_units.return_value = {"ceph-mon/0": action}
234+
235+ # test find minimum replication in list of 3 pools
236+ assert CephCommon.get_replication_number(model.units["ceph-mon/0"]) == 1
237+
238+ # test return None if list of pools is empty
239+ action.data.get.side_effect = {"results": {"pools": json.dumps([])}}.get
240+ assert CephCommon.get_replication_number(model.units["ceph-mon/0"]) is None
241+
242+
243+def test_get_replication_number_error(model):
244+ """Test get minimum replication number from ceph-mon unit raise CharmException."""
245+ with pytest.raises(CharmException):
246+ CephCommon.get_replication_number(model.units["ceph-osd/0"])
247+
248+
249 def test_get_ceph_mon_unit(model):
250 """Test get ceph-mon unit related to application."""
251 ceph_mon_units = [model.units["ceph-mon/0"], model.units["ceph-mon/1"],
252@@ -81,56 +114,114 @@ def test_get_ceph_mon_unit(model):
253 mock_relation = MagicMock()
254 mock_relation.matches = {"ceph-osd:mon": True}.get
255 mock_relation.provides.application.units = ceph_mon_units
256- mock_relations = MagicMock()
257- mock_relations.relations = [mock_relation]
258- model.applications = {"ceph-osd": mock_relations}
259+ model.applications["ceph-osd"].relations = [mock_relation]
260
261 # return first ceph-mon unit in "ceph-osd:mon" relations
262 unit = CephOsd([model.units["ceph-osd/0"]])._get_ceph_mon_unit("ceph-osd")
263 assert unit == ceph_mon_units[0]
264
265 # return none for non-existent application name
266- model.applications = {"ceph-osd-cluster": mock_relations}
267- unit = CephOsd([model.units["ceph-osd/0"]])._get_ceph_mon_unit("ceph-osd")
268+ unit = CephOsd([model.units["ceph-osd/0"]])._get_ceph_mon_unit("ceph-osd-cluster")
269 assert unit is None
270
271 # return none for application with no units
272 mock_relation.provides.application.units = []
273- model.applications = {"ceph-osd": mock_relations}
274 unit = CephOsd([model.units["ceph-osd/0"]])._get_ceph_mon_unit("ceph-osd")
275 assert unit is None
276
277
278 @mock.patch("juju_verify.verifiers.ceph.CephOsd._get_ceph_mon_unit")
279-def test_get_ceph_mon_units(mock_get_ceph_mon_unit, model):
280+def test_get_ceph_mon_app_map(mock_get_ceph_mon_unit, model):
281 """Test function to get ceph-mon units related to verified units."""
282 ceph_osd_units = [model.units["ceph-osd/0"], model.units["ceph-osd/1"]]
283 mock_get_ceph_mon_unit.return_value = model.units["ceph-mon/0"]
284
285- units = CephOsd(ceph_osd_units).get_ceph_mon_units()
286+ units = CephOsd(ceph_osd_units)._get_ceph_mon_app_map()
287
288 assert units == {"ceph-osd": model.units["ceph-mon/0"]}
289
290
291-@mock.patch("juju_verify.verifiers.ceph.CephOsd.get_ceph_mon_units")
292+@mock.patch("juju_verify.verifiers.ceph.CephOsd._get_ceph_mon_app_map")
293+@mock.patch("juju_verify.verifiers.ceph.CephCommon.check_cluster_health")
294+def test_check_ceph_cluster_health(
295+ mock_check_cluster_health, mock_get_ceph_mon_app_map, model):
296+ """Test check the Ceph cluster health for unique ceph-mon units."""
297+ mock_get_ceph_mon_app_map.return_value = {"ceph-osd": model.units["ceph-mon/0"]}
298+ mock_check_cluster_health.return_value = Result(True)
299+
300+ ceph_osd_verifier = CephOsd([model.units["ceph-osd/0"]])
301+ assert ceph_osd_verifier.check_ceph_cluster_health() == Result(True)
302+ mock_check_cluster_health.assert_called_once_with(model.units["ceph-mon/0"])
303+
304+
305+@mock.patch("juju_verify.verifiers.ceph.CephOsd._get_ceph_mon_app_map")
306+@mock.patch("juju_verify.verifiers.ceph.CephCommon.get_replication_number")
307+def test_check_replication_number(
308+ mock_get_replication_number, mock_get_ceph_mon_app_map, model):
309+ """Test check the minimum number of replications for related applications."""
310+ mock_get_replication_number.return_value = 1
311+ mock_get_ceph_mon_app_map.return_value = {"ceph-osd": model.units["ceph-mon/0"]}
312+
313+ # verified one ceph-osd unit
314+ ceph_osd_verifier = CephOsd([model.units["ceph-osd/0"]])
315+ assert ceph_osd_verifier.check_replication_number() == Result(True)
316+
317+ # verified two ceph-osd unit
318+ ceph_osd_verifier = CephOsd([model.units["ceph-osd/0"], model.units["ceph-osd/1"]])
319+ assert ceph_osd_verifier.check_replication_number() == Result(
320+ False, "The minimum number of replicas in 'ceph-osd' is 1 and "
321+ "it's not safe to restart/shutdown 2 units. 0 units are not active."
322+ )
323+
324+ # verified one ceph-osd unit, if there is an unit that is not in an active state
325+ model.units["ceph-osd/1"].data["workload-status"]["current"] = "blocked"
326+ ceph_osd_verifier = CephOsd([model.units["ceph-osd/0"]])
327+ assert ceph_osd_verifier.check_replication_number() == Result(
328+ False, "The minimum number of replicas in 'ceph-osd' is 1 and "
329+ "it's not safe to restart/shutdown 1 units. 1 units are not active."
330+ )
331+ model.units["ceph-osd/1"].data["workload-status"]["current"] = "active"
332+
333+
334+@mock.patch("juju_verify.verifiers.ceph.CephOsd._get_ceph_mon_app_map")
335 @mock.patch("juju_verify.verifiers.ceph.CephCommon.check_cluster_health")
336-def test_verify_reboot(mock_check_cluster_health, mock_get_ceph_mon_units, model):
337+@mock.patch("juju_verify.verifiers.ceph.CephCommon.get_replication_number")
338+def test_verify_reboot(
339+ mock_get_replication_number,
340+ mock_check_cluster_health,
341+ mock_get_ceph_mon_app_map,
342+ model
343+):
344 """Test reboot verification on CephOsd."""
345- mock_get_ceph_mon_units.return_value = {"ceph-osd": model.units["ceph-mon/0"]}
346+ mock_get_replication_number.return_value = 1
347+ mock_get_ceph_mon_app_map.return_value = {"ceph-osd": model.units["ceph-mon/0"]}
348 mock_check_cluster_health.return_value = Result(True, "Ceph cluster is healthy")
349
350 result = CephOsd([model.units["ceph-osd/0"]]).verify_reboot()
351+ assert result == Result(True, "Ceph cluster is healthy")
352
353+ mock_get_replication_number.return_value = None # empty list of pools
354+ result = CephOsd([model.units["ceph-osd/0"]]).verify_reboot()
355 assert result == Result(True, "Ceph cluster is healthy")
356
357
358-@mock.patch("juju_verify.verifiers.ceph.CephOsd.get_ceph_mon_units")
359+@mock.patch("juju_verify.verifiers.ceph.CephOsd._get_ceph_mon_app_map")
360 @mock.patch("juju_verify.verifiers.ceph.CephCommon.check_cluster_health")
361-def test_verify_shutdown(mock_check_cluster_health, mock_get_ceph_mon_units, model):
362+@mock.patch("juju_verify.verifiers.ceph.CephCommon.get_replication_number")
363+def test_verify_shutdown(
364+ mock_get_replication_number,
365+ mock_check_cluster_health,
366+ mock_get_ceph_mon_app_map,
367+ model
368+):
369 """Test shutdown verification on CephOsd."""
370- mock_get_ceph_mon_units.return_value = {"ceph-osd": model.units["ceph-mon/0"]}
371+ mock_get_replication_number.return_value = 1
372+ mock_get_ceph_mon_app_map.return_value = {"ceph-osd": model.units["ceph-mon/0"]}
373 mock_check_cluster_health.return_value = Result(True, "Ceph cluster is healthy")
374
375 result = CephOsd([model.units["ceph-osd/0"]]).verify_shutdown()
376+ assert result == Result(True, "Ceph cluster is healthy")
377
378+ mock_get_replication_number.return_value = None # empty list of pools
379+ result = CephOsd([model.units["ceph-osd/0"]]).verify_shutdown()
380 assert result == Result(True, "Ceph cluster is healthy")

Subscribers

People subscribed via source and target branches