Following https://tracker.ceph.com/issues/52867 we need to tell ceph
which address family to use via the ms_bind_ipv4/6 config flags.
I added them to the ceph.conf template and updated the config hook.
Closes-Bug: #2056337
Change-Id: Ib735bd4876b6909762288b97857bccaa597c2b80
(cherry picked from commit 7e61d1b8a050998697aee91eb0db6c988c2c397a)
Ceph reef has a behaviour change where it doesn't always return
version keys for all components. In
I12a1bcd32be2ed8a8e5ee0e304f716f5a190bd57 an attempt was made to fix
this by retrying, however this code path can also be hit when a
component such as OSDs are absent. While a cluster without OSDs
wouldn't be functional it still should not cause the charm to error.
As a fix, just make the OSD component optional when querying for a
version instead of retrying.
Closes-Bug: #2058636
Resolved Conflicts:
src/utils.py
Change-Id: I5524896c7ad944f6f22fb1498ab0069397b52418
(cherry picked from commit 1c9f3b210d8bf8904143647443133cf35f48d8b7)
Setting the 'mgr/prometheus/rbd_stats_pools' option can fail
if we arrive too early, even if the cluster is bootstrapped. This is
particularly seen in ceph-radosgw test runs. This patchset thus
adds a retry decorator to work around this issue.
Related-Bug: #2042405
Related-Bug: #2058636
Change-Id: Id9b7b903e67154e7d2bb6fecbeef7fac126804a8
(cherry picked from commit d76939ef70bd5016a6e515558de1b9eabe9d0d55)
A job name passed via the prometheus_scrape library doesn't end up as a
static job name in the prometheus configuration file in the COS world
even though COS expects a fixed string. Practically we cannot have a
static job name like job=ceph in any of the alert rules in COS since the
charms will convert the string "ceph" into:
Let's give up the possibility of the static job name and use "up{}" so
it will be annotated with the model name/ID, etc. without any specific
job related condition. It will break the alert rules when one unit have
more than one scraping endpoint because there will be no way to
distinguish multiple scraping jobs. Ceph MON only has one prometheus
endpoint for the time being so this change shouldn't cause an immediate
issue. Overall, it's not ideal but at least better than the current
status, which is an alert error out of the box.
The following alert rule:
> up{} == 0
will be converted and annotated as:
> up{juju_application="ceph-mon",juju_model="ceph",juju_model_uuid="UUID"} == 0
Closes-Bug: #2044062
Change-Id: I0df8bc0238349b5f03179dfb8f4da95da48140c7
(cherry picked from commit fb3262183102171da5704868d7522290b3a9ede4)