Merge ~sajoupa/charm-k8s-telegraf:sidecar into charm-k8s-telegraf:master

Proposed by Laurent Sesquès
Status: Work in progress
Proposed branch: ~sajoupa/charm-k8s-telegraf:sidecar
Merge into: charm-k8s-telegraf:master
Diff against target: 1088 lines (+572/-297)
11 files modified
Makefile (+3/-3)
README.md (+48/-38)
actions.yaml (+3/-0)
config.yaml (+6/-20)
lib/charms/nginx_ingress_integrator/v0/ingress.py (+198/-0)
metadata.yaml (+19/-2)
requirements.txt (+1/-0)
src/charm.py (+252/-100)
tests/unit/requirements.txt (+1/-0)
tests/unit/scenario.py (+36/-94)
tests/unit/test_charm.py (+5/-40)
Reviewer Review Type Date Requested Status
BootStack Reviewers mr tracking; do not claim Pending
BootStack Reviewers Pending
BootStack Reviewers Pending
Telegraf Charmers Pending
Review via email: mp+402499@code.launchpad.net

Commit message

Switch to the sidecar framework

To post a comment you must log in.
~sajoupa/charm-k8s-telegraf:sidecar updated
316e11b... by Laurent Sesquès

Dockerfile: use a more suitable name for the CMD

2396cf7... by Laurent Sesquès

import the nginx-ingress-integrator lib

f3c7b62... by Laurent Sesquès

config.yaml: restrict open_port to just 1 port number, and restrict to tcp

604d966... by Laurent Sesquès

rename the charm to telegraf-k8s, add display-name

df32e90... by Laurent Sesquès

charm.py: various fixes, and updates coming from previous commits

9d2bf26... by Laurent Sesquès

update tests following conversion to the sidecar framework

efd9949... by Laurent Sesquès

Makefile: perform a build before running unittests

4315d13... by Laurent Sesquès

Add a relation with postgresql-k8s

4f8a99a... by Laurent Sesquès

make sure _on_config_changed is called with a PG relation is changed

5e88ed7... by Laurent Sesquès

add jinja2 to the unit tests requirements

251cdee... by Laurent Sesquès

remove image_* configs (now using a resource), install ca-certificates in the image, update README, add docs: link to metadata.yaml

acb12a9... by Laurent Sesquès

README: fix markup

c11cdd5... by Laurent Sesquès

README: fix wrong copy/paste. config.yaml: remove obsolete comment for open_port and lp:1876129.

7560863... by Laurent Sesquès

Add a get-prometheus-metrics action

8e657b6... by Laurent Sesquès

small README improvements

468043c... by Laurent Sesquès

README: add a section for the get-prometheus-metrics action

Unmerged commits

468043c... by Laurent Sesquès

README: add a section for the get-prometheus-metrics action

8e657b6... by Laurent Sesquès

small README improvements

7560863... by Laurent Sesquès

Add a get-prometheus-metrics action

c11cdd5... by Laurent Sesquès

README: fix wrong copy/paste. config.yaml: remove obsolete comment for open_port and lp:1876129.

acb12a9... by Laurent Sesquès

README: fix markup

251cdee... by Laurent Sesquès

remove image_* configs (now using a resource), install ca-certificates in the image, update README, add docs: link to metadata.yaml

5e88ed7... by Laurent Sesquès

add jinja2 to the unit tests requirements

4f8a99a... by Laurent Sesquès

make sure _on_config_changed is called with a PG relation is changed

4315d13... by Laurent Sesquès

Add a relation with postgresql-k8s

efd9949... by Laurent Sesquès

Makefile: perform a build before running unittests

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/Makefile b/Makefile
2index 7a71e0f..e2df6b8 100644
3--- a/Makefile
4+++ b/Makefile
5@@ -16,7 +16,7 @@ lint: blacken
6
7 # We actually use the build directory created by charmcraft,
8 # but the .charm file makes a much more convenient sentinel.
9-unittests:
10+unittests: build
11 @tox -e unit
12
13 build:
14@@ -24,7 +24,8 @@ build:
15 @-git rev-parse HEAD > ./repo-info
16 @cd ${CHARM_BUILD_DIR} && TERM=linux charmcraft build -f ${PROJECTPATH}
17
18-test: lint unittests functional
19+# TODO: fix functional tests, broken with juju 2.9.0 + zaza 94abaf1 + microk8s v1.20.6
20+test: lint unittests #functional
21 @echo "Tests completed for charm ${CHARM_NAME}."
22
23 functional: build
24@@ -40,7 +41,6 @@ clean:
25 image-build:
26 @echo "Building the image."
27 @docker build \
28- --no-cache=true \
29 --build-arg VERSION_TO_BUILD=$(VERSION_TO_BUILD) \
30 -t telegraf:$(VERSION_TO_BUILD) \
31 .
32diff --git a/README.md b/README.md
33index 7a29599..3986814 100644
34--- a/README.md
35+++ b/README.md
36@@ -1,52 +1,62 @@
37-# charm-k8s-telegraf
38+# Telegraf Operator
39
40 ## Description
41
42-Telegraf is an agent for collecting, processing, aggregating, and writing metrics.
43+[Telegraf](https://github.com/influxdata/telegraf) is an agent for collecting, processing, aggregating, and writing metrics.
44
45 Telegraf is plugin-driven and has the concept of 4 distinct plugin types:
46- Input Plugins collect metrics from the system, services, or 3rd party APIs
47- Processor Plugins transform, decorate, and/or filter metrics
48- Aggregator Plugins create aggregate metrics (e.g. mean, min, max, quantiles, etc.)
49- Output Plugins write metrics to various destinations
50+ - Input Plugins collect metrics from the system, services, or 3rd party APIs
51+ - Processor Plugins transform, decorate, and/or filter metrics
52+ - Aggregator Plugins create aggregate metrics (e.g. mean, min, max, quantiles, etc.)
53+ - Output Plugins write metrics to various destinations
54
55-Deploying telegraf in a k8s environment makes sense to monitor services or 3rd party
56-APIs, not to gather system stats (telegraf would only monitor itself in its container).
57+This telegraf Charmed Operator addresses two use cases:
58+ - monitor charmed k8s workoad using a juju relation
59+ - monitor a remote service thanks to telegraf's wide range of input plugins
60
61-It is possible for instance to deploy telegraf on k8s to gather metrics about a github
62-repository with the github input plugin, weather data with the OpenWeatherMap input
63-plugin, or check HTTP/HTTPS connections with the http_response plugin.
64+It is possible for instance to deploy telegraf on k8s to gather metrics about a github repository with the `github` input plugin, weather data with the `OpenWeatherMap` input plugin, or monitor remote HTTP/HTTPS endpoints with the `http_response` plugin.
65
66 ## Usage
67
68-Deploy the charm to a k8s juju model, for example:
69
70- juju deploy cs:~telegraf-charmers/telegraf --config inputs='[[inputs.github]]
71- repositories = [
72- "influxdata/telegraf"
73- ]'
74-
75-In this case, telegraf will expose its metrics using the charm's default output plugin,
76-prometheus on tcp port 9103.
77-
78-## Using a custom image
79-
80-By default the charm will use the telegrafcharmers/telegraf:edge image from dockerhub.
81-To build and push a custom image:
82-
83- git clone https://git.launchpad.net/charm-k8s-telegraf
84- cd charm-k8s-telegraf
85- make image-build
86- docker tag telegraf:latest localhost:32000/telegraf
87- docker push localhost:32000/telegraf
88-
89-Then, to use your new image, either replace the `deploy` step above with:
90-
91- juju deploy ./telegraf.charm --config image_path=localhost:32000/telegraf
92-
93-or, if you've already deployed telegraf:
94-
95- juju config telegraf image_path=localhost:32000/telegraf
96+### Monitor a remote service
97+
98+In this example, we will use telegraf to get metrics about a GitHub repository, using
99+the github input plugin:
100+```
101+juju deploy telegraf-k8s --channel edge --config inputs='[[inputs.github]]\n repositories = ["canonical/operator"]'
102+```
103+Wait for the unit to be active/idle, and the metrics will be available on the default output (`prometheus_client` on port 9103):
104+```
105+$ curl -s <unit IP>:9103/metrics | grep github_repository_stars
106+# HELP github_repository_stars Telegraf collected metric
107+# TYPE github_repository_stars untyped
108+github_repository_stars{host="telegraf-k8s-0",language="Python",license="Apache License 2.0",name="operator",owner="canonical"} 98
109+```
110+
111+### Monitor a charmed k8s workload through a Juju relation
112+
113+In this example, telegraf will get metrics from postgresql:
114+
115+```
116+juju deploy postgresql-k8s
117+juju deploy telegraf-k8s --channel edge
118+juju relate telegraf-k8s:pg postgresql-k8s:db
119+```
120+Wait for the units to be active/idle, and then the metrics can be scraped:
121+```
122+$ curl -s <unit IP>:9103/metrics | grep ^postgresql_
123+postgresql_blk_read_time{datname="",db="postgres",host="telegraf-k8s-0",replica="master",server="dbname=telegraf-k8s host=10.152.183.38 port=5432 user=telegraf-k8s"} 0
124+[...]
125+```
126+
127+## Prometheus metrics - service check
128+
129+By default (and in most cases), the operator will make telegraf expose a prometheus_client endpoint on port tcp/9103.
130+This can be easily tested by running the juju action:
131+```
132+juju run-action telegraf-k8s/0 get-prometheus-metrics --wait
133+```
134
135 ## Testing
136
137diff --git a/actions.yaml b/actions.yaml
138new file mode 100644
139index 0000000..d26e5f0
140--- /dev/null
141+++ b/actions.yaml
142@@ -0,0 +1,3 @@
143+get-prometheus-metrics:
144+ description: >
145+ Scrape the open port for prometheus metrics
146diff --git a/config.yaml b/config.yaml
147index fbe14b5..051169d 100644
148--- a/config.yaml
149+++ b/config.yaml
150@@ -53,27 +53,13 @@ options:
151
152 This setting is required.
153 type: string
154- image_path:
155- type: string
156+ open_port:
157+ default: '9103'
158 description: >
159- The location of the image to use, e.g. "registry.example.com/telegraf:v1".
160-
161- This setting is required.
162- default: 'telegrafcharmers/telegraf:edge'
163- image_username:
164+ TCP port number to open.
165 type: string
166- description: >
167- The username for accessing the registry specified in image_path.
168- default: ''
169- image_password:
170+ external_hostname:
171 type: string
172 description: >
173- The password associated with image_username for accessing the registry specified in image_path.
174- default: ''
175- open_ports:
176- default: '9103:tcp'
177- description: >
178- Comma-separated list of <port number>:<protocol> ports to open. Ex: '9103:tcp,6343:udp'.
179-
180- This setting is required. Even if no port needs to be exposed, a dummy one needs to be set lp:1876129.
181- type: string
182+ External hostname this agent should respond to (required).
183+ default: 'telegraf.internal'
184diff --git a/lib/charms/nginx_ingress_integrator/v0/ingress.py b/lib/charms/nginx_ingress_integrator/v0/ingress.py
185new file mode 100644
186index 0000000..196c314
187--- /dev/null
188+++ b/lib/charms/nginx_ingress_integrator/v0/ingress.py
189@@ -0,0 +1,198 @@
190+"""Library for the ingress relation.
191+
192+This library contains the Requires and Provides classes for handling
193+the ingress interface.
194+
195+Import `IngressRequires` in your charm, with two required options:
196+ - "self" (the charm itself)
197+ - config_dict
198+
199+`config_dict` accepts the following keys:
200+ - service-hostname (required)
201+ - service-name (required)
202+ - service-port (required)
203+ - limit-rps
204+ - limit-whitelist
205+ - max-body-size
206+ - retry-errors
207+ - service-namespace
208+ - session-cookie-max-age
209+ - tls-secret-name
210+
211+See [the config section](https://charmhub.io/nginx-ingress-integrator/configure) for descriptions
212+of each, along with the required type.
213+
214+As an example, add the following to `src/charm.py`:
215+```
216+from charms.nginx_ingress_integrator.v0.ingress import IngressRequires
217+
218+# In your charm's `__init__` method.
219+self.ingress = IngressRequires(self, {"service-hostname": self.config["external_hostname"],
220+ "service-name": self.app.name,
221+ "service-port": 80})
222+
223+# In your charm's `config-changed` handler.
224+self.ingress.update_config({"service-hostname": self.config["external_hostname"]})
225+```
226+And then add the following to `metadata.yaml`:
227+```
228+requires:
229+ ingress:
230+ interface: ingress
231+```
232+"""
233+
234+import logging
235+
236+from ops.charm import CharmEvents
237+from ops.framework import EventBase, EventSource, Object
238+from ops.model import BlockedStatus
239+
240+# The unique Charmhub library identifier, never change it
241+LIBID = "db0af4367506491c91663468fb5caa4c"
242+
243+# Increment this major API version when introducing breaking changes
244+LIBAPI = 0
245+
246+# Increment this PATCH version before using `charmcraft publish-lib` or reset
247+# to 0 if you are raising the major API version
248+LIBPATCH = 6
249+
250+logger = logging.getLogger(__name__)
251+
252+REQUIRED_INGRESS_RELATION_FIELDS = {
253+ "service-hostname",
254+ "service-name",
255+ "service-port",
256+}
257+
258+OPTIONAL_INGRESS_RELATION_FIELDS = {
259+ "limit-rps",
260+ "limit-whitelist",
261+ "max-body-size",
262+ "retry-errors",
263+ "service-namespace",
264+ "session-cookie-max-age",
265+ "tls-secret-name",
266+}
267+
268+
269+class IngressAvailableEvent(EventBase):
270+ pass
271+
272+
273+class IngressCharmEvents(CharmEvents):
274+ """Custom charm events."""
275+
276+ ingress_available = EventSource(IngressAvailableEvent)
277+
278+
279+class IngressRequires(Object):
280+ """This class defines the functionality for the 'requires' side of the 'ingress' relation.
281+
282+ Hook events observed:
283+ - relation-changed
284+ """
285+
286+ def __init__(self, charm, config_dict):
287+ super().__init__(charm, "ingress")
288+
289+ self.framework.observe(charm.on["ingress"].relation_changed, self._on_relation_changed)
290+
291+ self.config_dict = config_dict
292+
293+ def _config_dict_errors(self, update_only=False):
294+ """Check our config dict for errors."""
295+ blocked_message = "Error in ingress relation, check `juju debug-log`"
296+ unknown = [
297+ x
298+ for x in self.config_dict
299+ if x not in REQUIRED_INGRESS_RELATION_FIELDS | OPTIONAL_INGRESS_RELATION_FIELDS
300+ ]
301+ if unknown:
302+ logger.error(
303+ "Ingress relation error, unknown key(s) in config dictionary found: %s",
304+ ", ".join(unknown),
305+ )
306+ self.model.unit.status = BlockedStatus(blocked_message)
307+ return True
308+ if not update_only:
309+ missing = [x for x in REQUIRED_INGRESS_RELATION_FIELDS if x not in self.config_dict]
310+ if missing:
311+ logger.error(
312+ "Ingress relation error, missing required key(s) in config dictionary: %s",
313+ ", ".join(missing),
314+ )
315+ self.model.unit.status = BlockedStatus(blocked_message)
316+ return True
317+ return False
318+
319+ def _on_relation_changed(self, event):
320+ """Handle the relation-changed event."""
321+ # `self.unit` isn't available here, so use `self.model.unit`.
322+ if self.model.unit.is_leader():
323+ if self._config_dict_errors():
324+ return
325+ for key in self.config_dict:
326+ event.relation.data[self.model.app][key] = str(self.config_dict[key])
327+
328+ def update_config(self, config_dict):
329+ """Allow for updates to relation."""
330+ if self.model.unit.is_leader():
331+ self.config_dict = config_dict
332+ if self._config_dict_errors(update_only=True):
333+ return
334+ relation = self.model.get_relation("ingress")
335+ if relation:
336+ for key in self.config_dict:
337+ relation.data[self.model.app][key] = str(self.config_dict[key])
338+
339+
340+class IngressProvides(Object):
341+ """This class defines the functionality for the 'provides' side of the 'ingress' relation.
342+
343+ Hook events observed:
344+ - relation-changed
345+ """
346+
347+ def __init__(self, charm):
348+ super().__init__(charm, "ingress")
349+ # Observe the relation-changed hook event and bind
350+ # self.on_relation_changed() to handle the event.
351+ self.framework.observe(charm.on["ingress"].relation_changed, self._on_relation_changed)
352+ self.charm = charm
353+
354+ def _on_relation_changed(self, event):
355+ """Handle a change to the ingress relation.
356+
357+ Confirm we have the fields we expect to receive."""
358+ # `self.unit` isn't available here, so use `self.model.unit`.
359+ if not self.model.unit.is_leader():
360+ return
361+
362+ ingress_data = {
363+ field: event.relation.data[event.app].get(field)
364+ for field in REQUIRED_INGRESS_RELATION_FIELDS | OPTIONAL_INGRESS_RELATION_FIELDS
365+ }
366+
367+ missing_fields = sorted(
368+ [
369+ field
370+ for field in REQUIRED_INGRESS_RELATION_FIELDS
371+ if ingress_data.get(field) is None
372+ ]
373+ )
374+
375+ if missing_fields:
376+ logger.error(
377+ "Missing required data fields for ingress relation: {}".format(
378+ ", ".join(missing_fields)
379+ )
380+ )
381+ self.model.unit.status = BlockedStatus(
382+ "Missing fields for ingress: {}".format(", ".join(missing_fields))
383+ )
384+
385+ # Create an event that our charm can use to decide it's okay to
386+ # configure the ingress.
387+ self.charm.on.ingress_available.emit()
388diff --git a/metadata.yaml b/metadata.yaml
389index 80f06b7..cd1898c 100644
390--- a/metadata.yaml
391+++ b/metadata.yaml
392@@ -1,11 +1,28 @@
393 # Copyright 2020 Canonical Ltd.
394 # See LICENSE file for licensing details.
395-name: telegraf
396+name: telegraf-k8s
397+display-name: Telegraf
398 description: |
399 Telegraf charm for Kubernetes.
400 Telegraf is an agent for collecting, processing, aggregating, and writing metrics.
401 summary: |
402 Telegraf charm for Kubernetes
403-series: [kubernetes]
404 maintainers:
405 - launchpad.net/~telegraf-charmers
406+docs: https://discourse.charmhub.io/t/telegraf-k8s-docs-index/4587
407+
408+containers:
409+ telegraf:
410+ resource: telegraf-image
411+
412+resources:
413+ telegraf-image:
414+ type: oci-image
415+ description: Docker image for telegraf to run
416+
417+requires:
418+ pg:
419+ interface: pgsql
420+ limit: 1
421+ ingress:
422+ interface: ingress
423diff --git a/requirements.txt b/requirements.txt
424index 2d81d3b..fd6adcd 100644
425--- a/requirements.txt
426+++ b/requirements.txt
427@@ -1 +1,2 @@
428 ops
429+ops-lib-pgsql
430diff --git a/src/charm.py b/src/charm.py
431index 09d477f..42b0d00 100755
432--- a/src/charm.py
433+++ b/src/charm.py
434@@ -2,22 +2,77 @@
435 # Copyright 2020 Canonical Ltd.
436 # See LICENSE file for licensing details.
437
438+from jinja2 import Environment, BaseLoader
439+from urllib.error import HTTPError
440 import logging
441+import pgsql
442+import re
443+import urllib.request
444+import yaml
445
446+from charms.nginx_ingress_integrator.v0.ingress import IngressRequires
447 import ops
448 from ops.charm import CharmBase
449-from ops.main import main
450 from ops.framework import StoredState
451+from ops.main import main
452 from ops.model import (
453 ActiveStatus,
454 BlockedStatus,
455- MaintenanceStatus,
456 )
457+from ops.pebble import ServiceStatus
458
459
460 logger = logging.getLogger(__name__)
461
462-REQUIRED_JUJU_CONFIG = ['image_path', 'inputs', 'outputs', 'open_ports']
463+REQUIRED_JUJU_CONFIG = ['inputs', 'outputs', 'open_port']
464+TELEGRAF_CONFIG_FILE = '/etc/telegraf.conf'
465+
466+POSTGRESQL_TEMPLATE = '''
467+{%- if conn_str %}
468+[[inputs.postgresql_extensible]]
469+ address = "{{conn_str}}"
470+ [inputs.postgresql_extensible.tags]
471+ replica = "master"
472+
473+[[inputs.postgresql_extensible.query]]
474+ sqlquery="SELECT * FROM pg_stat_database"
475+ version=901
476+ withdbname=false
477+ tagvalue=""
478+
479+[[inputs.postgresql_extensible.query]]
480+ sqlquery="SELECT * FROM pg_stat_bgwriter"
481+ version=901
482+ withdbname=false
483+ tagvalue=""
484+
485+[[inputs.postgresql_extensible.query]]
486+ withdbname=false
487+ tagvalue=""
488+ sqlquery="""
489+ SELECT
490+ datname,
491+ EXTRACT(EPOCH FROM clock_timestamp() - MIN(xact_start)) AS oldest_xact,
492+ EXTRACT(EPOCH FROM clock_timestamp()
493+ - MIN(CASE WHEN state='active' THEN query_start ELSE NULL END)) AS oldest_query,
494+ COUNT(NULLIF(wait_event_type IS NOT NULL AND wait_event_type <> 'Activity', False)) AS queries_waiting
495+ FROM pg_stat_activity
496+ GROUP BY datname"""
497+
498+[[inputs.postgresql_extensible.query]]
499+ withdbname=false
500+ tagvalue="transaction_state"
501+ sqlquery="""
502+ SELECT
503+ datname,
504+ CASE WHEN state='active' AND wait_event_type IS NOT NULL THEN 'blocked'
505+ ELSE state END AS transaction_state,
506+ COUNT(*) AS connections
507+ FROM pg_stat_activity
508+ WHERE state IS NOT NULL
509+ GROUP BY datname, state, wait_event_type"""
510+{%- endif %}
511+'''
512
513
514 class TelegrafK8sCharmJujuConfigError(Exception):
515@@ -30,10 +85,174 @@ class TelegrafK8sCharm(CharmBase):
516 def __init__(self, *args):
517 super().__init__(*args)
518
519- self.framework.observe(self.on.start, self._configure_pod)
520- self.framework.observe(self.on.config_changed, self._configure_pod)
521- self.framework.observe(self.on.leader_elected, self._configure_pod)
522- self.framework.observe(self.on.upgrade_charm, self._configure_pod)
523+ self.framework.observe(self.on.config_changed, self._on_config_changed)
524+ self.framework.observe(self.on.upgrade_charm, self._on_upgrade_charm)
525+ self.framework.observe(self.on.telegraf_pebble_ready, self._on_telegraf_pebble_ready)
526+
527+ # actions
528+ self.framework.observe(self.on.get_prometheus_metrics_action, self.on_get_prometheus_metrics_action)
529+
530+ self.ingress = IngressRequires(
531+ self,
532+ {
533+ "service-hostname": self.config["external_hostname"],
534+ "service-name": self.app.name,
535+ "service-port": self.config["open_port"],
536+ },
537+ )
538+
539+ self._stored.set_default(telegraf_pebble_ready=False, reldata={})
540+
541+ self._init_postgresql_relation()
542+
543+ def _get_pebble_config(self, event: ops.framework.EventBase) -> dict:
544+ """Generate pebble config."""
545+ pebble_config = {
546+ "summary": "telegraf layer",
547+ "description": "telegraf layer",
548+ "services": {
549+ "telegraf": {
550+ "override": "replace",
551+ "summary": "telegraf service",
552+ "command": "/run_telegraf",
553+ "startup": "enabled",
554+ }
555+ },
556+ }
557+
558+ try:
559+ self._check_juju_config()
560+ except TelegrafK8sCharmJujuConfigError as e:
561+ self.unit.status = BlockedStatus(str(e))
562+ return {}
563+
564+ # Update pod environment config.
565+ pebble_config["services"]["telegraf"]["environment"] = self._make_pod_env()
566+
567+ return pebble_config
568+
569+ def _on_config_changed(self, event: ops.framework.EventBase) -> None:
570+ """Handle the config changed event."""
571+ if not self._stored.telegraf_pebble_ready:
572+ logger.info(
573+ "Got a config changed event, but the workload isn't ready yet. Doing nothing, config will be "
574+ "picked up when workload is ready."
575+ )
576+ event.defer()
577+ return
578+
579+ pebble_config = self._get_pebble_config(event)
580+ if not pebble_config:
581+ # Charm will be in blocked status.
582+ return
583+
584+ # Ensure the ingress relation has the external hostname and port.
585+ self.ingress.update_config({"service-hostname": self.config["external_hostname"]})
586+ self.ingress.update_config({"service-name": self.app.name})
587+ self.ingress.update_config({"service-port": self.config["open_port"]})
588+
589+ container = self.unit.get_container("telegraf")
590+ plan = container.get_plan().to_dict()
591+ if plan["services"] != pebble_config["services"]:
592+ container.add_layer("telegraf", pebble_config, combine=True)
593+
594+ status = container.get_service("telegraf")
595+ if status.current == ServiceStatus.ACTIVE:
596+ container.stop("telegraf")
597+ container.start("telegraf")
598+
599+ self.unit.status = ActiveStatus()
600+
601+ def _on_upgrade_charm(self, event: ops.framework.EventBase) -> None:
602+ """Handle the upgrade charm event."""
603+ # An 'upgrade-charm' hook (which will also be triggered by an
604+ # 'attach-resource' event) will cause the pod to be rescheduled:
605+ # even though the name remains the same, the IP may change.
606+ # The workload won't be running, so we need to handle that in the
607+ # course of subsequent events that will be triggered after this.
608+ #
609+ # Setting pebble_ready to `False` will ensure a 'config-changed'
610+ # hook waits for the workload to be ready before doing anything.
611+ self._stored.telegraf_pebble_ready = False
612+ # An upgrade-charm hook will be followed by others such as config-changed
613+ # and workload-ready, so just do nothing else for now.
614+ return
615+
616+ def _on_telegraf_pebble_ready(self, event: ops.framework.EventBase) -> None:
617+ """Handle the workload ready event."""
618+ self._stored.telegraf_pebble_ready = True
619+
620+ pebble_config = self._get_pebble_config(event)
621+ if not pebble_config:
622+ # Charm will be in blocked status.
623+ return
624+
625+ container = event.workload
626+ logger.debug("About to add_layer with pebble_config:\n{}".format(yaml.dump(pebble_config)))
627+ # `container.add_layer` accepts str (YAML) or dict or pebble.Layer
628+ # object directly.
629+ container.add_layer("telegraf", pebble_config)
630+ # Start the container and set status.
631+ container.autostart()
632+ self.unit.status = ActiveStatus()
633+
634+ def _init_postgresql_relation(self) -> None:
635+ """Initialization related to the postgresql relation"""
636+ if 'pg' not in self._stored.reldata:
637+ self._stored.reldata['pg'] = {}
638+ self.pg = pgsql.PostgreSQLClient(self, 'pg')
639+ self.framework.observe(self.on.pg_relation_changed, self._on_config_changed)
640+ self.framework.observe(self.pg.on.database_relation_joined, self._on_database_relation_joined)
641+ self.framework.observe(self.pg.on.master_changed, self._on_master_changed)
642+ self.framework.observe(self.pg.on.standby_changed, self._on_standby_changed)
643+
644+ def _on_database_relation_joined(self, event: pgsql.DatabaseRelationJoinedEvent) -> None:
645+ """Handle db-relation-joined."""
646+ if self.model.unit.is_leader():
647+ # Provide requirements to the PostgreSQL server.
648+ event.database = self.app.name # Request database named like the Juju app
649+ elif event.database != self.app.name:
650+ # Leader has not yet set requirements. Defer, in case this unit
651+ # becomes leader and needs to perform that operation.
652+ event.defer()
653+
654+ def _on_master_changed(self, event: pgsql.MasterChangedEvent) -> None:
655+ """Handle changes in the primary database unit."""
656+ if event.database != self.app.name:
657+ # Leader has not yet set requirements. Wait until next
658+ # event, or risk connecting to an incorrect database.
659+ return
660+
661+ self._stored.reldata['pg']['conn_str'] = None if event.master is None else event.master.conn_str
662+ self._stored.reldata['pg']['db_uri'] = None if event.master is None else event.master.uri
663+
664+ if event.master is None:
665+ return
666+
667+ def _remove_fallback_application_name_from_conn_str(self, conn_str):
668+ """Remove the fallback_application_name from conn_str as it's making telegraf Error."""
669+
670+ pattern = r'fallback_application_name=[^\s]+'
671+ return re.sub(pattern, '', conn_str)
672+
673+ def _on_standby_changed(self, event: pgsql.StandbyChangedEvent) -> None:
674+ """Handle changes in the secondary database unit(s)."""
675+ if event.database != self.app.name:
676+ # Leader has not yet set requirements. Wait until next
677+ # event, or risk connecting to an incorrect database.
678+ return
679+
680+ self._stored.reldata['pg']['ro_uris'] = [c.uri for c in event.standbys]
681+
682+ # TODO: Emit event when we add support for read replicas
683+
684+ def on_get_prometheus_metrics_action(self, event):
685+ """Handle the get-prometheus-metrics action."""
686+ try:
687+ response = urllib.request.urlopen('http://127.0.0.1:{}/metrics'.format(self.config["open_port"]))
688+ event.set_results({"prometheus-metrics": response.read(), "result-code": response.status})
689+ except HTTPError as error:
690+ event.set_results({"result-code": error.code})
691
692 def _make_pod_env(self) -> dict:
693 """Return an envConfig with some core configuration.
694@@ -43,11 +262,16 @@ class TelegrafK8sCharm(CharmBase):
695
696 config = self.model.config
697
698+ inputs = config['inputs']
699+ if self._stored.reldata['pg'] and self._stored.reldata['pg']['conn_str']:
700+ conn_str = self._remove_fallback_application_name_from_conn_str(self._stored.reldata['pg']['conn_str'])
701+ inputs = inputs + self._render_template(POSTGRESQL_TEMPLATE, {'conn_str': conn_str})
702+
703 return {
704 'GLOBAL_TAGS': config['global_tags'],
705 'AGENT_CONF': config['agent_conf'],
706 'OUTPUTS': config['outputs'],
707- 'INPUTS': config['inputs'],
708+ 'INPUTS': inputs,
709 }
710
711 def _check_juju_config(self) -> None:
712@@ -72,100 +296,28 @@ class TelegrafK8sCharm(CharmBase):
713 "Required Juju config item(s) not set : {}".format(", ".join(sorted(errors)))
714 )
715
716- port_list = self.model.config['open_ports']
717- for port in port_list.split(","):
718- try:
719- [number, protocol] = port.split(":")
720- number_int = int(number)
721- if number_int < 1024 or number_int >= 65535:
722- logger.error("open_ports wants to open a port out of range: %s", number)
723- raise TelegrafK8sCharmJujuConfigError(
724- "open_ports wants to open a port out of range: {}".format(number)
725- )
726- if protocol.upper() not in ['TCP', 'UDP']:
727- logger.error(
728- "open_ports has wrong format: %s. %s is not a valid protocol. 'tcp' or 'udp' expected.",
729- port,
730- protocol,
731- )
732- raise TelegrafK8sCharmJujuConfigError(
733- "open_ports has wrong format: {}. {} is not a valid protocol. 'tcp' or 'udp' expected.".format(
734- port, protocol
735- )
736- )
737- except ValueError as e:
738- logger.error("Failed to parse open_ports: %s", e)
739- raise TelegrafK8sCharmJujuConfigError("Failed to parse open_ports: {}".format(str(e)))
740-
741- def _make_open_ports_list(self) -> dict:
742- """Return a list of ports to be opened from config['open_ports'].
743-
744- :returns: A list of dicts used for ports in podspec
745- """
746-
747- open_ports = self.model.config['open_ports']
748- if open_ports == '':
749- return None
750- open_ports_list = []
751- for port in open_ports.split(","):
752- number, proto = port.split(":")
753- open_ports_list.append(
754- {'containerPort': int(number), 'protocol': proto.upper(), 'name': '{}-{}'.format(number, proto)}
755- )
756-
757- return open_ports_list
758-
759- def _make_pod_spec(self) -> dict:
760- """Create a pod spec with some core configuration."""
761-
762- config = self.model.config
763- image_details = {
764- 'imagePath': config['image_path'],
765- }
766- if config.get('image_username', None):
767- image_details.update({'username': config['image_username'], 'password': config['image_password']})
768- pod_env = self._make_pod_env()
769- open_ports = self._make_open_ports_list()
770-
771- return {
772- 'version': 3, # otherwise resources are ignored
773- 'containers': [
774- {
775- 'name': self.app.name,
776- "imageDetails": image_details,
777- # TODO: debatable. The idea is that if you want to force an update with the same image name, you
778- # don't need to empty kubelet cache on each node to have the right version.
779- # This implies a performance drop upon start.
780- "imagePullPolicy": "Always",
781- "ports": open_ports,
782- "envConfig": pod_env,
783- },
784- ],
785- }
786-
787- def _configure_pod(self, event: ops.framework.EventBase) -> None:
788- """Assemble the pod spec and apply it, if possible.
789-
790- :param event: Event that triggered the method.
791- """
792-
793- if not self.unit.is_leader():
794- self.unit.status = ActiveStatus()
795- return
796-
797+ port = self.model.config['open_port']
798 try:
799- self._check_juju_config()
800- except TelegrafK8sCharmJujuConfigError as e:
801- self.unit.status = BlockedStatus(str(e))
802- return
803-
804- self.model.unit.status = MaintenanceStatus('Configuring pod')
805-
806- pod_spec = self._make_pod_spec()
807+ port_int = int(port)
808+ if port_int < 1024 or port_int >= 65535:
809+ logger.error("open_port wants to open a port out of range: %s", port_int)
810+ raise TelegrafK8sCharmJujuConfigError(
811+ "open_port wants to open a port out of range: {}".format(port_int)
812+ )
813+ except ValueError as e:
814+ logger.error("Failed to parse open_port: %s", e)
815+ raise TelegrafK8sCharmJujuConfigError("Failed to parse open_port: {}".format(str(e)))
816+
817+ def _render_template(self, tmpl: str, ctx: dict) -> str:
818+ """Render a Jinja2 template
819+
820+ :returns: A rendered Jinja2 template
821+ """
822+ j2env = Environment(loader=BaseLoader())
823+ j2template = j2env.from_string(tmpl)
824
825- self.model.pod.set_spec(pod_spec)
826- self.unit.status = ActiveStatus()
827+ return j2template.render(**ctx)
828
829
830 if __name__ == "__main__": # pragma: no cover
831- main(TelegrafK8sCharm)
832+ main(TelegrafK8sCharm, use_juju_for_storage=True)
833diff --git a/tests/unit/requirements.txt b/tests/unit/requirements.txt
834index 65431fc..6dd0825 100644
835--- a/tests/unit/requirements.txt
836+++ b/tests/unit/requirements.txt
837@@ -1,3 +1,4 @@
838+jinja2
839 mock
840 pytest
841 pytest-cov
842diff --git a/tests/unit/scenario.py b/tests/unit/scenario.py
843index e60569b..2a05beb 100644
844--- a/tests/unit/scenario.py
845+++ b/tests/unit/scenario.py
846@@ -39,52 +39,30 @@ TEST_JUJU_CONFIG = {
847 'logger': ["ERROR:charm:Required Juju config item(s) not set : inputs"],
848 'expected': 'Required Juju config item(s) not set : inputs',
849 },
850- 'good_config': {
851- 'config': {'image_path': 'telegraf:latest'},
852- 'logger': [],
853- 'expected': False,
854- },
855 'empty_ports_list': {
856 'config': {
857- 'open_ports': '',
858+ 'open_port': '',
859 },
860- 'logger': ['ERROR:charm:Required Juju config item(s) not set : open_ports'],
861- 'expected': 'Required Juju config item(s) not set : open_ports',
862+ 'logger': ['ERROR:charm:Required Juju config item(s) not set : open_port'],
863+ 'expected': 'Required Juju config item(s) not set : open_port',
864 },
865 'port_out_of_range': {
866 'config': {
867- 'open_ports': '9103:tcp,-1:udp',
868- 'inputs': '[[inputs.internal]]',
869- 'outputs': '[[outputs.prometheus_client]]',
870- },
871- 'logger': ['ERROR:charm:open_ports wants to open a port out of range: -1'],
872- 'expected': 'open_ports wants to open a port out of range: -1',
873- },
874- 'invalid_protocol': {
875- 'config': {
876- 'open_ports': '9103:tcp,6343:wrong_protocol',
877+ 'open_port': '10',
878 'inputs': '[[inputs.internal]]',
879 'outputs': '[[outputs.prometheus_client]]',
880 },
881- 'logger': [
882- (
883- "ERROR:charm:open_ports has wrong format: 6343:wrong_protocol. wrong_protocol is not a "
884- "valid protocol. 'tcp' or 'udp' expected."
885- )
886- ],
887- 'expected': (
888- "open_ports has wrong format: 6343:wrong_protocol. wrong_protocol is not a valid "
889- "protocol. 'tcp' or 'udp' expected."
890- ),
891+ 'logger': ['ERROR:charm:open_port wants to open a port out of range: 10'],
892+ 'expected': 'open_port wants to open a port out of range: 10',
893 },
894 'invalid_port_number': {
895 'config': {
896- 'open_ports': 'not_an_int:tcp',
897+ 'open_port': 'not_an_int',
898 'inputs': '[[inputs.internal]]',
899 'outputs': '[[outputs.prometheus_client]]',
900 },
901- 'logger': [("ERROR:charm:Failed to parse open_ports: invalid literal for int() with base 10: 'not_an_int'")],
902- 'expected': ("Failed to parse open_ports: invalid literal for int() with base 10: 'not_an_int'"),
903+ 'logger': [("ERROR:charm:Failed to parse open_port: invalid literal for int() with base 10: 'not_an_int'")],
904+ 'expected': ("Failed to parse open_port: invalid literal for int() with base 10: 'not_an_int'"),
905 },
906 }
907
908@@ -109,76 +87,40 @@ TEST_MAKE_POD_ENV = {
909 },
910 }
911
912-TEST_MAKE_OPEN_PORTS_LIST = {
913- 'good_config': {
914- 'config': {
915- 'open_ports': '9103:tcp,6343:udp',
916- },
917- 'expected_ret': [
918- {'containerPort': 9103, 'protocol': 'TCP', 'name': '9103-tcp'},
919- {'containerPort': 6343, 'protocol': 'UDP', 'name': '6343-udp'},
920- ],
921- },
922- 'empty_open_ports': {
923+TEST_GET_PEBBLE_CONFIG = {
924+ 'invalid_port_number': {
925 'config': {
926- 'open_ports': '',
927+ 'open_port': 'not_an_int',
928+ 'inputs': '[[inputs.internal]]',
929+ 'outputs': '[[outputs.prometheus_client]]',
930 },
931- 'expected_ret': None,
932+ 'logger': [("ERROR:charm:Failed to parse open_port: invalid literal for int() with base 10: 'not_an_int'")],
933+ 'expected_ret': {},
934 },
935-}
936-
937-TEST_MAKE_POD_SPEC = {
938- 'basic': {
939+ 'good_config': {
940 'config': {
941- 'agent_conf': ('[agent]\n' ' interval = "10s"\n' ' round_interval = true'),
942+ 'agent_conf': '[agent]',
943+ 'global_tags': '[global_tags]',
944+ 'inputs': '[[inputs.internal]]',
945+ 'outputs': '[[outputs.prometheus_client]]\n listen = ":9103"',
946 },
947- 'pod_spec': {
948- 'version': 3, # otherwise resources are ignored
949- 'containers': [
950- {
951- 'name': 'telegraf',
952- 'imageDetails': {
953- 'imagePath': 'telegrafcharmers/telegraf:edge',
954- },
955- 'imagePullPolicy': 'Always',
956- 'ports': [{'containerPort': 9103, 'protocol': 'TCP', 'name': '9103-tcp'}],
957- 'envConfig': {
958- 'GLOBAL_TAGS': '[global_tags]',
959- 'AGENT_CONF': ('[agent]\n' ' interval = "10s"\n' ' round_interval = true'),
960- 'OUTPUTS': '[[outputs.prometheus_client]]\n listen = ":9103"',
961- 'INPUTS': '[[inputs.internal]]\n collect_memstats = true',
962+ 'expected_ret': {
963+ "summary": "telegraf layer",
964+ "description": "telegraf layer",
965+ "services": {
966+ "telegraf": {
967+ "environment": {
968+ "AGENT_CONF": "[agent]",
969+ "GLOBAL_TAGS": "[global_tags]",
970+ "INPUTS": "[[inputs.internal]]",
971+ "OUTPUTS": '[[outputs.prometheus_client]]\n listen = ":9103"',
972 },
973+ "override": "replace",
974+ "summary": "telegraf service",
975+ "command": "/run_telegraf",
976+ "startup": "enabled",
977 }
978- ],
979- },
980- },
981- 'basic_with_image_username_and_password': {
982- 'config': {
983- 'image_path': 'telegraf:latest',
984- 'image_username': 'test_user',
985- 'image_password': 'test_password',
986- 'agent_conf': ('[agent]\n' ' interval = "10s"\n' ' round_interval = true'),
987- },
988- 'pod_spec': {
989- 'version': 3, # otherwise resources are ignored
990- 'containers': [
991- {
992- 'name': 'telegraf',
993- 'imageDetails': {
994- 'imagePath': 'telegraf:latest',
995- 'username': 'test_user',
996- 'password': 'test_password',
997- },
998- 'imagePullPolicy': 'Always',
999- 'ports': [{'containerPort': 9103, 'protocol': 'TCP', 'name': '9103-tcp'}],
1000- 'envConfig': {
1001- 'GLOBAL_TAGS': '[global_tags]',
1002- 'AGENT_CONF': ('[agent]\n' ' interval = "10s"\n' ' round_interval = true'),
1003- 'OUTPUTS': '[[outputs.prometheus_client]]\n listen = ":9103"',
1004- 'INPUTS': '[[inputs.internal]]\n collect_memstats = true',
1005- },
1006- },
1007- ],
1008+ },
1009 },
1010 },
1011 }
1012diff --git a/tests/unit/test_charm.py b/tests/unit/test_charm.py
1013index 3ee5524..8bc6047 100644
1014--- a/tests/unit/test_charm.py
1015+++ b/tests/unit/test_charm.py
1016@@ -7,10 +7,6 @@ import unittest
1017 from unittest.mock import MagicMock
1018
1019 from ops import testing
1020-from ops.model import (
1021- ActiveStatus,
1022- BlockedStatus,
1023-)
1024 from charm import (
1025 TelegrafK8sCharm,
1026 TelegrafK8sCharmJujuConfigError,
1027@@ -18,10 +14,9 @@ from charm import (
1028
1029 from scenario import (
1030 JUJU_DEFAULT_CONFIG,
1031+ TEST_GET_PEBBLE_CONFIG,
1032 TEST_JUJU_CONFIG,
1033- TEST_MAKE_OPEN_PORTS_LIST,
1034 TEST_MAKE_POD_ENV,
1035- TEST_MAKE_POD_SPEC,
1036 )
1037
1038
1039@@ -49,45 +44,15 @@ class TestTelegrafK8sCharm(unittest.TestCase):
1040 self.assertEqual(self.harness.charm._make_pod_env(), values['expected_ret'])
1041 self.harness.update_config(JUJU_DEFAULT_CONFIG) # You need to clean the config after each run
1042
1043- def test_make_pod_spec(self):
1044- """Check the crafting of the pod spec."""
1045-
1046- self.harness.update_config(JUJU_DEFAULT_CONFIG)
1047-
1048- for scenario, values in TEST_MAKE_POD_SPEC.items():
1049- with self.subTest(scenario=scenario):
1050- self.harness.update_config(values['config'])
1051- self.assertEqual(self.harness.charm._make_pod_spec(), values['pod_spec'])
1052- self.harness.update_config(JUJU_DEFAULT_CONFIG) # You need to clean the config after each run
1053-
1054- def test_configure_pod(self):
1055- """Test the pod configuration."""
1056+ def test_get_pebble_config(self):
1057+ """Test the _get_pebble_config function."""
1058 mock_event = MagicMock()
1059-
1060- self.harness.update_config(JUJU_DEFAULT_CONFIG)
1061-
1062- for is_leader in [True, False]:
1063- self.harness.set_leader(is_leader)
1064- self.harness.charm.unit.status = BlockedStatus("Testing")
1065- self.harness.charm._configure_pod(mock_event)
1066- self.assertEqual(self.harness.charm.unit.status, ActiveStatus())
1067- self.harness.update_config(JUJU_DEFAULT_CONFIG) # You need to clean the config after each run
1068-
1069- self.harness.set_leader(True)
1070- self.harness.update_config({'inputs': ''})
1071- self.harness.charm._configure_pod(mock_event)
1072- self.assertEqual(self.harness.charm.unit.status, BlockedStatus("Required Juju config item(s) not set : inputs"))
1073- self.harness.update_config(JUJU_DEFAULT_CONFIG) # You need to clean the config after each run
1074-
1075- def test_make_open_ports_list(self):
1076- """Test the _make_open_ports_list function."""
1077-
1078 self.harness.update_config(JUJU_DEFAULT_CONFIG)
1079
1080- for scenario, values in TEST_MAKE_OPEN_PORTS_LIST.items():
1081+ for scenario, values in TEST_GET_PEBBLE_CONFIG.items():
1082 with self.subTest(scenario=scenario):
1083 self.harness.update_config(values['config'])
1084- self.assertEqual(self.harness.charm._make_open_ports_list(), values['expected_ret'])
1085+ self.assertEqual(self.harness.charm._get_pebble_config(mock_event), values['expected_ret'])
1086 self.harness.update_config(JUJU_DEFAULT_CONFIG) # You need to clean the config after each run
1087
1088 def test_check_juju_config(self):

Subscribers

People subscribed via source and target branches