Changing the prometheus web-listen-port leaves the charm in a permanent error state
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Grafana Charm |
Fix Released
|
Undecided
|
Brett Milford |
Bug Description
* Problem Description *
Changing the related prometheus charm's web-listen-port gets the grafana charm stuck in an error state in two different ways
(1) If an update-status hook is scheduled after the port has changed, but before the grafana-
File "charm/
generate_
File "charm/
response = requests.
requests.
(2) Because juju hooks are ordered, this update-status failure prevents the grafana-
(3) Once the grafana-
File "/var/lib/
check_
File "/var/lib/
cur.execute(stmt, values)
sqlite3.
This happens because the relevant code is keying off of the URL to update entries in the database:
"if row[1] == ds["type"] and row[3] == ds["url"]:" from https:/
Since that check fails, it attempts to add a new data source however the grafana sqlite database has a UNIQUE constraint on (org_id, name) so this also fails.
* Reproducer *
juju deploy cs:grafana --config port=3000 --config install_method=snap
juju deploy cs:prometheus2 prometheus
juju add-relation prometheus:
juju add-relation telegraf:
<wait for deployment to settle>
juju config prometheus web-listen-port=80
* Suggested Solution *
This code was previously modified NOT to check name, to allow users to change the name in the Grafana configuraton editor to a friendly name they prefer:
https:/
So to fix this we will need to either revert that ability, or, find some other way to key the change, possible ideas:
- storing some kind of tag/metadata
- using the data source description (currently set to "name - Juju generated source")
- Using the charmhelpers that store the old configuration data to check the Old URL
* Workaround *
(1) Resolve the broken update-status hook that is failing with the Failed to establish a new connection: [Errno 111] Connection refused
juju resolved grafana/0 --no-retry
(2) Watch "juju debug-log grafana/0" and wait for grafana-
If you get more hook failures with the "Connection refused" error, re-run the resolved command and wait again and hopefully you will get to the UNIQUE constraint error.
(3) Manually update the grafana source list to use the new URL
You can attempt to do this through the Grafana UI. Settings Menu -> Data Sources -> Click the relevant entry.
However for some reason if you setup a very simple reproduction environment this page throws an error in the Grafana UI "TypeError: Cannot read property 'timeInterval' of undefined". I assume because the reproduction environment only has prometheus2/grafana and no data source like telegraf that triggers this. In such a case, we can update the sqlite configuration file manually:
juju ssh grafana/0 sudo -i
apt-get install sqlite3
sqlite3 /var/snap/
SELECT * FROM data_source;
UPDATE data_source SET url='http://
.quit
(4) Mark the error resolved WITHOUT specifying --no-retry, so that the hook retries and should succeed.
juju resolved grafana/0
Related branches
- Celia Wang: Approve
- Chris Johnston (community): Approve
- Paul Goins: Needs Fixing
-
Diff: 48 lines (+18/-13)1 file modifiedsrc/reactive/grafana.py (+18/-13)
- Drew Freiberger (community): Approve
- Paul Goins: Approve
-
Diff: 53 lines (+22/-2)2 files modifiedsrc/reactive/grafana.py (+11/-2)
src/tests/functional/tests/test_grafana.py (+11/-0)
Changed in charm-grafana: | |
status: | New → Confirmed |
tags: | added: sts |
Changed in charm-grafana: | |
assignee: | nobody → Brett Milford (brettmilford) |
Changed in charm-grafana: | |
milestone: | none → 21.01 |
status: | Confirmed → Fix Committed |
status: | Fix Committed → Fix Released |
Changed in charm-grafana: | |
milestone: | 21.01 → 21.07 |
So its interesting to note, the code doesn't traverse the UPDATE path because we're comparing URL's including the port number which by this point has changed.
If its possible to capture and compare the previous URL to be sure we're updating the same entry this would be ideal.
Otherwise I think its sufficient it compare the rest of the URL except the port.