Merge lp:~stub/charms/trusty/postgresql/rewrite into lp:charms/trusty/postgresql

Proposed by Stuart Bishop
Status: Merged
Merged at revision: 131
Proposed branch: lp:~stub/charms/trusty/postgresql/rewrite
Merge into: lp:charms/trusty/postgresql
Prerequisite: lp:~stub/charms/trusty/postgresql/rewrite-charmhelpers
Diff against target: 22814 lines (+17181/-4595)
67 files modified
.bzrignore (+3/-1)
Makefile (+64/-49)
README.md (+52/-100)
TODO (+0/-27)
actions.yaml (+11/-0)
actions/actions.py (+42/-7)
config.yaml (+327/-270)
copyright (+6/-7)
hooks/bootstrap.py (+57/-0)
hooks/client.py (+182/-0)
hooks/coordinator.py (+19/-0)
hooks/data-relation-changed (+23/-0)
hooks/data-relation-departed (+23/-0)
hooks/db-admin-relation-departed (+23/-0)
hooks/db-relation-departed (+23/-0)
hooks/decorators.py (+124/-0)
hooks/definitions.py (+86/-0)
hooks/helpers.py (+0/-197)
hooks/hooks.py (+0/-2820)
hooks/leader-elected (+23/-0)
hooks/leader-settings-changed (+23/-0)
hooks/local-monitors-relation-changed (+23/-0)
hooks/master-relation-changed (+23/-0)
hooks/master-relation-departed (+23/-0)
hooks/metrics.py (+64/-0)
hooks/nagios.py (+81/-0)
hooks/nrpe-external-master-relation-changed (+23/-0)
hooks/postgresql.py (+692/-0)
hooks/replication-relation-changed (+23/-0)
hooks/replication.py (+324/-0)
hooks/service.py (+930/-0)
hooks/start (+23/-0)
hooks/stop (+23/-0)
hooks/storage.py (+88/-0)
hooks/syslog-relation-departed (+23/-0)
hooks/syslogrel.py (+72/-0)
hooks/test_hooks.py (+0/-433)
hooks/upgrade.py (+121/-0)
hooks/wal_e.py (+129/-0)
lib/cache_settings.py (+44/-0)
lib/juju-deployer-wrapper.py (+32/-0)
lib/pg_settings_9.1.json (+2792/-0)
lib/pg_settings_9.2.json (+2858/-0)
lib/pg_settings_9.3.json (+2936/-0)
lib/pg_settings_9.4.json (+3050/-0)
lib/pgclient/metadata.yaml (+3/-4)
metadata.yaml (+44/-9)
templates/pg_backup_job.tmpl (+16/-23)
templates/pg_hba.conf.tmpl (+0/-29)
templates/pg_ident.conf.tmpl (+0/-3)
templates/postgres.cron.tmpl (+6/-6)
templates/postgresql.conf.tmpl (+0/-213)
templates/recovery.conf.tmpl (+1/-1)
templates/rsyslog_forward.conf (+2/-2)
templates/start_conf.tmpl (+0/-13)
templates/swiftwal.conf.tmpl (+0/-6)
testing/README (+0/-36)
testing/amuletfixture.py (+241/-0)
testing/jujufixture.py (+0/-297)
tests/00-setup.sh (+0/-15)
tests/01-lint.sh (+0/-3)
tests/02-unit-tests.sh (+0/-3)
tests/03-basic-amulet.py (+0/-19)
tests/obsolete.py (+2/-2)
tests/test_integration.py (+628/-0)
tests/test_postgresql.py (+711/-0)
tests/tests.yaml (+19/-0)
To merge this branch: bzr merge lp:~stub/charms/trusty/postgresql/rewrite
Reviewer Review Type Date Requested Status
Matt Bruzek (community) Approve
Review via email: mp+267646@code.launchpad.net

Description of the change

The PostgreSQL charm is one of the oldest around, crufty and unmaintainable. Its time for a rewrite to make use of modern Juju and improve its reliability.

To post a comment you must log in.
Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :
Revision history for this message
Stuart Bishop (stub) wrote :

They were passing on the new system being setup. I haven't been
watching the old runner.

--
Stuart Bishop <email address hidden>

Revision history for this message
Stuart Bishop (stub) wrote :

On 17 September 2015 at 14:49, Stuart Bishop
<email address hidden> wrote:
> They were passing on the new system being setup. I haven't been
> watching the old runner.

Looks like the green runs are lost in the mists of time. I'm not sure
what has changed in the month this has been waiting for review.

Revision history for this message
Stuart Bishop (stub) wrote :

Looking at http://data.vapour.ws/charm-test/charm-bundle-test-azure-651-all-machines-log,
it seems debugging output has been turned up to 11, and I think this
breaks 'juju wait'. Without 'juju wait', it is impossible to get the
tests running reliably

Its hard to tell through all the noise, but I think here is where the
initial deploy actually terminates:

unit-postgresql-0[21563]: 2015-09-17 01:01:47 INFO
unit.postgresql/0.juju-log server.go:254 db-admin:2: *** End
'db-admin-relation-changed' hook

After this, we keep seeing noise about leadership leases, and other
stuff including the juju run commands that are checking that the logs
are quiet (thus shooting itself in the foot, because the only way to
check if the logs are quiet adds noise to the logs...).

This never got picked up locally, as I don't even know how to turn on
this output and don't get it here (using juju stable).

Also see Bug #1496130, where we are fixing things so the OpenStack
mojo tests can use the wait algorithm.

What is confusing me though is that the Cassandra tests last passed
Sep 4th, and I would have expected them to have started failing much
earlier (it too relies on juju wait). What is even more confusing, the
last successful run at
http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-703
leads me to the AWS log at
http://data.vapour.ws/charm-test/charm-bundle-test-aws-591-all-machines-log
which contains a whole heap of logging information from other tests,
in addition to the Cassandra logs, which means test isolation was
busted and the results bogus.

Revision history for this message
Stuart Bishop (stub) wrote :

I see. Prior to ~Sep 4th, tests were often passing but would also
often fail with various provisioning errors. eg. This one is common,
indicating units took longer than 5 minutes to provision:

amulet.helpers.TimeoutError: public-address not set forall units after 300s

After to ~Sep 4th, all integration tests started failing consistently
at 'juju wait', for both Cassandra and the PostgreSQL rewrite.

--
Stuart Bishop <email address hidden>

Revision history for this message
Stuart Bishop (stub) wrote :

I added Juju 1.24 specific behavior to the juju wait plugin, which avoids needing to sniff the logs. I've updated the package in ppa:stub/juju and I think I've queued a retest.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

> They were passing on the new system being setup. I haven't been
> watching the old runner.
>
Sorry, I'm not familiar, where is the new system (or output from it)?

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

> Looking at http://data.vapour.ws/charm-test/charm-bundle-test-azure-651-all-
> machines-log,
> it seems debugging output has been turned up to 11, and I think this
> breaks 'juju wait'. Without 'juju wait', it is impossible to get the
> tests running reliably

I'm not sure why this would be, unless the default log level in juju itself got turned up. CI is currently running 1.24.5.1

> What is confusing me though is that the Cassandra tests last passed
> Sep 4th, and I would have expected them to have started failing much
> earlier (it too relies on juju wait). What is even more confusing, the
> last successful run at
> http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-703
> leads me to the AWS log at
> http://data.vapour.ws/charm-test/charm-bundle-test-aws-591-all-machines-log
> which contains a whole heap of logging information from other tests,
> in addition to the Cassandra logs, which means test isolation was
> busted and the results bogus.

Wow, that is bizarre. Thanks for pointing this out, I had not seen this before. I'm not convinced that test isolation is busted, but log isolation may be. I'll look into this.

Revision history for this message
Stuart Bishop (stub) wrote :
Download full text (11.6 KiB)

stub@aargh:~/charms/postgresql/rewrite$ make test
sudo add-apt-repository -y ppa:juju/stable
gpg: keyring `/tmp/tmppv9t1ria/secring.gpg' created
gpg: keyring `/tmp/tmppv9t1ria/pubring.gpg' created
gpg: requesting key C8068B11 from hkp server keyserver.ubuntu.com
gpg: /tmp/tmppv9t1ria/trustdb.gpg: trustdb created
gpg: key C8068B11: public key "Launchpad Ensemble PPA" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
OK
sudo add-apt-repository -y ppa:stub/juju
gpg: keyring `/tmp/tmpfkk5hwu_/secring.gpg' created
gpg: keyring `/tmp/tmpfkk5hwu_/pubring.gpg' created
gpg: requesting key E4FD7A7A from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpfkk5hwu_/trustdb.gpg: trustdb created
gpg: key E4FD7A7A: public key "Launchpad Stub's Launchpad PPA" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
OK
[...]
Reading package lists... Done
sudo apt-get install -y \
    python3-psycopg2 python3-nose python3-flake8 amulet \
    python3-jinja2 python3-yaml juju-wait bzr \
    python3-nose-cov python3-nose-timer python-swiftclient
Reading package lists... Done
Building dependency tree
Reading state information... Done
bzr is already the newest version.
python-swiftclient is already the newest version.
python3-flake8 is already the newest version.
python3-jinja2 is already the newest version.
python3-nose is already the newest version.
python3-psycopg2 is already the newest version.
python3-yaml is already the newest version.
python3-nose-cov is already the newest version.
python3-nose-timer is already the newest version.
amulet is already the newest version.
juju-wait is already the newest version.
0 to upgrade, 0 to newly install, 0 to remove and 0 not to upgrade.
Charm Proof
I: metadata name (postgresql) must match directory name (rewrite) exactly for local deployment.
Lint check (flake8)
user configuration: /home/stub/.config/flake8
directory hooks
checking hooks/bootstrap.py
checking hooks/client.py
checking hooks/coordinator.py
checking hooks/decorators.py
checking hooks/definitions.py
checking hooks/helpers.py
checking hooks/metrics.py
checking hooks/nagios.py
checking hooks/postgresql.py
checking hooks/replication.py
checking hooks/service.py
checking hooks/storage.py
checking hooks/syslogrel.py
checking hooks/upgrade.py
checking hooks/wal_e.py
directory actions
checking actions/actions.py
directory testing
checking testing/__init__.py
checking testing/amuletfixture.py
directory tests
checking tests/obsolete.py
checking tests/test_integration.py
checking tests/test_postgresql.py
nosetests3 -sv tests/test_postgresql.py --cover-package=bootstrap,nagios,wal_e,syslogrel,replication,definitions,storage,decorators,metrics,upgrade,postgresql,helpers,coordinator,client,service \
    --with-coverage --cover-branches
test_addr_to_range (test_postgresql.TestPostgresql) ... ok
test_connect (test_postgresql.TestPostgresql) ... ok
test_convert_unit (test_postgresql.TestPostgresql) ... ok
test_create_cluster (test_postgresql.TestPostgresql) ... ok
test_drop_cluster (test_postgresql.TestPostgresql) ... ok
test_ensure_database (test_postgresql.TestPostgresql) ... ok
test_ensure_extensions (test_...

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

@stub, which cloud was that on?

I still have no idea why it literally never passes on CI. Usually takes forever to run too, and often hangs and needs to be killed. I haven't had time to dig deeper yet but I'm hoping to soon.

Revision history for this message
Stuart Bishop (stub) wrote :

@tim It was just a different report, so these clouds. I think I was getting confused with the Cassandra runs which I was doing around the same time.

Previously I had been getting tests to pass, but not all the tests. There were always a few that would fail due to provisioning type errors, the amulet error failing to list the contents of a remote directory, or a juju-deployer race.

Then around Sept 4th, something changed on your clouds and the juju logging changed. This broke 'juju wait', and thus all the PostgreSQL tests, and the Cassandra tests to boot. I've since been reworking the t plugin to be less fragile with Juju 1.23 and earlier, and use 'agent-status' with Juju 1.24 and later. This has got the Cassandra tests green again (and the lp:~stub/charms/trusty/cassandra/spike MP green too).

Revision history for this message
Stuart Bishop (stub) wrote :

This round is looking good. Tests at http://juju-ci.vapour.ws:8080/job/charm-bundle-test-joyent/729/console running smooth until the env.lock bug kicked in.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

I just emailed the juju list about the env.lock problem. If my multiple $JUJU_HOME idea works, it'll be a quick fix. That particular problem is ruining a lot of otherwise green test runs across many charms and bundles.

Revision history for this message
Stuart Bishop (stub) wrote :

The remaining failure seems genuine (PG93MultiTests.test_failover, which victimizes test_replication). Given 3 nodes, dropping the master and adding a new node at the same time can cause a situation where there is no master. I have not been able to reproduce this locally. I need to trawl the logs from the CI system and try to reproduce manually on a cloud.

248. By Stuart Bishop

Work around Bug #1510000

Revision history for this message
Stuart Bishop (stub) wrote :

FWIW, this is still waiting for review so I can land it. Despite the failure of one of the tests on the CI system, this branch is still preferable to the existing known broken charm store charm which has its tests disabled.

249. By Stuart Bishop

Add timings

250. By Stuart Bishop

Dependency for timestamps

251. By Stuart Bishop

Skip failover tests until Bug #1511659 can be dealt with

252. By Stuart Bishop

Update tests for Juju 1.25 unit naming changes

Revision history for this message
Matt Bruzek (mbruzek) wrote :

Hello Stuart,

Thanks for the enormous amount of work on this charm! This merge is a huge refactor and a lot of files changed. From what I saw of the code and the extensive tests it looks great. I see the automated tests are passing on Joyent, and Power 8! I also tested this on the KVM local provider and the results were PASS: 13 Total: 13 (7392.816871 sec).

Charm proof now returns no errors or warnings which was not the case with the previous branch. Thanks for working with Tim and the team to get the automated tests passing. The tests are complex and very thorough. The automated tests passing are very important and will help this charm be of the highest quality in the future.

+1 LGTM

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file '.bzrignore'
2--- .bzrignore 2014-10-14 10:12:50 +0000
3+++ .bzrignore 2015-11-02 12:15:35 +0000
4@@ -1,4 +1,6 @@
5 _trial_temp
6 hooks/_trial_temp
7 hooks/local_state.pickle
8-lib/test-client-charm/hooks/charmhelpers
9+lib/pgclient/hooks/charmhelpers/
10+coverage
11+.coverage
12
13=== modified file 'Makefile'
14--- Makefile 2015-06-25 08:13:25 +0000
15+++ Makefile 2015-11-02 12:15:35 +0000
16@@ -1,63 +1,72 @@
17 CHARM_DIR := $(shell pwd)
18 TEST_TIMEOUT := 900
19-SERIES := $(juju get-environment default-series)
20+SERIES := $(shell juju get-environment default-series)
21+HOST_SERIES := $(shell lsb_release -sc)
22
23 default:
24 @echo "One of:"
25- @echo " make testdep"
26+ @echo " make testdeps"
27 @echo " make lint"
28- @echo " make unit_test"
29- @echo " make integration_test"
30- @echo " make integration_test_91"
31- @echo " make integration_test_92"
32- @echo " make integration_test_93"
33- @echo " make integration_test_94"
34- @echo
35- @echo "There is no 'make test'"
36-
37-test_bot_tests:
38- @echo "Installing dependencies and running automatic-testrunner tests"
39- tests/00-setup.sh
40- tests/01-lint.sh
41- tests/02-unit-tests.sh
42- tests/03-basic-amulet.py
43-
44-testdep:
45- tests/00-setup.sh
46-
47-unit_test:
48- @echo "Unit tests of hooks"
49- cd hooks && trial test_hooks.py
50-
51-integration_test:
52- @echo "PostgreSQL integration tests, all non-beta versions, ${SERIES}"
53- trial test.PG91Tests
54- trial test.PG92Tests
55- trial test.PG93Tests
56- trial test.PG94Tests
57-
58-integration_test_91:
59- @echo "PostgreSQL 9.1 integration tests, ${SERIES}"
60- trial test.PG91Tests
61-
62-integration_test_92:
63- @echo "PostgreSQL 9.2 integration tests, ${SERIES}"
64- trial test.PG92Tests
65-
66-integration_test_93:
67- @echo "PostgreSQL 9.3 integration tests, ${SERIES}"
68- trial test.PG93Tests
69-
70-integration_test_94:
71- @echo "PostgreSQL 9.4 integration tests, ${SERIES}"
72- trial test.PG94Tests
73+ @echo " make unittest"
74+ @echo " make integration"
75+ @echo " make coverage (opens browser)"
76+
77+test: testdeps lint unittest integration
78+
79+testdeps:
80+ sudo add-apt-repository -y ppa:juju/stable
81+ sudo add-apt-repository -y ppa:stub/juju
82+ sudo apt-get update
83+ifeq ($(HOST_SERIES),trusty)
84+ sudo apt-get install -y \
85+ python3-psycopg2 python3-nose python3-flake8 amulet \
86+ python3-jinja2 python3-yaml juju-wait bzr python3-amulet \
87+ python-swiftclient moreutils
88+else
89+ sudo apt-get install -y \
90+ python3-psycopg2 python3-nose python3-flake8 amulet \
91+ python3-jinja2 python3-yaml juju-wait bzr python3-amulet \
92+ python3-nose-cov python3-nose-timer python-swiftclient moreutils
93+endif
94
95 lint:
96+ @echo "Charm Proof"
97+ @charm proof
98 @echo "Lint check (flake8)"
99 @flake8 -v \
100- --exclude hooks/charmhelpers,hooks/_trial_temp \
101 --ignore=E402 \
102- hooks actions testing tests test.py
103+ --exclude=hooks/charmhelpers,__pycache__ \
104+ hooks actions testing tests
105+
106+_co=,
107+_empty=
108+_sp=$(_empty) $(_empty)
109+
110+TESTFILES=$(filter-out %/test_integration.py,$(wildcard tests/test_*.py))
111+PACKAGES=$(subst $(_sp),$(_co),$(notdir $(basename $(wildcard hooks/*.py))))
112+
113+NOSE := nosetests3 -sv
114+ifeq ($(HOST_SERIES),trusty)
115+TIMING_NOSE := nosetests3 -sv
116+else
117+TIMING_NOSE := nosetests3 -sv --with-timer
118+endif
119+
120+unittest:
121+ ${NOSE} ${TESTFILES} --cover-package=${PACKAGES} \
122+ --with-coverage --cover-branches
123+ @echo OK: Unit tests pass `date`
124+
125+coverage:
126+ ${NOSE} ${TESTFILES} --cover-package=${PACKAGES} \
127+ --with-coverage --cover-branches \
128+ --cover-erase --cover-html --cover-html-dir=coverage \
129+ --cover-min-percentage=100 || \
130+ (gnome-open coverage/index.html; false)
131+
132+integration:
133+ ${TIMING_NOSE} tests/test_integration.py 2>&1 | ts
134+ @echo OK: Integration tests pass `date`
135
136 sync:
137 @bzr cat \
138@@ -65,3 +74,9 @@
139 > .charm_helpers_sync.py
140 @python .charm_helpers_sync.py -c charm-helpers.yaml
141 @rm .charm_helpers_sync.py
142+
143+
144+# These targets are to separate the test output in the Charm CI system
145+test_integration.py%:
146+ ${TIMING_NOSE} tests/$@ 2>&1 | ts
147+ @echo OK: $@ tests pass `date`
148
149=== modified file 'README.md'
150--- README.md 2015-06-22 18:40:20 +0000
151+++ README.md 2015-11-02 12:15:35 +0000
152@@ -26,14 +26,8 @@
153
154 # Usage
155
156-This charm supports several deployment models:
157-
158- - A single service containing one unit. This provides a 'standalone'
159- environment.
160-
161- - A service containing multiple units. One unit will be a 'master', and every
162- other unit is a 'hot standby'. The charm sets up and maintains replication
163-for you, using standard PostgreSQL streaming replication.
164+This charm can deploy a single standalone PostgreSQL unit, or a service
165+containing a single master unit and one or more replicas.
166
167 To setup a single 'standalone' service::
168
169@@ -42,21 +36,18 @@
170
171 ## Scale Out Usage
172
173-To replicate this 'standalone' database to a 'hot standby', turning the
174-existing unit into a 'master'::
175+To add a replica to an existing service::
176
177 juju add-unit pg-a
178
179-To deploy a new service containing a 'master' and two 'hot standbys'::
180+To deploy a new service containing a master and two hot standby replicas::
181
182- juju deploy -n 2 postgresql pg-b
183- [ ... wait until units are stable ... ]
184- juju add-unit pg-b
185+ juju deploy -n 3 postgresql pg-b
186
187 You can remove units as normal. If the master unit is removed, failover occurs
188-and the most up to date 'hot standby' is promoted to 'master'. The
189-'db-relation-changed' and 'db-admin-relation-changed' hooks are fired, letting
190-clients adjust::
191+and the most up to date hot standby is promoted to the master. The
192+'db-relation-changed' and 'db-admin-relation-changed' hooks are fired,
193+letting clients adjust::
194
195 juju remove-unit pg-b/0
196
197@@ -74,12 +65,6 @@
198
199 ## Known Limitations and Issues
200
201-âš  Due to current [limitations][1] with juju, you cannot reliably
202-create a service initially containing more than 2 units (eg. juju deploy
203--n 3 postgresql). Instead, you must first create a service with 2 units.
204-Once the environment is stable and all the hooks have finished running,
205-you may add more units.
206-
207 âš  Do not attempt to relate client charms to a PostgreSQL service containing
208 multiple units unless you know the charm supports a replicated service.
209
210@@ -106,20 +91,23 @@
211 practice with your database permissions will make your life difficult
212 when you need to recover after failure.
213
214-_Always_ set the 'roles' relationship setting when joining a
215-relationship. _Always_ grant permissions to database roles for _all_
216-database objects your charm creates. _Never_ rely on access permissions
217-given directly to a user, either explicitly or implicitly (such as being
218-the user who created a table). Consider the users you are provided by
219-the PostgreSQL charm as ephemeral. Any rights granted directly to them
220-will be lost if relations are recreated, as the generated usernames will
221-be different. _If you don't follow this advice, you will need to
222-manually repair permissions on all your database objects after any of
223-the available recovery mechanisms._
224+PostgreSQL has comprehensive database security, including ownership
225+and permissions on database objects. By default, any objects a client
226+service creates will be owned by a user with the same name as the
227+client service and inaccessible to other users. To share data, it
228+is best to create new roles, grant the relevant permissions and object
229+ownership to the new roles and finally grant these roles to the users
230+your services can connect as. This also makes disaster recovery easier.
231+If you restore a database into an indentical Juju environment, then
232+the service names and usernames will be the same and database permissions
233+will match. However, if you restore a database into an environment
234+with different client service names then the usernames will not match
235+and the new users not have access to your data.
236
237 Learn about the SQL `GRANT` statement in the excellect [PostgreSQL
238 reference guide][3].
239
240+
241 ### block-storage-broker
242
243 If you are using external storage provided by the block storage broker,
244@@ -163,10 +151,9 @@
245 will create it if necessary. If your charm sets this, then it must wait
246 until a matching `database` value is presented on the PostgreSQL side of
247 the relation (ie. `relation-get database` returns the value you set).
248-- `roles`: Optional. A comma separated list of database roles to grant the
249+- `roles`: Deprecated. A comma separated list of database roles to grant the
250 database user. Typically these roles will have been granted permissions to
251- access the tables and other database objects. Do not grant permissions
252- directly to juju generated database users, as the charm may revoke them.
253+ access the tables and other database objects.
254 - `extensions`: Optional. A comma separated list of required postgresql
255 extensions.
256
257@@ -210,28 +197,26 @@
258 A PostgreSQL service may contain multiple units (a single master, and
259 optionally one or more hot standbys). The client charm can tell which
260 unit in a relation is the master and which are hot standbys by
261-inspecting the 'state' property on the relation, and it needs to be
262-aware of how many units are in the relation by using the 'relation-list'
263-hook tool.
264-
265-If there is a single PostgreSQL unit related, the state will be
266-'standalone'. All database connections of course go to this unit.
267-
268-If there is more than one PostgreSQL unit related, the client charm
269-must only use units with state set to 'master' or 'hot standby'.
270-The unit with 'master' state can accept read and write connections. The
271-units with 'hot standby' state can accept read-only connections, and
272-any attempted writes will fail. Units with any other state must not be
273-used and should be ignored ('standalone' units are new units joining the
274-service that are not yet setup, and 'failover' state will occur when the
275-master unit is being shutdown and a new master is being elected).
276+inspecting the 'state' property on the relation.
277+
278+The 'standalone' state is deprecated, and when a unit advertises itself
279+as 'standalone' you should treat it like a 'master'. The state only exists
280+for backwards compatibility and will be removed soon.
281+
282+Writes must go to the unit identifying itself as 'master' or 'standalone'.
283+If you sent writes to a 'hot standby', they will fail.
284+
285+Reads may go to any unit. Ideally they should be load balanced over
286+the 'hot standby' units. If you need to ensure consistency, you may
287+need to read from the 'master'.
288+
289+Units in any other state, including no state, should not be used and
290+connections to them will likely fail. These units may still be setting
291+up, or performing a maintenance operation such as a failover.
292
293 The client charm needs to watch for state changes in its
294-relation-changed hook. New units may be added to a single unit service,
295-and the client charm must stop using existing 'standalone' unit and wait
296-for 'master' and 'hot standby' units to appear. Units may be removed,
297-possibly causing a 'hot standby' unit to be promoted to a master, or
298-even having the service revert to a single 'standalone' unit.
299+relation-changed hook. During failover, one of the existing 'hot standby'
300+units will change into a 'master'.
301
302
303 ## Example client hooks
304@@ -249,7 +234,6 @@
305 @hook
306 def db_relation_joined():
307 relation_set('database', config('database')) # Explicit database name
308- relation_set('roles', 'reporting,standard') # DB roles required
309
310 @hook('db-relation-changed', 'db-relation-departed')
311 def db_relation_changed():
312@@ -271,9 +255,7 @@
313 conn_str = conn_str_tmpl.format(**relation_get(unit=db_unit)
314 remote_state = relation_get('state', db_unit)
315
316- if remote_state == 'standalone' and len(active_db_units) == 1:
317- master_conn_str = conn_str
318- elif relation_state == 'master':
319+ if remote_state in ('master', 'standalone'):
320 master_conn_str = conn_str
321 elif relation_state == 'hot standby':
322 slave_conn_strs.append(conn_str)
323@@ -284,46 +266,17 @@
324 hooks.execute(sys.argv)
325
326
327-## Upgrade-charm hook notes
328-
329-The PostgreSQL charm has deprecated volume-map and volume-ephemeral-storage
330-configuration options in favor of using the storage subordinate charm for
331-general external storage management. If the installation being upgraded is
332-using these deprecated options, there are a couple of manual steps necessary
333-to finish migration and continue using the current external volumes.
334-Even though all data will remain intact, and PostgreSQL service will continue
335-running, the upgrade-charm hook will intentionally fail and exit 1 as well to
336-raise awareness of the manual procedure which will also be documented in the
337-juju logs on the PostgreSQL units.
338-
339-The following steps must be additionally performed to continue using external
340-volume maps for the PostgreSQL units once juju upgrade-charm is run from the
341-command line:
342- 1. cat > storage.cfg <<EOF
343- storage:
344- provider:block-storage-broker
345- root: /srv/data
346- volume_map: "{postgresql/0: your-vol-id, postgresql/1: your-2nd-vol-id}"
347- EOF
348- 2. juju deploy --config storage.cfg storage
349- 3. juju deploy block-storage-broker
350- 4. juju add-relation block-storage-broker storage
351- 5. juju resolved --retry postgresql/0 # for each postgresql unit running
352- 6. juju add-relation postgresql storage
353-
354-
355 # Point In Time Recovery
356
357-The PostgreSQL charm has experimental support for log shipping and point
358-in time recovery. This feature uses the wal-e[2] tool, and requires access
359-to Amazon S3, Microsoft Azure Block Storage or Swift. This feature is
360-flagged as experimental because it has only been tested with Swift, and
361-not yet been tested under load. It also may require some API changes,
362-particularly on how authentication credentials are accessed when a standard
363-emerges. The charm can be configured to perform regular filesystem backups
364-and ship WAL files to the object store. Hot standbys will make use of the
365-archived WAL files, allowing them to resync after extended netsplits or
366-even let you turn off streaming replication entirely.
367+The PostgreSQL charm has support for log shipping and point in time
368+recovery. This feature uses the wal-e[2] tool, which will be
369+installed from the Launchpad PPA ppa:stub/pgcharm. This feature
370+requires access to either Amazon S3, Microsoft Azure Block Storage or
371+Swift. This feature is experimental because it has only been tested with
372+Swift. The charm can be configured to perform regular filesystem backups
373+and ship WAL files to the object store. Hot standbys will make use of
374+the archived WAL files, allowing them to resync after extended netsplits
375+or even let you turn off streaming replication entirely.
376
377 With a base backup and the WAL archive you can perform point in time
378 recovery, but this is still a manual process and the charm does not
379@@ -336,8 +289,7 @@
380 service.
381
382 To enable the experimental wal-e support with Swift, you will need to
383-use Ubuntu 14.04 (Trusty), and set the service configuration settings
384-similar to the following::
385+and set the service configuration settings similar to the following::
386
387 postgresql:
388 wal_e_storage_uri: swift://mycontainer
389
390=== removed file 'TODO'
391--- TODO 2013-02-20 13:42:08 +0000
392+++ TODO 1970-01-01 00:00:00 +0000
393@@ -1,27 +0,0 @@
394-- Fix "run" function in hooks.py to log error rather than exiting silently
395- (exceptions emitted at INFO level)
396-
397-- Pick a better way to get machine specs than hacky bash functions - facter?
398-
399-- Specify usernames when adding a relation, rather than generating them
400- automagically. This will interact better with connection poolers.
401-
402-- If we have three services deployed, related via master/slave, then the
403- repmgr command line tools only work on the service containing the master
404- unit, as the slave-only services have neither ssh nor postgresql access
405- to each other. This may only be a problem with master/slave relationships
406- between services and failover, and it is unclear how that would work
407- anyway (get failover working fist for units within a service).
408-
409-- Drop config_change_command from config.yaml, or make it work.
410- "restart" should be the default, as there may be required config changes
411- to get replication working. "reload" would mean blocking a hook requesting
412- config changes requireing a restart, emitting warnings so the admin can
413- do the restart manually. Is this good juju? An admin can already control
414- when restarts happen by when they change the juju configuration, and can
415- avoid unnecessary restarts when adding replicas by ensuring required
416- configuration changes are made before hand.
417-
418-- No hook is invoked when removing a unit, leaving the host standby PostgreSQL
419- cluster running and attempting to replicate from a master that is refusing
420- its connections. Bug #872264.
421
422=== modified file 'actions.yaml'
423--- actions.yaml 2015-06-25 11:35:33 +0000
424+++ actions.yaml 2015-11-02 12:15:35 +0000
425@@ -2,3 +2,14 @@
426 description: Pause replication replay on a hot standby unit.
427 replication-resume:
428 description: Resume replication replay on a hot standby unit.
429+
430+# Revisit this when actions are more mature. Per Bug #1483525, it seems
431+# impossible to return filenames in our results.
432+# backup:
433+# description: Run backups
434+# params:
435+# type:
436+# type: string
437+# enum: [dump]
438+# description: Type of backup. Currently only 'dump' supported.
439+# default: dump
440
441=== modified file 'actions/actions.py'
442--- actions/actions.py 2015-06-25 11:35:33 +0000
443+++ actions/actions.py 2015-11-02 12:15:35 +0000
444@@ -25,17 +25,21 @@
445 sys.path.append(hooks_dir)
446
447 from charmhelpers.core import hookenv
448-import hooks
449+
450+import postgresql
451
452
453 def replication_pause(params):
454- offset = hooks.postgresql_wal_received_offset()
455- if offset is None:
456+ if not postgresql.is_secondary():
457 hookenv.action_fail('Not a hot standby')
458 return
459+
460+ offset = postgresql.wal_received_offset()
461 hookenv.action_set(dict(offset=offset))
462
463- cur = hooks.db_cursor(autocommit=True)
464+ con = postgresql.connect()
465+ con.autocommit = True
466+ cur = con.cursor()
467 cur.execute('SELECT pg_is_xlog_replay_paused()')
468 if cur.fetchone()[0] is True:
469 hookenv.action_fail('Already paused')
470@@ -45,13 +49,16 @@
471
472
473 def replication_resume(params):
474- offset = hooks.postgresql_wal_received_offset()
475- if offset is None:
476+ if not postgresql.is_secondary():
477 hookenv.action_fail('Not a hot standby')
478 return
479+
480+ offset = postgresql.wal_received_offset()
481 hookenv.action_set(dict(offset=offset))
482
483- cur = hooks.db_cursor(autocommit=True)
484+ con = postgresql.connect()
485+ con.autocommit = True
486+ cur = con.cursor()
487 cur.execute('SELECT pg_is_xlog_replay_paused()')
488 if cur.fetchone()[0] is False:
489 hookenv.action_fail('Already resumed')
490@@ -60,6 +67,34 @@
491 hookenv.action_set(dict(result='Resumed'))
492
493
494+# Revisit this when actions are more mature. Per Bug #1483525, it seems
495+# impossible to return filenames in our results.
496+#
497+# def backup(params):
498+# assert params['type'] == 'dump'
499+# script = os.path.join(helpers.scripts_dir(), 'pg_backup_job')
500+# cmd = ['sudo', '-u', 'postgres', '-H', script, str(params['retention'])]
501+# hookenv.action_set(dict(command=' '.join(cmd)))
502+#
503+# try:
504+# subprocess.check_call(cmd)
505+# except subprocess.CalledProcessError as x:
506+# hookenv.action_fail(str(x))
507+# return
508+#
509+# backups = {}
510+# for filename in os.listdir(backups_dir):
511+# path = os.path.join(backups_dir, filename)
512+# if not is.path.isfile(path):
513+# continue
514+# backups['{}.{}'.format(filename
515+# backups[filename] = dict(name=filename,
516+# size=os.path.getsize(path),
517+# path=path,
518+# scp_path='{}:{}'.format(unit, path))
519+# hookenv.action_set(dict(backups=backups))
520+
521+
522 def main(argv):
523 action = os.path.basename(argv[0])
524 params = hookenv.action_get()
525
526=== modified file 'config.yaml'
527--- config.yaml 2015-07-10 17:05:44 +0000
528+++ config.yaml 2015-11-02 12:15:35 +0000
529@@ -3,12 +3,12 @@
530 default: ""
531 type: string
532 description: >
533- A comma-separated list of IP Addresses (or single IP) admin tools like
534- pgAdmin3 will connect from, this is most useful for developers running
535- juju in local mode who need to connect tools like pgAdmin to a postgres.
536- The IP addresses added here will be included in the pg_hba.conf file
537- allowing ip connections to all databases on the server from the given
538- using md5 password encryption.
539+ A comma-separated list of IP Addresses (or single IP) admin tools
540+ like pgAdmin3 will connect from. The IP addresses added here will
541+ be included in the pg_hba.conf file allowing ip connections to all
542+ databases on the server from the given IP addresses using md5
543+ password encryption. IP address ranges are also supported, using
544+ the standard format described in the PostgreSQL reference guide.
545 locale:
546 default: "C"
547 type: string
548@@ -22,53 +22,294 @@
549 description: >
550 Default encoding used to store text in this service. Can only be
551 set when deploying the first unit of a service.
552- extra-packages:
553+ relation_database_privileges:
554+ default: "ALL"
555+ type: string
556+ description: >
557+ A comma-separated list of database privileges to grant to relation
558+ users on their databases. The defaults allow to connect to the
559+ database (CONNECT), create objects such as tables (CREATE), and
560+ create temporary tables (TEMPORARY). Client charms that create
561+ objects in the database are responsible to granting suitable
562+ access on those objects to other roles and users (or PUBLIC) using
563+ standard GRANT statements.
564+ extra_packages:
565 default: ""
566 type: string
567- description: Extra packages to install on the postgresql service units.
568+ description: >
569+ Space separated list of extra packages to install.
570 dumpfile_location:
571 default: "None"
572 type: string
573 description: >
574 Path to a dumpfile to load into DB when service is initiated.
575 version:
576- default: null
577+ default: ""
578 type: string
579 description: >
580 Version of PostgreSQL that we want to install. Supported versions
581- are "9.1", "9.2", "9.3". The default version for the deployed Ubuntu
582- release is used when the version is not specified.
583- cluster_name:
584- default: "main"
585- type: string
586- description: Name of the cluster we want to install the DBs into
587- listen_ip:
588- default: "*"
589- type: string
590- description: IP to listen on
591+ are "9.1", "9.2", "9.3" & "9.4". The default version for the
592+ deployed Ubuntu release is used when the version is not specified.
593+ extra_pg_conf:
594+ # The defaults here match the defaults chosen by the charm,
595+ # so removing them will not change them. They are listed
596+ # as documentation. The charm actually loads the non-calculated
597+ # defaults from this config.yaml file to make it unlikely it will
598+ # get out of sync with reality.
599+ default: |
600+ # Additional service specific postgresql.conf settings.
601+ listen_addresses='*'
602+ max_connections=100
603+ ssl=true
604+ log_timezone=UTC
605+ log_checkpoints=true
606+ log_connections=true
607+ log_disconnections=true
608+ log_autovacuum_min_duration=-1
609+ log_line_prefix='%t [%p]: [%l-1] db=%d,user=%u '
610+ archive_mode=true
611+ archive_command='/bin/true'
612+ hot_standby=true
613+ max_wal_senders=80
614+ # max_wal_senders=num_units * 2 + 5
615+ # wal_level=hot_standby (<9.4) or logical (>=9.4)
616+ # shared_buffers=total_ram*0.25
617+ # effective_cache_size=total_ram*0.75
618+ default_statistics_target=250
619+ from_collapse_limit=16
620+ join_collapse_limit=16
621+ wal_buffers=-1
622+ checkpoint_completion_target=0.9
623+ password_encryption=true
624+ max_connections=100
625+ type: string
626+ description: >
627+ postgresql.conf settings, one per line in standard key=value
628+ PostgreSQL format. These settings will generally override
629+ any values selected by the charm. The charm however will
630+ attempt to ensure minimum requirements for the charm's
631+ operation are met.
632+ extra_pg_auth:
633+ type: string
634+ default: ""
635+ description: >
636+ A comma separated extra pg_hba.conf auth rules.
637+ This will be written to the pg_hba.conf file, one line per rule.
638+ Note that this should not be needed as db relations already create
639+ those rules the right way. Only use this if you really need too
640+ (e.g. on a development environment), or are connecting juju managed
641+ databases to external managed systems, or configuring replication
642+ between unrelated PostgreSQL services using the manual_replication
643+ option.
644+ performance_tuning:
645+ default: "Mixed"
646+ type: string
647+ description: >
648+ DEPRECATED AND IGNORED. The pgtune project has been abandoned
649+ and the packages dropped from Debian and Ubuntu. The charm
650+ still performs some basic tuning, which users can tweak using
651+ extra_pg_config.
652+ manual_replication:
653+ type: boolean
654+ default: False
655+ description: >
656+ Enable or disable charm managed replication. When manual_replication
657+ is True, the operator is responsible for maintaining recovery.conf
658+ and performing any necessary database mirroring. The charm will
659+ still advertise the unit as standalone, master or hot standby to
660+ relations based on whether the system is in recovery mode or not.
661+ Note that this option makes it possible to create a PostgreSQL
662+ service with multiple master units, which is a very silly thing
663+ to do unless you are also using multi-master software like BDR.
664+ backup_schedule:
665+ default: "13 4 * * *"
666+ type: string
667+ description: Cron-formatted schedule for database backups.
668+ backup_retention_count:
669+ default: 7
670+ type: int
671+ description: Number of recent backups to retain.
672+ nagios_context:
673+ default: "juju"
674+ type: string
675+ description: >
676+ Used by the nrpe subordinate charms.
677+ A string that will be prepended to instance name to set the host name
678+ in nagios. So for instance the hostname would be something like:
679+ juju-postgresql-0
680+ If you're running multiple environments with the same services in them
681+ this allows you to differentiate between them.
682+ nagios_servicegroups:
683+ default: ""
684+ type: string
685+ description: >
686+ A comma-separated list of nagios servicegroups.
687+ If left empty, the nagios_context will be used as the servicegroup
688+ pgdg:
689+ description: >
690+ Enable the PostgreSQL Global Development Group APT repository
691+ (https://wiki.postgresql.org/wiki/Apt). This package source provides
692+ official PostgreSQL packages for Ubuntu LTS releases beyond those
693+ provided by the main Ubuntu archive.
694+ type: boolean
695+ default: false
696+ install_sources:
697+ description: >
698+ List of extra package sources, per charm-helpers standard.
699+ YAML format.
700+ type: string
701+ default: ""
702+ install_keys:
703+ description: >
704+ List of signing keys for install_sources package sources, per
705+ charmhelpers standard. YAML format.
706+ type: string
707+ default: ""
708+ wal_e_storage_uri:
709+ type: string
710+ default: ""
711+ description: |
712+ EXPERIMENTAL.
713+ Specify storage to be used by WAL-E. Every PostgreSQL service must use
714+ a unique URI. Backups will be unrecoverable if it is not unique. The
715+ URI's scheme must be one of 'swift' (OpenStack Swift), 's3' (Amazon AWS)
716+ or 'wabs' (Windows Azure). For example:
717+ 'swift://some-container/directory/or/whatever'
718+ 's3://some-bucket/directory/or/whatever'
719+ 'wabs://some-bucket/directory/or/whatever'
720+ Setting the wal_e_storage_uri enables regular WAL-E filesystem level
721+ backups (per wal_e_backup_schedule), and log shipping to the configured
722+ storage. Point-in-time recovery becomes possible, as is disabling the
723+ streaming_replication configuration item and relying solely on
724+ log shipping for replication.
725+ wal_e_backup_schedule:
726+ type: string
727+ default: "13 0 * * *"
728+ description: >
729+ EXPERIMENTAL.
730+ Cron-formatted schedule for WAL-E database backups. If
731+ wal_e_backup_schedule is unset, WAL files will never be removed from
732+ WAL-E storage.
733+ wal_e_backup_retention:
734+ type: int
735+ default: 2
736+ description: >
737+ EXPERIMENTAL.
738+ Number of recent base backups and WAL files to retain.
739+ You need enough space for this many backups plus one more, as
740+ an old backup will only be removed after a new one has been
741+ successfully made to replace it.
742+ streaming_replication:
743+ type: boolean
744+ default: true
745+ description: >
746+ Enable streaming replication. Normally, streaming replication is
747+ always used, and any log shipping configured is used as a fallback.
748+ Turning this off without configuring log shipping is an error.
749+ os_username:
750+ type: string
751+ default: ""
752+ description: EXPERIMENTAL. OpenStack Swift username.
753+ os_password:
754+ type: string
755+ default: ""
756+ description: EXPERIMENTAL. OpenStack Swift password.
757+ os_auth_url:
758+ type: string
759+ default: ""
760+ description: EXPERIMENTAL. OpenStack Swift authentication URL.
761+ os_tenant_name:
762+ type: string
763+ default: ""
764+ description: EXPERIMENTAL. OpenStack Swift tenant name.
765+ aws_access_key_id:
766+ type: string
767+ default: ""
768+ description: EXPERIMENTAL. Amazon AWS access key id.
769+ aws_secret_access_key:
770+ type: string
771+ default: ""
772+ description: EXPERIMENTAL. Amazon AWS secret access key.
773+ wabs_account_name:
774+ type: string
775+ default: ""
776+ description: EXPERIMENTAL. Windows Azure account name.
777+ wabs_access_key:
778+ type: string
779+ default: ""
780+ description: EXPERIMENTAL. Windows Azure access key.
781+ package_status:
782+ default: "install"
783+ type: string
784+ description: >
785+ The status of service-affecting packages will be set to this
786+ value in the dpkg database. Useful valid values are "install"
787+ and "hold".
788+ # statsd-compatible metrics
789+ metrics_target:
790+ default: ""
791+ type: string
792+ description: >
793+ Destination for statsd-format metrics, format "host:port". If
794+ not present and valid, metrics disabled.
795+ metrics_prefix:
796+ default: "dev.$UNIT.postgresql"
797+ type: string
798+ description: >
799+ Prefix for metrics. Special value $UNIT can be used to include the
800+ name of the unit in the prefix.
801+ metrics_sample_interval:
802+ default: 5
803+ type: int
804+ description: Period for metrics cron job to run in minutes
805+
806+
807+ # DEPRECATED SETTINGS.
808+ # Remove them one day. They remain here to avoid making existing
809+ # configurations fail.
810+ advisory_lock_restart_key:
811+ default: 765
812+ type: int
813+ description: >
814+ DEPRECATED AND IGNORED.
815+ An advisory lock key used internally by the charm. You do not need
816+ to change it unless it happens to conflict with an advisory lock key
817+ being used by your applications.
818+ extra-packages:
819+ default: ""
820+ type: string
821+ description: DEPRECATED. Use extra_packages.
822 listen_port:
823- default: null
824+ default: -1
825 type: int
826- description: Port to listen on. Default is automatically assigned.
827+ description: >
828+ DEPRECATED. Use extra_pg_conf.
829+ Port to listen on. Default is automatically assigned.
830 max_connections:
831 default: 100
832 type: int
833- description: Maximum number of connections to allow to the PG database
834+ description: >
835+ DEPRECATED. Use extra_pg_conf.
836+ Maximum number of connections to allow to the PG database
837 max_prepared_transactions:
838 default: 0
839 type: int
840 description: >
841- Maximum number of prepared two phase commit transactions, waiting
842- to be committed. Defaults to 0. as using two phase commit without
843- a process to monitor and resolve lost transactions is dangerous.
844+ DEPRECATED. Use extra_pg_conf.
845+ Maximum number of prepared two phase commit transactions, waiting
846+ to be committed. Defaults to 0. as using two phase commit without
847+ a process to monitor and resolve lost transactions is dangerous.
848 ssl:
849 default: "True"
850 type: string
851- description: Whether PostgreSQL should talk SSL
852+ description: >
853+ DEPRECATED. Use extra_pg_conf.
854+ Whether PostgreSQL should talk SSL
855 log_min_duration_statement:
856 default: -1
857 type: int
858 description: >
859+ DEPRECATED. Use extra_pg_conf.
860 -1 is disabled, 0 logs all statements
861 and their durations, > 0 logs only
862 statements running at least this number
863@@ -76,19 +317,21 @@
864 log_checkpoints:
865 default: False
866 type: boolean
867- description: Log checkpoints
868+ description: >
869+ DEPRECATED. Use extra_pg_conf.
870 log_connections:
871 default: False
872 type: boolean
873- description: Log connections
874+ description: DEPRECATED. Use extra_pg_conf.
875 log_disconnections:
876 default: False
877 type: boolean
878- description: Log disconnections
879+ description: DEPRECATED. Use extra_pg_conf.
880 log_temp_files:
881 default: "-1"
882 type: string
883 description: >
884+ DEPRECATED. Use extra_pg_conf.
885 Log creation of temporary files larger than the threshold.
886 -1 disables the feature, 0 logs all temporary files, or specify
887 the threshold size with an optional unit (eg. "512KB", default
888@@ -97,6 +340,7 @@
889 default: "%t [%p]: [%l-1] db=%d,user=%u "
890 type: string
891 description: |
892+ DEPRECATED. Use extra_pg_conf.
893 special values:
894 %a = application name
895 %u = user name
896@@ -119,55 +363,68 @@
897 log_lock_waits:
898 default: False
899 type: boolean
900- description: log lock waits >= deadlock_timeout
901+ description: DEPRECATED. Use extra_pg_conf.
902 log_timezone:
903 default: "UTC"
904 type: string
905- description: Log timezone
906+ description: DEPRECATED. Use extra_pg_conf.
907 autovacuum:
908 default: True
909 type: boolean
910 description: >
911+ DEPRECATED. Use extra_pg_conf.
912 Autovacuum should almost always be running. If you want to turn this
913 off, you are probably following out of date documentation.
914 log_autovacuum_min_duration:
915 default: -1
916 type: int
917 description: >
918+ DEPRECATED. Use extra_pg_conf.
919 -1 disables, 0 logs all actions and their durations, > 0 logs only
920 actions running at least this number of milliseconds.
921 autovacuum_analyze_threshold:
922 default: 50
923 type: int
924- description: min number of row updates before analyze
925+ description: >
926+ DEPRECATED. Use extra_pg_conf.
927+ min number of row updates before analyze
928 autovacuum_vacuum_scale_factor:
929 default: 0.2
930 type: float
931- description: Fraction of table size before vacuum
932+ description: >
933+ DEPRECATED. Use extra_pg_conf.
934+ Fraction of table size before vacuum
935 autovacuum_analyze_scale_factor:
936 default: 0.1
937 type: float
938- description: Fraction of table size before analyze
939+ description: >
940+ DEPRECATED. Use extra_pg_conf.
941+ Fraction of table size before analyze
942 autovacuum_vacuum_cost_delay:
943 default: "20ms"
944 type: string
945 description: >
946+ DEPRECATED. Use extra_pg_conf.
947 Default vacuum cost delay for autovacuum, in milliseconds;
948 -1 means use vacuum_cost_delay
949 search_path:
950 default: "\"$user\",public"
951 type: string
952 description: >
953+ DEPRECATED. Use extra_pg_conf.
954 Comma separated list of schema names for
955 the default SQL search path.
956 standard_conforming_strings:
957 default: True
958 type: boolean
959- description: Standard conforming strings
960+ description: >
961+ DEPRECATED. Use extra_pg_conf.
962+ Standard conforming strings
963 hot_standby:
964 default: False
965 type: boolean
966 description: >
967+ DEPRECATED. Use extra_pg_conf.
968 Hot standby or warm standby. When True, queries can be run against
969 the database when in recovery or standby mode (ie. replicated).
970 Overridden when service contains multiple units.
971@@ -175,6 +432,7 @@
972 default: False
973 type: boolean
974 description: >
975+ DEPRECATED. Use extra_pg_conf.
976 Hot standby feedback, informing a master about in progress
977 transactions on a streaming hot standby and allowing the master to
978 defer cleanup and avoid query cancelations on the hot standby.
979@@ -182,6 +440,7 @@
980 default: minimal
981 type: string
982 description: >
983+ DEPRECATED. Use extra_pg_conf.
984 'minimal', 'archive', 'hot_standby' or 'logical'. Defines how much
985 information is written to the WAL. Set to 'minimal' for stand alone
986 databases and 'hot_standby' for replicated setups. Overridden by
987@@ -190,6 +449,7 @@
988 default: 0
989 type: int
990 description: >
991+ DEPRECATED. Use extra_pg_conf.
992 Maximum number of hot standbys that can connect using
993 streaming replication. Set this to the expected maximum number of
994 hot standby units to avoid unnecessary blocking and database restarts.
995@@ -198,6 +458,7 @@
996 default: 0
997 type: int
998 description: >
999+ DEPRECATED. Use extra_pg_conf.
1000 Number of old WAL files to keep, providing a larger buffer for
1001 streaming hot standbys to catch up from when lagged. Each WAL file
1002 is 16MB in size. The WAL files are the buffer of how far a
1003@@ -208,6 +469,7 @@
1004 default: 5000
1005 type: int
1006 description: >
1007+ DEPRECATED. Use extra_pg_conf.
1008 Value of wal_keep_segments used when this service is replicated.
1009 This setting only exists to provide a sane default when replication
1010 is requested (so it doesn't fail) and nobody bothered to change the
1011@@ -216,6 +478,7 @@
1012 default: False
1013 type: boolean
1014 description: >
1015+ DEPRECATED. Use extra_pg_conf.
1016 Enable archiving of WAL files using the command specified by
1017 archive_command. If archive_mode is enabled and archive_command not
1018 set, then archiving is deferred until archive_command is set and the
1019@@ -224,45 +487,38 @@
1020 default: ""
1021 type: string
1022 description: >
1023+ DEPRECATED. Use extra_pg_conf.
1024 Command used to archive WAL files when archive_mode is set and
1025 wal_level > minimal.
1026 work_mem:
1027 default: "1MB"
1028 type: string
1029 description: >
1030- Working Memory.
1031+ DEPRECATED. Use extra_pg_conf. Working Memory.
1032 Ignored unless 'performance_tuning' is set to 'manual'.
1033 maintenance_work_mem:
1034 default: "1MB"
1035 type: string
1036 description: >
1037- Maintenance working memory.
1038+ DEPRECATED. Use extra_pg_conf. Maintenance working memory.
1039 Ignored unless 'performance_tuning' is set to 'manual'.
1040- performance_tuning:
1041- default: "Mixed"
1042- type: string
1043- description: >
1044- Possible values here are "manual", "DW" (data warehouse),
1045- "OLTP" (online transaction processing), "Web" (web application),
1046- "Desktop" or "Mixed". When this is set to a value other than
1047- "manual", the charm invokes the pgtune tool to tune a number
1048- of performance parameters based on the specified load type.
1049- pgtune gathers information about the node on which you are deployed and
1050- tries to make intelligent guesses about what tuning parameters to set
1051- based on available RAM and CPU under the assumption that it's the only
1052- significant service running on this node.
1053 kernel_shmall:
1054 default: 0
1055 type: int
1056- description: Total amount of shared memory available, in bytes.
1057+ description: >
1058+ DEPRECATED and ignored.
1059+ Total amount of shared memory available, in bytes.
1060 kernel_shmmax:
1061 default: 0
1062 type: int
1063- description: The maximum size, in bytes, of a shared memory segment.
1064+ description: >
1065+ DEPRECATED and ignored.
1066+ The maximum size, in bytes, of a shared memory segment.
1067 shared_buffers:
1068 default: ""
1069 type: string
1070 description: >
1071+ DEPRECATED. Use extra_pg_conf.
1072 The amount of memory the database server uses for shared memory
1073 buffers. This string should be of the format '###MB'.
1074 Ignored unless 'performance_tuning' is set to 'manual'.
1075@@ -270,6 +526,7 @@
1076 default: ""
1077 type: string
1078 description: >
1079+ DEPRECATED. Use extra_pg_conf.
1080 Effective cache size is an estimate of how much memory is available for
1081 disk caching within the database. (50% to 75% of system memory). This
1082 string should be of the format '###MB'. Ignored unless
1083@@ -278,6 +535,7 @@
1084 default: -1
1085 type: int
1086 description: >
1087+ DEPRECATED. Use extra_pg_conf.
1088 Sets the default statistics target for table columns without a
1089 column-specific target set via ALTER TABLE SET STATISTICS.
1090 Leave unchanged to use the server default, which in recent
1091@@ -288,6 +546,7 @@
1092 default: -1
1093 type: int
1094 description: >
1095+ DEPRECATED. Use extra_pg_conf.
1096 Sets the from_collapse_limit and join_collapse_limit query planner
1097 options, controlling the maximum number of tables that can be joined
1098 before the turns off the table collapse query optimization.
1099@@ -295,35 +554,41 @@
1100 default: "1MB"
1101 type: string
1102 description: >
1103+ DEPRECATED. Use extra_pg_conf.
1104 The maximum number of temporary buffers used by each database session.
1105 wal_buffers:
1106 default: "-1"
1107 type: string
1108 description: >
1109+ DEPRECATED. Use extra_pg_conf.
1110 min 32kB, -1 sets based on shared_buffers (change requires restart).
1111 Ignored unless 'performance_tuning' is set to 'manual'.
1112 checkpoint_segments:
1113 default: 10
1114 type: int
1115 description: >
1116+ DEPRECATED. Use extra_pg_conf.
1117 in logfile segments, min 1, 16MB each.
1118 Ignored unless 'performance_tuning' is set to 'manual'.
1119 checkpoint_completion_target:
1120 default: 0.9
1121 type: float
1122 description: >
1123- checkpoint target duration time, as a fraction of checkpoint_timeout.
1124- Range [0.0, 1.0].
1125+ DEPRECATED. Use extra_pg_conf.
1126+ checkpoint target duration time, as a fraction of checkpoint_timeout.
1127+ Range [0.0, 1.0].
1128 checkpoint_timeout:
1129 default: ""
1130 type: string
1131 description: >
1132+ DEPRECATED. Use extra_pg_conf.
1133 Maximum time between automatic WAL checkpoints. range '30s-1h'.
1134 If left empty, the default postgresql value will be used.
1135 fsync:
1136 type: boolean
1137 default: True
1138 description: >
1139+ DEPRECATED. Use extra_pg_conf.
1140 Turns forced synchronization on/off. If fsync is turned off, database
1141 failures are likely to involve database corruption and require
1142 recreating the unit
1143@@ -331,231 +596,23 @@
1144 type: boolean
1145 default: True
1146 description: >
1147+ DEPRECATED. Use extra_pg_conf.
1148 Immediate fsync after commit.
1149 full_page_writes:
1150 type: boolean
1151 default: True
1152 description: >
1153+ DEPRECATED. Use extra_pg_conf.
1154 Recover from partial page writes.
1155 random_page_cost:
1156 default: 4.0
1157 type: float
1158- description: Random page cost
1159- extra_pg_auth:
1160- type: string
1161- default: ""
1162- description: >
1163- A comma separated extra pg_hba.conf auth rules.
1164- This will be written to the pg_hba.conf file, one line per rule.
1165- Note that this should not be needed as db relations already create
1166- those rules the right way. Only use this if you really need too
1167- (e.g. on a development environment), or are connecting juju managed
1168- databases to external managed systems, or configuring replication
1169- between unrelated PostgreSQL services using the manual_replication
1170- option.
1171- manual_replication:
1172- type: boolean
1173- default: False
1174- description: >
1175- Enable or disable charm managed replication. When manual_replication
1176- is True, the operator is responsible for maintaining recovery.conf
1177- and performing any necessary database mirroring. The charm will
1178- still advertise the unit as standalone, master or hot standby to
1179- relations based on whether the system is in recovery mode or not.
1180- Note that this option makes it possible to create a PostgreSQL
1181- service with multiple master units, which is probably a very silly
1182- thing to do.
1183+ description: >
1184+ DEPRECATED. Use extra_pg_conf. Random page cost
1185 backup_dir:
1186 default: "/var/lib/postgresql/backups"
1187 type: string
1188- description: Directory to place backups in
1189- backup_schedule:
1190- default: "13 4 * * *"
1191- type: string
1192- description: Cron-formatted schedule for database backups.
1193- backup_retention_count:
1194- default: 7
1195- type: int
1196- description: Number of recent backups to retain.
1197- nagios_context:
1198- default: "juju"
1199- type: string
1200- description: >
1201- Used by the nrpe-external-master subordinate charm.
1202- A string that will be prepended to instance name to set the host name
1203- in nagios. So for instance the hostname would be something like:
1204- juju-postgresql-0
1205- If you're running multiple environments with the same services in them
1206- this allows you to differentiate between them.
1207- nagios_additional_servicegroups:
1208- default: ""
1209- type: string
1210- description: >
1211- Used by the nrpe-external-master subordinate charm.
1212- A comma-separated list of servicegroups to include along with
1213- nagios_context when generating nagios service check configs.
1214- This is useful for nagios installations where servicegroups
1215- are used to apply special treatment to particular checks.
1216- pgdg:
1217- description: >
1218- Enable the PostgreSQL Global Development Group APT repository
1219- (https://wiki.postgresql.org/wiki/Apt). This package source provides
1220- official PostgreSQL packages for Ubuntu LTS releases beyond those
1221- provided by the main Ubuntu archive.
1222- type: boolean
1223- default: false
1224- install_sources:
1225- description: >
1226- List of extra package sources, per charm-helpers standard.
1227- YAML format.
1228- type: string
1229- default: null
1230- install_keys:
1231- description: >
1232- List of signing keys for install_sources package sources, per
1233- charmhelpers standard. YAML format.
1234- type: string
1235- default: null
1236- extra_archives:
1237- default: ""
1238- type: string
1239- description: >
1240- DEPRECATED & IGNORED. Use install_sources and install_keys.
1241- advisory_lock_restart_key:
1242- default: 765
1243- type: int
1244- description: >
1245- An advisory lock key used internally by the charm. You do not need
1246- to change it unless it happens to conflict with an advisory lock key
1247- being used by your applications.
1248- # Swift backups and PITR via SwiftWAL
1249- swiftwal_container_prefix:
1250- type: string
1251- default: null
1252- description: >
1253- EXPERIMENTAL.
1254- Swift container prefix for SwiftWAL to use. Must be set if any
1255- SwiftWAL features are enabled. This will become a simple
1256- swiftwal_container config item when proper leader election is
1257- implemented in juju.
1258- swiftwal_backup_schedule:
1259- type: string
1260- default: null
1261- description: >
1262- EXPERIMENTAL.
1263- Cron-formatted schedule for SwiftWAL database backups.
1264- swiftwal_backup_retention:
1265- type: int
1266- default: 2
1267- description: >
1268- EXPERIMENTAL.
1269- Number of recent base backups to retain. You need enough space in
1270- Swift for this many backups plus one more, as an old backup will only
1271- be removed after a new one has been successfully made to replace it.
1272- swiftwal_log_shipping:
1273- type: boolean
1274- default: false
1275- description: >
1276- EXPERIMENTAL.
1277- Archive WAL files into Swift. If swiftwal_backup_schedule is set,
1278- allows point-in-time recovery and WAL files are removed
1279- automatically with old backups. If swiftwal_backup_schedule is not set
1280- then WAL files are never removed. Enabling this option will override
1281- the archive_mode and archive_command settings.
1282- wal_e_storage_uri:
1283- type: string
1284- default: null
1285- description: |
1286- EXPERIMENTAL.
1287- Specify storage to be used by WAL-E. Every PostgreSQL service must use
1288- a unique URI. Backups will be unrecoverable if it is not unique. The
1289- URI's scheme must be one of 'swift' (OpenStack Swift), 's3' (Amazon AWS)
1290- or 'wabs' (Windows Azure). For example:
1291- 'swift://some-container/directory/or/whatever'
1292- 's3://some-bucket/directory/or/whatever'
1293- 'wabs://some-bucket/directory/or/whatever'
1294- Setting the wal_e_storage_uri enables regular WAL-E filesystem level
1295- backups (per wal_e_backup_schedule), and log shipping to the configured
1296- storage. Point-in-time recovery becomes possible, as is disabling the
1297- streaming_replication configuration item and relying solely on
1298- log shipping for replication.
1299- wal_e_backup_schedule:
1300- type: string
1301- default: "13 0 * * *"
1302- description: >
1303- EXPERIMENTAL.
1304- Cron-formatted schedule for WAL-E database backups. If
1305- wal_e_backup_schedule is unset, WAL files will never be removed from
1306- WAL-E storage.
1307- wal_e_backup_retention:
1308- type: int
1309- default: 2
1310- description: >
1311- EXPERIMENTAL.
1312- Number of recent base backups and WAL files to retain.
1313- You need enough space for this many backups plus one more, as
1314- an old backup will only be removed after a new one has been
1315- successfully made to replace it.
1316- streaming_replication:
1317- type: boolean
1318- default: true
1319- description: >
1320- Enable streaming replication. Normally, streaming replication is
1321- always used, and any log shipping configured is used as a fallback.
1322- Turning this off without configuring log shipping is an error.
1323- os_username:
1324- type: string
1325- default: null
1326- description: EXPERIMENTAL. OpenStack Swift username.
1327- os_password:
1328- type: string
1329- default: null
1330- description: EXPERIMENTAL. OpenStack Swift password.
1331- os_auth_url:
1332- type: string
1333- default: null
1334- description: EXPERIMENTAL. OpenStack Swift authentication URL.
1335- os_tenant_name:
1336- type: string
1337- default: null
1338- description: EXPERIMENTAL. OpenStack Swift tenant name.
1339- aws_access_key_id:
1340- type: string
1341- default: null
1342- description: EXPERIMENTAL. Amazon AWS access key id.
1343- aws_secret_access_key:
1344- type: string
1345- default: null
1346- description: EXPERIMENTAL. Amazon AWS secret access key.
1347- wabs_account_name:
1348- type: string
1349- default: null
1350- description: EXPERIMENTAL. Windows Azure account name.
1351- wabs_access_key:
1352- type: string
1353- default: null
1354- description: EXPERIMENTAL. Windows Azure access key.
1355- package_status:
1356- default: "install"
1357- type: string
1358- description: >
1359- The status of service-affecting packages will be set to this
1360- value in the dpkg database. Useful valid values are "install"
1361- and "hold".
1362- # statsd-compatible metrics
1363- metrics_target:
1364- default: ""
1365- type: string
1366- description: >
1367- Destination for statsd-format metrics, format "host:port". If
1368- not present and valid, metrics disabled.
1369- metrics_prefix:
1370- default: "dev.$UNIT.postgresql"
1371- type: string
1372- description: >
1373- Prefix for metrics. Special value $UNIT can be used to include the
1374- name of the unit in the prefix.
1375- metrics_sample_interval:
1376- default: 5
1377- type: int
1378- description: Period for metrics cron job to run in minutes
1379+ description: >
1380+ DEPRECATED. Directory to place backups in. If you change this,
1381+ your backups will go to this path and not the 'backups' Juju
1382+ storage mount.
1383
1384=== modified file 'copyright'
1385--- copyright 2011-07-10 09:53:13 +0000
1386+++ copyright 2015-11-02 12:15:35 +0000
1387@@ -1,17 +1,16 @@
1388 Format: http://dep.debian.net/deps/dep5/
1389
1390 Files: *
1391-Copyright: Copyright 2011, Canonical Ltd., All Rights Reserved.
1392+Copyright: Copyright 2011-2015, Canonical Ltd.
1393 License: GPL-3
1394 This program is free software: you can redistribute it and/or modify
1395- it under the terms of the GNU General Public License as published by
1396- the Free Software Foundation, either version 3 of the License, or
1397- (at your option) any later version.
1398+ it under the terms of the GNU General Public License version 3, as
1399+ published by the Free Software Foundation.
1400 .
1401 This program is distributed in the hope that it will be useful,
1402- but WITHOUT ANY WARRANTY; without even the implied warranty of
1403- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
1404- GNU General Public License for more details.
1405+ but WITHOUT ANY WARRANTY; without even the implied warranties of
1406+ MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1407+ PURPOSE. See the GNU General Public License for more details.
1408 .
1409 You should have received a copy of the GNU General Public License
1410 along with this program. If not, see <http://www.gnu.org/licenses/>.
1411
1412=== added file 'hooks/bootstrap.py'
1413--- hooks/bootstrap.py 1970-01-01 00:00:00 +0000
1414+++ hooks/bootstrap.py 2015-11-02 12:15:35 +0000
1415@@ -0,0 +1,57 @@
1416+#!/usr/bin/python3
1417+
1418+# Copyright 2015 Canonical Ltd.
1419+#
1420+# This file is part of the PostgreSQL Charm for Juju.
1421+#
1422+# This program is free software: you can redistribute it and/or modify
1423+# it under the terms of the GNU General Public License version 3, as
1424+# published by the Free Software Foundation.
1425+#
1426+# This program is distributed in the hope that it will be useful, but
1427+# WITHOUT ANY WARRANTY; without even the implied warranties of
1428+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1429+# PURPOSE. See the GNU General Public License for more details.
1430+#
1431+# You should have received a copy of the GNU General Public License
1432+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1433+
1434+from charmhelpers import fetch
1435+from charmhelpers.core import hookenv
1436+
1437+
1438+def bootstrap():
1439+ try:
1440+ import psycopg2 # NOQA: flake8
1441+ import jinja2 # NOQA: flake8
1442+ except ImportError:
1443+ packages = ['python3-psycopg2', 'python3-jinja2']
1444+ fetch.apt_install(packages, fatal=True)
1445+ import psycopg2 # NOQA: flake8
1446+
1447+
1448+def block_on_bad_juju():
1449+ if not hookenv.has_juju_version('1.24'):
1450+ hookenv.status_set('blocked', 'Requires Juju 1.24 or higher')
1451+ # Error state, since we don't have 1.24 to give a nice blocked state.
1452+ raise SystemExit(1)
1453+
1454+
1455+def upgrade_charm():
1456+ block_on_bad_juju()
1457+ # This needs to be imported after bootstrap() or required Python
1458+ # packages may not have been installed.
1459+ import upgrade
1460+ upgrade.upgrade_charm()
1461+
1462+
1463+def default_hook():
1464+ block_on_bad_juju()
1465+ # This needs to be imported after bootstrap() or required Python
1466+ # packages may not have been installed.
1467+ import definitions
1468+
1469+ hookenv.log('*** Start {!r} hook'.format(hookenv.hook_name()))
1470+ sm = definitions.get_service_manager()
1471+ sm.manage()
1472+ hookenv.log('*** End {!r} hook'.format(hookenv.hook_name()))
1473
1474=== added file 'hooks/client.py'
1475--- hooks/client.py 1970-01-01 00:00:00 +0000
1476+++ hooks/client.py 2015-11-02 12:15:35 +0000
1477@@ -0,0 +1,182 @@
1478+# Copyright 2015 Canonical Ltd.
1479+#
1480+# This file is part of the PostgreSQL Charm for Juju.
1481+#
1482+# This program is free software: you can redistribute it and/or modify
1483+# it under the terms of the GNU General Public License version 3, as
1484+# published by the Free Software Foundation.
1485+#
1486+# This program is distributed in the hope that it will be useful, but
1487+# WITHOUT ANY WARRANTY; without even the implied warranties of
1488+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1489+# PURPOSE. See the GNU General Public License for more details.
1490+#
1491+# You should have received a copy of the GNU General Public License
1492+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1493+
1494+from charmhelpers.core import hookenv, host
1495+
1496+from decorators import relation_handler, master_only
1497+import helpers
1498+import postgresql
1499+
1500+
1501+@relation_handler('db', 'db-admin', 'master')
1502+def publish_db_relations(rel):
1503+ if postgresql.is_master():
1504+ db_relation_master(rel)
1505+ else:
1506+ db_relation_mirror(rel)
1507+ db_relation_common(rel)
1508+
1509+
1510+def _credential_types(rel):
1511+ superuser = (rel.relname in ('db-admin', 'master'))
1512+ replication = (rel.relname == 'master')
1513+ return (superuser, replication)
1514+
1515+
1516+def db_relation_master(rel):
1517+ '''The master generates credentials and negotiates resources.'''
1518+ master = rel.local
1519+ # Pick one remote unit as representative. They should all converge.
1520+ for remote in rel.values():
1521+ break
1522+
1523+ # The requested database name, the existing database name, or use
1524+ # the remote service name as a default. We no longer use the
1525+ # relation id for the database name or usernames, as when a
1526+ # database dump is restored into a new Juju environment we
1527+ # are more likely to have matching service names than relation ids
1528+ # and less likely to have to perform manual permission and ownership
1529+ # cleanups.
1530+ if 'database' in remote:
1531+ master['database'] = remote['database']
1532+ elif 'database' not in master:
1533+ master['database'] = remote.service
1534+
1535+ superuser, replication = _credential_types(rel)
1536+
1537+ if 'user' not in master:
1538+ user = postgresql.username(remote.service, superuser=superuser,
1539+ replication=replication)
1540+ password = host.pwgen()
1541+ master['user'] = user
1542+ master['password'] = password
1543+
1544+ # schema_user has never been documented and is deprecated.
1545+ if not superuser:
1546+ master['schema_user'] = user
1547+ master['schema_password'] = password
1548+
1549+ hookenv.log('** Master providing {} ({}/{})'.format(rel,
1550+ master['database'],
1551+ master['user']))
1552+
1553+ # Reflect these settings back so the client knows when they have
1554+ # taken effect.
1555+ if not replication:
1556+ master['roles'] = remote.get('roles')
1557+ master['extensions'] = remote.get('extensions')
1558+
1559+
1560+def db_relation_mirror(rel):
1561+ '''Non-masters mirror relation information from the master.'''
1562+ master = postgresql.master()
1563+ master_keys = ['database', 'user', 'password', 'roles',
1564+ 'schema_user', 'schema_password', 'extensions']
1565+ master_info = rel.peers.get(master)
1566+ if master_info is None:
1567+ hookenv.log('Waiting for {} to join {}'.format(master, rel))
1568+ return
1569+ hookenv.log('Mirroring {} database credentials from {}'.format(rel,
1570+ master))
1571+ rel.local.update({k: master_info.get(k) for k in master_keys})
1572+
1573+
1574+def db_relation_common(rel):
1575+ '''Publish unit specific relation details.'''
1576+ local = rel.local
1577+ if 'database' not in local:
1578+ return # Not yet ready.
1579+
1580+ # Version number, allowing clients to adjust or block if their
1581+ # expectations are not met.
1582+ local['version'] = postgresql.version()
1583+
1584+ # Calculate the state of this unit. 'standalone' will disappear
1585+ # in a future version of this interface, as this state was
1586+ # only needed to deal with race conditions now solved by
1587+ # Juju leadership.
1588+ if postgresql.is_primary():
1589+ if hookenv.is_leader() and len(helpers.peers()) == 0:
1590+ local['state'] = 'standalone'
1591+ else:
1592+ local['state'] = 'master'
1593+ else:
1594+ local['state'] = 'hot standby'
1595+
1596+ # Host is the private ip address, but this might change and
1597+ # become the address of an attached proxy or alternative peer
1598+ # if this unit is in maintenance.
1599+ local['host'] = hookenv.unit_private_ip()
1600+
1601+ # Port will be 5432, unless the user has overridden it or
1602+ # something very weird happened when the packages where installed.
1603+ local['port'] = str(postgresql.port())
1604+
1605+ # The list of remote units on this relation granted access.
1606+ # This is to avoid the race condition where a new client unit
1607+ # joins an existing client relation and sees valid credentials,
1608+ # before we have had a chance to grant it access.
1609+ local['allowed-units'] = ' '.join(rel.keys())
1610+
1611+
1612+@master_only
1613+@relation_handler('db', 'db-admin', 'master')
1614+def ensure_db_relation_resources(rel):
1615+ '''Create the database resources needed for the relation.'''
1616+
1617+ master = rel.local
1618+
1619+ hookenv.log('Ensuring database {!r} and user {!r} exist for {}'
1620+ ''.format(master['database'], master['user'], rel))
1621+
1622+ # First create the database, if it isn't already.
1623+ postgresql.ensure_database(master['database'])
1624+
1625+ # Next, connect to the database to create the rest in a transaction.
1626+ con = postgresql.connect(database=master['database'])
1627+
1628+ superuser, replication = _credential_types(rel)
1629+ postgresql.ensure_user(con, master['user'], master['password'],
1630+ superuser=superuser, replication=replication)
1631+ if not superuser:
1632+ postgresql.ensure_user(con,
1633+ master['schema_user'],
1634+ master['schema_password'])
1635+
1636+ # Grant specified privileges on the database to the user. This comes
1637+ # from the PostgreSQL service configuration, as allowing the
1638+ # relation to specify how much access it gets is insecure.
1639+ config = hookenv.config()
1640+ privs = set(filter(None,
1641+ config['relation_database_privileges'].split(',')))
1642+ postgresql.grant_database_privileges(con, master['user'],
1643+ master['database'], privs)
1644+ if not superuser:
1645+ postgresql.grant_database_privileges(con, master['schema_user'],
1646+ master['database'], privs)
1647+
1648+ # Reset the roles granted to the user as requested.
1649+ if 'roles' in master:
1650+ roles = filter(None, master.get('roles', '').split(','))
1651+ postgresql.grant_user_roles(con, master['user'], roles)
1652+
1653+ # Create requested extensions. We never drop extensions, as there
1654+ # may be dependent objects.
1655+ if 'extensions' in master:
1656+ extensions = filter(None, master.get('extensions', '').split(','))
1657+ postgresql.ensure_extensions(con, extensions)
1658+
1659+ con.commit() # Don't throw away our changes.
1660
1661=== removed symlink 'hooks/config-changed'
1662=== target was u'hooks.py'
1663=== added file 'hooks/coordinator.py'
1664--- hooks/coordinator.py 1970-01-01 00:00:00 +0000
1665+++ hooks/coordinator.py 2015-11-02 12:15:35 +0000
1666@@ -0,0 +1,19 @@
1667+# Copyright 2015 Canonical Ltd.
1668+#
1669+# This file is part of the PostgreSQL Charm for Juju.
1670+#
1671+# This program is free software: you can redistribute it and/or modify
1672+# it under the terms of the GNU General Public License version 3, as
1673+# published by the Free Software Foundation.
1674+#
1675+# This program is distributed in the hope that it will be useful, but
1676+# WITHOUT ANY WARRANTY; without even the implied warranties of
1677+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1678+# PURPOSE. See the GNU General Public License for more details.
1679+#
1680+# You should have received a copy of the GNU General Public License
1681+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1682+
1683+from charmhelpers.coordinator import Serial
1684+
1685+coordinator = Serial()
1686
1687=== added file 'hooks/data-relation-changed'
1688--- hooks/data-relation-changed 1970-01-01 00:00:00 +0000
1689+++ hooks/data-relation-changed 2015-11-02 12:15:35 +0000
1690@@ -0,0 +1,23 @@
1691+#!/usr/bin/python3
1692+
1693+# Copyright 2015 Canonical Ltd.
1694+#
1695+# This file is part of the PostgreSQL Charm for Juju.
1696+#
1697+# This program is free software: you can redistribute it and/or modify
1698+# it under the terms of the GNU General Public License version 3, as
1699+# published by the Free Software Foundation.
1700+#
1701+# This program is distributed in the hope that it will be useful, but
1702+# WITHOUT ANY WARRANTY; without even the implied warranties of
1703+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1704+# PURPOSE. See the GNU General Public License for more details.
1705+#
1706+# You should have received a copy of the GNU General Public License
1707+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1708+
1709+import bootstrap
1710+
1711+if __name__ == '__main__':
1712+ bootstrap.bootstrap()
1713+ bootstrap.default_hook()
1714
1715=== removed symlink 'hooks/data-relation-changed'
1716=== target was u'hooks.py'
1717=== added file 'hooks/data-relation-departed'
1718--- hooks/data-relation-departed 1970-01-01 00:00:00 +0000
1719+++ hooks/data-relation-departed 2015-11-02 12:15:35 +0000
1720@@ -0,0 +1,23 @@
1721+#!/usr/bin/python3
1722+
1723+# Copyright 2015 Canonical Ltd.
1724+#
1725+# This file is part of the PostgreSQL Charm for Juju.
1726+#
1727+# This program is free software: you can redistribute it and/or modify
1728+# it under the terms of the GNU General Public License version 3, as
1729+# published by the Free Software Foundation.
1730+#
1731+# This program is distributed in the hope that it will be useful, but
1732+# WITHOUT ANY WARRANTY; without even the implied warranties of
1733+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1734+# PURPOSE. See the GNU General Public License for more details.
1735+#
1736+# You should have received a copy of the GNU General Public License
1737+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1738+
1739+import bootstrap
1740+
1741+if __name__ == '__main__':
1742+ bootstrap.bootstrap()
1743+ bootstrap.default_hook()
1744
1745=== removed symlink 'hooks/data-relation-departed'
1746=== target was u'hooks.py'
1747=== removed symlink 'hooks/data-relation-joined'
1748=== target was u'hooks.py'
1749=== removed symlink 'hooks/db-admin-relation-broken'
1750=== target was u'hooks.py'
1751=== removed symlink 'hooks/db-admin-relation-changed'
1752=== target was u'hooks.py'
1753=== added file 'hooks/db-admin-relation-departed'
1754--- hooks/db-admin-relation-departed 1970-01-01 00:00:00 +0000
1755+++ hooks/db-admin-relation-departed 2015-11-02 12:15:35 +0000
1756@@ -0,0 +1,23 @@
1757+#!/usr/bin/python3
1758+
1759+# Copyright 2015 Canonical Ltd.
1760+#
1761+# This file is part of the PostgreSQL Charm for Juju.
1762+#
1763+# This program is free software: you can redistribute it and/or modify
1764+# it under the terms of the GNU General Public License version 3, as
1765+# published by the Free Software Foundation.
1766+#
1767+# This program is distributed in the hope that it will be useful, but
1768+# WITHOUT ANY WARRANTY; without even the implied warranties of
1769+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1770+# PURPOSE. See the GNU General Public License for more details.
1771+#
1772+# You should have received a copy of the GNU General Public License
1773+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1774+
1775+import bootstrap
1776+
1777+if __name__ == '__main__':
1778+ bootstrap.bootstrap()
1779+ bootstrap.default_hook()
1780
1781=== removed symlink 'hooks/db-admin-relation-joined'
1782=== target was u'hooks.py'
1783=== removed symlink 'hooks/db-relation-broken'
1784=== target was u'hooks.py'
1785=== removed symlink 'hooks/db-relation-changed'
1786=== target was u'hooks.py'
1787=== added file 'hooks/db-relation-departed'
1788--- hooks/db-relation-departed 1970-01-01 00:00:00 +0000
1789+++ hooks/db-relation-departed 2015-11-02 12:15:35 +0000
1790@@ -0,0 +1,23 @@
1791+#!/usr/bin/python3
1792+
1793+# Copyright 2015 Canonical Ltd.
1794+#
1795+# This file is part of the PostgreSQL Charm for Juju.
1796+#
1797+# This program is free software: you can redistribute it and/or modify
1798+# it under the terms of the GNU General Public License version 3, as
1799+# published by the Free Software Foundation.
1800+#
1801+# This program is distributed in the hope that it will be useful, but
1802+# WITHOUT ANY WARRANTY; without even the implied warranties of
1803+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1804+# PURPOSE. See the GNU General Public License for more details.
1805+#
1806+# You should have received a copy of the GNU General Public License
1807+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1808+
1809+import bootstrap
1810+
1811+if __name__ == '__main__':
1812+ bootstrap.bootstrap()
1813+ bootstrap.default_hook()
1814
1815=== removed symlink 'hooks/db-relation-joined'
1816=== target was u'hooks.py'
1817=== added file 'hooks/decorators.py'
1818--- hooks/decorators.py 1970-01-01 00:00:00 +0000
1819+++ hooks/decorators.py 2015-11-02 12:15:35 +0000
1820@@ -0,0 +1,124 @@
1821+# Copyright 2015 Canonical Ltd.
1822+#
1823+# This file is part of the PostgreSQL Charm for Juju.
1824+#
1825+# This program is free software: you can redistribute it and/or modify
1826+# it under the terms of the GNU General Public License version 3, as
1827+# published by the Free Software Foundation.
1828+#
1829+# This program is distributed in the hope that it will be useful, but
1830+# WITHOUT ANY WARRANTY; without even the implied warranties of
1831+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1832+# PURPOSE. See the GNU General Public License for more details.
1833+#
1834+# You should have received a copy of the GNU General Public License
1835+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1836+from functools import wraps
1837+
1838+from charmhelpers import context
1839+from charmhelpers.core import hookenv
1840+from charmhelpers.core.hookenv import DEBUG
1841+
1842+import helpers
1843+
1844+
1845+def data_ready_action(func):
1846+ '''Decorate func to be used as a data_ready item.'''
1847+ @wraps(func)
1848+ def wrapper(service_name=None):
1849+ if hookenv.remote_unit():
1850+ hookenv.log("** Action {}/{} ({})".format(hookenv.hook_name(),
1851+ func.__name__,
1852+ hookenv.remote_unit()))
1853+ else:
1854+ hookenv.log("** Action {}/{}".format(hookenv.hook_name(),
1855+ func.__name__))
1856+ return func()
1857+ return wrapper
1858+
1859+
1860+class requirement:
1861+ '''Decorate a function so it can be used as a required_data item.
1862+
1863+ Function must True if requirements are met. Sets the unit state
1864+ to blocked if requirements are not met and the unit not already blocked.
1865+ '''
1866+ def __init__(self, func):
1867+ self._func = func
1868+
1869+ def __bool__(self):
1870+ name = self._func.__name__
1871+ if self._func():
1872+ hookenv.log('** Requirement {} passed'.format(name))
1873+ return True
1874+ else:
1875+ if hookenv.status_get() != 'blocked':
1876+ helpers.status_set('blocked',
1877+ 'Requirement {} failed'.format(name))
1878+ return False
1879+
1880+
1881+def relation_handler(*relnames):
1882+ '''Invoke the decorated function once per matching relation.
1883+
1884+ The decorated function should accept the Relation() instance
1885+ as its single parameter.
1886+ '''
1887+ assert relnames, 'relation names required'
1888+
1889+ def decorator(func):
1890+ @wraps(func)
1891+ def wrapper(service_name=None):
1892+ rels = context.Relations()
1893+ for relname in relnames:
1894+ for rel in rels[relname].values():
1895+ if rel:
1896+ func(rel)
1897+ return wrapper
1898+ return decorator
1899+
1900+
1901+def leader_only(func):
1902+ '''Only run on the service leader.'''
1903+ @wraps(func)
1904+ def wrapper(*args, **kw):
1905+ if hookenv.is_leader():
1906+ return func(*args, **kw)
1907+ else:
1908+ hookenv.log('Not the leader', DEBUG)
1909+ return wrapper
1910+
1911+
1912+def not_leader(func):
1913+ '''Only run on the service leader.'''
1914+ @wraps(func)
1915+ def wrapper(*args, **kw):
1916+ if not hookenv.is_leader():
1917+ return func(*args, **kw)
1918+ else:
1919+ hookenv.log("I'm the leader", DEBUG)
1920+ return wrapper
1921+
1922+
1923+def master_only(func):
1924+ '''Only run on the appointed master.'''
1925+ @wraps(func)
1926+ def wrapper(*args, **kw):
1927+ import postgresql
1928+ if postgresql.is_master():
1929+ return func(*args, **kw)
1930+ else:
1931+ hookenv.log('Not the master', DEBUG)
1932+ return wrapper
1933+
1934+
1935+def not_master(func):
1936+ '''Don't run on the appointed master.'''
1937+ @wraps(func)
1938+ def wrapper(*args, **kw):
1939+ import postgresql
1940+ if postgresql.is_master():
1941+ hookenv.log("I'm the master", DEBUG)
1942+ else:
1943+ return func(*args, **kw)
1944+ return wrapper
1945
1946=== added file 'hooks/definitions.py'
1947--- hooks/definitions.py 1970-01-01 00:00:00 +0000
1948+++ hooks/definitions.py 2015-11-02 12:15:35 +0000
1949@@ -0,0 +1,86 @@
1950+# Copyright 2011-2015 Canonical Ltd.
1951+#
1952+# This file is part of the PostgreSQL Charm for Juju.
1953+#
1954+# This program is free software: you can redistribute it and/or modify
1955+# it under the terms of the GNU General Public License version 3, as
1956+# published by the Free Software Foundation.
1957+#
1958+# This program is distributed in the hope that it will be useful, but
1959+# WITHOUT ANY WARRANTY; without even the implied warranties of
1960+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
1961+# PURPOSE. See the GNU General Public License for more details.
1962+#
1963+# You should have received a copy of the GNU General Public License
1964+# along with this program. If not, see <http://www.gnu.org/licenses/>.
1965+
1966+from charmhelpers.core import services
1967+
1968+import client
1969+import nagios
1970+import replication
1971+import service
1972+import storage
1973+import syslogrel
1974+import wal_e
1975+
1976+
1977+SERVICE_DEFINITION = [
1978+ dict(service='postgresql',
1979+ required_data=[service.valid_config],
1980+ data_ready=[service.preinstall,
1981+ service.configure_sources,
1982+ service.install_packages,
1983+ service.ensure_package_status,
1984+ service.update_kernel_settings,
1985+ service.appoint_master,
1986+
1987+ service.wait_for_peers, # Exit if there are no peers.
1988+
1989+ nagios.ensure_nagios_credentials,
1990+ replication.ensure_replication_credentials,
1991+ replication.publish_replication_details,
1992+
1993+ # Exit if required leader settings are not set.
1994+ service.wait_for_leader,
1995+
1996+ service.ensure_cluster,
1997+ service.update_pgpass,
1998+ service.update_pg_hba_conf,
1999+ service.update_pg_ident_conf,
2000+ service.update_postgresql_conf,
2001+ syslogrel.handle_syslog_relations,
2002+ storage.handle_storage_relation,
2003+ wal_e.update_wal_e_env_dir,
2004+ service.request_restart,
2005+
2006+ service.wait_for_restart, # Exit if cannot restart yet.
2007+
2008+ replication.promote_master,
2009+ storage.remount,
2010+ replication.clone_master, # Exit if cannot clone yet.
2011+ replication.update_recovery_conf,
2012+ service.restart_or_reload,
2013+
2014+ replication.ensure_replication_user,
2015+ nagios.ensure_nagios_user,
2016+ service.install_administrative_scripts,
2017+ service.update_postgresql_crontab,
2018+
2019+ client.publish_db_relations,
2020+ client.ensure_db_relation_resources,
2021+
2022+ service.update_pg_hba_conf, # Again, after client setup.
2023+ service.reload_config,
2024+
2025+ service.set_active,
2026+
2027+ # At the end, as people check the end of logs
2028+ # most frequently.
2029+ service.emit_deprecated_option_warnings],
2030+ start=[service.open_ports],
2031+ stop=[service.stop_postgresql, service.close_ports])]
2032+
2033+
2034+def get_service_manager():
2035+ return services.ServiceManager(SERVICE_DEFINITION)
2036
2037=== added file 'hooks/helpers.py'
2038--- hooks/helpers.py 1970-01-01 00:00:00 +0000
2039+++ hooks/helpers.py 2015-11-02 12:15:35 +0000
2040@@ -0,0 +1,151 @@
2041+# Copyright 2015 Canonical Ltd.
2042+#
2043+# This file is part of the PostgreSQL Charm for Juju.
2044+#
2045+# This program is free software: you can redistribute it and/or modify
2046+# it under the terms of the GNU General Public License version 3, as
2047+# published by the Free Software Foundation.
2048+#
2049+# This program is distributed in the hope that it will be useful, but
2050+# WITHOUT ANY WARRANTY; without even the implied warranties of
2051+# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
2052+# PURPOSE. See the GNU General Public License for more details.
2053+#
2054+# You should have received a copy of the GNU General Public License
2055+# along with this program. If not, see <http://www.gnu.org/licenses/>.
2056+
2057+from contextlib import contextmanager
2058+import os
2059+import shutil
2060+import stat
2061+import tempfile
2062+
2063+import yaml
2064+
2065+from charmhelpers import context
2066+from charmhelpers.core import hookenv, host
2067+from charmhelpers.core.hookenv import INFO, CRITICAL
2068+
2069+
2070+def status_set(status_or_msg, msg=None):
2071+ '''Set the unit status message, and log the change too.'''
2072+ if msg is None:
2073+ msg = status_or_msg
2074+ status = hookenv.status_get()
2075+ else:
2076+ status = status_or_msg
2077+
2078+ if status == 'blocked':
2079+ lvl = CRITICAL
2080+ else:
2081+ lvl = INFO
2082+ hookenv.log('{}: {}'.format(status, msg), lvl)
2083+ hookenv.status_set(status, msg)
2084+
2085+
2086+def distro_codename():
2087+ """Return the distro release code name, eg. 'precise' or 'trusty'."""
2088+ return host.lsb_release()['DISTRIB_CODENAME']
2089+
2090+
2091+def extra_packages():
2092+ config = hookenv.config()
2093+ packages = set()
2094+
2095+ packages.update(set(config['extra_packages'].split()))
2096+ packages.update(set(config['extra-packages'].split())) # Deprecated.
2097+
2098+ if config['wal_e_storage_uri']:
2099+ packages.add('daemontools')
2100+ packages.add('wal-e')
2101+
2102+ return packages
2103+
2104+
2105+def peers():
2106+ '''Return the set of peers, not including the local unit.'''
2107+ rel = context.Relations().peer
2108+ return frozenset(rel.keys()) if rel else frozenset()
2109+
2110+
2111+def rewrite(path, content):
2112+ '''Rewrite a file atomically, preserving ownership and permissions.'''
2113+ attr = os.lstat(path)
2114+ write(path, content,
2115+ mode=stat.S_IMODE(attr.st_mode),
2116+ user=attr[stat.ST_UID],
2117+ group=attr[stat.ST_GID])
2118+
2119+
2120+def write(path, content, mode=0o640, user='root', group='root'):
2121+ '''Write a file atomically.'''
2122+ open_mode = 'wb' if isinstance(content, bytes) else 'w'
2123+ with tempfile.NamedTemporaryFile(mode=open_mode, delete=False) as f:
2124+ try:
2125+ f.write(content)
2126+ f.flush()
2127+ shutil.chown(f.name, user, group)
2128+ os.chmod(f.name, mode)
2129+ os.replace(f.name, path)
2130+ finally:
2131+ if os.path.exists(f.name):
2132+ os.unlink(f.name)
2133+
2134+
2135+def makedirs(path, mode=0o750, user='root', group='root'):
2136+ if os.path.exists(path):
2137+ assert os.path.isdir(path), '{} is not a directory'
2138+ else:
2139+ os.makedirs(path, mode=mode)
2140+ shutil.chown(path, user, group)
2141+ os.chmod(path, mode)
2142+
2143+
2144+@contextmanager
2145+def switch_cwd(new_working_directory='/tmp'):
2146+ 'Switch working directory.'
2147+ org_dir = os.getcwd()
2148+ os.chdir(new_working_directory)
2149+ try:
2150+ yield new_working_directory
2151+ finally:
2152+ os.chdir(org_dir)
2153+
2154+
2155+def config_yaml():
2156+ config_yaml_path = os.path.join(hookenv.charm_dir(), 'config.yaml')
2157+ with open(config_yaml_path, 'r') as f:
2158+ return yaml.load(f)
2159+
2160+
2161+def deprecated_config_in_use():
2162+ options = config_yaml()['options']
2163+ config = hookenv.config()
2164+ deprecated = [key for key in options
2165+ if ('DEPRECATED' in options[key]['description'] and
2166+ config[key] != options[key]['default'])]
2167+ return set(deprecated)
2168+
2169+
2170+def cron_dir():
2171+ '''Where we put crontab files.'''
2172+ return '/etc/cron.d'
2173+
2174+
2175+def scripts_dir():
2176+ '''Where the charm puts adminstrative scripts.'''
2177+ return '/var/lib/postgresql/scripts'
2178+
2179+
2180+def logs_dir():
2181+ '''Where the charm administrative scripts log their output.'''
2182+ return '/var/lib/postgresql/logs'
2183+
2184+
2185+def backups_dir():
2186+ '''Where pg_dump backups are stored.'''
2187+ return hookenv.config()['backup_dir']
2188+
2189+
2190+def backups_log_path():
2191+ return os.path.join(logs_dir(), 'backups.log')
2192
2193=== removed file 'hooks/helpers.py'
2194--- hooks/helpers.py 2015-02-24 16:57:31 +0000
2195+++ hooks/helpers.py 1970-01-01 00:00:00 +0000
2196@@ -1,197 +0,0 @@
2197-# Copyright 2012 Canonical Ltd. This software is licensed under the
2198-# GNU Affero General Public License version 3 (see the file LICENSE).
2199-
2200-"""Helper functions for writing hooks in python."""
2201-
2202-__metaclass__ = type
2203-__all__ = [
2204- 'get_config',
2205- 'juju_status',
2206- 'log',
2207- 'log_entry',
2208- 'log_exit',
2209- 'make_charm_config_file',
2210- 'relation_get',
2211- 'relation_set',
2212- 'unit_info',
2213- 'wait_for_machine',
2214- 'wait_for_page_contents',
2215- 'wait_for_relation',
2216- 'wait_for_unit']
2217-
2218-from contextlib import contextmanager
2219-import json
2220-import operator
2221-from shelltoolbox import (
2222- command,
2223- run,
2224- script_name)
2225-import os
2226-import tempfile
2227-import time
2228-import urllib2
2229-import yaml
2230-
2231-
2232-log = command('juju-log')
2233-
2234-
2235-def log_entry():
2236- log("--> Entering {}".format(script_name()))
2237-
2238-
2239-def log_exit():
2240- log("<-- Exiting {}".format(script_name()))
2241-
2242-
2243-def get_config():
2244- config_get = command('config-get', '--format=json')
2245- return json.loads(config_get())
2246-
2247-
2248-def relation_get(*args):
2249- cmd = command('relation-get')
2250- return cmd(*args).strip()
2251-
2252-
2253-def relation_set(**kwargs):
2254- cmd = command('relation-set')
2255- args = ['{}={}'.format(k, v) for k, v in kwargs.items()]
2256- return cmd(*args)
2257-
2258-
2259-def make_charm_config_file(charm_config):
2260- charm_config_file = tempfile.NamedTemporaryFile()
2261- charm_config_file.write(yaml.dump(charm_config))
2262- charm_config_file.flush()
2263- # The NamedTemporaryFile instance is returned instead of just the name
2264- # because we want to take advantage of garbage collection-triggered
2265- # deletion of the temp file when it goes out of scope in the caller.
2266- return charm_config_file
2267-
2268-
2269-def juju_status(key):
2270- return yaml.safe_load(run('juju', 'status'))[key]
2271-
2272-
2273-def get_charm_revision(service_name):
2274- service = juju_status('services')[service_name]
2275- return int(service['charm'].split('-')[-1])
2276-
2277-
2278-def unit_info(service_name, item_name, data=None):
2279- services = juju_status('services') if data is None else data['services']
2280- service = services.get(service_name)
2281- if service is None:
2282- # XXX 2012-02-08 gmb:
2283- # This allows us to cope with the race condition that we
2284- # have between deploying a service and having it come up in
2285- # `juju status`. We could probably do with cleaning it up so
2286- # that it fails a bit more noisily after a while.
2287- return ''
2288- units = service['units']
2289- item = units.items()[0][1][item_name]
2290- return item
2291-
2292-
2293-@contextmanager
2294-def maintain_charm_revision(path=None):
2295- if path is None:
2296- path = os.path.join(os.path.dirname(__file__), '..', 'revision')
2297- revision = open(path).read()
2298- try:
2299- yield revision
2300- finally:
2301- with open(path, 'w') as f:
2302- f.write(revision)
2303-
2304-
2305-def upgrade_charm(service_name, timeout=120):
2306- next_revision = get_charm_revision(service_name) + 1
2307- start_time = time.time()
2308- run('juju', 'upgrade-charm', service_name)
2309- while get_charm_revision(service_name) != next_revision:
2310- if time.time() - start_time >= timeout:
2311- raise RuntimeError('timeout waiting for charm to be upgraded')
2312- time.sleep(0.1)
2313- return next_revision
2314-
2315-
2316-def wait_for_machine(num_machines=1, timeout=300):
2317- """Wait `timeout` seconds for `num_machines` machines to come up.
2318-
2319- This wait_for... function can be called by other wait_for functions
2320- whose timeouts might be too short in situations where only a bare
2321- Juju setup has been bootstrapped.
2322- """
2323- # You may think this is a hack, and you'd be right. The easiest way
2324- # to tell what environment we're working in (LXC vs EC2) is to check
2325- # the dns-name of the first machine. If it's localhost we're in LXC
2326- # and we can just return here.
2327- if juju_status('machines')[0]['dns-name'] == 'localhost':
2328- return
2329- start_time = time.time()
2330- while True:
2331- # Drop the first machine, since it's the Zookeeper and that's
2332- # not a machine that we need to wait for. This will only work
2333- # for EC2 environments, which is why we return early above if
2334- # we're in LXC.
2335- machine_data = juju_status('machines')
2336- non_zookeeper_machines = [
2337- machine_data[key] for key in machine_data.keys()[1:]]
2338- if len(non_zookeeper_machines) >= num_machines:
2339- all_machines_running = True
2340- for machine in non_zookeeper_machines:
2341- if machine['instance-state'] != 'running':
2342- all_machines_running = False
2343- break
2344- if all_machines_running:
2345- break
2346- if time.time() - start_time >= timeout:
2347- raise RuntimeError('timeout waiting for service to start')
2348- time.sleep(0.1)
2349-
2350-
2351-def wait_for_unit(service_name, timeout=480):
2352- """Wait `timeout` seconds for a given service name to come up."""
2353- wait_for_machine(num_machines=1)
2354- start_time = time.time()
2355- while True:
2356- state = unit_info(service_name, 'state')
2357- if 'error' in state or state == 'started':
2358- break
2359- if time.time() - start_time >= timeout:
2360- raise RuntimeError('timeout waiting for service to start')
2361- time.sleep(0.1)
2362- if state != 'started':
2363- raise RuntimeError('unit did not start, state: ' + state)
2364-
2365-
2366-def wait_for_relation(service_name, relation_name, timeout=120):
2367- """Wait `timeout` seconds for a given relation to come up."""
2368- start_time = time.time()
2369- while True:
2370- relation = unit_info(service_name, 'relations').get(relation_name)
2371- if relation is not None and relation['state'] == 'up':
2372- break
2373- if time.time() - start_time >= timeout:
2374- raise RuntimeError('timeout waiting for relation to be up')
2375- time.sleep(0.1)
2376-
2377-
2378-def wait_for_page_contents(url, contents, timeout=120, validate=None):
2379- if validate is None:
2380- validate = operator.contains
2381- start_time = time.time()
2382- while True:
2383- try:
2384- stream = urllib2.urlopen(url)
2385- except (urllib2.HTTPError, urllib2.URLError):
2386- pass
2387- else:
2388- page = stream.read()
2389- if validate(page, contents):
2390- return page
2391- if time.time() - start_time >= timeout:
2392- raise RuntimeError('timeout waiting for contents of ' + url)
2393- time.sleep(0.1)
2394
2395=== removed file 'hooks/hooks.py'
2396--- hooks/hooks.py 2015-08-11 11:15:27 +0000
2397+++ hooks/hooks.py 1970-01-01 00:00:00 +0000
2398@@ -1,2820 +0,0 @@
2399-#!/usr/bin/env python
2400-# vim: et ai ts=4 sw=4:
2401-
2402-from contextlib import contextmanager
2403-import commands
2404-import cPickle as pickle
2405-from distutils.version import StrictVersion
2406-import glob
2407-from grp import getgrnam
2408-import os
2409-from pwd import getpwnam
2410-import re
2411-import shutil
2412-import socket
2413-import subprocess
2414-import sys
2415-from tempfile import NamedTemporaryFile
2416-from textwrap import dedent
2417-import time
2418-import urlparse
2419-
2420-from charmhelpers import fetch
2421-from charmhelpers.core import hookenv, host
2422-from charmhelpers.core.hookenv import (
2423- CRITICAL, ERROR, WARNING, INFO, DEBUG)
2424-
2425-try:
2426- import psycopg2
2427- from jinja2 import Template
2428-except ImportError:
2429- fetch.apt_update(fatal=True)
2430- fetch.apt_install(['python-psycopg2', 'python-jinja2'], fatal=True)
2431- import psycopg2
2432- from jinja2 import Template
2433-
2434-from psycopg2.extensions import AsIs
2435-from jinja2 import Environment, FileSystemLoader
2436-
2437-
2438-hooks = hookenv.Hooks()
2439-
2440-
2441-def log(msg, lvl=INFO):
2442- '''Log a message.
2443-
2444- Per Bug #1208787, log messages sent via juju-log are being lost.
2445- Spit messages out to a log file to work around the problem.
2446- It is also rather nice to have the log messages we explicitly emit
2447- in a separate log file, rather than just mashed up with all the
2448- juju noise.
2449- '''
2450- myname = hookenv.local_unit().replace('/', '-')
2451- ts = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime())
2452- with open('{}/{}-debug.log'.format(juju_log_dir, myname), 'a') as f:
2453- f.write('{} {}: {}\n'.format(ts, lvl, msg))
2454- hookenv.log(msg, lvl)
2455-
2456-
2457-def pg_version():
2458- '''Return pg_version to use.
2459-
2460- Return "version" config item if set, else use version from "postgresql"
2461- package candidate, saving it in local_state for later.
2462- '''
2463- config_data = hookenv.config()
2464- if 'pg_version' in local_state:
2465- version = local_state['pg_version']
2466- elif 'version' in config_data:
2467- version = config_data['version']
2468- else:
2469- log("map version from distro release ...")
2470- version_map = {'precise': '9.1',
2471- 'trusty': '9.3'}
2472- version = version_map.get(distro_codename())
2473- if not version:
2474- log("No PG version map for distro_codename={}, "
2475- "you'll need to explicitly set it".format(distro_codename()),
2476- CRITICAL)
2477- sys.exit(1)
2478- log("version={} from distro_codename='{}'".format(
2479- version, distro_codename()))
2480- # save it for later
2481- local_state.setdefault('pg_version', version)
2482- local_state.save()
2483-
2484- assert version, "pg_version couldn't find a version to use"
2485- return version
2486-
2487-
2488-def distro_codename():
2489- """Return the distro release code name, eg. 'precise' or 'trusty'."""
2490- return host.lsb_release()['DISTRIB_CODENAME']
2491-
2492-
2493-def render_template(template_name, vars):
2494- # deferred import so install hook can install jinja2
2495- templates_dir = os.path.join(os.environ['CHARM_DIR'], 'templates')
2496- template_env = Environment(loader=FileSystemLoader(templates_dir))
2497- template = template_env.get_template(template_name)
2498- return template.render(vars)
2499-
2500-
2501-class State(dict):
2502- """Encapsulate state common to the unit for republishing to relations."""
2503- def __init__(self, state_file):
2504- super(State, self).__init__()
2505- self._state_file = state_file
2506- self.load()
2507-
2508- def load(self):
2509- '''Load stored state from local disk.'''
2510- if os.path.exists(self._state_file):
2511- state = pickle.load(open(self._state_file, 'rb'))
2512- else:
2513- state = {}
2514- self.clear()
2515-
2516- self.update(state)
2517-
2518- def save(self):
2519- '''Store state to local disk.'''
2520- state = {}
2521- state.update(self)
2522- old_mask = os.umask(0o077) # This file contains database passwords!
2523- try:
2524- pickle.dump(state, open(self._state_file, 'wb'))
2525- finally:
2526- os.umask(old_mask)
2527-
2528- def publish(self):
2529- """Publish relevant unit state to relations"""
2530-
2531- def add(state_dict, key):
2532- if key in self:
2533- state_dict[key] = self[key]
2534-
2535- client_state = {}
2536- add(client_state, 'state')
2537-
2538- for relid in hookenv.relation_ids('db'):
2539- hookenv.relation_set(relid, client_state)
2540-
2541- for relid in hookenv.relation_ids('db-admin'):
2542- hookenv.relation_set(relid, client_state)
2543-
2544- replication_state = dict(client_state)
2545-
2546- add(replication_state, 'replication_password')
2547- add(replication_state, 'port')
2548- add(replication_state, 'wal_received_offset')
2549- add(replication_state, 'following')
2550- add(replication_state, 'client_relations')
2551-
2552- authorized = self.get('authorized', None)
2553- if authorized:
2554- replication_state['authorized'] = ' '.join(sorted(authorized))
2555-
2556- for relid in hookenv.relation_ids('replication'):
2557- hookenv.relation_set(relid, replication_state)
2558-
2559- for relid in hookenv.relation_ids('master'):
2560- hookenv.relation_set(relid, state=self.get('state'))
2561-
2562- log('saving local state', DEBUG)
2563- self.save()
2564-
2565-
2566-def volume_get_all_mounted():
2567- command = ("mount |egrep %s" % external_volume_mount)
2568- status, output = commands.getstatusoutput(command)
2569- if status != 0:
2570- return None
2571- return output
2572-
2573-
2574-def postgresql_autostart(enabled):
2575- postgresql_config_dir = _get_postgresql_config_dir()
2576- startup_file = os.path.join(postgresql_config_dir, 'start.conf')
2577- if enabled:
2578- log("Enabling PostgreSQL startup in {}".format(startup_file))
2579- mode = 'auto'
2580- else:
2581- log("Disabling PostgreSQL startup in {}".format(startup_file))
2582- mode = 'manual'
2583- template_file = "{}/templates/start_conf.tmpl".format(hookenv.charm_dir())
2584- contents = Template(open(template_file).read()).render({'mode': mode})
2585- host.write_file(
2586- startup_file, contents, 'postgres', 'postgres', perms=0o644)
2587-
2588-
2589-def run(command, exit_on_error=True, quiet=False):
2590- '''Run a command and return the output.'''
2591- if not quiet:
2592- log("Running {!r}".format(command), DEBUG)
2593- p = subprocess.Popen(
2594- command, stdin=subprocess.PIPE, stdout=subprocess.PIPE,
2595- shell=isinstance(command, basestring))
2596- p.stdin.close()
2597- lines = []
2598- for line in p.stdout:
2599- if line:
2600- # LP:1274460 & LP:1259490 mean juju-log is no where near as
2601- # useful as we would like, so just shove a copy of the
2602- # output to stdout for logging.
2603- # log("> {}".format(line), DEBUG)
2604- if not quiet:
2605- print line
2606- lines.append(line)
2607- elif p.poll() is not None:
2608- break
2609-
2610- p.wait()
2611-
2612- if p.returncode == 0:
2613- return '\n'.join(lines)
2614-
2615- if p.returncode != 0 and exit_on_error:
2616- log("ERROR: {}".format(p.returncode), ERROR)
2617- sys.exit(p.returncode)
2618-
2619- raise subprocess.CalledProcessError(p.returncode, command,
2620- '\n'.join(lines))
2621-
2622-
2623-def postgresql_is_running():
2624- '''Return true if PostgreSQL is running.'''
2625- for version, name, _, status in lsclusters(slice(4)):
2626- if (version, name) == (pg_version(), hookenv.config('cluster_name')):
2627- if 'online' in status.split(','):
2628- log('PostgreSQL is running', DEBUG)
2629- return True
2630- else:
2631- log('PostgreSQL is not running', DEBUG)
2632- return False
2633- assert False, 'Cluster {} {} not found'.format(
2634- pg_version(), hookenv.config('cluster_name'))
2635-
2636-
2637-def postgresql_stop():
2638- '''Shutdown PostgreSQL.'''
2639- if postgresql_is_running():
2640- run([
2641- 'pg_ctlcluster', '--force',
2642- pg_version(), hookenv.config('cluster_name'), 'stop'])
2643- log('PostgreSQL shut down')
2644-
2645-
2646-def postgresql_start():
2647- '''Start PostgreSQL if it is not already running.'''
2648- if not postgresql_is_running():
2649- run([
2650- 'pg_ctlcluster', pg_version(),
2651- hookenv.config('cluster_name'), 'start'])
2652- log('PostgreSQL started')
2653-
2654-
2655-def postgresql_restart():
2656- '''Restart PostgreSQL, or start it if it is not already running.'''
2657- if postgresql_is_running():
2658- with restart_lock(hookenv.local_unit(), True):
2659- run([
2660- 'pg_ctlcluster', '--force',
2661- pg_version(), hookenv.config('cluster_name'), 'restart'])
2662- log('PostgreSQL restarted')
2663- else:
2664- postgresql_start()
2665-
2666- assert postgresql_is_running()
2667-
2668- # Store a copy of our known live configuration so
2669- # postgresql_reload_or_restart() can make good choices.
2670- if 'saved_config' in local_state:
2671- local_state['live_config'] = local_state['saved_config']
2672- local_state.save()
2673-
2674-
2675-def postgresql_reload():
2676- '''Make PostgreSQL reload its configuration.'''
2677- # reload returns a reliable exit status
2678- if postgresql_is_running():
2679- # I'm using the PostgreSQL function to avoid as much indirection
2680- # as possible.
2681- success = run_select_as_postgres('SELECT pg_reload_conf()')[1][0][0]
2682- assert success, 'Failed to reload PostgreSQL configuration'
2683- log('PostgreSQL configuration reloaded')
2684- return postgresql_start()
2685-
2686-
2687-def requires_restart():
2688- '''Check for configuration changes requiring a restart to take effect.'''
2689- if not postgresql_is_running():
2690- return True
2691-
2692- saved_config = local_state.get('saved_config', None)
2693- if not saved_config:
2694- log("No record of postgresql.conf state. Better restart.")
2695- return True
2696-
2697- live_config = local_state.setdefault('live_config', {})
2698-
2699- # Pull in a list of PostgreSQL settings.
2700- cur = db_cursor()
2701- cur.execute("SELECT name, context FROM pg_settings")
2702- restart = False
2703- for name, context in cur.fetchall():
2704- live_value = live_config.get(name, None)
2705- new_value = saved_config.get(name, None)
2706-
2707- if new_value != live_value:
2708- if live_config:
2709- log("Changed {} from {!r} to {!r}".format(
2710- name, live_value, new_value), DEBUG)
2711- if context == 'postmaster':
2712- # A setting has changed that requires PostgreSQL to be
2713- # restarted before it will take effect.
2714- restart = True
2715- log('{} changed from {} to {}. Restart required.'.format(
2716- name, live_value, new_value), DEBUG)
2717- return restart
2718-
2719-
2720-def postgresql_reload_or_restart():
2721- """Reload PostgreSQL configuration, restarting if necessary."""
2722- if requires_restart():
2723- log("Configuration change requires PostgreSQL restart", WARNING)
2724- postgresql_restart()
2725- assert not requires_restart(), "Configuration changes failed to apply"
2726- else:
2727- postgresql_reload()
2728-
2729- local_state['saved_config'] = local_state['live_config']
2730- local_state.save()
2731-
2732-
2733-def get_service_port():
2734- '''Return the port PostgreSQL is listening on.'''
2735- for version, name, port in lsclusters(slice(3)):
2736- if (version, name) == (pg_version(), hookenv.config('cluster_name')):
2737- return int(port)
2738-
2739- assert False, 'No port found for {!r} {!r}'.format(
2740- pg_version(), hookenv.config['cluster_name'])
2741-
2742-
2743-def lsclusters(s=slice(0, -1)):
2744- for line in run('pg_lsclusters', quiet=True).splitlines()[1:]:
2745- if line:
2746- yield line.split()[s]
2747-
2748-
2749-def createcluster():
2750- with switch_cwd('/tmp'): # Ensure cwd is readable as the postgres user
2751- create_cmd = [
2752- "pg_createcluster",
2753- "--locale", hookenv.config('locale'),
2754- "-e", hookenv.config('encoding')]
2755- if hookenv.config('listen_port'):
2756- create_cmd.extend(["-p", str(hookenv.config('listen_port'))])
2757- version = pg_version()
2758- create_cmd.append(version)
2759- create_cmd.append(hookenv.config('cluster_name'))
2760-
2761- # With 9.3+, we make an opinionated decision to always enable
2762- # data checksums. This seems to be best practice. We could
2763- # turn this into a configuration item if there is need. There
2764- # is no way to enable this option on existing clusters.
2765- if StrictVersion(version) >= StrictVersion('9.3'):
2766- create_cmd.extend(['--', '--data-checksums'])
2767-
2768- run(create_cmd)
2769- # Ensure SSL certificates exist, as we enable SSL by default.
2770- create_ssl_cert(os.path.join(
2771- postgresql_data_dir, pg_version(), hookenv.config('cluster_name')))
2772-
2773-
2774-def _get_system_ram():
2775- """ Return the system ram in Megabytes """
2776- import psutil
2777- return psutil.phymem_usage()[0] / (1024 ** 2)
2778-
2779-
2780-def _get_page_size():
2781- """ Return the operating system's configured PAGE_SIZE """
2782- return int(run("getconf PAGE_SIZE")) # frequently 4096
2783-
2784-
2785-def _run_sysctl(postgresql_sysctl):
2786- """sysctl -p postgresql_sysctl, helper for easy test mocking."""
2787- # Do not error out when this fails. It is not likely to work under LXC.
2788- return run("sysctl -p {}".format(postgresql_sysctl), exit_on_error=False)
2789-
2790-
2791-def create_postgresql_config(config_file):
2792- '''Create the postgresql.conf file'''
2793- config_data = hookenv.config()
2794- if not config_data.get('listen_port', None):
2795- config_data['listen_port'] = get_service_port()
2796- if config_data["performance_tuning"].lower() != "manual":
2797- total_ram = _get_system_ram()
2798- config_data["kernel_shmmax"] = (int(total_ram) * 1024 * 1024) + 1024
2799- config_data["kernel_shmall"] = config_data["kernel_shmmax"]
2800-
2801- # XXX: This is very messy - should probably be a subordinate charm
2802- lines = ["kernel.sem = 250 32000 100 1024\n"]
2803- if config_data["kernel_shmall"] > 0:
2804- # Convert config kernel_shmall (bytes) to pages
2805- page_size = _get_page_size()
2806- num_pages = config_data["kernel_shmall"] / page_size
2807- if (config_data["kernel_shmall"] % page_size) > 0:
2808- num_pages += 1
2809- lines.append("kernel.shmall = %s\n" % num_pages)
2810- if config_data["kernel_shmmax"] > 0:
2811- lines.append("kernel.shmmax = %s\n" % config_data["kernel_shmmax"])
2812- host.write_file(postgresql_sysctl, ''.join(lines), perms=0600)
2813- _run_sysctl(postgresql_sysctl)
2814-
2815- # Our config file specifies a default wal_level that only works
2816- # with PostgreSQL 9.4. Downgrade this for earlier versions of
2817- # PostgreSQL. We have this default so more things Just Work.
2818- if pg_version() < '9.4' and config_data['wal_level'] == 'logical':
2819- config_data['wal_level'] = 'hot_standby'
2820-
2821- # If we are replicating, some settings may need to be overridden to
2822- # certain minimum levels.
2823- num_slaves = slave_count()
2824- if num_slaves > 0:
2825- log('{} hot standbys in peer relation.'.format(num_slaves))
2826- log('Ensuring minimal replication settings')
2827- config_data['hot_standby'] = True
2828- if config_data['wal_level'] != 'logical':
2829- config_data['wal_level'] = 'hot_standby'
2830- config_data['wal_keep_segments'] = max(
2831- config_data['wal_keep_segments'],
2832- config_data['replicated_wal_keep_segments'])
2833- # We need this set even if config_data['streaming_replication']
2834- # is False, because the replication connection is still needed
2835- # by pg_basebackup to build a hot standby.
2836- config_data['max_wal_senders'] = max(
2837- num_slaves, config_data['max_wal_senders'])
2838-
2839- # Log shipping to Swift using SwiftWAL. This could be for
2840- # non-streaming replication, or for PITR.
2841- if config_data.get('swiftwal_log_shipping', None):
2842- config_data['archive_mode'] = True
2843- if config_data['wal_level'] != 'logical':
2844- config_data['wal_level'] = 'hot_standby'
2845- config_data['archive_command'] = swiftwal_archive_command()
2846-
2847- if config_data.get('wal_e_storage_uri', None):
2848- config_data['archive_mode'] = True
2849- if config_data['wal_level'] != 'logical':
2850- config_data['wal_level'] = 'hot_standby'
2851- config_data['archive_command'] = wal_e_archive_command()
2852-
2853- # Send config data to the template
2854- # Return it as pg_config
2855- charm_dir = hookenv.charm_dir()
2856- template_file = "{}/templates/postgresql.conf.tmpl".format(charm_dir)
2857- if not config_data.get('version', None):
2858- config_data['version'] = pg_version()
2859- pg_config = Template(
2860- open(template_file).read()).render(config_data)
2861- host.write_file(
2862- config_file, pg_config,
2863- owner="postgres", group="postgres", perms=0600)
2864-
2865- # Create or update files included from postgresql.conf.
2866- configure_log_destination(os.path.dirname(config_file))
2867-
2868- tune_postgresql_config(config_file)
2869-
2870- local_state['saved_config'] = dict(config_data)
2871- local_state.save()
2872-
2873-
2874-def tune_postgresql_config(config_file):
2875- tune_workload = hookenv.config('performance_tuning').lower()
2876- if tune_workload == "manual":
2877- return # Requested no autotuning.
2878-
2879- if tune_workload == "auto":
2880- tune_workload = "mixed" # Pre-pgtune backwards compatibility.
2881-
2882- with NamedTemporaryFile() as tmp_config:
2883- run(['pgtune', '-i', config_file, '-o', tmp_config.name,
2884- '-T', tune_workload,
2885- '-c', str(hookenv.config('max_connections'))])
2886- host.write_file(
2887- config_file, open(tmp_config.name, 'r').read(),
2888- owner='postgres', group='postgres', perms=0o600)
2889-
2890-
2891-def create_postgresql_ident(output_file):
2892- '''Create the pg_ident.conf file.'''
2893- ident_data = {}
2894- charm_dir = hookenv.charm_dir()
2895- template_file = "{}/templates/pg_ident.conf.tmpl".format(charm_dir)
2896- pg_ident_template = Template(open(template_file).read())
2897- host.write_file(
2898- output_file, pg_ident_template.render(ident_data),
2899- owner="postgres", group="postgres", perms=0600)
2900-
2901-
2902-def generate_postgresql_hba(
2903- output_file, user=None, schema_user=None, database=None):
2904- '''Create the pg_hba.conf file.'''
2905-
2906- # Per Bug #1117542, when generating the postgresql_hba file we
2907- # need to cope with private-address being either an IP address
2908- # or a hostname.
2909- def munge_address(addr):
2910- # http://stackoverflow.com/q/319279/196832
2911- try:
2912- socket.inet_aton(addr)
2913- return "%s/32" % addr
2914- except socket.error:
2915- # It's not an IP address.
2916- # XXX workaround for MAAS bug
2917- # https://bugs.launchpad.net/maas/+bug/1250435
2918- # If it's a CNAME, use the A record it points to.
2919- # If it fails for some reason, return the original address
2920- try:
2921- output = run("dig +short -t CNAME %s" % addr, True).strip()
2922- except:
2923- return addr
2924- if len(output) != 0:
2925- return output.rstrip(".") # trailing dot
2926- return addr
2927-
2928- config_data = hookenv.config()
2929- allowed_units = set()
2930- relation_data = []
2931- relids = hookenv.relation_ids('db') + hookenv.relation_ids('db-admin')
2932- for relid in relids:
2933- local_relation = hookenv.relation_get(
2934- unit=hookenv.local_unit(), rid=relid)
2935-
2936- # We might see relations that have not yet been setup enough.
2937- # At a minimum, the relation-joined hook needs to have been run
2938- # on the server so we have information about the usernames and
2939- # databases to allow in.
2940- if 'user' not in local_relation:
2941- continue
2942-
2943- for unit in hookenv.related_units(relid):
2944- relation = hookenv.relation_get(unit=unit, rid=relid)
2945-
2946- relation['relation-id'] = relid
2947- relation['unit'] = unit
2948-
2949- if relid.startswith('db-admin:'):
2950- relation['user'] = 'all'
2951- relation['database'] = 'all'
2952- elif relid.startswith('db:'):
2953- relation['user'] = local_relation.get('user', user)
2954- relation['schema_user'] = local_relation.get('schema_user',
2955- schema_user)
2956- relation['database'] = local_relation.get('database', database)
2957-
2958- if ((relation['user'] is None
2959- or relation['schema_user'] is None
2960- or relation['database'] is None)):
2961- # Missing info in relation for this unit, so skip it.
2962- continue
2963- else:
2964- raise RuntimeError(
2965- 'Unknown relation type {}'.format(repr(relid)))
2966-
2967- allowed_units.add(unit)
2968- relation['private-address'] = munge_address(
2969- relation['private-address'])
2970- relation_data.append(relation)
2971-
2972- log(str(relation_data), INFO)
2973-
2974- # Replication connections. Each unit needs to be able to connect to
2975- # every other unit's postgres database and the magic replication
2976- # database. It also needs to be able to connect to its own postgres
2977- # database.
2978- relids = hookenv.relation_ids('replication')
2979- for relid in relids:
2980- for unit in hookenv.related_units(relid):
2981- relation = hookenv.relation_get(unit=unit, rid=relid)
2982- remote_addr = munge_address(relation['private-address'])
2983- remote_replication = {'database': 'replication',
2984- 'user': 'juju_replication',
2985- 'private-address': remote_addr,
2986- 'relation-id': relid,
2987- 'unit': unit,
2988- }
2989- relation_data.append(remote_replication)
2990- remote_pgdb = {'database': 'postgres',
2991- 'user': 'juju_replication',
2992- 'private-address': remote_addr,
2993- 'relation-id': relid,
2994- 'unit': unit,
2995- }
2996- relation_data.append(remote_pgdb)
2997-
2998- # More replication connections, this time from external services.
2999- # Somewhat different than before, as we do not share credentials
3000- # and services using 9.4's logical replication feature will want
3001- # to specify the database name.
3002- relids = hookenv.relation_ids('master')
3003- for relid in relids:
3004- for unit in hookenv.related_units(relid):
3005- remote_rel = hookenv.relation_get(unit=unit, rid=relid)
3006- local_rel = hookenv.relation_get(unit=hookenv.local_unit(),
3007- rid=relid)
3008- remote_addr = munge_address(remote_rel['private-address'])
3009- remote_replication = {'database': 'replication',
3010- 'user': local_rel['user'],
3011- 'private-address': remote_addr,
3012- 'relation-id': relid,
3013- 'unit': unit,
3014- }
3015- relation_data.append(remote_replication)
3016- if 'database' in local_rel:
3017- remote_pgdb = {'database': local_rel['database'],
3018- 'user': local_rel['user'],
3019- 'private-address': remote_addr,
3020- 'relation-id': relid,
3021- 'unit': unit,
3022- }
3023- relation_data.append(remote_pgdb)
3024-
3025- # Local hooks also need permissions to setup replication.
3026- for relid in hookenv.relation_ids('replication'):
3027- local_replication = {'database': 'postgres',
3028- 'user': 'juju_replication',
3029- 'private-address': munge_address(
3030- hookenv.unit_private_ip()),
3031- 'relation-id': relid,
3032- 'unit': hookenv.local_unit(),
3033- }
3034- relation_data.append(local_replication)
3035-
3036- # Admin IP addresses for people using tools like pgAdminIII in a local JuJu
3037- # We accept a single IP or a comma separated list of IPs, these are added
3038- # to the list of relations that end up in pg_hba.conf thus granting
3039- # the IP addresses socket access to the postgres server.
3040- if config_data["admin_addresses"] != '':
3041- if "," in config_data["admin_addresses"]:
3042- admin_ip_list = config_data["admin_addresses"].split(",")
3043- else:
3044- admin_ip_list = [config_data["admin_addresses"]]
3045-
3046- for admin_ip in admin_ip_list:
3047- admin_host = {
3048- 'database': 'all',
3049- 'user': 'all',
3050- 'private-address': munge_address(admin_ip)}
3051- relation_data.append(admin_host)
3052-
3053- extra_pg_auth = [pg_auth.strip() for pg_auth in
3054- config_data["extra_pg_auth"].split(',') if pg_auth]
3055-
3056- template_file = "{}/templates/pg_hba.conf.tmpl".format(hookenv.charm_dir())
3057- pg_hba_template = Template(open(template_file).read())
3058- pg_hba_rendered = pg_hba_template.render(extra_pg_auth=extra_pg_auth,
3059- access_list=relation_data)
3060- host.write_file(
3061- output_file, pg_hba_rendered,
3062- owner="postgres", group="postgres", perms=0600)
3063- postgresql_reload()
3064-
3065- # Loop through all db relations, making sure each knows what are the list
3066- # of allowed hosts that were just added. lp:#1187508
3067- # We sort the list to ensure stability, probably unnecessarily.
3068- for relid in hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'):
3069- hookenv.relation_set(
3070- relid, {"allowed-units": " ".join(unit_sorted(allowed_units))})
3071-
3072-
3073-def install_postgresql_crontab(output_file):
3074- '''Create the postgres user's crontab'''
3075- config_data = hookenv.config()
3076- config_data['scripts_dir'] = postgresql_scripts_dir
3077- config_data['swiftwal_backup_command'] = swiftwal_backup_command()
3078- config_data['swiftwal_prune_command'] = swiftwal_prune_command()
3079- config_data['wal_e_backup_command'] = wal_e_backup_command()
3080- config_data['wal_e_prune_command'] = wal_e_prune_command()
3081-
3082- charm_dir = hookenv.charm_dir()
3083- template_file = "{}/templates/postgres.cron.tmpl".format(charm_dir)
3084- crontab_template = Template(
3085- open(template_file).read()).render(config_data)
3086- host.write_file(output_file, crontab_template, perms=0600)
3087-
3088-
3089-def create_recovery_conf(master_host, master_port, restart_on_change=False):
3090- if hookenv.config('manual_replication'):
3091- log('manual_replication; should not be here', CRITICAL)
3092- raise RuntimeError('manual_replication; should not be here')
3093-
3094- version = pg_version()
3095- cluster_name = hookenv.config('cluster_name')
3096- postgresql_cluster_dir = os.path.join(
3097- postgresql_data_dir, version, cluster_name)
3098-
3099- recovery_conf_path = os.path.join(postgresql_cluster_dir, 'recovery.conf')
3100- if os.path.exists(recovery_conf_path):
3101- old_recovery_conf = open(recovery_conf_path, 'r').read()
3102- else:
3103- old_recovery_conf = None
3104-
3105- charm_dir = hookenv.charm_dir()
3106- streaming_replication = hookenv.config('streaming_replication')
3107- template_file = "{}/templates/recovery.conf.tmpl".format(charm_dir)
3108- params = dict(
3109- host=master_host, port=master_port,
3110- password=local_state['replication_password'],
3111- streaming_replication=streaming_replication)
3112- if hookenv.config('wal_e_storage_uri'):
3113- params['restore_command'] = wal_e_restore_command()
3114- elif hookenv.config('swiftwal_log_shipping'):
3115- params['restore_command'] = swiftwal_restore_command()
3116- recovery_conf = Template(open(template_file).read()).render(params)
3117- log(recovery_conf, DEBUG)
3118- host.write_file(
3119- os.path.join(postgresql_cluster_dir, 'recovery.conf'),
3120- recovery_conf, owner="postgres", group="postgres", perms=0o600)
3121-
3122- if restart_on_change and old_recovery_conf != recovery_conf:
3123- log("recovery.conf updated. Restarting to take effect.")
3124- postgresql_restart()
3125-
3126-
3127-def ensure_swift_container(container):
3128- from swiftclient import client as swiftclient
3129- config = hookenv.config()
3130- con = swiftclient.Connection(
3131- authurl=config.get('os_auth_url', ''),
3132- user=config.get('os_username', ''),
3133- key=config.get('os_password', ''),
3134- tenant_name=config.get('os_tenant_name', ''),
3135- auth_version='2.0',
3136- retries=0)
3137- try:
3138- con.head_container(container)
3139- except swiftclient.ClientException:
3140- con.put_container(container)
3141-
3142-
3143-def wal_e_envdir():
3144- '''The envdir(1) environment location used to drive WAL-E.'''
3145- return os.path.join(_get_postgresql_config_dir(), 'wal-e.env')
3146-
3147-
3148-def create_wal_e_envdir():
3149- '''Regenerate the envdir(1) environment used to drive WAL-E.'''
3150- config = hookenv.config()
3151- env = dict(
3152- SWIFT_AUTHURL=config.get('os_auth_url', ''),
3153- SWIFT_TENANT=config.get('os_tenant_name', ''),
3154- SWIFT_USER=config.get('os_username', ''),
3155- SWIFT_PASSWORD=config.get('os_password', ''),
3156- AWS_ACCESS_KEY_ID=config.get('aws_access_key_id', ''),
3157- AWS_SECRET_ACCESS_KEY=config.get('aws_secret_access_key', ''),
3158- WABS_ACCOUNT_NAME=config.get('wabs_account_name', ''),
3159- WABS_ACCESS_KEY=config.get('wabs_access_key', ''),
3160- WALE_SWIFT_PREFIX='',
3161- WALE_S3_PREFIX='',
3162- WALE_WABS_PREFIX='')
3163-
3164- uri = config.get('wal_e_storage_uri', None)
3165-
3166- if uri:
3167- # Until juju provides us with proper leader election, we have a
3168- # state where units do not know if they are alone or part of a
3169- # cluster. To avoid units stomping on each others WAL and backups,
3170- # we use a unique container for each unit when they are not
3171- # part of the peer relation. Once they are part of the peer
3172- # relation, they share a container.
3173- if local_state.get('state', 'standalone') == 'standalone':
3174- if not uri.endswith('/'):
3175- uri += '/'
3176- uri += hookenv.local_unit().split('/')[-1]
3177-
3178- parsed_uri = urlparse.urlparse(uri)
3179-
3180- required_env = []
3181- if parsed_uri.scheme == 'swift':
3182- env['WALE_SWIFT_PREFIX'] = uri
3183- required_env = ['SWIFT_AUTHURL', 'SWIFT_TENANT',
3184- 'SWIFT_USER', 'SWIFT_PASSWORD']
3185- ensure_swift_container(parsed_uri.netloc)
3186- elif parsed_uri.scheme == 's3':
3187- env['WALE_S3_PREFIX'] = uri
3188- required_env = ['AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY']
3189- elif parsed_uri.scheme == 'wabs':
3190- env['WALE_WABS_PREFIX'] = uri
3191- required_env = ['WABS_ACCOUNT_NAME', 'WABS_ACCESS_KEY']
3192- else:
3193- log('Invalid wal_e_storage_uri {}'.format(uri), ERROR)
3194-
3195- for env_key in required_env:
3196- if not env[env_key].strip():
3197- log('Missing {}'.format(env_key), ERROR)
3198-
3199- # Regenerate the envdir(1) environment recommended by WAL-E.
3200- # All possible keys are rewritten to ensure we remove old secrets.
3201- host.mkdir(wal_e_envdir(), 'postgres', 'postgres', 0o750)
3202- for k, v in env.items():
3203- host.write_file(
3204- os.path.join(wal_e_envdir(), k), v.strip(),
3205- 'postgres', 'postgres', 0o640)
3206-
3207-
3208-def wal_e_archive_command():
3209- '''Return the archive_command needed in postgresql.conf.'''
3210- return 'envdir {} wal-e wal-push %p'.format(wal_e_envdir())
3211-
3212-
3213-def wal_e_restore_command():
3214- return 'envdir {} wal-e wal-fetch "%f" "%p"'.format(wal_e_envdir())
3215-
3216-
3217-def wal_e_backup_command():
3218- postgresql_cluster_dir = os.path.join(
3219- postgresql_data_dir, pg_version(), hookenv.config('cluster_name'))
3220- return 'envdir {} wal-e backup-push {}'.format(
3221- wal_e_envdir(), postgresql_cluster_dir)
3222-
3223-
3224-def wal_e_prune_command():
3225- return 'envdir {} wal-e delete --confirm retain {}'.format(
3226- wal_e_envdir(), hookenv.config('wal_e_backup_retention'))
3227-
3228-
3229-def swiftwal_config():
3230- postgresql_config_dir = _get_postgresql_config_dir()
3231- return os.path.join(postgresql_config_dir, "swiftwal.conf")
3232-
3233-
3234-def create_swiftwal_config():
3235- if not hookenv.config('swiftwal_container_prefix'):
3236- return
3237-
3238- # Until juju provides us with proper leader election, we have a
3239- # state where units do not know if they are alone or part of a
3240- # cluster. To avoid units stomping on each others WAL and backups,
3241- # we use a unique Swift container for each unit when they are not
3242- # part of the peer relation. Once they are part of the peer
3243- # relation, they share a container.
3244- if local_state.get('state', 'standalone') == 'standalone':
3245- container = '{}_{}'.format(hookenv.config('swiftwal_container_prefix'),
3246- hookenv.local_unit().split('/')[-1])
3247- else:
3248- container = hookenv.config('swiftwal_container_prefix')
3249-
3250- template_file = os.path.join(hookenv.charm_dir(),
3251- 'templates', 'swiftwal.conf.tmpl')
3252- params = dict(hookenv.config())
3253- params['swiftwal_container'] = container
3254- content = Template(open(template_file).read()).render(params)
3255- host.write_file(swiftwal_config(), content, "postgres", "postgres", 0o600)
3256-
3257-
3258-def swiftwal_archive_command():
3259- '''Return the archive_command needed in postgresql.conf'''
3260- return 'swiftwal --config={} archive-wal %p'.format(swiftwal_config())
3261-
3262-
3263-def swiftwal_restore_command():
3264- '''Return the restore_command needed in recovery.conf'''
3265- return 'swiftwal --config={} restore-wal %f %p'.format(swiftwal_config())
3266-
3267-
3268-def swiftwal_backup_command():
3269- '''Return the backup command needed in postgres' crontab'''
3270- cmd = 'swiftwal --config={} backup --port={}'.format(swiftwal_config(),
3271- get_service_port())
3272- if not hookenv.config('swiftwal_log_shipping'):
3273- cmd += ' --xlog'
3274- return cmd
3275-
3276-
3277-def swiftwal_prune_command():
3278- '''Return the backup & wal pruning command needed in postgres' crontab'''
3279- config = hookenv.config()
3280- args = '--keep-backups={} --keep-wals={}'.format(
3281- config.get('swiftwal_backup_retention', 0),
3282- max(config['wal_keep_segments'],
3283- config['replicated_wal_keep_segments']))
3284- return 'swiftwal --config={} prune {}'.format(swiftwal_config(), args)
3285-
3286-
3287-def update_service_port():
3288- old_port = local_state.get('listen_port', None)
3289- new_port = get_service_port()
3290- if old_port != new_port:
3291- if new_port:
3292- hookenv.open_port(new_port)
3293- if old_port:
3294- hookenv.close_port(old_port)
3295- local_state['listen_port'] = new_port
3296- local_state.save()
3297-
3298-
3299-def create_ssl_cert(cluster_dir):
3300- # PostgreSQL expects SSL certificates in the datadir.
3301- server_crt = os.path.join(cluster_dir, 'server.crt')
3302- server_key = os.path.join(cluster_dir, 'server.key')
3303- if not os.path.exists(server_crt):
3304- os.symlink('/etc/ssl/certs/ssl-cert-snakeoil.pem',
3305- server_crt)
3306- if not os.path.exists(server_key):
3307- os.symlink('/etc/ssl/private/ssl-cert-snakeoil.key',
3308- server_key)
3309-
3310-
3311-def set_password(user, password):
3312- if not os.path.isdir("passwords"):
3313- os.makedirs("passwords")
3314- old_umask = os.umask(0o077)
3315- try:
3316- with open("passwords/%s" % user, "w") as pwfile:
3317- pwfile.write(password)
3318- finally:
3319- os.umask(old_umask)
3320-
3321-
3322-def get_password(user):
3323- try:
3324- with open("passwords/%s" % user) as pwfile:
3325- return pwfile.read()
3326- except IOError:
3327- return None
3328-
3329-
3330-def db_cursor(autocommit=False, db='postgres', user='postgres',
3331- host=None, port=None, timeout=30):
3332- if port is None:
3333- port = get_service_port()
3334- if host:
3335- conn_str = "dbname={} host={} port={} user={}".format(
3336- db, host, port, user)
3337- else:
3338- conn_str = "dbname={} port={} user={}".format(db, port, user)
3339- # There are often race conditions in opening database connections,
3340- # such as a reload having just happened to change pg_hba.conf
3341- # settings or a hot standby being restarted and needing to catch up
3342- # with its master. To protect our automation against these sorts of
3343- # race conditions, by default we always retry failed connections
3344- # until a timeout is reached.
3345- start = time.time()
3346- while True:
3347- try:
3348- with pgpass():
3349- conn = psycopg2.connect(conn_str)
3350- break
3351- except psycopg2.Error, x:
3352- if time.time() > start + timeout:
3353- log("Database connection {!r} failed".format(
3354- conn_str), CRITICAL)
3355- raise
3356- log("Unable to open connection ({}), retrying.".format(x))
3357- time.sleep(1)
3358- conn.autocommit = autocommit
3359- return conn.cursor()
3360-
3361-
3362-def run_sql_as_postgres(sql, *parameters):
3363- cur = db_cursor(autocommit=True)
3364- try:
3365- cur.execute(sql, parameters)
3366- return cur.statusmessage
3367- except psycopg2.ProgrammingError:
3368- log(sql, CRITICAL)
3369- raise
3370-
3371-
3372-def run_select_as_postgres(sql, *parameters):
3373- cur = db_cursor()
3374- cur.execute(sql, parameters)
3375- # NB. Need to suck in the results before the rowcount is valid.
3376- results = cur.fetchall()
3377- return (cur.rowcount, results)
3378-
3379-
3380-def validate_config():
3381- """
3382- Sanity check charm configuration, aborting the script if
3383- we have bogus config values or config changes the charm does not yet
3384- (or cannot) support.
3385- """
3386- valid = True
3387- config_data = hookenv.config()
3388-
3389- version = config_data.get('version', None)
3390- if version:
3391- if version not in ('9.1', '9.2', '9.3', '9.4'):
3392- valid = False
3393- log("Invalid or unsupported version {!r} requested".format(
3394- version), CRITICAL)
3395-
3396- if config_data['cluster_name'] != 'main':
3397- valid = False
3398- log("Cluster names other than 'main' do not work per LP:1271835",
3399- CRITICAL)
3400-
3401- if config_data['listen_ip'] != '*':
3402- valid = False
3403- log("listen_ip values other than '*' do not work per LP:1271837",
3404- CRITICAL)
3405-
3406- valid_workloads = [
3407- 'dw', 'oltp', 'web', 'mixed', 'desktop', 'manual', 'auto']
3408- requested_workload = config_data['performance_tuning'].lower()
3409- if requested_workload not in valid_workloads:
3410- valid = False
3411- log('Invalid performance_tuning setting {}'.format(requested_workload),
3412- CRITICAL)
3413- if requested_workload == 'auto':
3414- log("'auto' performance_tuning deprecated. Using 'mixed' tuning",
3415- WARNING)
3416-
3417- unchangeable_config = [
3418- 'locale', 'encoding', 'version', 'cluster_name', 'pgdg']
3419-
3420- for name in unchangeable_config:
3421- if config_data._prev_dict is not None and config_data.changed(name):
3422- valid = False
3423- log("Cannot change {!r} setting after install.".format(name))
3424- local_state[name] = config_data.get(name, None)
3425- local_state.save()
3426-
3427- package_status = config_data['package_status']
3428- if package_status not in ['install', 'hold']:
3429- valid = False
3430- log("package_status must be 'install' or 'hold' not '{}'"
3431- "".format(package_status), CRITICAL)
3432-
3433- if not valid:
3434- sys.exit(99)
3435-
3436-
3437-def ensure_package_status(package, status):
3438- selections = ''.join(['{} {}\n'.format(package, status)])
3439- dpkg = subprocess.Popen(
3440- ['dpkg', '--set-selections'], stdin=subprocess.PIPE)
3441- dpkg.communicate(input=selections)
3442-
3443-
3444-# -----------------------------------------------------------------------------
3445-# Core logic for permanent storage changes:
3446-# NOTE the only 2 "True" return points:
3447-# 1) symlink already pointing to existing storage (no-op)
3448-# 2) new storage properly initialized:
3449-# - if fresh new storage dir: rsync existing data
3450-# - manipulate /var/lib/postgresql/VERSION/CLUSTER symlink
3451-# -----------------------------------------------------------------------------
3452-def config_changed_volume_apply(mount_point):
3453- version = pg_version()
3454- cluster_name = hookenv.config('cluster_name')
3455- data_directory_path = os.path.join(
3456- postgresql_data_dir, version, cluster_name)
3457-
3458- assert(data_directory_path)
3459-
3460- if not os.path.exists(data_directory_path):
3461- log(
3462- "postgresql data dir {} not found, "
3463- "not applying changes.".format(data_directory_path),
3464- CRITICAL)
3465- return False
3466-
3467- new_pg_dir = os.path.join(mount_point, "postgresql")
3468- new_pg_version_cluster_dir = os.path.join(
3469- new_pg_dir, version, cluster_name)
3470- if not mount_point:
3471- log(
3472- "invalid mount point = {}, "
3473- "not applying changes.".format(mount_point), ERROR)
3474- return False
3475-
3476- if ((os.path.islink(data_directory_path) and
3477- os.readlink(data_directory_path) == new_pg_version_cluster_dir and
3478- os.path.isdir(new_pg_version_cluster_dir))):
3479- log(
3480- "postgresql data dir '{}' already points "
3481- "to {}, skipping storage changes.".format(
3482- data_directory_path, new_pg_version_cluster_dir))
3483- log(
3484- "existing-symlink: to fix/avoid UID changes from "
3485- "previous units, doing: "
3486- "chown -R postgres:postgres {}".format(new_pg_dir))
3487- run("chown -R postgres:postgres %s" % new_pg_dir)
3488- return True
3489-
3490- # Create a directory structure below "new" mount_point as
3491- # external_volume_mount/postgresql/9.1/main
3492- for new_dir in [new_pg_dir,
3493- os.path.join(new_pg_dir, version),
3494- new_pg_version_cluster_dir]:
3495- if not os.path.isdir(new_dir):
3496- log("mkdir %s".format(new_dir))
3497- host.mkdir(new_dir, owner="postgres", perms=0o700)
3498- # Carefully build this symlink, e.g.:
3499- # /var/lib/postgresql/9.1/main ->
3500- # external_volume_mount/postgresql/9.1/main
3501- # but keep previous "main/" directory, by renaming it to
3502- # main-$TIMESTAMP
3503- if not postgresql_stop() and postgresql_is_running():
3504- log("postgresql_stop() failed - can't migrate data.", ERROR)
3505- return False
3506- if not os.path.exists(os.path.join(
3507- new_pg_version_cluster_dir, "PG_VERSION")):
3508- log("migrating PG data {}/ -> {}/".format(
3509- data_directory_path, new_pg_version_cluster_dir), WARNING)
3510- # void copying PID file to perm storage (shouldn't be any...)
3511- command = "rsync -a --exclude postmaster.pid {}/ {}/".format(
3512- data_directory_path, new_pg_version_cluster_dir)
3513- log("run: {}".format(command))
3514- run(command)
3515- try:
3516- os.rename(data_directory_path, "{}-{}".format(
3517- data_directory_path, int(time.time())))
3518- log("NOTICE: symlinking {} -> {}".format(
3519- new_pg_version_cluster_dir, data_directory_path))
3520- os.symlink(new_pg_version_cluster_dir, data_directory_path)
3521- run("chown -h postgres:postgres {}".format(data_directory_path))
3522- log(
3523- "after-symlink: to fix/avoid UID changes from "
3524- "previous units, doing: "
3525- "chown -R postgres:postgres {}".format(new_pg_dir))
3526- run("chown -R postgres:postgres {}".format(new_pg_dir))
3527- return True
3528- except OSError:
3529- log("failed to symlink {} -> {}".format(
3530- data_directory_path, mount_point), CRITICAL)
3531- return False
3532-
3533-
3534-def reset_manual_replication_state():
3535- '''In manual replication mode, the state of the local database cluster
3536- is outside of Juju's control. We need to detect and update the charm
3537- state to match reality.
3538- '''
3539- if hookenv.config('manual_replication'):
3540- if os.path.exists('recovery.conf'):
3541- local_state['state'] = 'hot standby'
3542- elif slave_count():
3543- local_state['state'] = 'master'
3544- else:
3545- local_state['state'] = 'standalone'
3546- local_state.publish()
3547-
3548-
3549-@hooks.hook()
3550-def config_changed(force_restart=False, mount_point=None):
3551- validate_config()
3552- config_data = hookenv.config()
3553- update_repos_and_packages()
3554-
3555- if mount_point is not None:
3556- # config_changed_volume_apply will stop the service if it finds
3557- # it necessary, ie: new volume setup
3558- if config_changed_volume_apply(mount_point=mount_point):
3559- postgresql_autostart(True)
3560- else:
3561- postgresql_autostart(False)
3562- postgresql_stop()
3563- mounts = volume_get_all_mounted()
3564- if mounts:
3565- log("current mounted volumes: {}".format(mounts))
3566- log(
3567- "Disabled and stopped postgresql service "
3568- "(config_changed_volume_apply failure)", ERROR)
3569- sys.exit(1)
3570-
3571- reset_manual_replication_state()
3572-
3573- postgresql_config_dir = _get_postgresql_config_dir(config_data)
3574- postgresql_config = os.path.join(postgresql_config_dir, "postgresql.conf")
3575- postgresql_hba = os.path.join(postgresql_config_dir, "pg_hba.conf")
3576- postgresql_ident = os.path.join(postgresql_config_dir, "pg_ident.conf")
3577-
3578- create_postgresql_config(postgresql_config)
3579- create_postgresql_ident(postgresql_ident) # Do this before pg_hba.conf.
3580- generate_postgresql_hba(postgresql_hba)
3581- create_ssl_cert(os.path.join(
3582- postgresql_data_dir, pg_version(), config_data['cluster_name']))
3583- create_swiftwal_config()
3584- create_wal_e_envdir()
3585- update_service_port()
3586- update_nrpe_checks()
3587- write_metrics_cronjob('/usr/local/bin/postgres_to_statsd.py',
3588- '/etc/cron.d/postgres_metrics')
3589-
3590- # If an external mountpoint has caused an old, existing DB to be
3591- # mounted, we need to ensure that all the users, databases, roles
3592- # etc. exist with known passwords.
3593- if local_state['state'] in ('standalone', 'master'):
3594- client_relids = (
3595- hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'))
3596- for relid in client_relids:
3597- rel = hookenv.relation_get(rid=relid, unit=hookenv.local_unit())
3598- client_rel = None
3599- for unit in hookenv.related_units(relid):
3600- client_rel = hookenv.relation_get(unit=unit, rid=relid)
3601- if not client_rel:
3602- continue # No client units - in between departed and broken?
3603-
3604- database = rel.get('database')
3605- if database is None:
3606- continue # The relation exists, but we haven't joined it yet.
3607-
3608- roles = filter(None, (client_rel.get('roles') or '').split(","))
3609- user = rel.get('user')
3610- if user:
3611- admin = relid.startswith('db-admin')
3612- password = create_user(user, admin=admin)
3613- reset_user_roles(user, roles)
3614- hookenv.relation_set(relid, password=password)
3615-
3616- schema_user = rel.get('schema_user')
3617- if schema_user:
3618- schema_password = create_user(schema_user)
3619- hookenv.relation_set(relid, schema_password=schema_password)
3620-
3621- if user and schema_user and not (
3622- database is None or database == 'all'):
3623- ensure_database(user, schema_user, database)
3624-
3625- if force_restart:
3626- postgresql_restart()
3627- postgresql_reload_or_restart()
3628-
3629- # In case the log_line_prefix has changed, inform syslog consumers.
3630- for relid in hookenv.relation_ids('syslog'):
3631- hookenv.relation_set(
3632- relid, log_line_prefix=hookenv.config('log_line_prefix'))
3633-
3634-
3635-@hooks.hook()
3636-def install(run_pre=True, force_restart=True):
3637- if run_pre:
3638- for f in glob.glob('exec.d/*/charm-pre-install'):
3639- if os.path.isfile(f) and os.access(f, os.X_OK):
3640- subprocess.check_call(['sh', '-c', f])
3641-
3642- validate_config()
3643-
3644- config_data = hookenv.config()
3645- update_repos_and_packages()
3646- if 'state' not in local_state:
3647- log('state not in {}'.format(local_state.keys()), DEBUG)
3648- # Fresh installation. Because this function is invoked by both
3649- # the install hook and the upgrade-charm hook, we need to guard
3650- # any non-idempotent setup. We should probably fix this; it
3651- # seems rather fragile.
3652- local_state.setdefault('state', 'standalone')
3653- log(repr(local_state.keys()), DEBUG)
3654-
3655- # Drop the cluster created when the postgresql package was
3656- # installed, and rebuild it with the requested locale and encoding.
3657- version = pg_version()
3658- for ver, name in lsclusters(slice(2)):
3659- if version == ver and name == 'main':
3660- run("pg_dropcluster --stop {} main".format(version))
3661- listen_port = config_data.get('listen_port', None)
3662- if listen_port:
3663- port_opt = "--port={}".format(config_data['listen_port'])
3664- else:
3665- port_opt = ''
3666- createcluster()
3667- assert (
3668- not port_opt
3669- or get_service_port() == config_data['listen_port']), (
3670- 'allocated port {!r} != {!r}'.format(
3671- get_service_port(), config_data['listen_port']))
3672- local_state['port'] = get_service_port()
3673- log('publishing state', DEBUG)
3674- local_state.publish()
3675-
3676- postgresql_backups_dir = (
3677- config_data['backup_dir'].strip() or
3678- os.path.join(postgresql_data_dir, 'backups'))
3679-
3680- host.mkdir(postgresql_backups_dir, owner="postgres", perms=0o755)
3681- host.mkdir(postgresql_scripts_dir, owner="postgres", perms=0o755)
3682- host.mkdir(postgresql_logs_dir, owner="postgres", perms=0o755)
3683- paths = {
3684- 'base_dir': postgresql_data_dir,
3685- 'backup_dir': postgresql_backups_dir,
3686- 'scripts_dir': postgresql_scripts_dir,
3687- 'logs_dir': postgresql_logs_dir,
3688- }
3689- charm_dir = hookenv.charm_dir()
3690- template_file = "{}/templates/pg_backup_job.tmpl".format(charm_dir)
3691- backup_job = Template(open(template_file).read()).render(paths)
3692- host.write_file(
3693- os.path.join(postgresql_scripts_dir, 'dump-pg-db'),
3694- open('scripts/pgbackup.py', 'r').read(), perms=0o755)
3695- host.write_file(
3696- os.path.join(postgresql_scripts_dir, 'pg_backup_job'),
3697- backup_job, perms=0755)
3698- install_postgresql_crontab(postgresql_crontab)
3699-
3700- # Create this empty log file on installation to avoid triggering
3701- # spurious monitoring system alerts, per Bug #1329816.
3702- if not os.path.exists(backup_log):
3703- host.write_file(backup_log, '', 'postgres', 'postgres', 0664)
3704-
3705- hookenv.open_port(get_service_port())
3706-
3707- # Ensure at least minimal access granted for hooks to run.
3708- # Reload because we are using the default cluster setup and started
3709- # when we installed the PostgreSQL packages.
3710- config_changed(force_restart=force_restart)
3711-
3712- snapshot_relations()
3713-
3714-
3715-@hooks.hook()
3716-def upgrade_charm():
3717- """Handle saving state during an upgrade-charm hook.
3718-
3719- When upgrading from an installation using volume-map, we migrate
3720- that installation to use the storage subordinate charm by remounting
3721- a mountpath that the storage subordinate maintains. We exit(1) only to
3722- raise visibility to manual procedure that we log in juju logs below for the
3723- juju admin to finish the migration by relating postgresql to the storage
3724- and block-storage-broker services. These steps are generalised in the
3725- README as well.
3726- """
3727- install(run_pre=False, force_restart=False)
3728- snapshot_relations()
3729- version = pg_version()
3730- cluster_name = hookenv.config('cluster_name')
3731- data_directory_path = os.path.join(
3732- postgresql_data_dir, version, cluster_name)
3733- if (os.path.islink(data_directory_path)):
3734- link_target = os.readlink(data_directory_path)
3735- if "/srv/juju" in link_target:
3736- # Then we just upgraded from an installation that was using
3737- # charm config volume_map definitions. We need to stop postgresql
3738- # and remount the device where the storage subordinate expects to
3739- # control the mount in the future if relations/units change
3740- volume_id = link_target.split("/")[3]
3741- unit_name = hookenv.local_unit()
3742- new_mount_root = external_volume_mount
3743- new_pg_version_cluster_dir = os.path.join(
3744- new_mount_root, "postgresql", version, cluster_name)
3745- if not os.exists(new_mount_root):
3746- os.mkdir(new_mount_root)
3747- log("\n"
3748- "WARNING: %s unit has external volume id %s mounted via the\n"
3749- "deprecated volume-map and volume-ephemeral-storage\n"
3750- "configuration parameters.\n"
3751- "These parameters are no longer available in the postgresql\n"
3752- "charm in favor of using the volume_map parameter in the\n"
3753- "storage subordinate charm.\n"
3754- "We are migrating the attached volume to a mount path which\n"
3755- "can be managed by the storage subordinate charm. To\n"
3756- "continue using this volume_id with the storage subordinate\n"
3757- "follow this procedure.\n-----------------------------------\n"
3758- "1. cat > storage.cfg <<EOF\nstorage:\n"
3759- " provider: block-storage-broker\n"
3760- " root: %s\n"
3761- " volume_map: \"{%s: %s}\"\nEOF\n2. juju deploy "
3762- "--config storage.cfg storage\n"
3763- "3. juju deploy block-storage-broker\n4. juju add-relation "
3764- "block-storage-broker storage\n5. juju resolved --retry "
3765- "%s\n6. juju add-relation postgresql storage\n"
3766- "-----------------------------------\n" %
3767- (unit_name, volume_id, new_mount_root, unit_name, volume_id,
3768- unit_name), WARNING)
3769- postgresql_stop()
3770- os.unlink(data_directory_path)
3771- log("Unmounting external storage due to charm upgrade: %s" %
3772- link_target)
3773- try:
3774- subprocess.check_output(
3775- "umount /srv/juju/%s" % volume_id, shell=True)
3776- # Since e2label truncates labels to 16 characters use only the
3777- # first 16 characters of the volume_id as that's what was
3778- # set by old versions of postgresql charm
3779- subprocess.check_call(
3780- "mount -t ext4 LABEL=%s %s" %
3781- (volume_id[:16], new_mount_root), shell=True)
3782- except subprocess.CalledProcessError, e:
3783- log("upgrade-charm mount migration failed. %s" % str(e), ERROR)
3784- sys.exit(1)
3785-
3786- log("NOTICE: symlinking {} -> {}".format(
3787- new_pg_version_cluster_dir, data_directory_path))
3788- os.symlink(new_pg_version_cluster_dir, data_directory_path)
3789- run("chown -h postgres:postgres {}".format(data_directory_path))
3790- postgresql_start() # Will exit(1) if issues
3791- log("Remount and restart success for this external volume.\n"
3792- "This current running installation will break upon\n"
3793- "add/remove postgresql units or relations if you do not\n"
3794- "follow the above procedure to ensure your external\n"
3795- "volumes are preserved by the storage subordinate charm.",
3796- WARNING)
3797- # So juju admins can see the hook fail and note the steps to fix
3798- # per our WARNINGs above
3799- sys.exit(1)
3800-
3801-
3802-@hooks.hook()
3803-def start():
3804- postgresql_reload_or_restart()
3805-
3806-
3807-@hooks.hook()
3808-def stop():
3809- if postgresql_is_running():
3810- with restart_lock(hookenv.local_unit(), True):
3811- postgresql_stop()
3812-
3813-
3814-def quote_identifier(identifier):
3815- r'''Quote an identifier, such as a table or role name.
3816-
3817- In SQL, identifiers are quoted using " rather than ' (which is reserved
3818- for strings).
3819-
3820- >>> print(quote_identifier('hello'))
3821- "hello"
3822-
3823- Quotes and Unicode are handled if you make use of them in your
3824- identifiers.
3825-
3826- >>> print(quote_identifier("'"))
3827- "'"
3828- >>> print(quote_identifier('"'))
3829- """"
3830- >>> print(quote_identifier("\\"))
3831- "\"
3832- >>> print(quote_identifier('\\"'))
3833- "\"""
3834- >>> print(quote_identifier('\\ aargh \u0441\u043b\u043e\u043d'))
3835- U&"\\ aargh \0441\043b\043e\043d"
3836- '''
3837- try:
3838- return '"%s"' % identifier.encode('US-ASCII').replace('"', '""')
3839- except UnicodeEncodeError:
3840- escaped = []
3841- for c in identifier:
3842- if c == '\\':
3843- escaped.append('\\\\')
3844- elif c == '"':
3845- escaped.append('""')
3846- else:
3847- c = c.encode('US-ASCII', 'backslashreplace')
3848- # Note Python only supports 32 bit unicode, so we use
3849- # the 4 hexdigit PostgreSQL syntax (\1234) rather than
3850- # the 6 hexdigit format (\+123456).
3851- if c.startswith('\\u'):
3852- c = '\\' + c[2:]
3853- escaped.append(c)
3854- return 'U&"%s"' % ''.join(escaped)
3855-
3856-
3857-def sanitize(s):
3858- s = s.replace(':', '_')
3859- s = s.replace('-', '_')
3860- s = s.replace('/', '_')
3861- s = s.replace('"', '_')
3862- s = s.replace("'", '_')
3863- return s
3864-
3865-
3866-def user_name(relid, remote_unit, admin=False, schema=False):
3867- # Per Bug #1160530, don't append the remote unit number to the user name.
3868- components = [sanitize(relid), sanitize(re.split("/", remote_unit)[0])]
3869- if admin:
3870- components.append("admin")
3871- elif schema:
3872- components.append("schema")
3873- return "_".join(components)
3874-
3875-
3876-def user_exists(user):
3877- sql = "SELECT rolname FROM pg_roles WHERE rolname = %s"
3878- if run_select_as_postgres(sql, user)[0] > 0:
3879- return True
3880- else:
3881- return False
3882-
3883-
3884-def create_user(user, admin=False, replication=False):
3885- password = get_password(user)
3886- if password is None:
3887- password = host.pwgen()
3888- set_password(user, password)
3889- if user_exists(user):
3890- log("Updating {} user".format(user))
3891- action = ["ALTER ROLE"]
3892- else:
3893- log("Creating {} user".format(user))
3894- action = ["CREATE ROLE"]
3895- action.append('%s WITH LOGIN')
3896- if admin:
3897- action.append('SUPERUSER')
3898- else:
3899- action.append('NOSUPERUSER')
3900- if replication:
3901- action.append('REPLICATION')
3902- else:
3903- action.append('NOREPLICATION')
3904- action.append('PASSWORD %s')
3905- sql = ' '.join(action)
3906- run_sql_as_postgres(sql, AsIs(quote_identifier(user)), password)
3907- return password
3908-
3909-
3910-def reset_user_roles(user, roles):
3911- wanted_roles = set(roles)
3912-
3913- sql = """
3914- SELECT role.rolname
3915- FROM
3916- pg_roles AS role,
3917- pg_roles AS member,
3918- pg_auth_members
3919- WHERE
3920- member.oid = pg_auth_members.member
3921- AND role.oid = pg_auth_members.roleid
3922- AND member.rolname = %s
3923- """
3924- existing_roles = set(r[0] for r in run_select_as_postgres(sql, user)[1])
3925-
3926- roles_to_grant = wanted_roles.difference(existing_roles)
3927-
3928- for role in roles_to_grant:
3929- ensure_role(role)
3930-
3931- if roles_to_grant:
3932- log("Granting {} to {}".format(",".join(roles_to_grant), user), INFO)
3933-
3934- for role in roles_to_grant:
3935- run_sql_as_postgres(
3936- "GRANT %s TO %s",
3937- AsIs(quote_identifier(role)), AsIs(quote_identifier(user)))
3938-
3939- roles_to_revoke = existing_roles.difference(wanted_roles)
3940-
3941- if roles_to_revoke:
3942- log("Revoking {} from {}".format(",".join(roles_to_grant), user), INFO)
3943-
3944- for role in roles_to_revoke:
3945- run_sql_as_postgres(
3946- "REVOKE %s FROM %s",
3947- AsIs(quote_identifier(role)), AsIs(quote_identifier(user)))
3948-
3949-
3950-def ensure_role(role):
3951- sql = "SELECT oid FROM pg_roles WHERE rolname = %s"
3952- if run_select_as_postgres(sql, role)[0] == 0:
3953- sql = "CREATE ROLE %s INHERIT NOLOGIN"
3954- run_sql_as_postgres(sql, AsIs(quote_identifier(role)))
3955-
3956-
3957-def ensure_database(user, schema_user, database):
3958- sql = "SELECT datname FROM pg_database WHERE datname = %s"
3959- if run_select_as_postgres(sql, database)[0] != 0:
3960- # DB already exists
3961- pass
3962- else:
3963- sql = "CREATE DATABASE %s"
3964- run_sql_as_postgres(sql, AsIs(quote_identifier(database)))
3965- sql = "GRANT ALL PRIVILEGES ON DATABASE %s TO %s"
3966- run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
3967- AsIs(quote_identifier(schema_user)))
3968- sql = "GRANT CONNECT ON DATABASE %s TO %s"
3969- run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
3970- AsIs(quote_identifier(user)))
3971-
3972-
3973-def ensure_extensions(extensions, database):
3974- if extensions:
3975- cur = db_cursor(db=database, autocommit=True)
3976- try:
3977- cur.execute('SELECT extname FROM pg_extension')
3978- installed_extensions = frozenset(x[0] for x in cur.fetchall())
3979- log("ensure_extensions({}), have {}"
3980- .format(extensions, installed_extensions),
3981- DEBUG)
3982- extensions_set = frozenset(extensions)
3983- extensions_to_create = \
3984- extensions_set.difference(installed_extensions)
3985- for ext in extensions_to_create:
3986- log("creating extension {}".format(ext), DEBUG)
3987- cur.execute('CREATE EXTENSION %s',
3988- (AsIs(quote_identifier(ext)),))
3989- finally:
3990- cur.close()
3991-
3992-
3993-def snapshot_relations():
3994- '''Snapshot our relation information into local state.
3995-
3996- We need this information to be available in -broken
3997- hooks letting us actually clean up properly. Bug #1190996.
3998- '''
3999- log("Snapshotting relations", DEBUG)
4000- local_state['relations'] = hookenv.relations()
4001- local_state.save()
4002-
4003-
4004-# Each database unit needs to publish connection details to the
4005-# client. This is problematic, because 1) the user and database are
4006-# only created on the master unit and this is replicated to the
4007-# slave units outside of juju and 2) we have no control over the
4008-# order that units join the relation.
4009-#
4010-# The simplest approach of generating usernames and passwords in
4011-# the master units db-relation-joined hook fails because slave
4012-# units may well have already run their hooks and found no
4013-# connection details to republish. When the master unit publishes
4014-# the connection details it only triggers relation-changed hooks
4015-# on the client units, not the relation-changed hook on other peer
4016-# units.
4017-#
4018-# A more complex approach is for the first database unit that joins
4019-# the relation to generate the usernames and passwords and publish
4020-# this to the relation. Subsequent units can retrieve this
4021-# information and republish it. Of course, the master unit also
4022-# creates the database and users when it joins the relation.
4023-# This approach should work reliably on the server side. However,
4024-# there is a window from when a slave unit joins a client relation
4025-# until the master unit has joined that relation when the
4026-# credentials published by the slave unit are invalid. These
4027-# credentials will only become valid after the master unit has
4028-# actually created the user and database.
4029-#
4030-# The implemented approach is for the master unit's
4031-# db-relation-joined hook to create the user and database and
4032-# publish the connection details, and in addition update a list
4033-# of active relations to the service's peer 'replication' relation.
4034-# After the master unit has updated the peer relationship, the
4035-# slave unit's peer replication-relation-changed hook will
4036-# be triggered and it will have an opportunity to republish the
4037-# connection details. Of course, it may not be able to do so if the
4038-# slave unit's db-relation-joined hook has yet been run, so we must
4039-# also attempt to to republish the connection settings there.
4040-# This way we are guaranteed at least one chance to republish the
4041-# connection details after the database and user have actually been
4042-# created and both the master and slave units have joined the
4043-# relation.
4044-#
4045-# The order of relevant hooks firing may be:
4046-#
4047-# master db-relation-joined (publish)
4048-# slave db-relation-joined (republish)
4049-# slave replication-relation-changed (noop)
4050-#
4051-# slave db-relation-joined (noop)
4052-# master db-relation-joined (publish)
4053-# slave replication-relation-changed (republish)
4054-#
4055-# master db-relation-joined (publish)
4056-# slave replication-relation-changed (noop; slave not yet joined db rel)
4057-# slave db-relation-joined (republish)
4058-
4059-
4060-@hooks.hook('db-relation-joined', 'db-relation-changed')
4061-def db_relation_joined_changed():
4062- reset_manual_replication_state()
4063- if local_state['state'] == 'hot standby':
4064- publish_hot_standby_credentials()
4065- return
4066-
4067- # By default, we create a database named after the remote
4068- # servicename. The remote service can override this by setting
4069- # the database property on the relation.
4070- database = hookenv.relation_get('database')
4071- if not database:
4072- database = hookenv.remote_unit().split('/')[0]
4073-
4074- # Generate a unique username for this relation to use.
4075- user = user_name(hookenv.relation_id(), hookenv.remote_unit())
4076-
4077- roles = filter(None, (hookenv.relation_get('roles') or '').split(","))
4078-
4079- extensions = filter(None,
4080- (hookenv.relation_get('extensions') or '').split(","))
4081-
4082- log('{} unit publishing credentials'.format(local_state['state']))
4083-
4084- password = create_user(user)
4085- reset_user_roles(user, roles)
4086- schema_user = "{}_schema".format(user)
4087- schema_password = create_user(schema_user)
4088- ensure_database(user, schema_user, database)
4089- ensure_extensions(extensions, database)
4090- host = hookenv.unit_private_ip()
4091- port = get_service_port()
4092- state = local_state['state'] # master, hot standby, standalone
4093-
4094- # Publish connection details.
4095- connection_settings = dict(
4096- user=user, password=password,
4097- schema_user=schema_user, schema_password=schema_password,
4098- host=host, database=database, port=port, state=state)
4099- log("Connection settings {!r}".format(connection_settings), DEBUG)
4100- hookenv.relation_set(relation_settings=connection_settings)
4101-
4102- # Update the peer relation, notifying any hot standby units
4103- # to republish connection details to the client relation.
4104- local_state['client_relations'] = ' '.join(sorted(
4105- hookenv.relation_ids('db') + hookenv.relation_ids('db-admin')))
4106- log("Client relations {}".format(local_state['client_relations']))
4107- local_state.publish()
4108-
4109- postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
4110- generate_postgresql_hba(postgresql_hba, user=user,
4111- schema_user=schema_user,
4112- database=database)
4113-
4114- snapshot_relations()
4115-
4116-
4117-@hooks.hook('db-admin-relation-joined', 'db-admin-relation-changed')
4118-def db_admin_relation_joined_changed():
4119- reset_manual_replication_state()
4120- if local_state['state'] == 'hot standby':
4121- publish_hot_standby_credentials()
4122- return
4123-
4124- user = user_name(
4125- hookenv.relation_id(), hookenv.remote_unit(), admin=True)
4126-
4127- log('{} unit publishing credentials'.format(local_state['state']))
4128-
4129- password = create_user(user, admin=True)
4130- host = hookenv.unit_private_ip()
4131- port = get_service_port()
4132- state = local_state['state'] # master, hot standby, standalone
4133-
4134- # Publish connection details.
4135- connection_settings = dict(
4136- user=user, password=password,
4137- host=host, database='all', port=port, state=state)
4138- log("Connection settings {!r}".format(connection_settings), DEBUG)
4139- hookenv.relation_set(relation_settings=connection_settings)
4140-
4141- # Update the peer relation, notifying any hot standby units
4142- # to republish connection details to the client relation.
4143- local_state['client_relations'] = ' '.join(
4144- hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'))
4145- log("Client relations {}".format(local_state['client_relations']))
4146- local_state.publish()
4147-
4148- postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
4149- generate_postgresql_hba(postgresql_hba)
4150-
4151- snapshot_relations()
4152-
4153-
4154-@hooks.hook()
4155-def db_relation_broken():
4156- relid = hookenv.relation_id()
4157- if relid not in local_state['relations']['db']:
4158- # This was to be a hot standby, but it had not yet got as far as
4159- # receiving and handling credentials from the master.
4160- log("db-relation-broken called before relation finished setup", DEBUG)
4161- return
4162-
4163- # The relation no longer exists, so we can't pull the database name
4164- # we used from there. Instead, we have to persist this information
4165- # ourselves.
4166- relation = local_state['relations']['db'][relid]
4167- unit_relation_data = relation[hookenv.local_unit()]
4168-
4169- if local_state['state'] in ('master', 'standalone'):
4170- user = unit_relation_data.get('user', None)
4171- database = unit_relation_data['database']
4172-
4173- # We need to check that the database still exists before
4174- # attempting to revoke privileges because the local PostgreSQL
4175- # cluster may have been rebuilt by another hook.
4176- sql = "SELECT datname FROM pg_database WHERE datname = %s"
4177- if run_select_as_postgres(sql, database)[0] != 0:
4178- sql = "REVOKE ALL PRIVILEGES ON DATABASE %s FROM %s"
4179- run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
4180- AsIs(quote_identifier(user)))
4181- run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
4182- AsIs(quote_identifier(user + "_schema")))
4183-
4184- postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
4185- generate_postgresql_hba(postgresql_hba)
4186-
4187- # Cleanup our local state.
4188- snapshot_relations()
4189-
4190-
4191-@hooks.hook()
4192-def db_admin_relation_broken():
4193- if local_state['state'] in ('master', 'standalone'):
4194- user = hookenv.relation_get('user', unit=hookenv.local_unit())
4195- if user:
4196- # We need to check that the user still exists before
4197- # attempting to revoke privileges because the local PostgreSQL
4198- # cluster may have been rebuilt by another hook.
4199- sql = "SELECT usename FROM pg_user WHERE usename = %s"
4200- if run_select_as_postgres(sql, user)[0] != 0:
4201- sql = "ALTER USER %s NOSUPERUSER"
4202- run_sql_as_postgres(sql, AsIs(quote_identifier(user)))
4203-
4204- postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
4205- generate_postgresql_hba(postgresql_hba)
4206-
4207- # Cleanup our local state.
4208- snapshot_relations()
4209-
4210-
4211-def update_repos_and_packages():
4212- need_upgrade = False
4213-
4214- version = pg_version()
4215-
4216- # Add the PGDG APT repository if it is enabled. Setting this boolean
4217- # is simpler than requiring the magic URL and key be added to
4218- # install_sources and install_keys. In addition, per Bug #1271148,
4219- # install_keys is likely a security hole for this sort of remote
4220- # archive. Instead, we keep a copy of the signing key in the charm
4221- # and can add it securely.
4222- pgdg_list = '/etc/apt/sources.list.d/pgdg_{}.list'.format(
4223- sanitize(hookenv.local_unit()))
4224- pgdg_key = 'ACCC4CF8'
4225-
4226- if hookenv.config('pgdg'):
4227- if not os.path.exists(pgdg_list):
4228- # We need to upgrade, as if we have Ubuntu main packages
4229- # installed they may be incompatible with the PGDG ones.
4230- # This is unlikely to ever happen outside of the test suite,
4231- # and never if you don't reuse machines.
4232- need_upgrade = True
4233- run("apt-key add lib/{}.asc".format(pgdg_key))
4234- open(pgdg_list, 'w').write('deb {} {}-pgdg main'.format(
4235- 'http://apt.postgresql.org/pub/repos/apt/', distro_codename()))
4236- if version == '9.4':
4237- pgdg_94_list = '/etc/apt/sources.list.d/pgdg_94_{}.list'.format(
4238- sanitize(hookenv.local_unit()))
4239- if not os.path.exists(pgdg_94_list):
4240- need_upgrade = True
4241- open(pgdg_94_list, 'w').write(
4242- 'deb {} {}-pgdg main 9.4'.format(
4243- 'http://apt.postgresql.org/pub/repos/apt/',
4244- distro_codename()))
4245-
4246- elif os.path.exists(pgdg_list):
4247- log(
4248- "PGDG apt source not requested, but already in place in this "
4249- "container", WARNING)
4250- # We can't just remove a source, as we may have packages
4251- # installed that conflict with ones from the other configured
4252- # sources. In particular, if we have postgresql-common installed
4253- # from the PGDG Apt source, PostgreSQL packages from Ubuntu main
4254- # will fail to install.
4255- # os.unlink(pgdg_list)
4256-
4257- # Try to optimize our calls to fetch.configure_sources(), as it
4258- # cannot do this itself due to lack of state.
4259- if (need_upgrade
4260- or local_state.get('install_sources', None)
4261- != hookenv.config('install_sources')
4262- or local_state.get('install_keys', None)
4263- != hookenv.config('install_keys')):
4264- # Support the standard mechanism implemented by charm-helpers. Pulls
4265- # from the default 'install_sources' and 'install_keys' config
4266- # options. This also does 'apt-get update', pulling in the PGDG data
4267- # if we just configured it.
4268- fetch.configure_sources(True)
4269- local_state['install_sources'] = hookenv.config('install_sources')
4270- local_state['install_keys'] = hookenv.config('install_keys')
4271- local_state.save()
4272-
4273- # Ensure that the desired database locale is possible.
4274- if hookenv.config('locale') != 'C':
4275- run(["locale-gen", "{}.{}".format(
4276- hookenv.config('locale'), hookenv.config('encoding'))])
4277-
4278- if need_upgrade:
4279- run("apt-get -y upgrade")
4280-
4281- # It might have been better for debversion and plpython to only get
4282- # installed if they were listed in the extra-packages config item,
4283- # but they predate this feature.
4284- packages = ["python-psutil", # to obtain system RAM from python
4285- "libc-bin", # for getconf
4286- "postgresql-{}".format(version),
4287- "postgresql-contrib-{}".format(version),
4288- "postgresql-plpython-{}".format(version),
4289- "python-jinja2", "python-psycopg2"]
4290-
4291- # PGDG currently doesn't have debversion for 9.3 & 9.4. Put this back
4292- # when it does.
4293- if not (hookenv.config('pgdg') and version in ('9.3', '9.4')):
4294- packages.append("postgresql-{}-debversion".format(version))
4295-
4296- if hookenv.config('performance_tuning').lower() != 'manual':
4297- packages.append('pgtune')
4298-
4299- if hookenv.config('swiftwal_container_prefix'):
4300- packages.append('swiftwal')
4301-
4302- if hookenv.config('wal_e_storage_uri'):
4303- packages.extend(['wal-e', 'daemontools'])
4304-
4305- packages.extend((hookenv.config('extra-packages') or '').split())
4306- packages = fetch.filter_installed_packages(packages)
4307- # Set package state for main postgresql package if installed
4308- if 'postgresql-{}'.format(version) not in packages:
4309- ensure_package_status('postgresql-{}'.format(version),
4310- hookenv.config('package_status'))
4311- fetch.apt_update(fatal=True)
4312- fetch.apt_install(packages, fatal=True)
4313-
4314-
4315-@contextmanager
4316-def pgpass():
4317- passwords = {}
4318-
4319- # Replication.
4320- # pg_basebackup only works with the password in .pgpass, or entered
4321- # at the command prompt.
4322- if 'replication_password' in local_state:
4323- passwords['juju_replication'] = local_state['replication_password']
4324-
4325- pgpass_contents = '\n'.join(
4326- "*:*:*:{}:{}".format(username, password)
4327- for username, password in passwords.items())
4328- pgpass_file = NamedTemporaryFile()
4329- pgpass_file.write(pgpass_contents)
4330- pgpass_file.flush()
4331- os.chown(pgpass_file.name, getpwnam('postgres').pw_uid, -1)
4332- os.chmod(pgpass_file.name, 0o400)
4333- org_pgpassfile = os.environ.get('PGPASSFILE', None)
4334- os.environ['PGPASSFILE'] = pgpass_file.name
4335- try:
4336- yield pgpass_file.name
4337- finally:
4338- if org_pgpassfile is None:
4339- del os.environ['PGPASSFILE']
4340- else:
4341- os.environ['PGPASSFILE'] = org_pgpassfile
4342-
4343-
4344-def authorized_by(unit):
4345- '''Return True if the peer has authorized our database connections.'''
4346- for relid in hookenv.relation_ids('replication'):
4347- relation = hookenv.relation_get(unit=unit, rid=relid)
4348- authorized = relation.get('authorized', '').split()
4349- return hookenv.local_unit() in authorized
4350-
4351-
4352-def promote_database():
4353- '''Take the database out of recovery mode.'''
4354- config_data = hookenv.config()
4355- version = pg_version()
4356- cluster_name = config_data['cluster_name']
4357- postgresql_cluster_dir = os.path.join(
4358- postgresql_data_dir, version, cluster_name)
4359- recovery_conf = os.path.join(postgresql_cluster_dir, 'recovery.conf')
4360- if os.path.exists(recovery_conf):
4361- # Rather than using 'pg_ctl promote', we do the promotion
4362- # this way to avoid creating a timeline change. Switch this
4363- # to using 'pg_ctl promote' once PostgreSQL propagates
4364- # timeline changes via streaming replication.
4365- os.unlink(recovery_conf)
4366- postgresql_restart()
4367-
4368-
4369-def follow_database(master):
4370- '''Connect the database as a streaming replica of the master.'''
4371- master_relation = hookenv.relation_get(unit=master)
4372- create_recovery_conf(
4373- master_relation['private-address'],
4374- master_relation['port'], restart_on_change=True)
4375-
4376-
4377-def elected_master():
4378- """Return the unit that should be master, or None if we don't yet know."""
4379- if local_state['state'] == 'master':
4380- log("I am already the master", DEBUG)
4381- return hookenv.local_unit()
4382-
4383- if local_state['state'] == 'hot standby':
4384- log("I am already following {}".format(
4385- local_state['following']), DEBUG)
4386- return local_state['following']
4387-
4388- replication_relid = hookenv.relation_ids('replication')[0]
4389- replication_units = hookenv.related_units(replication_relid)
4390-
4391- if local_state['state'] == 'standalone':
4392- log("I'm a standalone unit wanting to participate in replication")
4393- existing_replication = False
4394- for unit in replication_units:
4395- # If another peer thinks it is the master, believe it.
4396- remote_state = hookenv.relation_get(
4397- 'state', unit, replication_relid)
4398- if remote_state == 'master':
4399- log("{} thinks it is the master, believing it".format(
4400- unit), DEBUG)
4401- return unit
4402-
4403- # If we find a peer that isn't standalone, we know
4404- # replication has already been setup at some point.
4405- if remote_state != 'standalone':
4406- existing_replication = True
4407-
4408- # If we are joining a peer relation where replication has
4409- # already been setup, but there is currently no master, wait
4410- # until one of the remaining participating units has been
4411- # promoted to master. Only they have the data we need to
4412- # preserve.
4413- if existing_replication:
4414- log("Peers participating in replication need to elect a master",
4415- DEBUG)
4416- return None
4417-
4418- # There are no peers claiming to be master, and there is no
4419- # election in progress, so lowest numbered unit wins.
4420- units = replication_units + [hookenv.local_unit()]
4421- master = unit_sorted(units)[0]
4422- if master == hookenv.local_unit():
4423- log("I'm Master - lowest numbered unit in new peer group")
4424- return master
4425- else:
4426- log("Waiting on {} to declare itself Master".format(master), DEBUG)
4427- return None
4428-
4429- if local_state['state'] == 'failover':
4430- former_master = local_state['following']
4431- log("Failover from {}".format(former_master))
4432-
4433- units_not_in_failover = set()
4434- candidates = set()
4435- for unit in replication_units:
4436- if unit == former_master:
4437- log("Found dying master {}".format(unit), DEBUG)
4438- continue
4439-
4440- relation = hookenv.relation_get(unit=unit, rid=replication_relid)
4441-
4442- if relation['state'] == 'master':
4443- log("{} says it already won the election".format(unit),
4444- INFO)
4445- return unit
4446-
4447- if relation['state'] == 'failover':
4448- candidates.add(unit)
4449-
4450- elif relation['state'] != 'standalone':
4451- units_not_in_failover.add(unit)
4452-
4453- if units_not_in_failover:
4454- log("{} unaware of impending election. Deferring result.".format(
4455- " ".join(unit_sorted(units_not_in_failover))))
4456- return None
4457-
4458- log("Election in progress")
4459- winner = None
4460- winning_offset = -1
4461- candidates.add(hookenv.local_unit())
4462- # Sort the unit lists so we get consistent results in a tie
4463- # and lowest unit number wins.
4464- for unit in unit_sorted(candidates):
4465- relation = hookenv.relation_get(unit=unit, rid=replication_relid)
4466- if int(relation['wal_received_offset']) > winning_offset:
4467- winner = unit
4468- winning_offset = int(relation['wal_received_offset'])
4469-
4470- # All remaining hot standbys are in failover mode and have
4471- # reported their wal_received_offset. We can declare victory.
4472- if winner == hookenv.local_unit():
4473- log("I won the election, announcing myself winner")
4474- return winner
4475- else:
4476- log("Waiting for {} to announce its victory".format(winner),
4477- DEBUG)
4478- return None
4479-
4480-
4481-@hooks.hook('replication-relation-joined', 'replication-relation-changed')
4482-def replication_relation_joined_changed():
4483- config_changed() # Ensure minimal replication settings.
4484-
4485- # Now that pg_hba.conf has been regenerated and loaded, inform related
4486- # units that they have been granted replication access.
4487- authorized_units = set()
4488- for unit in hookenv.related_units():
4489- authorized_units.add(unit)
4490- local_state['authorized'] = authorized_units
4491-
4492- if hookenv.config('manual_replication'):
4493- log('manual_replication, nothing to do')
4494- return
4495-
4496- master = elected_master()
4497-
4498- # Handle state changes:
4499- # - Fresh install becoming the master
4500- # - Fresh install becoming a hot standby
4501- # - Hot standby being promoted to master
4502-
4503- if master is None:
4504- log("Master is not yet elected. Deferring.")
4505-
4506- elif master == hookenv.local_unit():
4507- if local_state['state'] != 'master':
4508- log("I have elected myself master")
4509- promote_database()
4510- if 'following' in local_state:
4511- del local_state['following']
4512- if 'wal_received_offset' in local_state:
4513- del local_state['wal_received_offset']
4514- if 'paused_at_failover' in local_state:
4515- del local_state['paused_at_failover']
4516- local_state['state'] = 'master'
4517-
4518- # Publish credentials to hot standbys so they can connect.
4519- replication_password = create_user(
4520- 'juju_replication', replication=True)
4521- local_state['replication_password'] = replication_password
4522- local_state['client_relations'] = ' '.join(
4523- hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'))
4524- local_state.publish()
4525-
4526- else:
4527- log("I am master and remain master")
4528-
4529- elif not authorized_by(master):
4530- log("I need to follow {} but am not yet authorized".format(master))
4531-
4532- else:
4533- log("Syncing replication_password from {}".format(master), DEBUG)
4534- local_state['replication_password'] = hookenv.relation_get(
4535- 'replication_password', master)
4536-
4537- if 'following' not in local_state:
4538- log("Fresh unit. I will clone {} and become a hot standby".format(
4539- master))
4540-
4541- master_ip = hookenv.relation_get('private-address', master)
4542- master_port = hookenv.relation_get('port', master)
4543- assert master_port is not None, 'No master port set'
4544-
4545- clone_database(master, master_ip, master_port)
4546-
4547- local_state['state'] = 'hot standby'
4548- local_state['following'] = master
4549- if 'wal_received_offset' in local_state:
4550- del local_state['wal_received_offset']
4551-
4552- elif local_state['following'] == master:
4553- log("I am a hot standby already following {}".format(master))
4554-
4555- # Replication connection details may have changed, so
4556- # ensure we are still following.
4557- follow_database(master)
4558-
4559- else:
4560- log("I am a hot standby following new master {}".format(master))
4561- follow_database(master)
4562- if not local_state.get("paused_at_failover", None):
4563- run_sql_as_postgres("SELECT pg_xlog_replay_resume()")
4564- local_state['state'] = 'hot standby'
4565- local_state['following'] = master
4566- del local_state['wal_received_offset']
4567- del local_state['paused_at_failover']
4568-
4569- publish_hot_standby_credentials()
4570- postgresql_hba = os.path.join(
4571- _get_postgresql_config_dir(), "pg_hba.conf")
4572- generate_postgresql_hba(postgresql_hba)
4573-
4574- # Swift container name make have changed, so regenerate the SwiftWAL
4575- # config. This can go away when we have real leader election and can
4576- # safely share a single container.
4577- create_swiftwal_config()
4578- create_wal_e_envdir()
4579-
4580- local_state.publish()
4581-
4582-
4583-def publish_hot_standby_credentials():
4584- '''
4585- If a hot standby joins a client relation before the master
4586- unit, it is unable to publish connection details. However,
4587- when the master does join it updates the client_relations
4588- value in the peer relation causing the replication-relation-changed
4589- hook to be invoked. This gives us a second opertunity to publish
4590- connection details.
4591-
4592- This function is invoked from both the client and peer
4593- relation-changed hook. One of these will work depending on the order
4594- the master and hot standby joined the client relation.
4595- '''
4596- master = local_state['following']
4597- if not master:
4598- log("I will be a hot standby, but no master yet")
4599- return
4600-
4601- if not authorized_by(master):
4602- log("Master {} has not yet authorized us".format(master))
4603- return
4604-
4605- client_relations = hookenv.relation_get(
4606- 'client_relations', master, hookenv.relation_ids('replication')[0])
4607-
4608- if client_relations is None:
4609- log("Master {} has not yet joined any client relations".format(
4610- master), DEBUG)
4611- return
4612-
4613- # Build the set of client relations that both the master and this
4614- # unit have joined.
4615- possible_client_relations = set(hookenv.relation_ids('db') +
4616- hookenv.relation_ids('db-admin') +
4617- hookenv.relation_ids('master'))
4618- active_client_relations = possible_client_relations.intersection(
4619- set(client_relations.split()))
4620-
4621- for client_relation in active_client_relations:
4622- # We need to pull the credentials from the master unit's
4623- # end of the client relation.
4624- log('Hot standby republishing credentials from {} to {}'.format(
4625- master, client_relation))
4626-
4627- connection_settings = hookenv.relation_get(
4628- unit=master, rid=client_relation)
4629-
4630- # Override unit specific connection details
4631- connection_settings['host'] = hookenv.unit_private_ip()
4632- connection_settings['port'] = get_service_port()
4633- connection_settings['state'] = local_state['state']
4634- requested_db = hookenv.relation_get('database')
4635- # A hot standby might have seen a database name change before
4636- # the master, so override. This is no problem because we block
4637- # until this database has been created on the master and
4638- # replicated through to this unit.
4639- if requested_db:
4640- connection_settings['database'] = requested_db
4641-
4642- # Block until users and database has replicated, so we know the
4643- # connection details we publish are actually valid. This will
4644- # normally be pretty much instantaneous. Do not block if we are
4645- # running in manual replication mode, as it is outside of juju's
4646- # control when replication is actually setup and running.
4647- if not hookenv.config('manual_replication'):
4648- timeout = 60
4649- start = time.time()
4650- while time.time() < start + timeout:
4651- cur = db_cursor(autocommit=True)
4652- cur.execute('select datname from pg_database')
4653- if cur.fetchone() is not None:
4654- break
4655- del cur
4656- log('Waiting for database {} to be replicated'.format(
4657- connection_settings['database']))
4658- time.sleep(10)
4659-
4660- log("Relation {} connection settings {!r}".format(
4661- client_relation, connection_settings), DEBUG)
4662- hookenv.relation_set(
4663- client_relation, relation_settings=connection_settings)
4664-
4665-
4666-@hooks.hook()
4667-def replication_relation_departed():
4668- '''A unit has left the replication peer group.'''
4669- remote_unit = hookenv.remote_unit()
4670-
4671- assert remote_unit is not None
4672-
4673- log("{} has left the peer group".format(remote_unit))
4674-
4675- # If we are the last unit standing, we become standalone
4676- remaining_peers = set(hookenv.related_units(hookenv.relation_id()))
4677- remaining_peers.discard(remote_unit) # Bug #1192433
4678-
4679- # True if we were following the departed unit.
4680- following_departed = (local_state.get('following', None) == remote_unit)
4681-
4682- if remaining_peers and not following_departed:
4683- log("Remaining {}".format(local_state['state']))
4684-
4685- elif remaining_peers and following_departed:
4686- # If the unit being removed was our master, prepare for failover.
4687- # We need to suspend replication to ensure that the replay point
4688- # remains consistent throughout the election, and publish that
4689- # replay point. Once all units have entered this steady state,
4690- # we can identify the most up to date hot standby and promote it
4691- # to be the new master.
4692- log("Entering failover state")
4693- cur = db_cursor(autocommit=True)
4694- cur.execute("SELECT pg_is_xlog_replay_paused()")
4695- already_paused = cur.fetchone()[0]
4696- local_state["paused_at_failover"] = already_paused
4697- if not already_paused:
4698- cur.execute("SELECT pg_xlog_replay_pause()")
4699- # Switch to failover state. Don't cleanup the 'following'
4700- # setting because having access to the former master is still
4701- # useful.
4702- local_state['state'] = 'failover'
4703- local_state['wal_received_offset'] = postgresql_wal_received_offset()
4704-
4705- else:
4706- log("Last unit standing. Switching from {} to standalone.".format(
4707- local_state['state']))
4708- promote_database()
4709- local_state['state'] = 'standalone'
4710- if 'following' in local_state:
4711- del local_state['following']
4712- if 'wal_received_offset' in local_state:
4713- del local_state['wal_received_offset']
4714- if 'paused_at_failover' in local_state:
4715- del local_state['paused_at_failover']
4716-
4717- config_changed()
4718- local_state.publish()
4719-
4720-
4721-@hooks.hook()
4722-def replication_relation_broken():
4723- # This unit has been removed from the service.
4724- promote_database()
4725- config_changed()
4726-
4727-
4728-@contextmanager
4729-def switch_cwd(new_working_directory):
4730- org_dir = os.getcwd()
4731- os.chdir(new_working_directory)
4732- try:
4733- yield new_working_directory
4734- finally:
4735- os.chdir(org_dir)
4736-
4737-
4738-@contextmanager
4739-def restart_lock(unit, exclusive):
4740- '''Aquire the database restart lock on the given unit.
4741-
4742- A database needing a restart should grab an exclusive lock before
4743- doing so. To block a remote database from doing a restart, grab a shared
4744- lock.
4745- '''
4746- key = long(hookenv.config('advisory_lock_restart_key'))
4747- if exclusive:
4748- lock_function = 'pg_advisory_lock'
4749- else:
4750- lock_function = 'pg_advisory_lock_shared'
4751- q = 'SELECT {}({})'.format(lock_function, key)
4752-
4753- # We will get an exception if the database is rebooted while waiting
4754- # for a shared lock. If the connection is killed, we retry a few
4755- # times to cope.
4756- num_retries = 3
4757-
4758- for count in range(0, num_retries):
4759- try:
4760- if unit == hookenv.local_unit():
4761- cur = db_cursor(autocommit=True)
4762- else:
4763- host = hookenv.relation_get('private-address', unit)
4764- port = hookenv.relation_get('port', unit)
4765- cur = db_cursor(
4766- autocommit=True, db='postgres', user='juju_replication',
4767- host=host, port=port)
4768- cur.execute(q)
4769- break
4770- except psycopg2.Error:
4771- if count == num_retries - 1:
4772- raise
4773-
4774- try:
4775- yield
4776- finally:
4777- # Close our connection, swallowing any exceptions as the database
4778- # may be being rebooted now we have released our lock.
4779- try:
4780- del cur
4781- except psycopg2.Error:
4782- pass
4783-
4784-
4785-def clone_database(master_unit, master_host, master_port):
4786- with restart_lock(master_unit, False):
4787- postgresql_stop()
4788- log("Cloning master {}".format(master_unit))
4789-
4790- config_data = hookenv.config()
4791- version = pg_version()
4792- cluster_name = config_data['cluster_name']
4793- postgresql_cluster_dir = os.path.join(
4794- postgresql_data_dir, version, cluster_name)
4795- postgresql_config_dir = _get_postgresql_config_dir(config_data)
4796- cmd = [
4797- 'sudo', '-E', # -E needed to locate pgpass file.
4798- '-u', 'postgres', 'pg_basebackup', '-D', postgresql_cluster_dir,
4799- '--xlog', '--checkpoint=fast', '--no-password',
4800- '-h', master_host, '-p', master_port,
4801- '--username=juju_replication']
4802- log(' '.join(cmd), DEBUG)
4803-
4804- if os.path.isdir(postgresql_cluster_dir):
4805- shutil.rmtree(postgresql_cluster_dir)
4806-
4807- try:
4808- # Change directory the postgres user can read, and need
4809- # .pgpass too.
4810- with switch_cwd('/tmp'), pgpass():
4811- # Clone the master with pg_basebackup.
4812- output = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
4813- log(output, DEBUG)
4814- # SSL certificates need to exist in the datadir.
4815- create_ssl_cert(postgresql_cluster_dir)
4816- create_recovery_conf(master_host, master_port)
4817- except subprocess.CalledProcessError as x:
4818- # We failed, and this cluster is broken. Rebuild a
4819- # working cluster so start/stop etc. works and we
4820- # can retry hooks again. Even assuming the charm is
4821- # functioning correctly, the clone may still fail
4822- # due to eg. lack of disk space.
4823- log(x.output, ERROR)
4824- log("Clone failed, local db destroyed", ERROR)
4825- if os.path.exists(postgresql_cluster_dir):
4826- shutil.rmtree(postgresql_cluster_dir)
4827- if os.path.exists(postgresql_config_dir):
4828- shutil.rmtree(postgresql_config_dir)
4829- createcluster()
4830- config_changed()
4831- raise
4832- finally:
4833- postgresql_start()
4834- wait_for_db()
4835-
4836-
4837-def slave_count():
4838- num_slaves = 0
4839- for relid in hookenv.relation_ids('replication'):
4840- num_slaves += len(hookenv.related_units(relid))
4841- for relid in hookenv.relation_ids('master'):
4842- num_slaves += len(hookenv.related_units(relid))
4843- return num_slaves
4844-
4845-
4846-def postgresql_is_in_backup_mode():
4847- version = pg_version()
4848- cluster_name = hookenv.config('cluster_name')
4849- postgresql_cluster_dir = os.path.join(
4850- postgresql_data_dir, version, cluster_name)
4851-
4852- return os.path.exists(
4853- os.path.join(postgresql_cluster_dir, 'backup_label'))
4854-
4855-
4856-def pg_basebackup_is_running():
4857- cur = db_cursor(autocommit=True)
4858- cur.execute("""
4859- SELECT count(*) FROM pg_stat_activity
4860- WHERE usename='juju_replication' AND application_name='pg_basebackup'
4861- """)
4862- return cur.fetchone()[0] > 0
4863-
4864-
4865-def postgresql_wal_received_offset():
4866- """How much WAL we have.
4867-
4868- WAL is replicated asynchronously from the master to hot standbys.
4869- The more WAL a hot standby has received, the better a candidate it
4870- makes for master during failover.
4871-
4872- Note that this is not quite the same as how in sync the hot standby is.
4873- That depends on how much WAL has been replayed. WAL is replayed after
4874- it is received.
4875- """
4876- cur = db_cursor(autocommit=True)
4877- cur.execute('SELECT pg_is_in_recovery(), pg_last_xlog_receive_location()')
4878- is_in_recovery, xlog_received = cur.fetchone()
4879- if is_in_recovery:
4880- return wal_location_to_bytes(xlog_received)
4881- return None
4882-
4883-
4884-def wal_location_to_bytes(wal_location):
4885- """Convert WAL + offset to num bytes, so they can be compared."""
4886- logid, offset = wal_location.split('/')
4887- return int(logid, 16) * 16 * 1024 * 1024 * 255 + int(offset, 16)
4888-
4889-
4890-def wait_for_db(
4891- timeout=120, db='postgres', user='postgres', host=None, port=None):
4892- '''Wait until the db is fully up.'''
4893- db_cursor(db=db, user=user, host=host, port=port, timeout=timeout)
4894-
4895-
4896-def unit_sorted(units):
4897- """Return a sorted list of unit names."""
4898- return sorted(
4899- units, lambda a, b: cmp(int(a.split('/')[-1]), int(b.split('/')[-1])))
4900-
4901-
4902-def delete_metrics_cronjob(cron_path):
4903- try:
4904- os.unlink(cron_path)
4905- except OSError:
4906- pass
4907-
4908-
4909-def write_metrics_cronjob(script_path, cron_path):
4910- config_data = hookenv.config()
4911-
4912- # need the following two configs to be valid
4913- metrics_target = config_data['metrics_target'].strip()
4914- metrics_sample_interval = config_data['metrics_sample_interval']
4915- if (not metrics_target
4916- or ':' not in metrics_target
4917- or not metrics_sample_interval):
4918- log("Required config not found or invalid "
4919- "(metrics_target, metrics_sample_interval), "
4920- "disabling statsd metrics", DEBUG)
4921- delete_metrics_cronjob(cron_path)
4922- return
4923-
4924- charm_dir = os.environ['CHARM_DIR']
4925- statsd_host, statsd_port = metrics_target.split(':', 1)
4926- metrics_prefix = config_data['metrics_prefix'].strip()
4927- metrics_prefix = metrics_prefix.replace(
4928- "$UNIT", hookenv.local_unit().replace('.', '-').replace('/', '-'))
4929-
4930- # ensure script installed
4931- charm_script = os.path.join(charm_dir, 'files', 'metrics',
4932- 'postgres_to_statsd.py')
4933- host.write_file(script_path, open(charm_script, 'rb').read(), perms=0755)
4934-
4935- # write the crontab
4936- with open(cron_path, 'w') as cronjob:
4937- cronjob.write(render_template("metrics_cronjob.template", {
4938- 'interval': config_data['metrics_sample_interval'],
4939- 'script': script_path,
4940- 'metrics_prefix': metrics_prefix,
4941- 'metrics_sample_interval': metrics_sample_interval,
4942- 'statsd_host': statsd_host,
4943- 'statsd_port': statsd_port,
4944- }))
4945-
4946-
4947-@hooks.hook('nrpe-external-master-relation-changed')
4948-def update_nrpe_checks():
4949- config_data = hookenv.config()
4950- try:
4951- nagios_uid = getpwnam('nagios').pw_uid
4952- nagios_gid = getgrnam('nagios').gr_gid
4953- except Exception:
4954- hookenv.log("Nagios user not set up.", hookenv.DEBUG)
4955- return
4956-
4957- try:
4958- nagios_password = create_user('nagios')
4959- pg_pass_entry = '*:*:*:nagios:%s' % (nagios_password)
4960- with open('/var/lib/nagios/.pgpass', 'w') as target:
4961- os.fchown(target.fileno(), nagios_uid, nagios_gid)
4962- os.fchmod(target.fileno(), 0400)
4963- target.write(pg_pass_entry)
4964- except psycopg2.InternalError:
4965- if config_data['manual_replication']:
4966- log("update_nrpe_checks(): manual_replication: "
4967- "ignoring psycopg2.InternalError caught creating 'nagios' "
4968- "postgres role; assuming we're already replicating")
4969- else:
4970- raise
4971-
4972- relids = hookenv.relation_ids('nrpe-external-master')
4973- relations = []
4974- for relid in relids:
4975- for unit in hookenv.related_units(relid):
4976- relations.append(hookenv.relation_get(unit=unit, rid=relid))
4977-
4978- if len(relations) == 1 and 'nagios_hostname' in relations[0]:
4979- nagios_hostname = relations[0]['nagios_hostname']
4980- log("update_nrpe_checks: Obtained nagios_hostname ({}) "
4981- "from nrpe-external-master relation.".format(nagios_hostname))
4982- else:
4983- unit = hookenv.local_unit()
4984- unit_name = unit.replace('/', '-')
4985- nagios_hostname = "%s-%s" % (config_data['nagios_context'], unit_name)
4986- log("update_nrpe_checks: Deduced nagios_hostname ({}) from charm "
4987- "config (nagios_hostname not found in nrpe-external-master "
4988- "relation, or wrong number of relations "
4989- "found)".format(nagios_hostname))
4990-
4991- nrpe_service_file = \
4992- '/var/lib/nagios/export/service__{}_check_pgsql.cfg'.format(
4993- nagios_hostname)
4994- nagios_logdir = '/var/log/nagios'
4995- if not os.path.exists(nagios_logdir):
4996- os.mkdir(nagios_logdir)
4997- os.chown(nagios_logdir, nagios_uid, nagios_gid)
4998- for f in os.listdir('/var/lib/nagios/export/'):
4999- if re.search('.*check_pgsql.cfg', f):
5000- os.remove(os.path.join('/var/lib/nagios/export/', f))
The diff has been truncated for viewing.

Subscribers

People subscribed via source and target branches

to all changes: