Merge lp:~stub/charms/trusty/postgresql/rewrite into lp:charms/trusty/postgresql

Proposed by Stuart Bishop
Status: Merged
Merged at revision: 131
Proposed branch: lp:~stub/charms/trusty/postgresql/rewrite
Merge into: lp:charms/trusty/postgresql
Prerequisite: lp:~stub/charms/trusty/postgresql/rewrite-charmhelpers
Diff against target: 22814 lines (+17181/-4595)
67 files modified
.bzrignore (+3/-1)
Makefile (+64/-49)
README.md (+52/-100)
TODO (+0/-27)
actions.yaml (+11/-0)
actions/actions.py (+42/-7)
config.yaml (+327/-270)
copyright (+6/-7)
hooks/bootstrap.py (+57/-0)
hooks/client.py (+182/-0)
hooks/coordinator.py (+19/-0)
hooks/data-relation-changed (+23/-0)
hooks/data-relation-departed (+23/-0)
hooks/db-admin-relation-departed (+23/-0)
hooks/db-relation-departed (+23/-0)
hooks/decorators.py (+124/-0)
hooks/definitions.py (+86/-0)
hooks/helpers.py (+0/-197)
hooks/hooks.py (+0/-2820)
hooks/leader-elected (+23/-0)
hooks/leader-settings-changed (+23/-0)
hooks/local-monitors-relation-changed (+23/-0)
hooks/master-relation-changed (+23/-0)
hooks/master-relation-departed (+23/-0)
hooks/metrics.py (+64/-0)
hooks/nagios.py (+81/-0)
hooks/nrpe-external-master-relation-changed (+23/-0)
hooks/postgresql.py (+692/-0)
hooks/replication-relation-changed (+23/-0)
hooks/replication.py (+324/-0)
hooks/service.py (+930/-0)
hooks/start (+23/-0)
hooks/stop (+23/-0)
hooks/storage.py (+88/-0)
hooks/syslog-relation-departed (+23/-0)
hooks/syslogrel.py (+72/-0)
hooks/test_hooks.py (+0/-433)
hooks/upgrade.py (+121/-0)
hooks/wal_e.py (+129/-0)
lib/cache_settings.py (+44/-0)
lib/juju-deployer-wrapper.py (+32/-0)
lib/pg_settings_9.1.json (+2792/-0)
lib/pg_settings_9.2.json (+2858/-0)
lib/pg_settings_9.3.json (+2936/-0)
lib/pg_settings_9.4.json (+3050/-0)
lib/pgclient/metadata.yaml (+3/-4)
metadata.yaml (+44/-9)
templates/pg_backup_job.tmpl (+16/-23)
templates/pg_hba.conf.tmpl (+0/-29)
templates/pg_ident.conf.tmpl (+0/-3)
templates/postgres.cron.tmpl (+6/-6)
templates/postgresql.conf.tmpl (+0/-213)
templates/recovery.conf.tmpl (+1/-1)
templates/rsyslog_forward.conf (+2/-2)
templates/start_conf.tmpl (+0/-13)
templates/swiftwal.conf.tmpl (+0/-6)
testing/README (+0/-36)
testing/amuletfixture.py (+241/-0)
testing/jujufixture.py (+0/-297)
tests/00-setup.sh (+0/-15)
tests/01-lint.sh (+0/-3)
tests/02-unit-tests.sh (+0/-3)
tests/03-basic-amulet.py (+0/-19)
tests/obsolete.py (+2/-2)
tests/test_integration.py (+628/-0)
tests/test_postgresql.py (+711/-0)
tests/tests.yaml (+19/-0)
To merge this branch: bzr merge lp:~stub/charms/trusty/postgresql/rewrite
Reviewer Review Type Date Requested Status
Matt Bruzek (community) Approve
Review via email: mp+267646@code.launchpad.net

Description of the change

The PostgreSQL charm is one of the oldest around, crufty and unmaintainable. Its time for a rewrite to make use of modern Juju and improve its reliability.

To post a comment you must log in.
Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :
Revision history for this message
Stuart Bishop (stub) wrote :

They were passing on the new system being setup. I haven't been
watching the old runner.

--
Stuart Bishop <email address hidden>

Revision history for this message
Stuart Bishop (stub) wrote :

On 17 September 2015 at 14:49, Stuart Bishop
<email address hidden> wrote:
> They were passing on the new system being setup. I haven't been
> watching the old runner.

Looks like the green runs are lost in the mists of time. I'm not sure
what has changed in the month this has been waiting for review.

Revision history for this message
Stuart Bishop (stub) wrote :

Looking at http://data.vapour.ws/charm-test/charm-bundle-test-azure-651-all-machines-log,
it seems debugging output has been turned up to 11, and I think this
breaks 'juju wait'. Without 'juju wait', it is impossible to get the
tests running reliably

Its hard to tell through all the noise, but I think here is where the
initial deploy actually terminates:

unit-postgresql-0[21563]: 2015-09-17 01:01:47 INFO
unit.postgresql/0.juju-log server.go:254 db-admin:2: *** End
'db-admin-relation-changed' hook

After this, we keep seeing noise about leadership leases, and other
stuff including the juju run commands that are checking that the logs
are quiet (thus shooting itself in the foot, because the only way to
check if the logs are quiet adds noise to the logs...).

This never got picked up locally, as I don't even know how to turn on
this output and don't get it here (using juju stable).

Also see Bug #1496130, where we are fixing things so the OpenStack
mojo tests can use the wait algorithm.

What is confusing me though is that the Cassandra tests last passed
Sep 4th, and I would have expected them to have started failing much
earlier (it too relies on juju wait). What is even more confusing, the
last successful run at
http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-703
leads me to the AWS log at
http://data.vapour.ws/charm-test/charm-bundle-test-aws-591-all-machines-log
which contains a whole heap of logging information from other tests,
in addition to the Cassandra logs, which means test isolation was
busted and the results bogus.

Revision history for this message
Stuart Bishop (stub) wrote :

I see. Prior to ~Sep 4th, tests were often passing but would also
often fail with various provisioning errors. eg. This one is common,
indicating units took longer than 5 minutes to provision:

amulet.helpers.TimeoutError: public-address not set forall units after 300s

After to ~Sep 4th, all integration tests started failing consistently
at 'juju wait', for both Cassandra and the PostgreSQL rewrite.

--
Stuart Bishop <email address hidden>

Revision history for this message
Stuart Bishop (stub) wrote :

I added Juju 1.24 specific behavior to the juju wait plugin, which avoids needing to sniff the logs. I've updated the package in ppa:stub/juju and I think I've queued a retest.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

> They were passing on the new system being setup. I haven't been
> watching the old runner.
>
Sorry, I'm not familiar, where is the new system (or output from it)?

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

> Looking at http://data.vapour.ws/charm-test/charm-bundle-test-azure-651-all-
> machines-log,
> it seems debugging output has been turned up to 11, and I think this
> breaks 'juju wait'. Without 'juju wait', it is impossible to get the
> tests running reliably

I'm not sure why this would be, unless the default log level in juju itself got turned up. CI is currently running 1.24.5.1

> What is confusing me though is that the Cassandra tests last passed
> Sep 4th, and I would have expected them to have started failing much
> earlier (it too relies on juju wait). What is even more confusing, the
> last successful run at
> http://reports.vapour.ws/charm-test-details/charm-bundle-test-parent-703
> leads me to the AWS log at
> http://data.vapour.ws/charm-test/charm-bundle-test-aws-591-all-machines-log
> which contains a whole heap of logging information from other tests,
> in addition to the Cassandra logs, which means test isolation was
> busted and the results bogus.

Wow, that is bizarre. Thanks for pointing this out, I had not seen this before. I'm not convinced that test isolation is busted, but log isolation may be. I'll look into this.

Revision history for this message
Stuart Bishop (stub) wrote :
Download full text (11.6 KiB)

stub@aargh:~/charms/postgresql/rewrite$ make test
sudo add-apt-repository -y ppa:juju/stable
gpg: keyring `/tmp/tmppv9t1ria/secring.gpg' created
gpg: keyring `/tmp/tmppv9t1ria/pubring.gpg' created
gpg: requesting key C8068B11 from hkp server keyserver.ubuntu.com
gpg: /tmp/tmppv9t1ria/trustdb.gpg: trustdb created
gpg: key C8068B11: public key "Launchpad Ensemble PPA" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
OK
sudo add-apt-repository -y ppa:stub/juju
gpg: keyring `/tmp/tmpfkk5hwu_/secring.gpg' created
gpg: keyring `/tmp/tmpfkk5hwu_/pubring.gpg' created
gpg: requesting key E4FD7A7A from hkp server keyserver.ubuntu.com
gpg: /tmp/tmpfkk5hwu_/trustdb.gpg: trustdb created
gpg: key E4FD7A7A: public key "Launchpad Stub's Launchpad PPA" imported
gpg: Total number processed: 1
gpg: imported: 1 (RSA: 1)
OK
[...]
Reading package lists... Done
sudo apt-get install -y \
    python3-psycopg2 python3-nose python3-flake8 amulet \
    python3-jinja2 python3-yaml juju-wait bzr \
    python3-nose-cov python3-nose-timer python-swiftclient
Reading package lists... Done
Building dependency tree
Reading state information... Done
bzr is already the newest version.
python-swiftclient is already the newest version.
python3-flake8 is already the newest version.
python3-jinja2 is already the newest version.
python3-nose is already the newest version.
python3-psycopg2 is already the newest version.
python3-yaml is already the newest version.
python3-nose-cov is already the newest version.
python3-nose-timer is already the newest version.
amulet is already the newest version.
juju-wait is already the newest version.
0 to upgrade, 0 to newly install, 0 to remove and 0 not to upgrade.
Charm Proof
I: metadata name (postgresql) must match directory name (rewrite) exactly for local deployment.
Lint check (flake8)
user configuration: /home/stub/.config/flake8
directory hooks
checking hooks/bootstrap.py
checking hooks/client.py
checking hooks/coordinator.py
checking hooks/decorators.py
checking hooks/definitions.py
checking hooks/helpers.py
checking hooks/metrics.py
checking hooks/nagios.py
checking hooks/postgresql.py
checking hooks/replication.py
checking hooks/service.py
checking hooks/storage.py
checking hooks/syslogrel.py
checking hooks/upgrade.py
checking hooks/wal_e.py
directory actions
checking actions/actions.py
directory testing
checking testing/__init__.py
checking testing/amuletfixture.py
directory tests
checking tests/obsolete.py
checking tests/test_integration.py
checking tests/test_postgresql.py
nosetests3 -sv tests/test_postgresql.py --cover-package=bootstrap,nagios,wal_e,syslogrel,replication,definitions,storage,decorators,metrics,upgrade,postgresql,helpers,coordinator,client,service \
    --with-coverage --cover-branches
test_addr_to_range (test_postgresql.TestPostgresql) ... ok
test_connect (test_postgresql.TestPostgresql) ... ok
test_convert_unit (test_postgresql.TestPostgresql) ... ok
test_create_cluster (test_postgresql.TestPostgresql) ... ok
test_drop_cluster (test_postgresql.TestPostgresql) ... ok
test_ensure_database (test_postgresql.TestPostgresql) ... ok
test_ensure_extensions (test_...

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

@stub, which cloud was that on?

I still have no idea why it literally never passes on CI. Usually takes forever to run too, and often hangs and needs to be killed. I haven't had time to dig deeper yet but I'm hoping to soon.

Revision history for this message
Stuart Bishop (stub) wrote :

@tim It was just a different report, so these clouds. I think I was getting confused with the Cassandra runs which I was doing around the same time.

Previously I had been getting tests to pass, but not all the tests. There were always a few that would fail due to provisioning type errors, the amulet error failing to list the contents of a remote directory, or a juju-deployer race.

Then around Sept 4th, something changed on your clouds and the juju logging changed. This broke 'juju wait', and thus all the PostgreSQL tests, and the Cassandra tests to boot. I've since been reworking the t plugin to be less fragile with Juju 1.23 and earlier, and use 'agent-status' with Juju 1.24 and later. This has got the Cassandra tests green again (and the lp:~stub/charms/trusty/cassandra/spike MP green too).

Revision history for this message
Stuart Bishop (stub) wrote :

This round is looking good. Tests at http://juju-ci.vapour.ws:8080/job/charm-bundle-test-joyent/729/console running smooth until the env.lock bug kicked in.

Revision history for this message
Tim Van Steenburgh (tvansteenburgh) wrote :

I just emailed the juju list about the env.lock problem. If my multiple $JUJU_HOME idea works, it'll be a quick fix. That particular problem is ruining a lot of otherwise green test runs across many charms and bundles.

Revision history for this message
Stuart Bishop (stub) wrote :

The remaining failure seems genuine (PG93MultiTests.test_failover, which victimizes test_replication). Given 3 nodes, dropping the master and adding a new node at the same time can cause a situation where there is no master. I have not been able to reproduce this locally. I need to trawl the logs from the CI system and try to reproduce manually on a cloud.

248. By Stuart Bishop

Work around Bug #1510000

Revision history for this message
Stuart Bishop (stub) wrote :

FWIW, this is still waiting for review so I can land it. Despite the failure of one of the tests on the CI system, this branch is still preferable to the existing known broken charm store charm which has its tests disabled.

249. By Stuart Bishop

Add timings

250. By Stuart Bishop

Dependency for timestamps

251. By Stuart Bishop

Skip failover tests until Bug #1511659 can be dealt with

252. By Stuart Bishop

Update tests for Juju 1.25 unit naming changes

Revision history for this message
Matt Bruzek (mbruzek) wrote :

Hello Stuart,

Thanks for the enormous amount of work on this charm! This merge is a huge refactor and a lot of files changed. From what I saw of the code and the extensive tests it looks great. I see the automated tests are passing on Joyent, and Power 8! I also tested this on the KVM local provider and the results were PASS: 13 Total: 13 (7392.816871 sec).

Charm proof now returns no errors or warnings which was not the case with the previous branch. Thanks for working with Tim and the team to get the automated tests passing. The tests are complex and very thorough. The automated tests passing are very important and will help this charm be of the highest quality in the future.

+1 LGTM

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file '.bzrignore'
--- .bzrignore 2014-10-14 10:12:50 +0000
+++ .bzrignore 2015-11-02 12:15:35 +0000
@@ -1,4 +1,6 @@
1_trial_temp1_trial_temp
2hooks/_trial_temp2hooks/_trial_temp
3hooks/local_state.pickle3hooks/local_state.pickle
4lib/test-client-charm/hooks/charmhelpers4lib/pgclient/hooks/charmhelpers/
5coverage
6.coverage
57
=== modified file 'Makefile'
--- Makefile 2015-06-25 08:13:25 +0000
+++ Makefile 2015-11-02 12:15:35 +0000
@@ -1,63 +1,72 @@
1CHARM_DIR := $(shell pwd)1CHARM_DIR := $(shell pwd)
2TEST_TIMEOUT := 9002TEST_TIMEOUT := 900
3SERIES := $(juju get-environment default-series)3SERIES := $(shell juju get-environment default-series)
4HOST_SERIES := $(shell lsb_release -sc)
45
5default:6default:
6 @echo "One of:"7 @echo "One of:"
7 @echo " make testdep"8 @echo " make testdeps"
8 @echo " make lint"9 @echo " make lint"
9 @echo " make unit_test"10 @echo " make unittest"
10 @echo " make integration_test"11 @echo " make integration"
11 @echo " make integration_test_91"12 @echo " make coverage (opens browser)"
12 @echo " make integration_test_92"13
13 @echo " make integration_test_93"14test: testdeps lint unittest integration
14 @echo " make integration_test_94"15
15 @echo16testdeps:
16 @echo "There is no 'make test'"17 sudo add-apt-repository -y ppa:juju/stable
1718 sudo add-apt-repository -y ppa:stub/juju
18test_bot_tests:19 sudo apt-get update
19 @echo "Installing dependencies and running automatic-testrunner tests"20ifeq ($(HOST_SERIES),trusty)
20 tests/00-setup.sh21 sudo apt-get install -y \
21 tests/01-lint.sh22 python3-psycopg2 python3-nose python3-flake8 amulet \
22 tests/02-unit-tests.sh23 python3-jinja2 python3-yaml juju-wait bzr python3-amulet \
23 tests/03-basic-amulet.py24 python-swiftclient moreutils
2425else
25testdep:26 sudo apt-get install -y \
26 tests/00-setup.sh27 python3-psycopg2 python3-nose python3-flake8 amulet \
2728 python3-jinja2 python3-yaml juju-wait bzr python3-amulet \
28unit_test:29 python3-nose-cov python3-nose-timer python-swiftclient moreutils
29 @echo "Unit tests of hooks"30endif
30 cd hooks && trial test_hooks.py
31
32integration_test:
33 @echo "PostgreSQL integration tests, all non-beta versions, ${SERIES}"
34 trial test.PG91Tests
35 trial test.PG92Tests
36 trial test.PG93Tests
37 trial test.PG94Tests
38
39integration_test_91:
40 @echo "PostgreSQL 9.1 integration tests, ${SERIES}"
41 trial test.PG91Tests
42
43integration_test_92:
44 @echo "PostgreSQL 9.2 integration tests, ${SERIES}"
45 trial test.PG92Tests
46
47integration_test_93:
48 @echo "PostgreSQL 9.3 integration tests, ${SERIES}"
49 trial test.PG93Tests
50
51integration_test_94:
52 @echo "PostgreSQL 9.4 integration tests, ${SERIES}"
53 trial test.PG94Tests
5431
55lint:32lint:
33 @echo "Charm Proof"
34 @charm proof
56 @echo "Lint check (flake8)"35 @echo "Lint check (flake8)"
57 @flake8 -v \36 @flake8 -v \
58 --exclude hooks/charmhelpers,hooks/_trial_temp \
59 --ignore=E402 \37 --ignore=E402 \
60 hooks actions testing tests test.py38 --exclude=hooks/charmhelpers,__pycache__ \
39 hooks actions testing tests
40
41_co=,
42_empty=
43_sp=$(_empty) $(_empty)
44
45TESTFILES=$(filter-out %/test_integration.py,$(wildcard tests/test_*.py))
46PACKAGES=$(subst $(_sp),$(_co),$(notdir $(basename $(wildcard hooks/*.py))))
47
48NOSE := nosetests3 -sv
49ifeq ($(HOST_SERIES),trusty)
50TIMING_NOSE := nosetests3 -sv
51else
52TIMING_NOSE := nosetests3 -sv --with-timer
53endif
54
55unittest:
56 ${NOSE} ${TESTFILES} --cover-package=${PACKAGES} \
57 --with-coverage --cover-branches
58 @echo OK: Unit tests pass `date`
59
60coverage:
61 ${NOSE} ${TESTFILES} --cover-package=${PACKAGES} \
62 --with-coverage --cover-branches \
63 --cover-erase --cover-html --cover-html-dir=coverage \
64 --cover-min-percentage=100 || \
65 (gnome-open coverage/index.html; false)
66
67integration:
68 ${TIMING_NOSE} tests/test_integration.py 2>&1 | ts
69 @echo OK: Integration tests pass `date`
6170
62sync:71sync:
63 @bzr cat \72 @bzr cat \
@@ -65,3 +74,9 @@
65 > .charm_helpers_sync.py74 > .charm_helpers_sync.py
66 @python .charm_helpers_sync.py -c charm-helpers.yaml75 @python .charm_helpers_sync.py -c charm-helpers.yaml
67 @rm .charm_helpers_sync.py76 @rm .charm_helpers_sync.py
77
78
79# These targets are to separate the test output in the Charm CI system
80test_integration.py%:
81 ${TIMING_NOSE} tests/$@ 2>&1 | ts
82 @echo OK: $@ tests pass `date`
6883
=== modified file 'README.md'
--- README.md 2015-06-22 18:40:20 +0000
+++ README.md 2015-11-02 12:15:35 +0000
@@ -26,14 +26,8 @@
2626
27# Usage27# Usage
2828
29This charm supports several deployment models:29This charm can deploy a single standalone PostgreSQL unit, or a service
3030containing a single master unit and one or more replicas.
31 - A single service containing one unit. This provides a 'standalone'
32 environment.
33
34 - A service containing multiple units. One unit will be a 'master', and every
35 other unit is a 'hot standby'. The charm sets up and maintains replication
36for you, using standard PostgreSQL streaming replication.
3731
38To setup a single 'standalone' service::32To setup a single 'standalone' service::
3933
@@ -42,21 +36,18 @@
4236
43## Scale Out Usage37## Scale Out Usage
4438
45To replicate this 'standalone' database to a 'hot standby', turning the39To add a replica to an existing service::
46existing unit into a 'master'::
4740
48 juju add-unit pg-a41 juju add-unit pg-a
4942
50To deploy a new service containing a 'master' and two 'hot standbys'::43To deploy a new service containing a master and two hot standby replicas::
5144
52 juju deploy -n 2 postgresql pg-b45 juju deploy -n 3 postgresql pg-b
53 [ ... wait until units are stable ... ]
54 juju add-unit pg-b
5546
56You can remove units as normal. If the master unit is removed, failover occurs47You can remove units as normal. If the master unit is removed, failover occurs
57and the most up to date 'hot standby' is promoted to 'master'. The48and the most up to date hot standby is promoted to the master. The
58'db-relation-changed' and 'db-admin-relation-changed' hooks are fired, letting49'db-relation-changed' and 'db-admin-relation-changed' hooks are fired,
59clients adjust::50letting clients adjust::
6051
61 juju remove-unit pg-b/052 juju remove-unit pg-b/0
6253
@@ -74,12 +65,6 @@
7465
75## Known Limitations and Issues66## Known Limitations and Issues
7667
77âš  Due to current [limitations][1] with juju, you cannot reliably
78create a service initially containing more than 2 units (eg. juju deploy
79-n 3 postgresql). Instead, you must first create a service with 2 units.
80Once the environment is stable and all the hooks have finished running,
81you may add more units.
82
83âš  Do not attempt to relate client charms to a PostgreSQL service containing68âš  Do not attempt to relate client charms to a PostgreSQL service containing
84 multiple units unless you know the charm supports a replicated service.69 multiple units unless you know the charm supports a replicated service.
8570
@@ -106,20 +91,23 @@
106 practice with your database permissions will make your life difficult91 practice with your database permissions will make your life difficult
107 when you need to recover after failure.92 when you need to recover after failure.
10893
109_Always_ set the 'roles' relationship setting when joining a94PostgreSQL has comprehensive database security, including ownership
110relationship. _Always_ grant permissions to database roles for _all_95and permissions on database objects. By default, any objects a client
111database objects your charm creates. _Never_ rely on access permissions96service creates will be owned by a user with the same name as the
112given directly to a user, either explicitly or implicitly (such as being97client service and inaccessible to other users. To share data, it
113the user who created a table). Consider the users you are provided by98is best to create new roles, grant the relevant permissions and object
114the PostgreSQL charm as ephemeral. Any rights granted directly to them99ownership to the new roles and finally grant these roles to the users
115will be lost if relations are recreated, as the generated usernames will100your services can connect as. This also makes disaster recovery easier.
116be different. _If you don't follow this advice, you will need to101If you restore a database into an indentical Juju environment, then
117manually repair permissions on all your database objects after any of102the service names and usernames will be the same and database permissions
118the available recovery mechanisms._103will match. However, if you restore a database into an environment
104with different client service names then the usernames will not match
105and the new users not have access to your data.
119106
120Learn about the SQL `GRANT` statement in the excellect [PostgreSQL107Learn about the SQL `GRANT` statement in the excellect [PostgreSQL
121reference guide][3].108reference guide][3].
122109
110
123### block-storage-broker111### block-storage-broker
124112
125If you are using external storage provided by the block storage broker,113If you are using external storage provided by the block storage broker,
@@ -163,10 +151,9 @@
163 will create it if necessary. If your charm sets this, then it must wait151 will create it if necessary. If your charm sets this, then it must wait
164 until a matching `database` value is presented on the PostgreSQL side of152 until a matching `database` value is presented on the PostgreSQL side of
165 the relation (ie. `relation-get database` returns the value you set).153 the relation (ie. `relation-get database` returns the value you set).
166- `roles`: Optional. A comma separated list of database roles to grant the154- `roles`: Deprecated. A comma separated list of database roles to grant the
167 database user. Typically these roles will have been granted permissions to155 database user. Typically these roles will have been granted permissions to
168 access the tables and other database objects. Do not grant permissions156 access the tables and other database objects.
169 directly to juju generated database users, as the charm may revoke them.
170- `extensions`: Optional. A comma separated list of required postgresql157- `extensions`: Optional. A comma separated list of required postgresql
171 extensions.158 extensions.
172159
@@ -210,28 +197,26 @@
210A PostgreSQL service may contain multiple units (a single master, and197A PostgreSQL service may contain multiple units (a single master, and
211optionally one or more hot standbys). The client charm can tell which198optionally one or more hot standbys). The client charm can tell which
212unit in a relation is the master and which are hot standbys by199unit in a relation is the master and which are hot standbys by
213inspecting the 'state' property on the relation, and it needs to be200inspecting the 'state' property on the relation.
214aware of how many units are in the relation by using the 'relation-list'201
215hook tool.202The 'standalone' state is deprecated, and when a unit advertises itself
216203as 'standalone' you should treat it like a 'master'. The state only exists
217If there is a single PostgreSQL unit related, the state will be204for backwards compatibility and will be removed soon.
218'standalone'. All database connections of course go to this unit.205
219206Writes must go to the unit identifying itself as 'master' or 'standalone'.
220If there is more than one PostgreSQL unit related, the client charm207If you sent writes to a 'hot standby', they will fail.
221must only use units with state set to 'master' or 'hot standby'.208
222The unit with 'master' state can accept read and write connections. The209Reads may go to any unit. Ideally they should be load balanced over
223units with 'hot standby' state can accept read-only connections, and210the 'hot standby' units. If you need to ensure consistency, you may
224any attempted writes will fail. Units with any other state must not be211need to read from the 'master'.
225used and should be ignored ('standalone' units are new units joining the212
226service that are not yet setup, and 'failover' state will occur when the213Units in any other state, including no state, should not be used and
227master unit is being shutdown and a new master is being elected).214connections to them will likely fail. These units may still be setting
215up, or performing a maintenance operation such as a failover.
228216
229The client charm needs to watch for state changes in its217The client charm needs to watch for state changes in its
230relation-changed hook. New units may be added to a single unit service,218relation-changed hook. During failover, one of the existing 'hot standby'
231and the client charm must stop using existing 'standalone' unit and wait219units will change into a 'master'.
232for 'master' and 'hot standby' units to appear. Units may be removed,
233possibly causing a 'hot standby' unit to be promoted to a master, or
234even having the service revert to a single 'standalone' unit.
235220
236221
237## Example client hooks222## Example client hooks
@@ -249,7 +234,6 @@
249 @hook234 @hook
250 def db_relation_joined():235 def db_relation_joined():
251 relation_set('database', config('database')) # Explicit database name236 relation_set('database', config('database')) # Explicit database name
252 relation_set('roles', 'reporting,standard') # DB roles required
253237
254 @hook('db-relation-changed', 'db-relation-departed')238 @hook('db-relation-changed', 'db-relation-departed')
255 def db_relation_changed():239 def db_relation_changed():
@@ -271,9 +255,7 @@
271 conn_str = conn_str_tmpl.format(**relation_get(unit=db_unit)255 conn_str = conn_str_tmpl.format(**relation_get(unit=db_unit)
272 remote_state = relation_get('state', db_unit)256 remote_state = relation_get('state', db_unit)
273257
274 if remote_state == 'standalone' and len(active_db_units) == 1:258 if remote_state in ('master', 'standalone'):
275 master_conn_str = conn_str
276 elif relation_state == 'master':
277 master_conn_str = conn_str259 master_conn_str = conn_str
278 elif relation_state == 'hot standby':260 elif relation_state == 'hot standby':
279 slave_conn_strs.append(conn_str)261 slave_conn_strs.append(conn_str)
@@ -284,46 +266,17 @@
284 hooks.execute(sys.argv)266 hooks.execute(sys.argv)
285267
286268
287## Upgrade-charm hook notes
288
289The PostgreSQL charm has deprecated volume-map and volume-ephemeral-storage
290configuration options in favor of using the storage subordinate charm for
291general external storage management. If the installation being upgraded is
292using these deprecated options, there are a couple of manual steps necessary
293to finish migration and continue using the current external volumes.
294Even though all data will remain intact, and PostgreSQL service will continue
295running, the upgrade-charm hook will intentionally fail and exit 1 as well to
296raise awareness of the manual procedure which will also be documented in the
297juju logs on the PostgreSQL units.
298
299The following steps must be additionally performed to continue using external
300volume maps for the PostgreSQL units once juju upgrade-charm is run from the
301command line:
302 1. cat > storage.cfg <<EOF
303 storage:
304 provider:block-storage-broker
305 root: /srv/data
306 volume_map: "{postgresql/0: your-vol-id, postgresql/1: your-2nd-vol-id}"
307 EOF
308 2. juju deploy --config storage.cfg storage
309 3. juju deploy block-storage-broker
310 4. juju add-relation block-storage-broker storage
311 5. juju resolved --retry postgresql/0 # for each postgresql unit running
312 6. juju add-relation postgresql storage
313
314
315# Point In Time Recovery269# Point In Time Recovery
316270
317The PostgreSQL charm has experimental support for log shipping and point271The PostgreSQL charm has support for log shipping and point in time
318in time recovery. This feature uses the wal-e[2] tool, and requires access272recovery. This feature uses the wal-e[2] tool, which will be
319to Amazon S3, Microsoft Azure Block Storage or Swift. This feature is273installed from the Launchpad PPA ppa:stub/pgcharm. This feature
320flagged as experimental because it has only been tested with Swift, and274requires access to either Amazon S3, Microsoft Azure Block Storage or
321not yet been tested under load. It also may require some API changes,275Swift. This feature is experimental because it has only been tested with
322particularly on how authentication credentials are accessed when a standard276Swift. The charm can be configured to perform regular filesystem backups
323emerges. The charm can be configured to perform regular filesystem backups277and ship WAL files to the object store. Hot standbys will make use of
324and ship WAL files to the object store. Hot standbys will make use of the278the archived WAL files, allowing them to resync after extended netsplits
325archived WAL files, allowing them to resync after extended netsplits or279or even let you turn off streaming replication entirely.
326even let you turn off streaming replication entirely.
327280
328With a base backup and the WAL archive you can perform point in time281With a base backup and the WAL archive you can perform point in time
329recovery, but this is still a manual process and the charm does not282recovery, but this is still a manual process and the charm does not
@@ -336,8 +289,7 @@
336service.289service.
337290
338To enable the experimental wal-e support with Swift, you will need to291To enable the experimental wal-e support with Swift, you will need to
339use Ubuntu 14.04 (Trusty), and set the service configuration settings292and set the service configuration settings similar to the following::
340similar to the following::
341293
342 postgresql:294 postgresql:
343 wal_e_storage_uri: swift://mycontainer295 wal_e_storage_uri: swift://mycontainer
344296
=== removed file 'TODO'
--- TODO 2013-02-20 13:42:08 +0000
+++ TODO 1970-01-01 00:00:00 +0000
@@ -1,27 +0,0 @@
1- Fix "run" function in hooks.py to log error rather than exiting silently
2 (exceptions emitted at INFO level)
3
4- Pick a better way to get machine specs than hacky bash functions - facter?
5
6- Specify usernames when adding a relation, rather than generating them
7 automagically. This will interact better with connection poolers.
8
9- If we have three services deployed, related via master/slave, then the
10 repmgr command line tools only work on the service containing the master
11 unit, as the slave-only services have neither ssh nor postgresql access
12 to each other. This may only be a problem with master/slave relationships
13 between services and failover, and it is unclear how that would work
14 anyway (get failover working fist for units within a service).
15
16- Drop config_change_command from config.yaml, or make it work.
17 "restart" should be the default, as there may be required config changes
18 to get replication working. "reload" would mean blocking a hook requesting
19 config changes requireing a restart, emitting warnings so the admin can
20 do the restart manually. Is this good juju? An admin can already control
21 when restarts happen by when they change the juju configuration, and can
22 avoid unnecessary restarts when adding replicas by ensuring required
23 configuration changes are made before hand.
24
25- No hook is invoked when removing a unit, leaving the host standby PostgreSQL
26 cluster running and attempting to replicate from a master that is refusing
27 its connections. Bug #872264.
280
=== modified file 'actions.yaml'
--- actions.yaml 2015-06-25 11:35:33 +0000
+++ actions.yaml 2015-11-02 12:15:35 +0000
@@ -2,3 +2,14 @@
2 description: Pause replication replay on a hot standby unit.2 description: Pause replication replay on a hot standby unit.
3replication-resume:3replication-resume:
4 description: Resume replication replay on a hot standby unit.4 description: Resume replication replay on a hot standby unit.
5
6# Revisit this when actions are more mature. Per Bug #1483525, it seems
7# impossible to return filenames in our results.
8# backup:
9# description: Run backups
10# params:
11# type:
12# type: string
13# enum: [dump]
14# description: Type of backup. Currently only 'dump' supported.
15# default: dump
516
=== modified file 'actions/actions.py'
--- actions/actions.py 2015-06-25 11:35:33 +0000
+++ actions/actions.py 2015-11-02 12:15:35 +0000
@@ -25,17 +25,21 @@
25 sys.path.append(hooks_dir)25 sys.path.append(hooks_dir)
2626
27from charmhelpers.core import hookenv27from charmhelpers.core import hookenv
28import hooks28
29import postgresql
2930
3031
31def replication_pause(params):32def replication_pause(params):
32 offset = hooks.postgresql_wal_received_offset()33 if not postgresql.is_secondary():
33 if offset is None:
34 hookenv.action_fail('Not a hot standby')34 hookenv.action_fail('Not a hot standby')
35 return35 return
36
37 offset = postgresql.wal_received_offset()
36 hookenv.action_set(dict(offset=offset))38 hookenv.action_set(dict(offset=offset))
3739
38 cur = hooks.db_cursor(autocommit=True)40 con = postgresql.connect()
41 con.autocommit = True
42 cur = con.cursor()
39 cur.execute('SELECT pg_is_xlog_replay_paused()')43 cur.execute('SELECT pg_is_xlog_replay_paused()')
40 if cur.fetchone()[0] is True:44 if cur.fetchone()[0] is True:
41 hookenv.action_fail('Already paused')45 hookenv.action_fail('Already paused')
@@ -45,13 +49,16 @@
4549
4650
47def replication_resume(params):51def replication_resume(params):
48 offset = hooks.postgresql_wal_received_offset()52 if not postgresql.is_secondary():
49 if offset is None:
50 hookenv.action_fail('Not a hot standby')53 hookenv.action_fail('Not a hot standby')
51 return54 return
55
56 offset = postgresql.wal_received_offset()
52 hookenv.action_set(dict(offset=offset))57 hookenv.action_set(dict(offset=offset))
5358
54 cur = hooks.db_cursor(autocommit=True)59 con = postgresql.connect()
60 con.autocommit = True
61 cur = con.cursor()
55 cur.execute('SELECT pg_is_xlog_replay_paused()')62 cur.execute('SELECT pg_is_xlog_replay_paused()')
56 if cur.fetchone()[0] is False:63 if cur.fetchone()[0] is False:
57 hookenv.action_fail('Already resumed')64 hookenv.action_fail('Already resumed')
@@ -60,6 +67,34 @@
60 hookenv.action_set(dict(result='Resumed'))67 hookenv.action_set(dict(result='Resumed'))
6168
6269
70# Revisit this when actions are more mature. Per Bug #1483525, it seems
71# impossible to return filenames in our results.
72#
73# def backup(params):
74# assert params['type'] == 'dump'
75# script = os.path.join(helpers.scripts_dir(), 'pg_backup_job')
76# cmd = ['sudo', '-u', 'postgres', '-H', script, str(params['retention'])]
77# hookenv.action_set(dict(command=' '.join(cmd)))
78#
79# try:
80# subprocess.check_call(cmd)
81# except subprocess.CalledProcessError as x:
82# hookenv.action_fail(str(x))
83# return
84#
85# backups = {}
86# for filename in os.listdir(backups_dir):
87# path = os.path.join(backups_dir, filename)
88# if not is.path.isfile(path):
89# continue
90# backups['{}.{}'.format(filename
91# backups[filename] = dict(name=filename,
92# size=os.path.getsize(path),
93# path=path,
94# scp_path='{}:{}'.format(unit, path))
95# hookenv.action_set(dict(backups=backups))
96
97
63def main(argv):98def main(argv):
64 action = os.path.basename(argv[0])99 action = os.path.basename(argv[0])
65 params = hookenv.action_get()100 params = hookenv.action_get()
66101
=== modified file 'config.yaml'
--- config.yaml 2015-07-10 17:05:44 +0000
+++ config.yaml 2015-11-02 12:15:35 +0000
@@ -3,12 +3,12 @@
3 default: ""3 default: ""
4 type: string4 type: string
5 description: >5 description: >
6 A comma-separated list of IP Addresses (or single IP) admin tools like 6 A comma-separated list of IP Addresses (or single IP) admin tools
7 pgAdmin3 will connect from, this is most useful for developers running 7 like pgAdmin3 will connect from. The IP addresses added here will
8 juju in local mode who need to connect tools like pgAdmin to a postgres. 8 be included in the pg_hba.conf file allowing ip connections to all
9 The IP addresses added here will be included in the pg_hba.conf file 9 databases on the server from the given IP addresses using md5
10 allowing ip connections to all databases on the server from the given 10 password encryption. IP address ranges are also supported, using
11 using md5 password encryption.11 the standard format described in the PostgreSQL reference guide.
12 locale:12 locale:
13 default: "C"13 default: "C"
14 type: string14 type: string
@@ -22,53 +22,294 @@
22 description: >22 description: >
23 Default encoding used to store text in this service. Can only be23 Default encoding used to store text in this service. Can only be
24 set when deploying the first unit of a service.24 set when deploying the first unit of a service.
25 extra-packages:25 relation_database_privileges:
26 default: "ALL"
27 type: string
28 description: >
29 A comma-separated list of database privileges to grant to relation
30 users on their databases. The defaults allow to connect to the
31 database (CONNECT), create objects such as tables (CREATE), and
32 create temporary tables (TEMPORARY). Client charms that create
33 objects in the database are responsible to granting suitable
34 access on those objects to other roles and users (or PUBLIC) using
35 standard GRANT statements.
36 extra_packages:
26 default: ""37 default: ""
27 type: string38 type: string
28 description: Extra packages to install on the postgresql service units.39 description: >
40 Space separated list of extra packages to install.
29 dumpfile_location:41 dumpfile_location:
30 default: "None"42 default: "None"
31 type: string43 type: string
32 description: >44 description: >
33 Path to a dumpfile to load into DB when service is initiated.45 Path to a dumpfile to load into DB when service is initiated.
34 version:46 version:
35 default: null47 default: ""
36 type: string48 type: string
37 description: >49 description: >
38 Version of PostgreSQL that we want to install. Supported versions50 Version of PostgreSQL that we want to install. Supported versions
39 are "9.1", "9.2", "9.3". The default version for the deployed Ubuntu51 are "9.1", "9.2", "9.3" & "9.4". The default version for the
40 release is used when the version is not specified.52 deployed Ubuntu release is used when the version is not specified.
41 cluster_name:53 extra_pg_conf:
42 default: "main"54 # The defaults here match the defaults chosen by the charm,
43 type: string55 # so removing them will not change them. They are listed
44 description: Name of the cluster we want to install the DBs into56 # as documentation. The charm actually loads the non-calculated
45 listen_ip:57 # defaults from this config.yaml file to make it unlikely it will
46 default: "*"58 # get out of sync with reality.
47 type: string59 default: |
48 description: IP to listen on60 # Additional service specific postgresql.conf settings.
61 listen_addresses='*'
62 max_connections=100
63 ssl=true
64 log_timezone=UTC
65 log_checkpoints=true
66 log_connections=true
67 log_disconnections=true
68 log_autovacuum_min_duration=-1
69 log_line_prefix='%t [%p]: [%l-1] db=%d,user=%u '
70 archive_mode=true
71 archive_command='/bin/true'
72 hot_standby=true
73 max_wal_senders=80
74 # max_wal_senders=num_units * 2 + 5
75 # wal_level=hot_standby (<9.4) or logical (>=9.4)
76 # shared_buffers=total_ram*0.25
77 # effective_cache_size=total_ram*0.75
78 default_statistics_target=250
79 from_collapse_limit=16
80 join_collapse_limit=16
81 wal_buffers=-1
82 checkpoint_completion_target=0.9
83 password_encryption=true
84 max_connections=100
85 type: string
86 description: >
87 postgresql.conf settings, one per line in standard key=value
88 PostgreSQL format. These settings will generally override
89 any values selected by the charm. The charm however will
90 attempt to ensure minimum requirements for the charm's
91 operation are met.
92 extra_pg_auth:
93 type: string
94 default: ""
95 description: >
96 A comma separated extra pg_hba.conf auth rules.
97 This will be written to the pg_hba.conf file, one line per rule.
98 Note that this should not be needed as db relations already create
99 those rules the right way. Only use this if you really need too
100 (e.g. on a development environment), or are connecting juju managed
101 databases to external managed systems, or configuring replication
102 between unrelated PostgreSQL services using the manual_replication
103 option.
104 performance_tuning:
105 default: "Mixed"
106 type: string
107 description: >
108 DEPRECATED AND IGNORED. The pgtune project has been abandoned
109 and the packages dropped from Debian and Ubuntu. The charm
110 still performs some basic tuning, which users can tweak using
111 extra_pg_config.
112 manual_replication:
113 type: boolean
114 default: False
115 description: >
116 Enable or disable charm managed replication. When manual_replication
117 is True, the operator is responsible for maintaining recovery.conf
118 and performing any necessary database mirroring. The charm will
119 still advertise the unit as standalone, master or hot standby to
120 relations based on whether the system is in recovery mode or not.
121 Note that this option makes it possible to create a PostgreSQL
122 service with multiple master units, which is a very silly thing
123 to do unless you are also using multi-master software like BDR.
124 backup_schedule:
125 default: "13 4 * * *"
126 type: string
127 description: Cron-formatted schedule for database backups.
128 backup_retention_count:
129 default: 7
130 type: int
131 description: Number of recent backups to retain.
132 nagios_context:
133 default: "juju"
134 type: string
135 description: >
136 Used by the nrpe subordinate charms.
137 A string that will be prepended to instance name to set the host name
138 in nagios. So for instance the hostname would be something like:
139 juju-postgresql-0
140 If you're running multiple environments with the same services in them
141 this allows you to differentiate between them.
142 nagios_servicegroups:
143 default: ""
144 type: string
145 description: >
146 A comma-separated list of nagios servicegroups.
147 If left empty, the nagios_context will be used as the servicegroup
148 pgdg:
149 description: >
150 Enable the PostgreSQL Global Development Group APT repository
151 (https://wiki.postgresql.org/wiki/Apt). This package source provides
152 official PostgreSQL packages for Ubuntu LTS releases beyond those
153 provided by the main Ubuntu archive.
154 type: boolean
155 default: false
156 install_sources:
157 description: >
158 List of extra package sources, per charm-helpers standard.
159 YAML format.
160 type: string
161 default: ""
162 install_keys:
163 description: >
164 List of signing keys for install_sources package sources, per
165 charmhelpers standard. YAML format.
166 type: string
167 default: ""
168 wal_e_storage_uri:
169 type: string
170 default: ""
171 description: |
172 EXPERIMENTAL.
173 Specify storage to be used by WAL-E. Every PostgreSQL service must use
174 a unique URI. Backups will be unrecoverable if it is not unique. The
175 URI's scheme must be one of 'swift' (OpenStack Swift), 's3' (Amazon AWS)
176 or 'wabs' (Windows Azure). For example:
177 'swift://some-container/directory/or/whatever'
178 's3://some-bucket/directory/or/whatever'
179 'wabs://some-bucket/directory/or/whatever'
180 Setting the wal_e_storage_uri enables regular WAL-E filesystem level
181 backups (per wal_e_backup_schedule), and log shipping to the configured
182 storage. Point-in-time recovery becomes possible, as is disabling the
183 streaming_replication configuration item and relying solely on
184 log shipping for replication.
185 wal_e_backup_schedule:
186 type: string
187 default: "13 0 * * *"
188 description: >
189 EXPERIMENTAL.
190 Cron-formatted schedule for WAL-E database backups. If
191 wal_e_backup_schedule is unset, WAL files will never be removed from
192 WAL-E storage.
193 wal_e_backup_retention:
194 type: int
195 default: 2
196 description: >
197 EXPERIMENTAL.
198 Number of recent base backups and WAL files to retain.
199 You need enough space for this many backups plus one more, as
200 an old backup will only be removed after a new one has been
201 successfully made to replace it.
202 streaming_replication:
203 type: boolean
204 default: true
205 description: >
206 Enable streaming replication. Normally, streaming replication is
207 always used, and any log shipping configured is used as a fallback.
208 Turning this off without configuring log shipping is an error.
209 os_username:
210 type: string
211 default: ""
212 description: EXPERIMENTAL. OpenStack Swift username.
213 os_password:
214 type: string
215 default: ""
216 description: EXPERIMENTAL. OpenStack Swift password.
217 os_auth_url:
218 type: string
219 default: ""
220 description: EXPERIMENTAL. OpenStack Swift authentication URL.
221 os_tenant_name:
222 type: string
223 default: ""
224 description: EXPERIMENTAL. OpenStack Swift tenant name.
225 aws_access_key_id:
226 type: string
227 default: ""
228 description: EXPERIMENTAL. Amazon AWS access key id.
229 aws_secret_access_key:
230 type: string
231 default: ""
232 description: EXPERIMENTAL. Amazon AWS secret access key.
233 wabs_account_name:
234 type: string
235 default: ""
236 description: EXPERIMENTAL. Windows Azure account name.
237 wabs_access_key:
238 type: string
239 default: ""
240 description: EXPERIMENTAL. Windows Azure access key.
241 package_status:
242 default: "install"
243 type: string
244 description: >
245 The status of service-affecting packages will be set to this
246 value in the dpkg database. Useful valid values are "install"
247 and "hold".
248 # statsd-compatible metrics
249 metrics_target:
250 default: ""
251 type: string
252 description: >
253 Destination for statsd-format metrics, format "host:port". If
254 not present and valid, metrics disabled.
255 metrics_prefix:
256 default: "dev.$UNIT.postgresql"
257 type: string
258 description: >
259 Prefix for metrics. Special value $UNIT can be used to include the
260 name of the unit in the prefix.
261 metrics_sample_interval:
262 default: 5
263 type: int
264 description: Period for metrics cron job to run in minutes
265
266
267 # DEPRECATED SETTINGS.
268 # Remove them one day. They remain here to avoid making existing
269 # configurations fail.
270 advisory_lock_restart_key:
271 default: 765
272 type: int
273 description: >
274 DEPRECATED AND IGNORED.
275 An advisory lock key used internally by the charm. You do not need
276 to change it unless it happens to conflict with an advisory lock key
277 being used by your applications.
278 extra-packages:
279 default: ""
280 type: string
281 description: DEPRECATED. Use extra_packages.
49 listen_port:282 listen_port:
50 default: null283 default: -1
51 type: int284 type: int
52 description: Port to listen on. Default is automatically assigned.285 description: >
286 DEPRECATED. Use extra_pg_conf.
287 Port to listen on. Default is automatically assigned.
53 max_connections:288 max_connections:
54 default: 100289 default: 100
55 type: int290 type: int
56 description: Maximum number of connections to allow to the PG database291 description: >
292 DEPRECATED. Use extra_pg_conf.
293 Maximum number of connections to allow to the PG database
57 max_prepared_transactions:294 max_prepared_transactions:
58 default: 0295 default: 0
59 type: int296 type: int
60 description: >297 description: >
61 Maximum number of prepared two phase commit transactions, waiting298 DEPRECATED. Use extra_pg_conf.
62 to be committed. Defaults to 0. as using two phase commit without299 Maximum number of prepared two phase commit transactions, waiting
63 a process to monitor and resolve lost transactions is dangerous.300 to be committed. Defaults to 0. as using two phase commit without
301 a process to monitor and resolve lost transactions is dangerous.
64 ssl:302 ssl:
65 default: "True"303 default: "True"
66 type: string304 type: string
67 description: Whether PostgreSQL should talk SSL305 description: >
306 DEPRECATED. Use extra_pg_conf.
307 Whether PostgreSQL should talk SSL
68 log_min_duration_statement:308 log_min_duration_statement:
69 default: -1309 default: -1
70 type: int310 type: int
71 description: >311 description: >
312 DEPRECATED. Use extra_pg_conf.
72 -1 is disabled, 0 logs all statements313 -1 is disabled, 0 logs all statements
73 and their durations, > 0 logs only314 and their durations, > 0 logs only
74 statements running at least this number315 statements running at least this number
@@ -76,19 +317,21 @@
76 log_checkpoints:317 log_checkpoints:
77 default: False318 default: False
78 type: boolean319 type: boolean
79 description: Log checkpoints320 description: >
321 DEPRECATED. Use extra_pg_conf.
80 log_connections:322 log_connections:
81 default: False323 default: False
82 type: boolean324 type: boolean
83 description: Log connections325 description: DEPRECATED. Use extra_pg_conf.
84 log_disconnections:326 log_disconnections:
85 default: False327 default: False
86 type: boolean328 type: boolean
87 description: Log disconnections329 description: DEPRECATED. Use extra_pg_conf.
88 log_temp_files:330 log_temp_files:
89 default: "-1"331 default: "-1"
90 type: string332 type: string
91 description: >333 description: >
334 DEPRECATED. Use extra_pg_conf.
92 Log creation of temporary files larger than the threshold.335 Log creation of temporary files larger than the threshold.
93 -1 disables the feature, 0 logs all temporary files, or specify336 -1 disables the feature, 0 logs all temporary files, or specify
94 the threshold size with an optional unit (eg. "512KB", default337 the threshold size with an optional unit (eg. "512KB", default
@@ -97,6 +340,7 @@
97 default: "%t [%p]: [%l-1] db=%d,user=%u "340 default: "%t [%p]: [%l-1] db=%d,user=%u "
98 type: string341 type: string
99 description: |342 description: |
343 DEPRECATED. Use extra_pg_conf.
100 special values:344 special values:
101 %a = application name345 %a = application name
102 %u = user name346 %u = user name
@@ -119,55 +363,68 @@
119 log_lock_waits:363 log_lock_waits:
120 default: False364 default: False
121 type: boolean365 type: boolean
122 description: log lock waits >= deadlock_timeout366 description: DEPRECATED. Use extra_pg_conf.
123 log_timezone:367 log_timezone:
124 default: "UTC"368 default: "UTC"
125 type: string369 type: string
126 description: Log timezone370 description: DEPRECATED. Use extra_pg_conf.
127 autovacuum:371 autovacuum:
128 default: True372 default: True
129 type: boolean373 type: boolean
130 description: >374 description: >
375 DEPRECATED. Use extra_pg_conf.
131 Autovacuum should almost always be running. If you want to turn this376 Autovacuum should almost always be running. If you want to turn this
132 off, you are probably following out of date documentation.377 off, you are probably following out of date documentation.
133 log_autovacuum_min_duration:378 log_autovacuum_min_duration:
134 default: -1379 default: -1
135 type: int380 type: int
136 description: >381 description: >
382 DEPRECATED. Use extra_pg_conf.
137 -1 disables, 0 logs all actions and their durations, > 0 logs only383 -1 disables, 0 logs all actions and their durations, > 0 logs only
138 actions running at least this number of milliseconds.384 actions running at least this number of milliseconds.
139 autovacuum_analyze_threshold:385 autovacuum_analyze_threshold:
140 default: 50386 default: 50
141 type: int387 type: int
142 description: min number of row updates before analyze388 description: >
389 DEPRECATED. Use extra_pg_conf.
390 min number of row updates before analyze
143 autovacuum_vacuum_scale_factor:391 autovacuum_vacuum_scale_factor:
144 default: 0.2392 default: 0.2
145 type: float393 type: float
146 description: Fraction of table size before vacuum394 description: >
395 DEPRECATED. Use extra_pg_conf.
396 Fraction of table size before vacuum
147 autovacuum_analyze_scale_factor:397 autovacuum_analyze_scale_factor:
148 default: 0.1 398 default: 0.1
149 type: float399 type: float
150 description: Fraction of table size before analyze400 description: >
401 DEPRECATED. Use extra_pg_conf.
402 Fraction of table size before analyze
151 autovacuum_vacuum_cost_delay:403 autovacuum_vacuum_cost_delay:
152 default: "20ms"404 default: "20ms"
153 type: string405 type: string
154 description: >406 description: >
407 DEPRECATED. Use extra_pg_conf.
155 Default vacuum cost delay for autovacuum, in milliseconds;408 Default vacuum cost delay for autovacuum, in milliseconds;
156 -1 means use vacuum_cost_delay409 -1 means use vacuum_cost_delay
157 search_path:410 search_path:
158 default: "\"$user\",public"411 default: "\"$user\",public"
159 type: string412 type: string
160 description: >413 description: >
414 DEPRECATED. Use extra_pg_conf.
161 Comma separated list of schema names for415 Comma separated list of schema names for
162 the default SQL search path.416 the default SQL search path.
163 standard_conforming_strings:417 standard_conforming_strings:
164 default: True418 default: True
165 type: boolean419 type: boolean
166 description: Standard conforming strings420 description: >
421 DEPRECATED. Use extra_pg_conf.
422 Standard conforming strings
167 hot_standby:423 hot_standby:
168 default: False424 default: False
169 type: boolean425 type: boolean
170 description: >426 description: >
427 DEPRECATED. Use extra_pg_conf.
171 Hot standby or warm standby. When True, queries can be run against428 Hot standby or warm standby. When True, queries can be run against
172 the database when in recovery or standby mode (ie. replicated).429 the database when in recovery or standby mode (ie. replicated).
173 Overridden when service contains multiple units.430 Overridden when service contains multiple units.
@@ -175,6 +432,7 @@
175 default: False432 default: False
176 type: boolean433 type: boolean
177 description: >434 description: >
435 DEPRECATED. Use extra_pg_conf.
178 Hot standby feedback, informing a master about in progress436 Hot standby feedback, informing a master about in progress
179 transactions on a streaming hot standby and allowing the master to437 transactions on a streaming hot standby and allowing the master to
180 defer cleanup and avoid query cancelations on the hot standby.438 defer cleanup and avoid query cancelations on the hot standby.
@@ -182,6 +440,7 @@
182 default: minimal440 default: minimal
183 type: string441 type: string
184 description: >442 description: >
443 DEPRECATED. Use extra_pg_conf.
185 'minimal', 'archive', 'hot_standby' or 'logical'. Defines how much444 'minimal', 'archive', 'hot_standby' or 'logical'. Defines how much
186 information is written to the WAL. Set to 'minimal' for stand alone445 information is written to the WAL. Set to 'minimal' for stand alone
187 databases and 'hot_standby' for replicated setups. Overridden by446 databases and 'hot_standby' for replicated setups. Overridden by
@@ -190,6 +449,7 @@
190 default: 0449 default: 0
191 type: int450 type: int
192 description: >451 description: >
452 DEPRECATED. Use extra_pg_conf.
193 Maximum number of hot standbys that can connect using453 Maximum number of hot standbys that can connect using
194 streaming replication. Set this to the expected maximum number of454 streaming replication. Set this to the expected maximum number of
195 hot standby units to avoid unnecessary blocking and database restarts.455 hot standby units to avoid unnecessary blocking and database restarts.
@@ -198,6 +458,7 @@
198 default: 0458 default: 0
199 type: int459 type: int
200 description: >460 description: >
461 DEPRECATED. Use extra_pg_conf.
201 Number of old WAL files to keep, providing a larger buffer for462 Number of old WAL files to keep, providing a larger buffer for
202 streaming hot standbys to catch up from when lagged. Each WAL file463 streaming hot standbys to catch up from when lagged. Each WAL file
203 is 16MB in size. The WAL files are the buffer of how far a464 is 16MB in size. The WAL files are the buffer of how far a
@@ -208,6 +469,7 @@
208 default: 5000469 default: 5000
209 type: int470 type: int
210 description: >471 description: >
472 DEPRECATED. Use extra_pg_conf.
211 Value of wal_keep_segments used when this service is replicated.473 Value of wal_keep_segments used when this service is replicated.
212 This setting only exists to provide a sane default when replication474 This setting only exists to provide a sane default when replication
213 is requested (so it doesn't fail) and nobody bothered to change the475 is requested (so it doesn't fail) and nobody bothered to change the
@@ -216,6 +478,7 @@
216 default: False478 default: False
217 type: boolean479 type: boolean
218 description: >480 description: >
481 DEPRECATED. Use extra_pg_conf.
219 Enable archiving of WAL files using the command specified by482 Enable archiving of WAL files using the command specified by
220 archive_command. If archive_mode is enabled and archive_command not483 archive_command. If archive_mode is enabled and archive_command not
221 set, then archiving is deferred until archive_command is set and the484 set, then archiving is deferred until archive_command is set and the
@@ -224,45 +487,38 @@
224 default: ""487 default: ""
225 type: string488 type: string
226 description: >489 description: >
490 DEPRECATED. Use extra_pg_conf.
227 Command used to archive WAL files when archive_mode is set and491 Command used to archive WAL files when archive_mode is set and
228 wal_level > minimal.492 wal_level > minimal.
229 work_mem:493 work_mem:
230 default: "1MB"494 default: "1MB"
231 type: string495 type: string
232 description: >496 description: >
233 Working Memory.497 DEPRECATED. Use extra_pg_conf. Working Memory.
234 Ignored unless 'performance_tuning' is set to 'manual'.498 Ignored unless 'performance_tuning' is set to 'manual'.
235 maintenance_work_mem:499 maintenance_work_mem:
236 default: "1MB"500 default: "1MB"
237 type: string501 type: string
238 description: >502 description: >
239 Maintenance working memory.503 DEPRECATED. Use extra_pg_conf. Maintenance working memory.
240 Ignored unless 'performance_tuning' is set to 'manual'.504 Ignored unless 'performance_tuning' is set to 'manual'.
241 performance_tuning:
242 default: "Mixed"
243 type: string
244 description: >
245 Possible values here are "manual", "DW" (data warehouse),
246 "OLTP" (online transaction processing), "Web" (web application),
247 "Desktop" or "Mixed". When this is set to a value other than
248 "manual", the charm invokes the pgtune tool to tune a number
249 of performance parameters based on the specified load type.
250 pgtune gathers information about the node on which you are deployed and
251 tries to make intelligent guesses about what tuning parameters to set
252 based on available RAM and CPU under the assumption that it's the only
253 significant service running on this node.
254 kernel_shmall:505 kernel_shmall:
255 default: 0506 default: 0
256 type: int507 type: int
257 description: Total amount of shared memory available, in bytes.508 description: >
509 DEPRECATED and ignored.
510 Total amount of shared memory available, in bytes.
258 kernel_shmmax:511 kernel_shmmax:
259 default: 0512 default: 0
260 type: int513 type: int
261 description: The maximum size, in bytes, of a shared memory segment.514 description: >
515 DEPRECATED and ignored.
516 The maximum size, in bytes, of a shared memory segment.
262 shared_buffers:517 shared_buffers:
263 default: ""518 default: ""
264 type: string519 type: string
265 description: >520 description: >
521 DEPRECATED. Use extra_pg_conf.
266 The amount of memory the database server uses for shared memory522 The amount of memory the database server uses for shared memory
267 buffers. This string should be of the format '###MB'.523 buffers. This string should be of the format '###MB'.
268 Ignored unless 'performance_tuning' is set to 'manual'.524 Ignored unless 'performance_tuning' is set to 'manual'.
@@ -270,6 +526,7 @@
270 default: ""526 default: ""
271 type: string527 type: string
272 description: >528 description: >
529 DEPRECATED. Use extra_pg_conf.
273 Effective cache size is an estimate of how much memory is available for530 Effective cache size is an estimate of how much memory is available for
274 disk caching within the database. (50% to 75% of system memory). This531 disk caching within the database. (50% to 75% of system memory). This
275 string should be of the format '###MB'. Ignored unless532 string should be of the format '###MB'. Ignored unless
@@ -278,6 +535,7 @@
278 default: -1535 default: -1
279 type: int536 type: int
280 description: >537 description: >
538 DEPRECATED. Use extra_pg_conf.
281 Sets the default statistics target for table columns without a539 Sets the default statistics target for table columns without a
282 column-specific target set via ALTER TABLE SET STATISTICS.540 column-specific target set via ALTER TABLE SET STATISTICS.
283 Leave unchanged to use the server default, which in recent541 Leave unchanged to use the server default, which in recent
@@ -288,6 +546,7 @@
288 default: -1 546 default: -1
289 type: int547 type: int
290 description: >548 description: >
549 DEPRECATED. Use extra_pg_conf.
291 Sets the from_collapse_limit and join_collapse_limit query planner550 Sets the from_collapse_limit and join_collapse_limit query planner
292 options, controlling the maximum number of tables that can be joined551 options, controlling the maximum number of tables that can be joined
293 before the turns off the table collapse query optimization.552 before the turns off the table collapse query optimization.
@@ -295,35 +554,41 @@
295 default: "1MB"554 default: "1MB"
296 type: string555 type: string
297 description: >556 description: >
557 DEPRECATED. Use extra_pg_conf.
298 The maximum number of temporary buffers used by each database session.558 The maximum number of temporary buffers used by each database session.
299 wal_buffers:559 wal_buffers:
300 default: "-1"560 default: "-1"
301 type: string561 type: string
302 description: >562 description: >
563 DEPRECATED. Use extra_pg_conf.
303 min 32kB, -1 sets based on shared_buffers (change requires restart).564 min 32kB, -1 sets based on shared_buffers (change requires restart).
304 Ignored unless 'performance_tuning' is set to 'manual'.565 Ignored unless 'performance_tuning' is set to 'manual'.
305 checkpoint_segments:566 checkpoint_segments:
306 default: 10 567 default: 10
307 type: int568 type: int
308 description: >569 description: >
570 DEPRECATED. Use extra_pg_conf.
309 in logfile segments, min 1, 16MB each.571 in logfile segments, min 1, 16MB each.
310 Ignored unless 'performance_tuning' is set to 'manual'.572 Ignored unless 'performance_tuning' is set to 'manual'.
311 checkpoint_completion_target:573 checkpoint_completion_target:
312 default: 0.9574 default: 0.9
313 type: float575 type: float
314 description: >576 description: >
315 checkpoint target duration time, as a fraction of checkpoint_timeout.577 DEPRECATED. Use extra_pg_conf.
316 Range [0.0, 1.0].578 checkpoint target duration time, as a fraction of checkpoint_timeout.
579 Range [0.0, 1.0].
317 checkpoint_timeout:580 checkpoint_timeout:
318 default: ""581 default: ""
319 type: string582 type: string
320 description: >583 description: >
584 DEPRECATED. Use extra_pg_conf.
321 Maximum time between automatic WAL checkpoints. range '30s-1h'.585 Maximum time between automatic WAL checkpoints. range '30s-1h'.
322 If left empty, the default postgresql value will be used.586 If left empty, the default postgresql value will be used.
323 fsync:587 fsync:
324 type: boolean588 type: boolean
325 default: True589 default: True
326 description: >590 description: >
591 DEPRECATED. Use extra_pg_conf.
327 Turns forced synchronization on/off. If fsync is turned off, database592 Turns forced synchronization on/off. If fsync is turned off, database
328 failures are likely to involve database corruption and require593 failures are likely to involve database corruption and require
329 recreating the unit594 recreating the unit
@@ -331,231 +596,23 @@
331 type: boolean596 type: boolean
332 default: True597 default: True
333 description: >598 description: >
599 DEPRECATED. Use extra_pg_conf.
334 Immediate fsync after commit.600 Immediate fsync after commit.
335 full_page_writes:601 full_page_writes:
336 type: boolean602 type: boolean
337 default: True603 default: True
338 description: >604 description: >
605 DEPRECATED. Use extra_pg_conf.
339 Recover from partial page writes.606 Recover from partial page writes.
340 random_page_cost:607 random_page_cost:
341 default: 4.0608 default: 4.0
342 type: float609 type: float
343 description: Random page cost610 description: >
344 extra_pg_auth:611 DEPRECATED. Use extra_pg_conf. Random page cost
345 type: string
346 default: ""
347 description: >
348 A comma separated extra pg_hba.conf auth rules.
349 This will be written to the pg_hba.conf file, one line per rule.
350 Note that this should not be needed as db relations already create
351 those rules the right way. Only use this if you really need too
352 (e.g. on a development environment), or are connecting juju managed
353 databases to external managed systems, or configuring replication
354 between unrelated PostgreSQL services using the manual_replication
355 option.
356 manual_replication:
357 type: boolean
358 default: False
359 description: >
360 Enable or disable charm managed replication. When manual_replication
361 is True, the operator is responsible for maintaining recovery.conf
362 and performing any necessary database mirroring. The charm will
363 still advertise the unit as standalone, master or hot standby to
364 relations based on whether the system is in recovery mode or not.
365 Note that this option makes it possible to create a PostgreSQL
366 service with multiple master units, which is probably a very silly
367 thing to do.
368 backup_dir:612 backup_dir:
369 default: "/var/lib/postgresql/backups"613 default: "/var/lib/postgresql/backups"
370 type: string614 type: string
371 description: Directory to place backups in615 description: >
372 backup_schedule:616 DEPRECATED. Directory to place backups in. If you change this,
373 default: "13 4 * * *"617 your backups will go to this path and not the 'backups' Juju
374 type: string618 storage mount.
375 description: Cron-formatted schedule for database backups.
376 backup_retention_count:
377 default: 7
378 type: int
379 description: Number of recent backups to retain.
380 nagios_context:
381 default: "juju"
382 type: string
383 description: >
384 Used by the nrpe-external-master subordinate charm.
385 A string that will be prepended to instance name to set the host name
386 in nagios. So for instance the hostname would be something like:
387 juju-postgresql-0
388 If you're running multiple environments with the same services in them
389 this allows you to differentiate between them.
390 nagios_additional_servicegroups:
391 default: ""
392 type: string
393 description: >
394 Used by the nrpe-external-master subordinate charm.
395 A comma-separated list of servicegroups to include along with
396 nagios_context when generating nagios service check configs.
397 This is useful for nagios installations where servicegroups
398 are used to apply special treatment to particular checks.
399 pgdg:
400 description: >
401 Enable the PostgreSQL Global Development Group APT repository
402 (https://wiki.postgresql.org/wiki/Apt). This package source provides
403 official PostgreSQL packages for Ubuntu LTS releases beyond those
404 provided by the main Ubuntu archive.
405 type: boolean
406 default: false
407 install_sources:
408 description: >
409 List of extra package sources, per charm-helpers standard.
410 YAML format.
411 type: string
412 default: null
413 install_keys:
414 description: >
415 List of signing keys for install_sources package sources, per
416 charmhelpers standard. YAML format.
417 type: string
418 default: null
419 extra_archives:
420 default: ""
421 type: string
422 description: >
423 DEPRECATED & IGNORED. Use install_sources and install_keys.
424 advisory_lock_restart_key:
425 default: 765
426 type: int
427 description: >
428 An advisory lock key used internally by the charm. You do not need
429 to change it unless it happens to conflict with an advisory lock key
430 being used by your applications.
431 # Swift backups and PITR via SwiftWAL
432 swiftwal_container_prefix:
433 type: string
434 default: null
435 description: >
436 EXPERIMENTAL.
437 Swift container prefix for SwiftWAL to use. Must be set if any
438 SwiftWAL features are enabled. This will become a simple
439 swiftwal_container config item when proper leader election is
440 implemented in juju.
441 swiftwal_backup_schedule:
442 type: string
443 default: null
444 description: >
445 EXPERIMENTAL.
446 Cron-formatted schedule for SwiftWAL database backups.
447 swiftwal_backup_retention:
448 type: int
449 default: 2
450 description: >
451 EXPERIMENTAL.
452 Number of recent base backups to retain. You need enough space in
453 Swift for this many backups plus one more, as an old backup will only
454 be removed after a new one has been successfully made to replace it.
455 swiftwal_log_shipping:
456 type: boolean
457 default: false
458 description: >
459 EXPERIMENTAL.
460 Archive WAL files into Swift. If swiftwal_backup_schedule is set,
461 allows point-in-time recovery and WAL files are removed
462 automatically with old backups. If swiftwal_backup_schedule is not set
463 then WAL files are never removed. Enabling this option will override
464 the archive_mode and archive_command settings.
465 wal_e_storage_uri:
466 type: string
467 default: null
468 description: |
469 EXPERIMENTAL.
470 Specify storage to be used by WAL-E. Every PostgreSQL service must use
471 a unique URI. Backups will be unrecoverable if it is not unique. The
472 URI's scheme must be one of 'swift' (OpenStack Swift), 's3' (Amazon AWS)
473 or 'wabs' (Windows Azure). For example:
474 'swift://some-container/directory/or/whatever'
475 's3://some-bucket/directory/or/whatever'
476 'wabs://some-bucket/directory/or/whatever'
477 Setting the wal_e_storage_uri enables regular WAL-E filesystem level
478 backups (per wal_e_backup_schedule), and log shipping to the configured
479 storage. Point-in-time recovery becomes possible, as is disabling the
480 streaming_replication configuration item and relying solely on
481 log shipping for replication.
482 wal_e_backup_schedule:
483 type: string
484 default: "13 0 * * *"
485 description: >
486 EXPERIMENTAL.
487 Cron-formatted schedule for WAL-E database backups. If
488 wal_e_backup_schedule is unset, WAL files will never be removed from
489 WAL-E storage.
490 wal_e_backup_retention:
491 type: int
492 default: 2
493 description: >
494 EXPERIMENTAL.
495 Number of recent base backups and WAL files to retain.
496 You need enough space for this many backups plus one more, as
497 an old backup will only be removed after a new one has been
498 successfully made to replace it.
499 streaming_replication:
500 type: boolean
501 default: true
502 description: >
503 Enable streaming replication. Normally, streaming replication is
504 always used, and any log shipping configured is used as a fallback.
505 Turning this off without configuring log shipping is an error.
506 os_username:
507 type: string
508 default: null
509 description: EXPERIMENTAL. OpenStack Swift username.
510 os_password:
511 type: string
512 default: null
513 description: EXPERIMENTAL. OpenStack Swift password.
514 os_auth_url:
515 type: string
516 default: null
517 description: EXPERIMENTAL. OpenStack Swift authentication URL.
518 os_tenant_name:
519 type: string
520 default: null
521 description: EXPERIMENTAL. OpenStack Swift tenant name.
522 aws_access_key_id:
523 type: string
524 default: null
525 description: EXPERIMENTAL. Amazon AWS access key id.
526 aws_secret_access_key:
527 type: string
528 default: null
529 description: EXPERIMENTAL. Amazon AWS secret access key.
530 wabs_account_name:
531 type: string
532 default: null
533 description: EXPERIMENTAL. Windows Azure account name.
534 wabs_access_key:
535 type: string
536 default: null
537 description: EXPERIMENTAL. Windows Azure access key.
538 package_status:
539 default: "install"
540 type: string
541 description: >
542 The status of service-affecting packages will be set to this
543 value in the dpkg database. Useful valid values are "install"
544 and "hold".
545 # statsd-compatible metrics
546 metrics_target:
547 default: ""
548 type: string
549 description: >
550 Destination for statsd-format metrics, format "host:port". If
551 not present and valid, metrics disabled.
552 metrics_prefix:
553 default: "dev.$UNIT.postgresql"
554 type: string
555 description: >
556 Prefix for metrics. Special value $UNIT can be used to include the
557 name of the unit in the prefix.
558 metrics_sample_interval:
559 default: 5
560 type: int
561 description: Period for metrics cron job to run in minutes
562619
=== modified file 'copyright'
--- copyright 2011-07-10 09:53:13 +0000
+++ copyright 2015-11-02 12:15:35 +0000
@@ -1,17 +1,16 @@
1Format: http://dep.debian.net/deps/dep5/1Format: http://dep.debian.net/deps/dep5/
22
3Files: *3Files: *
4Copyright: Copyright 2011, Canonical Ltd., All Rights Reserved.4Copyright: Copyright 2011-2015, Canonical Ltd.
5License: GPL-35License: GPL-3
6 This program is free software: you can redistribute it and/or modify6 This program is free software: you can redistribute it and/or modify
7 it under the terms of the GNU General Public License as published by7 it under the terms of the GNU General Public License version 3, as
8 the Free Software Foundation, either version 3 of the License, or8 published by the Free Software Foundation.
9 (at your option) any later version.
10 .9 .
11 This program is distributed in the hope that it will be useful,10 This program is distributed in the hope that it will be useful,
12 but WITHOUT ANY WARRANTY; without even the implied warranty of11 but WITHOUT ANY WARRANTY; without even the implied warranties of
13 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the12 MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
14 GNU General Public License for more details.13 PURPOSE. See the GNU General Public License for more details.
15 .14 .
16 You should have received a copy of the GNU General Public License15 You should have received a copy of the GNU General Public License
17 along with this program. If not, see <http://www.gnu.org/licenses/>.16 along with this program. If not, see <http://www.gnu.org/licenses/>.
1817
=== added file 'hooks/bootstrap.py'
--- hooks/bootstrap.py 1970-01-01 00:00:00 +0000
+++ hooks/bootstrap.py 2015-11-02 12:15:35 +0000
@@ -0,0 +1,57 @@
1#!/usr/bin/python3
2
3# Copyright 2015 Canonical Ltd.
4#
5# This file is part of the PostgreSQL Charm for Juju.
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License version 3, as
9# published by the Free Software Foundation.
10#
11# This program is distributed in the hope that it will be useful, but
12# WITHOUT ANY WARRANTY; without even the implied warranties of
13# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
14# PURPOSE. See the GNU General Public License for more details.
15#
16# You should have received a copy of the GNU General Public License
17# along with this program. If not, see <http://www.gnu.org/licenses/>.
18
19from charmhelpers import fetch
20from charmhelpers.core import hookenv
21
22
23def bootstrap():
24 try:
25 import psycopg2 # NOQA: flake8
26 import jinja2 # NOQA: flake8
27 except ImportError:
28 packages = ['python3-psycopg2', 'python3-jinja2']
29 fetch.apt_install(packages, fatal=True)
30 import psycopg2 # NOQA: flake8
31
32
33def block_on_bad_juju():
34 if not hookenv.has_juju_version('1.24'):
35 hookenv.status_set('blocked', 'Requires Juju 1.24 or higher')
36 # Error state, since we don't have 1.24 to give a nice blocked state.
37 raise SystemExit(1)
38
39
40def upgrade_charm():
41 block_on_bad_juju()
42 # This needs to be imported after bootstrap() or required Python
43 # packages may not have been installed.
44 import upgrade
45 upgrade.upgrade_charm()
46
47
48def default_hook():
49 block_on_bad_juju()
50 # This needs to be imported after bootstrap() or required Python
51 # packages may not have been installed.
52 import definitions
53
54 hookenv.log('*** Start {!r} hook'.format(hookenv.hook_name()))
55 sm = definitions.get_service_manager()
56 sm.manage()
57 hookenv.log('*** End {!r} hook'.format(hookenv.hook_name()))
058
=== added file 'hooks/client.py'
--- hooks/client.py 1970-01-01 00:00:00 +0000
+++ hooks/client.py 2015-11-02 12:15:35 +0000
@@ -0,0 +1,182 @@
1# Copyright 2015 Canonical Ltd.
2#
3# This file is part of the PostgreSQL Charm for Juju.
4#
5# This program is free software: you can redistribute it and/or modify
6# it under the terms of the GNU General Public License version 3, as
7# published by the Free Software Foundation.
8#
9# This program is distributed in the hope that it will be useful, but
10# WITHOUT ANY WARRANTY; without even the implied warranties of
11# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
12# PURPOSE. See the GNU General Public License for more details.
13#
14# You should have received a copy of the GNU General Public License
15# along with this program. If not, see <http://www.gnu.org/licenses/>.
16
17from charmhelpers.core import hookenv, host
18
19from decorators import relation_handler, master_only
20import helpers
21import postgresql
22
23
24@relation_handler('db', 'db-admin', 'master')
25def publish_db_relations(rel):
26 if postgresql.is_master():
27 db_relation_master(rel)
28 else:
29 db_relation_mirror(rel)
30 db_relation_common(rel)
31
32
33def _credential_types(rel):
34 superuser = (rel.relname in ('db-admin', 'master'))
35 replication = (rel.relname == 'master')
36 return (superuser, replication)
37
38
39def db_relation_master(rel):
40 '''The master generates credentials and negotiates resources.'''
41 master = rel.local
42 # Pick one remote unit as representative. They should all converge.
43 for remote in rel.values():
44 break
45
46 # The requested database name, the existing database name, or use
47 # the remote service name as a default. We no longer use the
48 # relation id for the database name or usernames, as when a
49 # database dump is restored into a new Juju environment we
50 # are more likely to have matching service names than relation ids
51 # and less likely to have to perform manual permission and ownership
52 # cleanups.
53 if 'database' in remote:
54 master['database'] = remote['database']
55 elif 'database' not in master:
56 master['database'] = remote.service
57
58 superuser, replication = _credential_types(rel)
59
60 if 'user' not in master:
61 user = postgresql.username(remote.service, superuser=superuser,
62 replication=replication)
63 password = host.pwgen()
64 master['user'] = user
65 master['password'] = password
66
67 # schema_user has never been documented and is deprecated.
68 if not superuser:
69 master['schema_user'] = user
70 master['schema_password'] = password
71
72 hookenv.log('** Master providing {} ({}/{})'.format(rel,
73 master['database'],
74 master['user']))
75
76 # Reflect these settings back so the client knows when they have
77 # taken effect.
78 if not replication:
79 master['roles'] = remote.get('roles')
80 master['extensions'] = remote.get('extensions')
81
82
83def db_relation_mirror(rel):
84 '''Non-masters mirror relation information from the master.'''
85 master = postgresql.master()
86 master_keys = ['database', 'user', 'password', 'roles',
87 'schema_user', 'schema_password', 'extensions']
88 master_info = rel.peers.get(master)
89 if master_info is None:
90 hookenv.log('Waiting for {} to join {}'.format(master, rel))
91 return
92 hookenv.log('Mirroring {} database credentials from {}'.format(rel,
93 master))
94 rel.local.update({k: master_info.get(k) for k in master_keys})
95
96
97def db_relation_common(rel):
98 '''Publish unit specific relation details.'''
99 local = rel.local
100 if 'database' not in local:
101 return # Not yet ready.
102
103 # Version number, allowing clients to adjust or block if their
104 # expectations are not met.
105 local['version'] = postgresql.version()
106
107 # Calculate the state of this unit. 'standalone' will disappear
108 # in a future version of this interface, as this state was
109 # only needed to deal with race conditions now solved by
110 # Juju leadership.
111 if postgresql.is_primary():
112 if hookenv.is_leader() and len(helpers.peers()) == 0:
113 local['state'] = 'standalone'
114 else:
115 local['state'] = 'master'
116 else:
117 local['state'] = 'hot standby'
118
119 # Host is the private ip address, but this might change and
120 # become the address of an attached proxy or alternative peer
121 # if this unit is in maintenance.
122 local['host'] = hookenv.unit_private_ip()
123
124 # Port will be 5432, unless the user has overridden it or
125 # something very weird happened when the packages where installed.
126 local['port'] = str(postgresql.port())
127
128 # The list of remote units on this relation granted access.
129 # This is to avoid the race condition where a new client unit
130 # joins an existing client relation and sees valid credentials,
131 # before we have had a chance to grant it access.
132 local['allowed-units'] = ' '.join(rel.keys())
133
134
135@master_only
136@relation_handler('db', 'db-admin', 'master')
137def ensure_db_relation_resources(rel):
138 '''Create the database resources needed for the relation.'''
139
140 master = rel.local
141
142 hookenv.log('Ensuring database {!r} and user {!r} exist for {}'
143 ''.format(master['database'], master['user'], rel))
144
145 # First create the database, if it isn't already.
146 postgresql.ensure_database(master['database'])
147
148 # Next, connect to the database to create the rest in a transaction.
149 con = postgresql.connect(database=master['database'])
150
151 superuser, replication = _credential_types(rel)
152 postgresql.ensure_user(con, master['user'], master['password'],
153 superuser=superuser, replication=replication)
154 if not superuser:
155 postgresql.ensure_user(con,
156 master['schema_user'],
157 master['schema_password'])
158
159 # Grant specified privileges on the database to the user. This comes
160 # from the PostgreSQL service configuration, as allowing the
161 # relation to specify how much access it gets is insecure.
162 config = hookenv.config()
163 privs = set(filter(None,
164 config['relation_database_privileges'].split(',')))
165 postgresql.grant_database_privileges(con, master['user'],
166 master['database'], privs)
167 if not superuser:
168 postgresql.grant_database_privileges(con, master['schema_user'],
169 master['database'], privs)
170
171 # Reset the roles granted to the user as requested.
172 if 'roles' in master:
173 roles = filter(None, master.get('roles', '').split(','))
174 postgresql.grant_user_roles(con, master['user'], roles)
175
176 # Create requested extensions. We never drop extensions, as there
177 # may be dependent objects.
178 if 'extensions' in master:
179 extensions = filter(None, master.get('extensions', '').split(','))
180 postgresql.ensure_extensions(con, extensions)
181
182 con.commit() # Don't throw away our changes.
0183
=== removed symlink 'hooks/config-changed'
=== target was u'hooks.py'
=== added file 'hooks/coordinator.py'
--- hooks/coordinator.py 1970-01-01 00:00:00 +0000
+++ hooks/coordinator.py 2015-11-02 12:15:35 +0000
@@ -0,0 +1,19 @@
1# Copyright 2015 Canonical Ltd.
2#
3# This file is part of the PostgreSQL Charm for Juju.
4#
5# This program is free software: you can redistribute it and/or modify
6# it under the terms of the GNU General Public License version 3, as
7# published by the Free Software Foundation.
8#
9# This program is distributed in the hope that it will be useful, but
10# WITHOUT ANY WARRANTY; without even the implied warranties of
11# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
12# PURPOSE. See the GNU General Public License for more details.
13#
14# You should have received a copy of the GNU General Public License
15# along with this program. If not, see <http://www.gnu.org/licenses/>.
16
17from charmhelpers.coordinator import Serial
18
19coordinator = Serial()
020
=== added file 'hooks/data-relation-changed'
--- hooks/data-relation-changed 1970-01-01 00:00:00 +0000
+++ hooks/data-relation-changed 2015-11-02 12:15:35 +0000
@@ -0,0 +1,23 @@
1#!/usr/bin/python3
2
3# Copyright 2015 Canonical Ltd.
4#
5# This file is part of the PostgreSQL Charm for Juju.
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License version 3, as
9# published by the Free Software Foundation.
10#
11# This program is distributed in the hope that it will be useful, but
12# WITHOUT ANY WARRANTY; without even the implied warranties of
13# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
14# PURPOSE. See the GNU General Public License for more details.
15#
16# You should have received a copy of the GNU General Public License
17# along with this program. If not, see <http://www.gnu.org/licenses/>.
18
19import bootstrap
20
21if __name__ == '__main__':
22 bootstrap.bootstrap()
23 bootstrap.default_hook()
024
=== removed symlink 'hooks/data-relation-changed'
=== target was u'hooks.py'
=== added file 'hooks/data-relation-departed'
--- hooks/data-relation-departed 1970-01-01 00:00:00 +0000
+++ hooks/data-relation-departed 2015-11-02 12:15:35 +0000
@@ -0,0 +1,23 @@
1#!/usr/bin/python3
2
3# Copyright 2015 Canonical Ltd.
4#
5# This file is part of the PostgreSQL Charm for Juju.
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License version 3, as
9# published by the Free Software Foundation.
10#
11# This program is distributed in the hope that it will be useful, but
12# WITHOUT ANY WARRANTY; without even the implied warranties of
13# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
14# PURPOSE. See the GNU General Public License for more details.
15#
16# You should have received a copy of the GNU General Public License
17# along with this program. If not, see <http://www.gnu.org/licenses/>.
18
19import bootstrap
20
21if __name__ == '__main__':
22 bootstrap.bootstrap()
23 bootstrap.default_hook()
024
=== removed symlink 'hooks/data-relation-departed'
=== target was u'hooks.py'
=== removed symlink 'hooks/data-relation-joined'
=== target was u'hooks.py'
=== removed symlink 'hooks/db-admin-relation-broken'
=== target was u'hooks.py'
=== removed symlink 'hooks/db-admin-relation-changed'
=== target was u'hooks.py'
=== added file 'hooks/db-admin-relation-departed'
--- hooks/db-admin-relation-departed 1970-01-01 00:00:00 +0000
+++ hooks/db-admin-relation-departed 2015-11-02 12:15:35 +0000
@@ -0,0 +1,23 @@
1#!/usr/bin/python3
2
3# Copyright 2015 Canonical Ltd.
4#
5# This file is part of the PostgreSQL Charm for Juju.
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License version 3, as
9# published by the Free Software Foundation.
10#
11# This program is distributed in the hope that it will be useful, but
12# WITHOUT ANY WARRANTY; without even the implied warranties of
13# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
14# PURPOSE. See the GNU General Public License for more details.
15#
16# You should have received a copy of the GNU General Public License
17# along with this program. If not, see <http://www.gnu.org/licenses/>.
18
19import bootstrap
20
21if __name__ == '__main__':
22 bootstrap.bootstrap()
23 bootstrap.default_hook()
024
=== removed symlink 'hooks/db-admin-relation-joined'
=== target was u'hooks.py'
=== removed symlink 'hooks/db-relation-broken'
=== target was u'hooks.py'
=== removed symlink 'hooks/db-relation-changed'
=== target was u'hooks.py'
=== added file 'hooks/db-relation-departed'
--- hooks/db-relation-departed 1970-01-01 00:00:00 +0000
+++ hooks/db-relation-departed 2015-11-02 12:15:35 +0000
@@ -0,0 +1,23 @@
1#!/usr/bin/python3
2
3# Copyright 2015 Canonical Ltd.
4#
5# This file is part of the PostgreSQL Charm for Juju.
6#
7# This program is free software: you can redistribute it and/or modify
8# it under the terms of the GNU General Public License version 3, as
9# published by the Free Software Foundation.
10#
11# This program is distributed in the hope that it will be useful, but
12# WITHOUT ANY WARRANTY; without even the implied warranties of
13# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
14# PURPOSE. See the GNU General Public License for more details.
15#
16# You should have received a copy of the GNU General Public License
17# along with this program. If not, see <http://www.gnu.org/licenses/>.
18
19import bootstrap
20
21if __name__ == '__main__':
22 bootstrap.bootstrap()
23 bootstrap.default_hook()
024
=== removed symlink 'hooks/db-relation-joined'
=== target was u'hooks.py'
=== added file 'hooks/decorators.py'
--- hooks/decorators.py 1970-01-01 00:00:00 +0000
+++ hooks/decorators.py 2015-11-02 12:15:35 +0000
@@ -0,0 +1,124 @@
1# Copyright 2015 Canonical Ltd.
2#
3# This file is part of the PostgreSQL Charm for Juju.
4#
5# This program is free software: you can redistribute it and/or modify
6# it under the terms of the GNU General Public License version 3, as
7# published by the Free Software Foundation.
8#
9# This program is distributed in the hope that it will be useful, but
10# WITHOUT ANY WARRANTY; without even the implied warranties of
11# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
12# PURPOSE. See the GNU General Public License for more details.
13#
14# You should have received a copy of the GNU General Public License
15# along with this program. If not, see <http://www.gnu.org/licenses/>.
16from functools import wraps
17
18from charmhelpers import context
19from charmhelpers.core import hookenv
20from charmhelpers.core.hookenv import DEBUG
21
22import helpers
23
24
25def data_ready_action(func):
26 '''Decorate func to be used as a data_ready item.'''
27 @wraps(func)
28 def wrapper(service_name=None):
29 if hookenv.remote_unit():
30 hookenv.log("** Action {}/{} ({})".format(hookenv.hook_name(),
31 func.__name__,
32 hookenv.remote_unit()))
33 else:
34 hookenv.log("** Action {}/{}".format(hookenv.hook_name(),
35 func.__name__))
36 return func()
37 return wrapper
38
39
40class requirement:
41 '''Decorate a function so it can be used as a required_data item.
42
43 Function must True if requirements are met. Sets the unit state
44 to blocked if requirements are not met and the unit not already blocked.
45 '''
46 def __init__(self, func):
47 self._func = func
48
49 def __bool__(self):
50 name = self._func.__name__
51 if self._func():
52 hookenv.log('** Requirement {} passed'.format(name))
53 return True
54 else:
55 if hookenv.status_get() != 'blocked':
56 helpers.status_set('blocked',
57 'Requirement {} failed'.format(name))
58 return False
59
60
61def relation_handler(*relnames):
62 '''Invoke the decorated function once per matching relation.
63
64 The decorated function should accept the Relation() instance
65 as its single parameter.
66 '''
67 assert relnames, 'relation names required'
68
69 def decorator(func):
70 @wraps(func)
71 def wrapper(service_name=None):
72 rels = context.Relations()
73 for relname in relnames:
74 for rel in rels[relname].values():
75 if rel:
76 func(rel)
77 return wrapper
78 return decorator
79
80
81def leader_only(func):
82 '''Only run on the service leader.'''
83 @wraps(func)
84 def wrapper(*args, **kw):
85 if hookenv.is_leader():
86 return func(*args, **kw)
87 else:
88 hookenv.log('Not the leader', DEBUG)
89 return wrapper
90
91
92def not_leader(func):
93 '''Only run on the service leader.'''
94 @wraps(func)
95 def wrapper(*args, **kw):
96 if not hookenv.is_leader():
97 return func(*args, **kw)
98 else:
99 hookenv.log("I'm the leader", DEBUG)
100 return wrapper
101
102
103def master_only(func):
104 '''Only run on the appointed master.'''
105 @wraps(func)
106 def wrapper(*args, **kw):
107 import postgresql
108 if postgresql.is_master():
109 return func(*args, **kw)
110 else:
111 hookenv.log('Not the master', DEBUG)
112 return wrapper
113
114
115def not_master(func):
116 '''Don't run on the appointed master.'''
117 @wraps(func)
118 def wrapper(*args, **kw):
119 import postgresql
120 if postgresql.is_master():
121 hookenv.log("I'm the master", DEBUG)
122 else:
123 return func(*args, **kw)
124 return wrapper
0125
=== added file 'hooks/definitions.py'
--- hooks/definitions.py 1970-01-01 00:00:00 +0000
+++ hooks/definitions.py 2015-11-02 12:15:35 +0000
@@ -0,0 +1,86 @@
1# Copyright 2011-2015 Canonical Ltd.
2#
3# This file is part of the PostgreSQL Charm for Juju.
4#
5# This program is free software: you can redistribute it and/or modify
6# it under the terms of the GNU General Public License version 3, as
7# published by the Free Software Foundation.
8#
9# This program is distributed in the hope that it will be useful, but
10# WITHOUT ANY WARRANTY; without even the implied warranties of
11# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
12# PURPOSE. See the GNU General Public License for more details.
13#
14# You should have received a copy of the GNU General Public License
15# along with this program. If not, see <http://www.gnu.org/licenses/>.
16
17from charmhelpers.core import services
18
19import client
20import nagios
21import replication
22import service
23import storage
24import syslogrel
25import wal_e
26
27
28SERVICE_DEFINITION = [
29 dict(service='postgresql',
30 required_data=[service.valid_config],
31 data_ready=[service.preinstall,
32 service.configure_sources,
33 service.install_packages,
34 service.ensure_package_status,
35 service.update_kernel_settings,
36 service.appoint_master,
37
38 service.wait_for_peers, # Exit if there are no peers.
39
40 nagios.ensure_nagios_credentials,
41 replication.ensure_replication_credentials,
42 replication.publish_replication_details,
43
44 # Exit if required leader settings are not set.
45 service.wait_for_leader,
46
47 service.ensure_cluster,
48 service.update_pgpass,
49 service.update_pg_hba_conf,
50 service.update_pg_ident_conf,
51 service.update_postgresql_conf,
52 syslogrel.handle_syslog_relations,
53 storage.handle_storage_relation,
54 wal_e.update_wal_e_env_dir,
55 service.request_restart,
56
57 service.wait_for_restart, # Exit if cannot restart yet.
58
59 replication.promote_master,
60 storage.remount,
61 replication.clone_master, # Exit if cannot clone yet.
62 replication.update_recovery_conf,
63 service.restart_or_reload,
64
65 replication.ensure_replication_user,
66 nagios.ensure_nagios_user,
67 service.install_administrative_scripts,
68 service.update_postgresql_crontab,
69
70 client.publish_db_relations,
71 client.ensure_db_relation_resources,
72
73 service.update_pg_hba_conf, # Again, after client setup.
74 service.reload_config,
75
76 service.set_active,
77
78 # At the end, as people check the end of logs
79 # most frequently.
80 service.emit_deprecated_option_warnings],
81 start=[service.open_ports],
82 stop=[service.stop_postgresql, service.close_ports])]
83
84
85def get_service_manager():
86 return services.ServiceManager(SERVICE_DEFINITION)
087
=== added file 'hooks/helpers.py'
--- hooks/helpers.py 1970-01-01 00:00:00 +0000
+++ hooks/helpers.py 2015-11-02 12:15:35 +0000
@@ -0,0 +1,151 @@
1# Copyright 2015 Canonical Ltd.
2#
3# This file is part of the PostgreSQL Charm for Juju.
4#
5# This program is free software: you can redistribute it and/or modify
6# it under the terms of the GNU General Public License version 3, as
7# published by the Free Software Foundation.
8#
9# This program is distributed in the hope that it will be useful, but
10# WITHOUT ANY WARRANTY; without even the implied warranties of
11# MERCHANTABILITY, SATISFACTORY QUALITY, or FITNESS FOR A PARTICULAR
12# PURPOSE. See the GNU General Public License for more details.
13#
14# You should have received a copy of the GNU General Public License
15# along with this program. If not, see <http://www.gnu.org/licenses/>.
16
17from contextlib import contextmanager
18import os
19import shutil
20import stat
21import tempfile
22
23import yaml
24
25from charmhelpers import context
26from charmhelpers.core import hookenv, host
27from charmhelpers.core.hookenv import INFO, CRITICAL
28
29
30def status_set(status_or_msg, msg=None):
31 '''Set the unit status message, and log the change too.'''
32 if msg is None:
33 msg = status_or_msg
34 status = hookenv.status_get()
35 else:
36 status = status_or_msg
37
38 if status == 'blocked':
39 lvl = CRITICAL
40 else:
41 lvl = INFO
42 hookenv.log('{}: {}'.format(status, msg), lvl)
43 hookenv.status_set(status, msg)
44
45
46def distro_codename():
47 """Return the distro release code name, eg. 'precise' or 'trusty'."""
48 return host.lsb_release()['DISTRIB_CODENAME']
49
50
51def extra_packages():
52 config = hookenv.config()
53 packages = set()
54
55 packages.update(set(config['extra_packages'].split()))
56 packages.update(set(config['extra-packages'].split())) # Deprecated.
57
58 if config['wal_e_storage_uri']:
59 packages.add('daemontools')
60 packages.add('wal-e')
61
62 return packages
63
64
65def peers():
66 '''Return the set of peers, not including the local unit.'''
67 rel = context.Relations().peer
68 return frozenset(rel.keys()) if rel else frozenset()
69
70
71def rewrite(path, content):
72 '''Rewrite a file atomically, preserving ownership and permissions.'''
73 attr = os.lstat(path)
74 write(path, content,
75 mode=stat.S_IMODE(attr.st_mode),
76 user=attr[stat.ST_UID],
77 group=attr[stat.ST_GID])
78
79
80def write(path, content, mode=0o640, user='root', group='root'):
81 '''Write a file atomically.'''
82 open_mode = 'wb' if isinstance(content, bytes) else 'w'
83 with tempfile.NamedTemporaryFile(mode=open_mode, delete=False) as f:
84 try:
85 f.write(content)
86 f.flush()
87 shutil.chown(f.name, user, group)
88 os.chmod(f.name, mode)
89 os.replace(f.name, path)
90 finally:
91 if os.path.exists(f.name):
92 os.unlink(f.name)
93
94
95def makedirs(path, mode=0o750, user='root', group='root'):
96 if os.path.exists(path):
97 assert os.path.isdir(path), '{} is not a directory'
98 else:
99 os.makedirs(path, mode=mode)
100 shutil.chown(path, user, group)
101 os.chmod(path, mode)
102
103
104@contextmanager
105def switch_cwd(new_working_directory='/tmp'):
106 'Switch working directory.'
107 org_dir = os.getcwd()
108 os.chdir(new_working_directory)
109 try:
110 yield new_working_directory
111 finally:
112 os.chdir(org_dir)
113
114
115def config_yaml():
116 config_yaml_path = os.path.join(hookenv.charm_dir(), 'config.yaml')
117 with open(config_yaml_path, 'r') as f:
118 return yaml.load(f)
119
120
121def deprecated_config_in_use():
122 options = config_yaml()['options']
123 config = hookenv.config()
124 deprecated = [key for key in options
125 if ('DEPRECATED' in options[key]['description'] and
126 config[key] != options[key]['default'])]
127 return set(deprecated)
128
129
130def cron_dir():
131 '''Where we put crontab files.'''
132 return '/etc/cron.d'
133
134
135def scripts_dir():
136 '''Where the charm puts adminstrative scripts.'''
137 return '/var/lib/postgresql/scripts'
138
139
140def logs_dir():
141 '''Where the charm administrative scripts log their output.'''
142 return '/var/lib/postgresql/logs'
143
144
145def backups_dir():
146 '''Where pg_dump backups are stored.'''
147 return hookenv.config()['backup_dir']
148
149
150def backups_log_path():
151 return os.path.join(logs_dir(), 'backups.log')
0152
=== removed file 'hooks/helpers.py'
--- hooks/helpers.py 2015-02-24 16:57:31 +0000
+++ hooks/helpers.py 1970-01-01 00:00:00 +0000
@@ -1,197 +0,0 @@
1# Copyright 2012 Canonical Ltd. This software is licensed under the
2# GNU Affero General Public License version 3 (see the file LICENSE).
3
4"""Helper functions for writing hooks in python."""
5
6__metaclass__ = type
7__all__ = [
8 'get_config',
9 'juju_status',
10 'log',
11 'log_entry',
12 'log_exit',
13 'make_charm_config_file',
14 'relation_get',
15 'relation_set',
16 'unit_info',
17 'wait_for_machine',
18 'wait_for_page_contents',
19 'wait_for_relation',
20 'wait_for_unit']
21
22from contextlib import contextmanager
23import json
24import operator
25from shelltoolbox import (
26 command,
27 run,
28 script_name)
29import os
30import tempfile
31import time
32import urllib2
33import yaml
34
35
36log = command('juju-log')
37
38
39def log_entry():
40 log("--> Entering {}".format(script_name()))
41
42
43def log_exit():
44 log("<-- Exiting {}".format(script_name()))
45
46
47def get_config():
48 config_get = command('config-get', '--format=json')
49 return json.loads(config_get())
50
51
52def relation_get(*args):
53 cmd = command('relation-get')
54 return cmd(*args).strip()
55
56
57def relation_set(**kwargs):
58 cmd = command('relation-set')
59 args = ['{}={}'.format(k, v) for k, v in kwargs.items()]
60 return cmd(*args)
61
62
63def make_charm_config_file(charm_config):
64 charm_config_file = tempfile.NamedTemporaryFile()
65 charm_config_file.write(yaml.dump(charm_config))
66 charm_config_file.flush()
67 # The NamedTemporaryFile instance is returned instead of just the name
68 # because we want to take advantage of garbage collection-triggered
69 # deletion of the temp file when it goes out of scope in the caller.
70 return charm_config_file
71
72
73def juju_status(key):
74 return yaml.safe_load(run('juju', 'status'))[key]
75
76
77def get_charm_revision(service_name):
78 service = juju_status('services')[service_name]
79 return int(service['charm'].split('-')[-1])
80
81
82def unit_info(service_name, item_name, data=None):
83 services = juju_status('services') if data is None else data['services']
84 service = services.get(service_name)
85 if service is None:
86 # XXX 2012-02-08 gmb:
87 # This allows us to cope with the race condition that we
88 # have between deploying a service and having it come up in
89 # `juju status`. We could probably do with cleaning it up so
90 # that it fails a bit more noisily after a while.
91 return ''
92 units = service['units']
93 item = units.items()[0][1][item_name]
94 return item
95
96
97@contextmanager
98def maintain_charm_revision(path=None):
99 if path is None:
100 path = os.path.join(os.path.dirname(__file__), '..', 'revision')
101 revision = open(path).read()
102 try:
103 yield revision
104 finally:
105 with open(path, 'w') as f:
106 f.write(revision)
107
108
109def upgrade_charm(service_name, timeout=120):
110 next_revision = get_charm_revision(service_name) + 1
111 start_time = time.time()
112 run('juju', 'upgrade-charm', service_name)
113 while get_charm_revision(service_name) != next_revision:
114 if time.time() - start_time >= timeout:
115 raise RuntimeError('timeout waiting for charm to be upgraded')
116 time.sleep(0.1)
117 return next_revision
118
119
120def wait_for_machine(num_machines=1, timeout=300):
121 """Wait `timeout` seconds for `num_machines` machines to come up.
122
123 This wait_for... function can be called by other wait_for functions
124 whose timeouts might be too short in situations where only a bare
125 Juju setup has been bootstrapped.
126 """
127 # You may think this is a hack, and you'd be right. The easiest way
128 # to tell what environment we're working in (LXC vs EC2) is to check
129 # the dns-name of the first machine. If it's localhost we're in LXC
130 # and we can just return here.
131 if juju_status('machines')[0]['dns-name'] == 'localhost':
132 return
133 start_time = time.time()
134 while True:
135 # Drop the first machine, since it's the Zookeeper and that's
136 # not a machine that we need to wait for. This will only work
137 # for EC2 environments, which is why we return early above if
138 # we're in LXC.
139 machine_data = juju_status('machines')
140 non_zookeeper_machines = [
141 machine_data[key] for key in machine_data.keys()[1:]]
142 if len(non_zookeeper_machines) >= num_machines:
143 all_machines_running = True
144 for machine in non_zookeeper_machines:
145 if machine['instance-state'] != 'running':
146 all_machines_running = False
147 break
148 if all_machines_running:
149 break
150 if time.time() - start_time >= timeout:
151 raise RuntimeError('timeout waiting for service to start')
152 time.sleep(0.1)
153
154
155def wait_for_unit(service_name, timeout=480):
156 """Wait `timeout` seconds for a given service name to come up."""
157 wait_for_machine(num_machines=1)
158 start_time = time.time()
159 while True:
160 state = unit_info(service_name, 'state')
161 if 'error' in state or state == 'started':
162 break
163 if time.time() - start_time >= timeout:
164 raise RuntimeError('timeout waiting for service to start')
165 time.sleep(0.1)
166 if state != 'started':
167 raise RuntimeError('unit did not start, state: ' + state)
168
169
170def wait_for_relation(service_name, relation_name, timeout=120):
171 """Wait `timeout` seconds for a given relation to come up."""
172 start_time = time.time()
173 while True:
174 relation = unit_info(service_name, 'relations').get(relation_name)
175 if relation is not None and relation['state'] == 'up':
176 break
177 if time.time() - start_time >= timeout:
178 raise RuntimeError('timeout waiting for relation to be up')
179 time.sleep(0.1)
180
181
182def wait_for_page_contents(url, contents, timeout=120, validate=None):
183 if validate is None:
184 validate = operator.contains
185 start_time = time.time()
186 while True:
187 try:
188 stream = urllib2.urlopen(url)
189 except (urllib2.HTTPError, urllib2.URLError):
190 pass
191 else:
192 page = stream.read()
193 if validate(page, contents):
194 return page
195 if time.time() - start_time >= timeout:
196 raise RuntimeError('timeout waiting for contents of ' + url)
197 time.sleep(0.1)
1980
=== removed file 'hooks/hooks.py'
--- hooks/hooks.py 2015-08-11 11:15:27 +0000
+++ hooks/hooks.py 1970-01-01 00:00:00 +0000
@@ -1,2820 +0,0 @@
1#!/usr/bin/env python
2# vim: et ai ts=4 sw=4:
3
4from contextlib import contextmanager
5import commands
6import cPickle as pickle
7from distutils.version import StrictVersion
8import glob
9from grp import getgrnam
10import os
11from pwd import getpwnam
12import re
13import shutil
14import socket
15import subprocess
16import sys
17from tempfile import NamedTemporaryFile
18from textwrap import dedent
19import time
20import urlparse
21
22from charmhelpers import fetch
23from charmhelpers.core import hookenv, host
24from charmhelpers.core.hookenv import (
25 CRITICAL, ERROR, WARNING, INFO, DEBUG)
26
27try:
28 import psycopg2
29 from jinja2 import Template
30except ImportError:
31 fetch.apt_update(fatal=True)
32 fetch.apt_install(['python-psycopg2', 'python-jinja2'], fatal=True)
33 import psycopg2
34 from jinja2 import Template
35
36from psycopg2.extensions import AsIs
37from jinja2 import Environment, FileSystemLoader
38
39
40hooks = hookenv.Hooks()
41
42
43def log(msg, lvl=INFO):
44 '''Log a message.
45
46 Per Bug #1208787, log messages sent via juju-log are being lost.
47 Spit messages out to a log file to work around the problem.
48 It is also rather nice to have the log messages we explicitly emit
49 in a separate log file, rather than just mashed up with all the
50 juju noise.
51 '''
52 myname = hookenv.local_unit().replace('/', '-')
53 ts = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime())
54 with open('{}/{}-debug.log'.format(juju_log_dir, myname), 'a') as f:
55 f.write('{} {}: {}\n'.format(ts, lvl, msg))
56 hookenv.log(msg, lvl)
57
58
59def pg_version():
60 '''Return pg_version to use.
61
62 Return "version" config item if set, else use version from "postgresql"
63 package candidate, saving it in local_state for later.
64 '''
65 config_data = hookenv.config()
66 if 'pg_version' in local_state:
67 version = local_state['pg_version']
68 elif 'version' in config_data:
69 version = config_data['version']
70 else:
71 log("map version from distro release ...")
72 version_map = {'precise': '9.1',
73 'trusty': '9.3'}
74 version = version_map.get(distro_codename())
75 if not version:
76 log("No PG version map for distro_codename={}, "
77 "you'll need to explicitly set it".format(distro_codename()),
78 CRITICAL)
79 sys.exit(1)
80 log("version={} from distro_codename='{}'".format(
81 version, distro_codename()))
82 # save it for later
83 local_state.setdefault('pg_version', version)
84 local_state.save()
85
86 assert version, "pg_version couldn't find a version to use"
87 return version
88
89
90def distro_codename():
91 """Return the distro release code name, eg. 'precise' or 'trusty'."""
92 return host.lsb_release()['DISTRIB_CODENAME']
93
94
95def render_template(template_name, vars):
96 # deferred import so install hook can install jinja2
97 templates_dir = os.path.join(os.environ['CHARM_DIR'], 'templates')
98 template_env = Environment(loader=FileSystemLoader(templates_dir))
99 template = template_env.get_template(template_name)
100 return template.render(vars)
101
102
103class State(dict):
104 """Encapsulate state common to the unit for republishing to relations."""
105 def __init__(self, state_file):
106 super(State, self).__init__()
107 self._state_file = state_file
108 self.load()
109
110 def load(self):
111 '''Load stored state from local disk.'''
112 if os.path.exists(self._state_file):
113 state = pickle.load(open(self._state_file, 'rb'))
114 else:
115 state = {}
116 self.clear()
117
118 self.update(state)
119
120 def save(self):
121 '''Store state to local disk.'''
122 state = {}
123 state.update(self)
124 old_mask = os.umask(0o077) # This file contains database passwords!
125 try:
126 pickle.dump(state, open(self._state_file, 'wb'))
127 finally:
128 os.umask(old_mask)
129
130 def publish(self):
131 """Publish relevant unit state to relations"""
132
133 def add(state_dict, key):
134 if key in self:
135 state_dict[key] = self[key]
136
137 client_state = {}
138 add(client_state, 'state')
139
140 for relid in hookenv.relation_ids('db'):
141 hookenv.relation_set(relid, client_state)
142
143 for relid in hookenv.relation_ids('db-admin'):
144 hookenv.relation_set(relid, client_state)
145
146 replication_state = dict(client_state)
147
148 add(replication_state, 'replication_password')
149 add(replication_state, 'port')
150 add(replication_state, 'wal_received_offset')
151 add(replication_state, 'following')
152 add(replication_state, 'client_relations')
153
154 authorized = self.get('authorized', None)
155 if authorized:
156 replication_state['authorized'] = ' '.join(sorted(authorized))
157
158 for relid in hookenv.relation_ids('replication'):
159 hookenv.relation_set(relid, replication_state)
160
161 for relid in hookenv.relation_ids('master'):
162 hookenv.relation_set(relid, state=self.get('state'))
163
164 log('saving local state', DEBUG)
165 self.save()
166
167
168def volume_get_all_mounted():
169 command = ("mount |egrep %s" % external_volume_mount)
170 status, output = commands.getstatusoutput(command)
171 if status != 0:
172 return None
173 return output
174
175
176def postgresql_autostart(enabled):
177 postgresql_config_dir = _get_postgresql_config_dir()
178 startup_file = os.path.join(postgresql_config_dir, 'start.conf')
179 if enabled:
180 log("Enabling PostgreSQL startup in {}".format(startup_file))
181 mode = 'auto'
182 else:
183 log("Disabling PostgreSQL startup in {}".format(startup_file))
184 mode = 'manual'
185 template_file = "{}/templates/start_conf.tmpl".format(hookenv.charm_dir())
186 contents = Template(open(template_file).read()).render({'mode': mode})
187 host.write_file(
188 startup_file, contents, 'postgres', 'postgres', perms=0o644)
189
190
191def run(command, exit_on_error=True, quiet=False):
192 '''Run a command and return the output.'''
193 if not quiet:
194 log("Running {!r}".format(command), DEBUG)
195 p = subprocess.Popen(
196 command, stdin=subprocess.PIPE, stdout=subprocess.PIPE,
197 shell=isinstance(command, basestring))
198 p.stdin.close()
199 lines = []
200 for line in p.stdout:
201 if line:
202 # LP:1274460 & LP:1259490 mean juju-log is no where near as
203 # useful as we would like, so just shove a copy of the
204 # output to stdout for logging.
205 # log("> {}".format(line), DEBUG)
206 if not quiet:
207 print line
208 lines.append(line)
209 elif p.poll() is not None:
210 break
211
212 p.wait()
213
214 if p.returncode == 0:
215 return '\n'.join(lines)
216
217 if p.returncode != 0 and exit_on_error:
218 log("ERROR: {}".format(p.returncode), ERROR)
219 sys.exit(p.returncode)
220
221 raise subprocess.CalledProcessError(p.returncode, command,
222 '\n'.join(lines))
223
224
225def postgresql_is_running():
226 '''Return true if PostgreSQL is running.'''
227 for version, name, _, status in lsclusters(slice(4)):
228 if (version, name) == (pg_version(), hookenv.config('cluster_name')):
229 if 'online' in status.split(','):
230 log('PostgreSQL is running', DEBUG)
231 return True
232 else:
233 log('PostgreSQL is not running', DEBUG)
234 return False
235 assert False, 'Cluster {} {} not found'.format(
236 pg_version(), hookenv.config('cluster_name'))
237
238
239def postgresql_stop():
240 '''Shutdown PostgreSQL.'''
241 if postgresql_is_running():
242 run([
243 'pg_ctlcluster', '--force',
244 pg_version(), hookenv.config('cluster_name'), 'stop'])
245 log('PostgreSQL shut down')
246
247
248def postgresql_start():
249 '''Start PostgreSQL if it is not already running.'''
250 if not postgresql_is_running():
251 run([
252 'pg_ctlcluster', pg_version(),
253 hookenv.config('cluster_name'), 'start'])
254 log('PostgreSQL started')
255
256
257def postgresql_restart():
258 '''Restart PostgreSQL, or start it if it is not already running.'''
259 if postgresql_is_running():
260 with restart_lock(hookenv.local_unit(), True):
261 run([
262 'pg_ctlcluster', '--force',
263 pg_version(), hookenv.config('cluster_name'), 'restart'])
264 log('PostgreSQL restarted')
265 else:
266 postgresql_start()
267
268 assert postgresql_is_running()
269
270 # Store a copy of our known live configuration so
271 # postgresql_reload_or_restart() can make good choices.
272 if 'saved_config' in local_state:
273 local_state['live_config'] = local_state['saved_config']
274 local_state.save()
275
276
277def postgresql_reload():
278 '''Make PostgreSQL reload its configuration.'''
279 # reload returns a reliable exit status
280 if postgresql_is_running():
281 # I'm using the PostgreSQL function to avoid as much indirection
282 # as possible.
283 success = run_select_as_postgres('SELECT pg_reload_conf()')[1][0][0]
284 assert success, 'Failed to reload PostgreSQL configuration'
285 log('PostgreSQL configuration reloaded')
286 return postgresql_start()
287
288
289def requires_restart():
290 '''Check for configuration changes requiring a restart to take effect.'''
291 if not postgresql_is_running():
292 return True
293
294 saved_config = local_state.get('saved_config', None)
295 if not saved_config:
296 log("No record of postgresql.conf state. Better restart.")
297 return True
298
299 live_config = local_state.setdefault('live_config', {})
300
301 # Pull in a list of PostgreSQL settings.
302 cur = db_cursor()
303 cur.execute("SELECT name, context FROM pg_settings")
304 restart = False
305 for name, context in cur.fetchall():
306 live_value = live_config.get(name, None)
307 new_value = saved_config.get(name, None)
308
309 if new_value != live_value:
310 if live_config:
311 log("Changed {} from {!r} to {!r}".format(
312 name, live_value, new_value), DEBUG)
313 if context == 'postmaster':
314 # A setting has changed that requires PostgreSQL to be
315 # restarted before it will take effect.
316 restart = True
317 log('{} changed from {} to {}. Restart required.'.format(
318 name, live_value, new_value), DEBUG)
319 return restart
320
321
322def postgresql_reload_or_restart():
323 """Reload PostgreSQL configuration, restarting if necessary."""
324 if requires_restart():
325 log("Configuration change requires PostgreSQL restart", WARNING)
326 postgresql_restart()
327 assert not requires_restart(), "Configuration changes failed to apply"
328 else:
329 postgresql_reload()
330
331 local_state['saved_config'] = local_state['live_config']
332 local_state.save()
333
334
335def get_service_port():
336 '''Return the port PostgreSQL is listening on.'''
337 for version, name, port in lsclusters(slice(3)):
338 if (version, name) == (pg_version(), hookenv.config('cluster_name')):
339 return int(port)
340
341 assert False, 'No port found for {!r} {!r}'.format(
342 pg_version(), hookenv.config['cluster_name'])
343
344
345def lsclusters(s=slice(0, -1)):
346 for line in run('pg_lsclusters', quiet=True).splitlines()[1:]:
347 if line:
348 yield line.split()[s]
349
350
351def createcluster():
352 with switch_cwd('/tmp'): # Ensure cwd is readable as the postgres user
353 create_cmd = [
354 "pg_createcluster",
355 "--locale", hookenv.config('locale'),
356 "-e", hookenv.config('encoding')]
357 if hookenv.config('listen_port'):
358 create_cmd.extend(["-p", str(hookenv.config('listen_port'))])
359 version = pg_version()
360 create_cmd.append(version)
361 create_cmd.append(hookenv.config('cluster_name'))
362
363 # With 9.3+, we make an opinionated decision to always enable
364 # data checksums. This seems to be best practice. We could
365 # turn this into a configuration item if there is need. There
366 # is no way to enable this option on existing clusters.
367 if StrictVersion(version) >= StrictVersion('9.3'):
368 create_cmd.extend(['--', '--data-checksums'])
369
370 run(create_cmd)
371 # Ensure SSL certificates exist, as we enable SSL by default.
372 create_ssl_cert(os.path.join(
373 postgresql_data_dir, pg_version(), hookenv.config('cluster_name')))
374
375
376def _get_system_ram():
377 """ Return the system ram in Megabytes """
378 import psutil
379 return psutil.phymem_usage()[0] / (1024 ** 2)
380
381
382def _get_page_size():
383 """ Return the operating system's configured PAGE_SIZE """
384 return int(run("getconf PAGE_SIZE")) # frequently 4096
385
386
387def _run_sysctl(postgresql_sysctl):
388 """sysctl -p postgresql_sysctl, helper for easy test mocking."""
389 # Do not error out when this fails. It is not likely to work under LXC.
390 return run("sysctl -p {}".format(postgresql_sysctl), exit_on_error=False)
391
392
393def create_postgresql_config(config_file):
394 '''Create the postgresql.conf file'''
395 config_data = hookenv.config()
396 if not config_data.get('listen_port', None):
397 config_data['listen_port'] = get_service_port()
398 if config_data["performance_tuning"].lower() != "manual":
399 total_ram = _get_system_ram()
400 config_data["kernel_shmmax"] = (int(total_ram) * 1024 * 1024) + 1024
401 config_data["kernel_shmall"] = config_data["kernel_shmmax"]
402
403 # XXX: This is very messy - should probably be a subordinate charm
404 lines = ["kernel.sem = 250 32000 100 1024\n"]
405 if config_data["kernel_shmall"] > 0:
406 # Convert config kernel_shmall (bytes) to pages
407 page_size = _get_page_size()
408 num_pages = config_data["kernel_shmall"] / page_size
409 if (config_data["kernel_shmall"] % page_size) > 0:
410 num_pages += 1
411 lines.append("kernel.shmall = %s\n" % num_pages)
412 if config_data["kernel_shmmax"] > 0:
413 lines.append("kernel.shmmax = %s\n" % config_data["kernel_shmmax"])
414 host.write_file(postgresql_sysctl, ''.join(lines), perms=0600)
415 _run_sysctl(postgresql_sysctl)
416
417 # Our config file specifies a default wal_level that only works
418 # with PostgreSQL 9.4. Downgrade this for earlier versions of
419 # PostgreSQL. We have this default so more things Just Work.
420 if pg_version() < '9.4' and config_data['wal_level'] == 'logical':
421 config_data['wal_level'] = 'hot_standby'
422
423 # If we are replicating, some settings may need to be overridden to
424 # certain minimum levels.
425 num_slaves = slave_count()
426 if num_slaves > 0:
427 log('{} hot standbys in peer relation.'.format(num_slaves))
428 log('Ensuring minimal replication settings')
429 config_data['hot_standby'] = True
430 if config_data['wal_level'] != 'logical':
431 config_data['wal_level'] = 'hot_standby'
432 config_data['wal_keep_segments'] = max(
433 config_data['wal_keep_segments'],
434 config_data['replicated_wal_keep_segments'])
435 # We need this set even if config_data['streaming_replication']
436 # is False, because the replication connection is still needed
437 # by pg_basebackup to build a hot standby.
438 config_data['max_wal_senders'] = max(
439 num_slaves, config_data['max_wal_senders'])
440
441 # Log shipping to Swift using SwiftWAL. This could be for
442 # non-streaming replication, or for PITR.
443 if config_data.get('swiftwal_log_shipping', None):
444 config_data['archive_mode'] = True
445 if config_data['wal_level'] != 'logical':
446 config_data['wal_level'] = 'hot_standby'
447 config_data['archive_command'] = swiftwal_archive_command()
448
449 if config_data.get('wal_e_storage_uri', None):
450 config_data['archive_mode'] = True
451 if config_data['wal_level'] != 'logical':
452 config_data['wal_level'] = 'hot_standby'
453 config_data['archive_command'] = wal_e_archive_command()
454
455 # Send config data to the template
456 # Return it as pg_config
457 charm_dir = hookenv.charm_dir()
458 template_file = "{}/templates/postgresql.conf.tmpl".format(charm_dir)
459 if not config_data.get('version', None):
460 config_data['version'] = pg_version()
461 pg_config = Template(
462 open(template_file).read()).render(config_data)
463 host.write_file(
464 config_file, pg_config,
465 owner="postgres", group="postgres", perms=0600)
466
467 # Create or update files included from postgresql.conf.
468 configure_log_destination(os.path.dirname(config_file))
469
470 tune_postgresql_config(config_file)
471
472 local_state['saved_config'] = dict(config_data)
473 local_state.save()
474
475
476def tune_postgresql_config(config_file):
477 tune_workload = hookenv.config('performance_tuning').lower()
478 if tune_workload == "manual":
479 return # Requested no autotuning.
480
481 if tune_workload == "auto":
482 tune_workload = "mixed" # Pre-pgtune backwards compatibility.
483
484 with NamedTemporaryFile() as tmp_config:
485 run(['pgtune', '-i', config_file, '-o', tmp_config.name,
486 '-T', tune_workload,
487 '-c', str(hookenv.config('max_connections'))])
488 host.write_file(
489 config_file, open(tmp_config.name, 'r').read(),
490 owner='postgres', group='postgres', perms=0o600)
491
492
493def create_postgresql_ident(output_file):
494 '''Create the pg_ident.conf file.'''
495 ident_data = {}
496 charm_dir = hookenv.charm_dir()
497 template_file = "{}/templates/pg_ident.conf.tmpl".format(charm_dir)
498 pg_ident_template = Template(open(template_file).read())
499 host.write_file(
500 output_file, pg_ident_template.render(ident_data),
501 owner="postgres", group="postgres", perms=0600)
502
503
504def generate_postgresql_hba(
505 output_file, user=None, schema_user=None, database=None):
506 '''Create the pg_hba.conf file.'''
507
508 # Per Bug #1117542, when generating the postgresql_hba file we
509 # need to cope with private-address being either an IP address
510 # or a hostname.
511 def munge_address(addr):
512 # http://stackoverflow.com/q/319279/196832
513 try:
514 socket.inet_aton(addr)
515 return "%s/32" % addr
516 except socket.error:
517 # It's not an IP address.
518 # XXX workaround for MAAS bug
519 # https://bugs.launchpad.net/maas/+bug/1250435
520 # If it's a CNAME, use the A record it points to.
521 # If it fails for some reason, return the original address
522 try:
523 output = run("dig +short -t CNAME %s" % addr, True).strip()
524 except:
525 return addr
526 if len(output) != 0:
527 return output.rstrip(".") # trailing dot
528 return addr
529
530 config_data = hookenv.config()
531 allowed_units = set()
532 relation_data = []
533 relids = hookenv.relation_ids('db') + hookenv.relation_ids('db-admin')
534 for relid in relids:
535 local_relation = hookenv.relation_get(
536 unit=hookenv.local_unit(), rid=relid)
537
538 # We might see relations that have not yet been setup enough.
539 # At a minimum, the relation-joined hook needs to have been run
540 # on the server so we have information about the usernames and
541 # databases to allow in.
542 if 'user' not in local_relation:
543 continue
544
545 for unit in hookenv.related_units(relid):
546 relation = hookenv.relation_get(unit=unit, rid=relid)
547
548 relation['relation-id'] = relid
549 relation['unit'] = unit
550
551 if relid.startswith('db-admin:'):
552 relation['user'] = 'all'
553 relation['database'] = 'all'
554 elif relid.startswith('db:'):
555 relation['user'] = local_relation.get('user', user)
556 relation['schema_user'] = local_relation.get('schema_user',
557 schema_user)
558 relation['database'] = local_relation.get('database', database)
559
560 if ((relation['user'] is None
561 or relation['schema_user'] is None
562 or relation['database'] is None)):
563 # Missing info in relation for this unit, so skip it.
564 continue
565 else:
566 raise RuntimeError(
567 'Unknown relation type {}'.format(repr(relid)))
568
569 allowed_units.add(unit)
570 relation['private-address'] = munge_address(
571 relation['private-address'])
572 relation_data.append(relation)
573
574 log(str(relation_data), INFO)
575
576 # Replication connections. Each unit needs to be able to connect to
577 # every other unit's postgres database and the magic replication
578 # database. It also needs to be able to connect to its own postgres
579 # database.
580 relids = hookenv.relation_ids('replication')
581 for relid in relids:
582 for unit in hookenv.related_units(relid):
583 relation = hookenv.relation_get(unit=unit, rid=relid)
584 remote_addr = munge_address(relation['private-address'])
585 remote_replication = {'database': 'replication',
586 'user': 'juju_replication',
587 'private-address': remote_addr,
588 'relation-id': relid,
589 'unit': unit,
590 }
591 relation_data.append(remote_replication)
592 remote_pgdb = {'database': 'postgres',
593 'user': 'juju_replication',
594 'private-address': remote_addr,
595 'relation-id': relid,
596 'unit': unit,
597 }
598 relation_data.append(remote_pgdb)
599
600 # More replication connections, this time from external services.
601 # Somewhat different than before, as we do not share credentials
602 # and services using 9.4's logical replication feature will want
603 # to specify the database name.
604 relids = hookenv.relation_ids('master')
605 for relid in relids:
606 for unit in hookenv.related_units(relid):
607 remote_rel = hookenv.relation_get(unit=unit, rid=relid)
608 local_rel = hookenv.relation_get(unit=hookenv.local_unit(),
609 rid=relid)
610 remote_addr = munge_address(remote_rel['private-address'])
611 remote_replication = {'database': 'replication',
612 'user': local_rel['user'],
613 'private-address': remote_addr,
614 'relation-id': relid,
615 'unit': unit,
616 }
617 relation_data.append(remote_replication)
618 if 'database' in local_rel:
619 remote_pgdb = {'database': local_rel['database'],
620 'user': local_rel['user'],
621 'private-address': remote_addr,
622 'relation-id': relid,
623 'unit': unit,
624 }
625 relation_data.append(remote_pgdb)
626
627 # Local hooks also need permissions to setup replication.
628 for relid in hookenv.relation_ids('replication'):
629 local_replication = {'database': 'postgres',
630 'user': 'juju_replication',
631 'private-address': munge_address(
632 hookenv.unit_private_ip()),
633 'relation-id': relid,
634 'unit': hookenv.local_unit(),
635 }
636 relation_data.append(local_replication)
637
638 # Admin IP addresses for people using tools like pgAdminIII in a local JuJu
639 # We accept a single IP or a comma separated list of IPs, these are added
640 # to the list of relations that end up in pg_hba.conf thus granting
641 # the IP addresses socket access to the postgres server.
642 if config_data["admin_addresses"] != '':
643 if "," in config_data["admin_addresses"]:
644 admin_ip_list = config_data["admin_addresses"].split(",")
645 else:
646 admin_ip_list = [config_data["admin_addresses"]]
647
648 for admin_ip in admin_ip_list:
649 admin_host = {
650 'database': 'all',
651 'user': 'all',
652 'private-address': munge_address(admin_ip)}
653 relation_data.append(admin_host)
654
655 extra_pg_auth = [pg_auth.strip() for pg_auth in
656 config_data["extra_pg_auth"].split(',') if pg_auth]
657
658 template_file = "{}/templates/pg_hba.conf.tmpl".format(hookenv.charm_dir())
659 pg_hba_template = Template(open(template_file).read())
660 pg_hba_rendered = pg_hba_template.render(extra_pg_auth=extra_pg_auth,
661 access_list=relation_data)
662 host.write_file(
663 output_file, pg_hba_rendered,
664 owner="postgres", group="postgres", perms=0600)
665 postgresql_reload()
666
667 # Loop through all db relations, making sure each knows what are the list
668 # of allowed hosts that were just added. lp:#1187508
669 # We sort the list to ensure stability, probably unnecessarily.
670 for relid in hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'):
671 hookenv.relation_set(
672 relid, {"allowed-units": " ".join(unit_sorted(allowed_units))})
673
674
675def install_postgresql_crontab(output_file):
676 '''Create the postgres user's crontab'''
677 config_data = hookenv.config()
678 config_data['scripts_dir'] = postgresql_scripts_dir
679 config_data['swiftwal_backup_command'] = swiftwal_backup_command()
680 config_data['swiftwal_prune_command'] = swiftwal_prune_command()
681 config_data['wal_e_backup_command'] = wal_e_backup_command()
682 config_data['wal_e_prune_command'] = wal_e_prune_command()
683
684 charm_dir = hookenv.charm_dir()
685 template_file = "{}/templates/postgres.cron.tmpl".format(charm_dir)
686 crontab_template = Template(
687 open(template_file).read()).render(config_data)
688 host.write_file(output_file, crontab_template, perms=0600)
689
690
691def create_recovery_conf(master_host, master_port, restart_on_change=False):
692 if hookenv.config('manual_replication'):
693 log('manual_replication; should not be here', CRITICAL)
694 raise RuntimeError('manual_replication; should not be here')
695
696 version = pg_version()
697 cluster_name = hookenv.config('cluster_name')
698 postgresql_cluster_dir = os.path.join(
699 postgresql_data_dir, version, cluster_name)
700
701 recovery_conf_path = os.path.join(postgresql_cluster_dir, 'recovery.conf')
702 if os.path.exists(recovery_conf_path):
703 old_recovery_conf = open(recovery_conf_path, 'r').read()
704 else:
705 old_recovery_conf = None
706
707 charm_dir = hookenv.charm_dir()
708 streaming_replication = hookenv.config('streaming_replication')
709 template_file = "{}/templates/recovery.conf.tmpl".format(charm_dir)
710 params = dict(
711 host=master_host, port=master_port,
712 password=local_state['replication_password'],
713 streaming_replication=streaming_replication)
714 if hookenv.config('wal_e_storage_uri'):
715 params['restore_command'] = wal_e_restore_command()
716 elif hookenv.config('swiftwal_log_shipping'):
717 params['restore_command'] = swiftwal_restore_command()
718 recovery_conf = Template(open(template_file).read()).render(params)
719 log(recovery_conf, DEBUG)
720 host.write_file(
721 os.path.join(postgresql_cluster_dir, 'recovery.conf'),
722 recovery_conf, owner="postgres", group="postgres", perms=0o600)
723
724 if restart_on_change and old_recovery_conf != recovery_conf:
725 log("recovery.conf updated. Restarting to take effect.")
726 postgresql_restart()
727
728
729def ensure_swift_container(container):
730 from swiftclient import client as swiftclient
731 config = hookenv.config()
732 con = swiftclient.Connection(
733 authurl=config.get('os_auth_url', ''),
734 user=config.get('os_username', ''),
735 key=config.get('os_password', ''),
736 tenant_name=config.get('os_tenant_name', ''),
737 auth_version='2.0',
738 retries=0)
739 try:
740 con.head_container(container)
741 except swiftclient.ClientException:
742 con.put_container(container)
743
744
745def wal_e_envdir():
746 '''The envdir(1) environment location used to drive WAL-E.'''
747 return os.path.join(_get_postgresql_config_dir(), 'wal-e.env')
748
749
750def create_wal_e_envdir():
751 '''Regenerate the envdir(1) environment used to drive WAL-E.'''
752 config = hookenv.config()
753 env = dict(
754 SWIFT_AUTHURL=config.get('os_auth_url', ''),
755 SWIFT_TENANT=config.get('os_tenant_name', ''),
756 SWIFT_USER=config.get('os_username', ''),
757 SWIFT_PASSWORD=config.get('os_password', ''),
758 AWS_ACCESS_KEY_ID=config.get('aws_access_key_id', ''),
759 AWS_SECRET_ACCESS_KEY=config.get('aws_secret_access_key', ''),
760 WABS_ACCOUNT_NAME=config.get('wabs_account_name', ''),
761 WABS_ACCESS_KEY=config.get('wabs_access_key', ''),
762 WALE_SWIFT_PREFIX='',
763 WALE_S3_PREFIX='',
764 WALE_WABS_PREFIX='')
765
766 uri = config.get('wal_e_storage_uri', None)
767
768 if uri:
769 # Until juju provides us with proper leader election, we have a
770 # state where units do not know if they are alone or part of a
771 # cluster. To avoid units stomping on each others WAL and backups,
772 # we use a unique container for each unit when they are not
773 # part of the peer relation. Once they are part of the peer
774 # relation, they share a container.
775 if local_state.get('state', 'standalone') == 'standalone':
776 if not uri.endswith('/'):
777 uri += '/'
778 uri += hookenv.local_unit().split('/')[-1]
779
780 parsed_uri = urlparse.urlparse(uri)
781
782 required_env = []
783 if parsed_uri.scheme == 'swift':
784 env['WALE_SWIFT_PREFIX'] = uri
785 required_env = ['SWIFT_AUTHURL', 'SWIFT_TENANT',
786 'SWIFT_USER', 'SWIFT_PASSWORD']
787 ensure_swift_container(parsed_uri.netloc)
788 elif parsed_uri.scheme == 's3':
789 env['WALE_S3_PREFIX'] = uri
790 required_env = ['AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY']
791 elif parsed_uri.scheme == 'wabs':
792 env['WALE_WABS_PREFIX'] = uri
793 required_env = ['WABS_ACCOUNT_NAME', 'WABS_ACCESS_KEY']
794 else:
795 log('Invalid wal_e_storage_uri {}'.format(uri), ERROR)
796
797 for env_key in required_env:
798 if not env[env_key].strip():
799 log('Missing {}'.format(env_key), ERROR)
800
801 # Regenerate the envdir(1) environment recommended by WAL-E.
802 # All possible keys are rewritten to ensure we remove old secrets.
803 host.mkdir(wal_e_envdir(), 'postgres', 'postgres', 0o750)
804 for k, v in env.items():
805 host.write_file(
806 os.path.join(wal_e_envdir(), k), v.strip(),
807 'postgres', 'postgres', 0o640)
808
809
810def wal_e_archive_command():
811 '''Return the archive_command needed in postgresql.conf.'''
812 return 'envdir {} wal-e wal-push %p'.format(wal_e_envdir())
813
814
815def wal_e_restore_command():
816 return 'envdir {} wal-e wal-fetch "%f" "%p"'.format(wal_e_envdir())
817
818
819def wal_e_backup_command():
820 postgresql_cluster_dir = os.path.join(
821 postgresql_data_dir, pg_version(), hookenv.config('cluster_name'))
822 return 'envdir {} wal-e backup-push {}'.format(
823 wal_e_envdir(), postgresql_cluster_dir)
824
825
826def wal_e_prune_command():
827 return 'envdir {} wal-e delete --confirm retain {}'.format(
828 wal_e_envdir(), hookenv.config('wal_e_backup_retention'))
829
830
831def swiftwal_config():
832 postgresql_config_dir = _get_postgresql_config_dir()
833 return os.path.join(postgresql_config_dir, "swiftwal.conf")
834
835
836def create_swiftwal_config():
837 if not hookenv.config('swiftwal_container_prefix'):
838 return
839
840 # Until juju provides us with proper leader election, we have a
841 # state where units do not know if they are alone or part of a
842 # cluster. To avoid units stomping on each others WAL and backups,
843 # we use a unique Swift container for each unit when they are not
844 # part of the peer relation. Once they are part of the peer
845 # relation, they share a container.
846 if local_state.get('state', 'standalone') == 'standalone':
847 container = '{}_{}'.format(hookenv.config('swiftwal_container_prefix'),
848 hookenv.local_unit().split('/')[-1])
849 else:
850 container = hookenv.config('swiftwal_container_prefix')
851
852 template_file = os.path.join(hookenv.charm_dir(),
853 'templates', 'swiftwal.conf.tmpl')
854 params = dict(hookenv.config())
855 params['swiftwal_container'] = container
856 content = Template(open(template_file).read()).render(params)
857 host.write_file(swiftwal_config(), content, "postgres", "postgres", 0o600)
858
859
860def swiftwal_archive_command():
861 '''Return the archive_command needed in postgresql.conf'''
862 return 'swiftwal --config={} archive-wal %p'.format(swiftwal_config())
863
864
865def swiftwal_restore_command():
866 '''Return the restore_command needed in recovery.conf'''
867 return 'swiftwal --config={} restore-wal %f %p'.format(swiftwal_config())
868
869
870def swiftwal_backup_command():
871 '''Return the backup command needed in postgres' crontab'''
872 cmd = 'swiftwal --config={} backup --port={}'.format(swiftwal_config(),
873 get_service_port())
874 if not hookenv.config('swiftwal_log_shipping'):
875 cmd += ' --xlog'
876 return cmd
877
878
879def swiftwal_prune_command():
880 '''Return the backup & wal pruning command needed in postgres' crontab'''
881 config = hookenv.config()
882 args = '--keep-backups={} --keep-wals={}'.format(
883 config.get('swiftwal_backup_retention', 0),
884 max(config['wal_keep_segments'],
885 config['replicated_wal_keep_segments']))
886 return 'swiftwal --config={} prune {}'.format(swiftwal_config(), args)
887
888
889def update_service_port():
890 old_port = local_state.get('listen_port', None)
891 new_port = get_service_port()
892 if old_port != new_port:
893 if new_port:
894 hookenv.open_port(new_port)
895 if old_port:
896 hookenv.close_port(old_port)
897 local_state['listen_port'] = new_port
898 local_state.save()
899
900
901def create_ssl_cert(cluster_dir):
902 # PostgreSQL expects SSL certificates in the datadir.
903 server_crt = os.path.join(cluster_dir, 'server.crt')
904 server_key = os.path.join(cluster_dir, 'server.key')
905 if not os.path.exists(server_crt):
906 os.symlink('/etc/ssl/certs/ssl-cert-snakeoil.pem',
907 server_crt)
908 if not os.path.exists(server_key):
909 os.symlink('/etc/ssl/private/ssl-cert-snakeoil.key',
910 server_key)
911
912
913def set_password(user, password):
914 if not os.path.isdir("passwords"):
915 os.makedirs("passwords")
916 old_umask = os.umask(0o077)
917 try:
918 with open("passwords/%s" % user, "w") as pwfile:
919 pwfile.write(password)
920 finally:
921 os.umask(old_umask)
922
923
924def get_password(user):
925 try:
926 with open("passwords/%s" % user) as pwfile:
927 return pwfile.read()
928 except IOError:
929 return None
930
931
932def db_cursor(autocommit=False, db='postgres', user='postgres',
933 host=None, port=None, timeout=30):
934 if port is None:
935 port = get_service_port()
936 if host:
937 conn_str = "dbname={} host={} port={} user={}".format(
938 db, host, port, user)
939 else:
940 conn_str = "dbname={} port={} user={}".format(db, port, user)
941 # There are often race conditions in opening database connections,
942 # such as a reload having just happened to change pg_hba.conf
943 # settings or a hot standby being restarted and needing to catch up
944 # with its master. To protect our automation against these sorts of
945 # race conditions, by default we always retry failed connections
946 # until a timeout is reached.
947 start = time.time()
948 while True:
949 try:
950 with pgpass():
951 conn = psycopg2.connect(conn_str)
952 break
953 except psycopg2.Error, x:
954 if time.time() > start + timeout:
955 log("Database connection {!r} failed".format(
956 conn_str), CRITICAL)
957 raise
958 log("Unable to open connection ({}), retrying.".format(x))
959 time.sleep(1)
960 conn.autocommit = autocommit
961 return conn.cursor()
962
963
964def run_sql_as_postgres(sql, *parameters):
965 cur = db_cursor(autocommit=True)
966 try:
967 cur.execute(sql, parameters)
968 return cur.statusmessage
969 except psycopg2.ProgrammingError:
970 log(sql, CRITICAL)
971 raise
972
973
974def run_select_as_postgres(sql, *parameters):
975 cur = db_cursor()
976 cur.execute(sql, parameters)
977 # NB. Need to suck in the results before the rowcount is valid.
978 results = cur.fetchall()
979 return (cur.rowcount, results)
980
981
982def validate_config():
983 """
984 Sanity check charm configuration, aborting the script if
985 we have bogus config values or config changes the charm does not yet
986 (or cannot) support.
987 """
988 valid = True
989 config_data = hookenv.config()
990
991 version = config_data.get('version', None)
992 if version:
993 if version not in ('9.1', '9.2', '9.3', '9.4'):
994 valid = False
995 log("Invalid or unsupported version {!r} requested".format(
996 version), CRITICAL)
997
998 if config_data['cluster_name'] != 'main':
999 valid = False
1000 log("Cluster names other than 'main' do not work per LP:1271835",
1001 CRITICAL)
1002
1003 if config_data['listen_ip'] != '*':
1004 valid = False
1005 log("listen_ip values other than '*' do not work per LP:1271837",
1006 CRITICAL)
1007
1008 valid_workloads = [
1009 'dw', 'oltp', 'web', 'mixed', 'desktop', 'manual', 'auto']
1010 requested_workload = config_data['performance_tuning'].lower()
1011 if requested_workload not in valid_workloads:
1012 valid = False
1013 log('Invalid performance_tuning setting {}'.format(requested_workload),
1014 CRITICAL)
1015 if requested_workload == 'auto':
1016 log("'auto' performance_tuning deprecated. Using 'mixed' tuning",
1017 WARNING)
1018
1019 unchangeable_config = [
1020 'locale', 'encoding', 'version', 'cluster_name', 'pgdg']
1021
1022 for name in unchangeable_config:
1023 if config_data._prev_dict is not None and config_data.changed(name):
1024 valid = False
1025 log("Cannot change {!r} setting after install.".format(name))
1026 local_state[name] = config_data.get(name, None)
1027 local_state.save()
1028
1029 package_status = config_data['package_status']
1030 if package_status not in ['install', 'hold']:
1031 valid = False
1032 log("package_status must be 'install' or 'hold' not '{}'"
1033 "".format(package_status), CRITICAL)
1034
1035 if not valid:
1036 sys.exit(99)
1037
1038
1039def ensure_package_status(package, status):
1040 selections = ''.join(['{} {}\n'.format(package, status)])
1041 dpkg = subprocess.Popen(
1042 ['dpkg', '--set-selections'], stdin=subprocess.PIPE)
1043 dpkg.communicate(input=selections)
1044
1045
1046# -----------------------------------------------------------------------------
1047# Core logic for permanent storage changes:
1048# NOTE the only 2 "True" return points:
1049# 1) symlink already pointing to existing storage (no-op)
1050# 2) new storage properly initialized:
1051# - if fresh new storage dir: rsync existing data
1052# - manipulate /var/lib/postgresql/VERSION/CLUSTER symlink
1053# -----------------------------------------------------------------------------
1054def config_changed_volume_apply(mount_point):
1055 version = pg_version()
1056 cluster_name = hookenv.config('cluster_name')
1057 data_directory_path = os.path.join(
1058 postgresql_data_dir, version, cluster_name)
1059
1060 assert(data_directory_path)
1061
1062 if not os.path.exists(data_directory_path):
1063 log(
1064 "postgresql data dir {} not found, "
1065 "not applying changes.".format(data_directory_path),
1066 CRITICAL)
1067 return False
1068
1069 new_pg_dir = os.path.join(mount_point, "postgresql")
1070 new_pg_version_cluster_dir = os.path.join(
1071 new_pg_dir, version, cluster_name)
1072 if not mount_point:
1073 log(
1074 "invalid mount point = {}, "
1075 "not applying changes.".format(mount_point), ERROR)
1076 return False
1077
1078 if ((os.path.islink(data_directory_path) and
1079 os.readlink(data_directory_path) == new_pg_version_cluster_dir and
1080 os.path.isdir(new_pg_version_cluster_dir))):
1081 log(
1082 "postgresql data dir '{}' already points "
1083 "to {}, skipping storage changes.".format(
1084 data_directory_path, new_pg_version_cluster_dir))
1085 log(
1086 "existing-symlink: to fix/avoid UID changes from "
1087 "previous units, doing: "
1088 "chown -R postgres:postgres {}".format(new_pg_dir))
1089 run("chown -R postgres:postgres %s" % new_pg_dir)
1090 return True
1091
1092 # Create a directory structure below "new" mount_point as
1093 # external_volume_mount/postgresql/9.1/main
1094 for new_dir in [new_pg_dir,
1095 os.path.join(new_pg_dir, version),
1096 new_pg_version_cluster_dir]:
1097 if not os.path.isdir(new_dir):
1098 log("mkdir %s".format(new_dir))
1099 host.mkdir(new_dir, owner="postgres", perms=0o700)
1100 # Carefully build this symlink, e.g.:
1101 # /var/lib/postgresql/9.1/main ->
1102 # external_volume_mount/postgresql/9.1/main
1103 # but keep previous "main/" directory, by renaming it to
1104 # main-$TIMESTAMP
1105 if not postgresql_stop() and postgresql_is_running():
1106 log("postgresql_stop() failed - can't migrate data.", ERROR)
1107 return False
1108 if not os.path.exists(os.path.join(
1109 new_pg_version_cluster_dir, "PG_VERSION")):
1110 log("migrating PG data {}/ -> {}/".format(
1111 data_directory_path, new_pg_version_cluster_dir), WARNING)
1112 # void copying PID file to perm storage (shouldn't be any...)
1113 command = "rsync -a --exclude postmaster.pid {}/ {}/".format(
1114 data_directory_path, new_pg_version_cluster_dir)
1115 log("run: {}".format(command))
1116 run(command)
1117 try:
1118 os.rename(data_directory_path, "{}-{}".format(
1119 data_directory_path, int(time.time())))
1120 log("NOTICE: symlinking {} -> {}".format(
1121 new_pg_version_cluster_dir, data_directory_path))
1122 os.symlink(new_pg_version_cluster_dir, data_directory_path)
1123 run("chown -h postgres:postgres {}".format(data_directory_path))
1124 log(
1125 "after-symlink: to fix/avoid UID changes from "
1126 "previous units, doing: "
1127 "chown -R postgres:postgres {}".format(new_pg_dir))
1128 run("chown -R postgres:postgres {}".format(new_pg_dir))
1129 return True
1130 except OSError:
1131 log("failed to symlink {} -> {}".format(
1132 data_directory_path, mount_point), CRITICAL)
1133 return False
1134
1135
1136def reset_manual_replication_state():
1137 '''In manual replication mode, the state of the local database cluster
1138 is outside of Juju's control. We need to detect and update the charm
1139 state to match reality.
1140 '''
1141 if hookenv.config('manual_replication'):
1142 if os.path.exists('recovery.conf'):
1143 local_state['state'] = 'hot standby'
1144 elif slave_count():
1145 local_state['state'] = 'master'
1146 else:
1147 local_state['state'] = 'standalone'
1148 local_state.publish()
1149
1150
1151@hooks.hook()
1152def config_changed(force_restart=False, mount_point=None):
1153 validate_config()
1154 config_data = hookenv.config()
1155 update_repos_and_packages()
1156
1157 if mount_point is not None:
1158 # config_changed_volume_apply will stop the service if it finds
1159 # it necessary, ie: new volume setup
1160 if config_changed_volume_apply(mount_point=mount_point):
1161 postgresql_autostart(True)
1162 else:
1163 postgresql_autostart(False)
1164 postgresql_stop()
1165 mounts = volume_get_all_mounted()
1166 if mounts:
1167 log("current mounted volumes: {}".format(mounts))
1168 log(
1169 "Disabled and stopped postgresql service "
1170 "(config_changed_volume_apply failure)", ERROR)
1171 sys.exit(1)
1172
1173 reset_manual_replication_state()
1174
1175 postgresql_config_dir = _get_postgresql_config_dir(config_data)
1176 postgresql_config = os.path.join(postgresql_config_dir, "postgresql.conf")
1177 postgresql_hba = os.path.join(postgresql_config_dir, "pg_hba.conf")
1178 postgresql_ident = os.path.join(postgresql_config_dir, "pg_ident.conf")
1179
1180 create_postgresql_config(postgresql_config)
1181 create_postgresql_ident(postgresql_ident) # Do this before pg_hba.conf.
1182 generate_postgresql_hba(postgresql_hba)
1183 create_ssl_cert(os.path.join(
1184 postgresql_data_dir, pg_version(), config_data['cluster_name']))
1185 create_swiftwal_config()
1186 create_wal_e_envdir()
1187 update_service_port()
1188 update_nrpe_checks()
1189 write_metrics_cronjob('/usr/local/bin/postgres_to_statsd.py',
1190 '/etc/cron.d/postgres_metrics')
1191
1192 # If an external mountpoint has caused an old, existing DB to be
1193 # mounted, we need to ensure that all the users, databases, roles
1194 # etc. exist with known passwords.
1195 if local_state['state'] in ('standalone', 'master'):
1196 client_relids = (
1197 hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'))
1198 for relid in client_relids:
1199 rel = hookenv.relation_get(rid=relid, unit=hookenv.local_unit())
1200 client_rel = None
1201 for unit in hookenv.related_units(relid):
1202 client_rel = hookenv.relation_get(unit=unit, rid=relid)
1203 if not client_rel:
1204 continue # No client units - in between departed and broken?
1205
1206 database = rel.get('database')
1207 if database is None:
1208 continue # The relation exists, but we haven't joined it yet.
1209
1210 roles = filter(None, (client_rel.get('roles') or '').split(","))
1211 user = rel.get('user')
1212 if user:
1213 admin = relid.startswith('db-admin')
1214 password = create_user(user, admin=admin)
1215 reset_user_roles(user, roles)
1216 hookenv.relation_set(relid, password=password)
1217
1218 schema_user = rel.get('schema_user')
1219 if schema_user:
1220 schema_password = create_user(schema_user)
1221 hookenv.relation_set(relid, schema_password=schema_password)
1222
1223 if user and schema_user and not (
1224 database is None or database == 'all'):
1225 ensure_database(user, schema_user, database)
1226
1227 if force_restart:
1228 postgresql_restart()
1229 postgresql_reload_or_restart()
1230
1231 # In case the log_line_prefix has changed, inform syslog consumers.
1232 for relid in hookenv.relation_ids('syslog'):
1233 hookenv.relation_set(
1234 relid, log_line_prefix=hookenv.config('log_line_prefix'))
1235
1236
1237@hooks.hook()
1238def install(run_pre=True, force_restart=True):
1239 if run_pre:
1240 for f in glob.glob('exec.d/*/charm-pre-install'):
1241 if os.path.isfile(f) and os.access(f, os.X_OK):
1242 subprocess.check_call(['sh', '-c', f])
1243
1244 validate_config()
1245
1246 config_data = hookenv.config()
1247 update_repos_and_packages()
1248 if 'state' not in local_state:
1249 log('state not in {}'.format(local_state.keys()), DEBUG)
1250 # Fresh installation. Because this function is invoked by both
1251 # the install hook and the upgrade-charm hook, we need to guard
1252 # any non-idempotent setup. We should probably fix this; it
1253 # seems rather fragile.
1254 local_state.setdefault('state', 'standalone')
1255 log(repr(local_state.keys()), DEBUG)
1256
1257 # Drop the cluster created when the postgresql package was
1258 # installed, and rebuild it with the requested locale and encoding.
1259 version = pg_version()
1260 for ver, name in lsclusters(slice(2)):
1261 if version == ver and name == 'main':
1262 run("pg_dropcluster --stop {} main".format(version))
1263 listen_port = config_data.get('listen_port', None)
1264 if listen_port:
1265 port_opt = "--port={}".format(config_data['listen_port'])
1266 else:
1267 port_opt = ''
1268 createcluster()
1269 assert (
1270 not port_opt
1271 or get_service_port() == config_data['listen_port']), (
1272 'allocated port {!r} != {!r}'.format(
1273 get_service_port(), config_data['listen_port']))
1274 local_state['port'] = get_service_port()
1275 log('publishing state', DEBUG)
1276 local_state.publish()
1277
1278 postgresql_backups_dir = (
1279 config_data['backup_dir'].strip() or
1280 os.path.join(postgresql_data_dir, 'backups'))
1281
1282 host.mkdir(postgresql_backups_dir, owner="postgres", perms=0o755)
1283 host.mkdir(postgresql_scripts_dir, owner="postgres", perms=0o755)
1284 host.mkdir(postgresql_logs_dir, owner="postgres", perms=0o755)
1285 paths = {
1286 'base_dir': postgresql_data_dir,
1287 'backup_dir': postgresql_backups_dir,
1288 'scripts_dir': postgresql_scripts_dir,
1289 'logs_dir': postgresql_logs_dir,
1290 }
1291 charm_dir = hookenv.charm_dir()
1292 template_file = "{}/templates/pg_backup_job.tmpl".format(charm_dir)
1293 backup_job = Template(open(template_file).read()).render(paths)
1294 host.write_file(
1295 os.path.join(postgresql_scripts_dir, 'dump-pg-db'),
1296 open('scripts/pgbackup.py', 'r').read(), perms=0o755)
1297 host.write_file(
1298 os.path.join(postgresql_scripts_dir, 'pg_backup_job'),
1299 backup_job, perms=0755)
1300 install_postgresql_crontab(postgresql_crontab)
1301
1302 # Create this empty log file on installation to avoid triggering
1303 # spurious monitoring system alerts, per Bug #1329816.
1304 if not os.path.exists(backup_log):
1305 host.write_file(backup_log, '', 'postgres', 'postgres', 0664)
1306
1307 hookenv.open_port(get_service_port())
1308
1309 # Ensure at least minimal access granted for hooks to run.
1310 # Reload because we are using the default cluster setup and started
1311 # when we installed the PostgreSQL packages.
1312 config_changed(force_restart=force_restart)
1313
1314 snapshot_relations()
1315
1316
1317@hooks.hook()
1318def upgrade_charm():
1319 """Handle saving state during an upgrade-charm hook.
1320
1321 When upgrading from an installation using volume-map, we migrate
1322 that installation to use the storage subordinate charm by remounting
1323 a mountpath that the storage subordinate maintains. We exit(1) only to
1324 raise visibility to manual procedure that we log in juju logs below for the
1325 juju admin to finish the migration by relating postgresql to the storage
1326 and block-storage-broker services. These steps are generalised in the
1327 README as well.
1328 """
1329 install(run_pre=False, force_restart=False)
1330 snapshot_relations()
1331 version = pg_version()
1332 cluster_name = hookenv.config('cluster_name')
1333 data_directory_path = os.path.join(
1334 postgresql_data_dir, version, cluster_name)
1335 if (os.path.islink(data_directory_path)):
1336 link_target = os.readlink(data_directory_path)
1337 if "/srv/juju" in link_target:
1338 # Then we just upgraded from an installation that was using
1339 # charm config volume_map definitions. We need to stop postgresql
1340 # and remount the device where the storage subordinate expects to
1341 # control the mount in the future if relations/units change
1342 volume_id = link_target.split("/")[3]
1343 unit_name = hookenv.local_unit()
1344 new_mount_root = external_volume_mount
1345 new_pg_version_cluster_dir = os.path.join(
1346 new_mount_root, "postgresql", version, cluster_name)
1347 if not os.exists(new_mount_root):
1348 os.mkdir(new_mount_root)
1349 log("\n"
1350 "WARNING: %s unit has external volume id %s mounted via the\n"
1351 "deprecated volume-map and volume-ephemeral-storage\n"
1352 "configuration parameters.\n"
1353 "These parameters are no longer available in the postgresql\n"
1354 "charm in favor of using the volume_map parameter in the\n"
1355 "storage subordinate charm.\n"
1356 "We are migrating the attached volume to a mount path which\n"
1357 "can be managed by the storage subordinate charm. To\n"
1358 "continue using this volume_id with the storage subordinate\n"
1359 "follow this procedure.\n-----------------------------------\n"
1360 "1. cat > storage.cfg <<EOF\nstorage:\n"
1361 " provider: block-storage-broker\n"
1362 " root: %s\n"
1363 " volume_map: \"{%s: %s}\"\nEOF\n2. juju deploy "
1364 "--config storage.cfg storage\n"
1365 "3. juju deploy block-storage-broker\n4. juju add-relation "
1366 "block-storage-broker storage\n5. juju resolved --retry "
1367 "%s\n6. juju add-relation postgresql storage\n"
1368 "-----------------------------------\n" %
1369 (unit_name, volume_id, new_mount_root, unit_name, volume_id,
1370 unit_name), WARNING)
1371 postgresql_stop()
1372 os.unlink(data_directory_path)
1373 log("Unmounting external storage due to charm upgrade: %s" %
1374 link_target)
1375 try:
1376 subprocess.check_output(
1377 "umount /srv/juju/%s" % volume_id, shell=True)
1378 # Since e2label truncates labels to 16 characters use only the
1379 # first 16 characters of the volume_id as that's what was
1380 # set by old versions of postgresql charm
1381 subprocess.check_call(
1382 "mount -t ext4 LABEL=%s %s" %
1383 (volume_id[:16], new_mount_root), shell=True)
1384 except subprocess.CalledProcessError, e:
1385 log("upgrade-charm mount migration failed. %s" % str(e), ERROR)
1386 sys.exit(1)
1387
1388 log("NOTICE: symlinking {} -> {}".format(
1389 new_pg_version_cluster_dir, data_directory_path))
1390 os.symlink(new_pg_version_cluster_dir, data_directory_path)
1391 run("chown -h postgres:postgres {}".format(data_directory_path))
1392 postgresql_start() # Will exit(1) if issues
1393 log("Remount and restart success for this external volume.\n"
1394 "This current running installation will break upon\n"
1395 "add/remove postgresql units or relations if you do not\n"
1396 "follow the above procedure to ensure your external\n"
1397 "volumes are preserved by the storage subordinate charm.",
1398 WARNING)
1399 # So juju admins can see the hook fail and note the steps to fix
1400 # per our WARNINGs above
1401 sys.exit(1)
1402
1403
1404@hooks.hook()
1405def start():
1406 postgresql_reload_or_restart()
1407
1408
1409@hooks.hook()
1410def stop():
1411 if postgresql_is_running():
1412 with restart_lock(hookenv.local_unit(), True):
1413 postgresql_stop()
1414
1415
1416def quote_identifier(identifier):
1417 r'''Quote an identifier, such as a table or role name.
1418
1419 In SQL, identifiers are quoted using " rather than ' (which is reserved
1420 for strings).
1421
1422 >>> print(quote_identifier('hello'))
1423 "hello"
1424
1425 Quotes and Unicode are handled if you make use of them in your
1426 identifiers.
1427
1428 >>> print(quote_identifier("'"))
1429 "'"
1430 >>> print(quote_identifier('"'))
1431 """"
1432 >>> print(quote_identifier("\\"))
1433 "\"
1434 >>> print(quote_identifier('\\"'))
1435 "\"""
1436 >>> print(quote_identifier('\\ aargh \u0441\u043b\u043e\u043d'))
1437 U&"\\ aargh \0441\043b\043e\043d"
1438 '''
1439 try:
1440 return '"%s"' % identifier.encode('US-ASCII').replace('"', '""')
1441 except UnicodeEncodeError:
1442 escaped = []
1443 for c in identifier:
1444 if c == '\\':
1445 escaped.append('\\\\')
1446 elif c == '"':
1447 escaped.append('""')
1448 else:
1449 c = c.encode('US-ASCII', 'backslashreplace')
1450 # Note Python only supports 32 bit unicode, so we use
1451 # the 4 hexdigit PostgreSQL syntax (\1234) rather than
1452 # the 6 hexdigit format (\+123456).
1453 if c.startswith('\\u'):
1454 c = '\\' + c[2:]
1455 escaped.append(c)
1456 return 'U&"%s"' % ''.join(escaped)
1457
1458
1459def sanitize(s):
1460 s = s.replace(':', '_')
1461 s = s.replace('-', '_')
1462 s = s.replace('/', '_')
1463 s = s.replace('"', '_')
1464 s = s.replace("'", '_')
1465 return s
1466
1467
1468def user_name(relid, remote_unit, admin=False, schema=False):
1469 # Per Bug #1160530, don't append the remote unit number to the user name.
1470 components = [sanitize(relid), sanitize(re.split("/", remote_unit)[0])]
1471 if admin:
1472 components.append("admin")
1473 elif schema:
1474 components.append("schema")
1475 return "_".join(components)
1476
1477
1478def user_exists(user):
1479 sql = "SELECT rolname FROM pg_roles WHERE rolname = %s"
1480 if run_select_as_postgres(sql, user)[0] > 0:
1481 return True
1482 else:
1483 return False
1484
1485
1486def create_user(user, admin=False, replication=False):
1487 password = get_password(user)
1488 if password is None:
1489 password = host.pwgen()
1490 set_password(user, password)
1491 if user_exists(user):
1492 log("Updating {} user".format(user))
1493 action = ["ALTER ROLE"]
1494 else:
1495 log("Creating {} user".format(user))
1496 action = ["CREATE ROLE"]
1497 action.append('%s WITH LOGIN')
1498 if admin:
1499 action.append('SUPERUSER')
1500 else:
1501 action.append('NOSUPERUSER')
1502 if replication:
1503 action.append('REPLICATION')
1504 else:
1505 action.append('NOREPLICATION')
1506 action.append('PASSWORD %s')
1507 sql = ' '.join(action)
1508 run_sql_as_postgres(sql, AsIs(quote_identifier(user)), password)
1509 return password
1510
1511
1512def reset_user_roles(user, roles):
1513 wanted_roles = set(roles)
1514
1515 sql = """
1516 SELECT role.rolname
1517 FROM
1518 pg_roles AS role,
1519 pg_roles AS member,
1520 pg_auth_members
1521 WHERE
1522 member.oid = pg_auth_members.member
1523 AND role.oid = pg_auth_members.roleid
1524 AND member.rolname = %s
1525 """
1526 existing_roles = set(r[0] for r in run_select_as_postgres(sql, user)[1])
1527
1528 roles_to_grant = wanted_roles.difference(existing_roles)
1529
1530 for role in roles_to_grant:
1531 ensure_role(role)
1532
1533 if roles_to_grant:
1534 log("Granting {} to {}".format(",".join(roles_to_grant), user), INFO)
1535
1536 for role in roles_to_grant:
1537 run_sql_as_postgres(
1538 "GRANT %s TO %s",
1539 AsIs(quote_identifier(role)), AsIs(quote_identifier(user)))
1540
1541 roles_to_revoke = existing_roles.difference(wanted_roles)
1542
1543 if roles_to_revoke:
1544 log("Revoking {} from {}".format(",".join(roles_to_grant), user), INFO)
1545
1546 for role in roles_to_revoke:
1547 run_sql_as_postgres(
1548 "REVOKE %s FROM %s",
1549 AsIs(quote_identifier(role)), AsIs(quote_identifier(user)))
1550
1551
1552def ensure_role(role):
1553 sql = "SELECT oid FROM pg_roles WHERE rolname = %s"
1554 if run_select_as_postgres(sql, role)[0] == 0:
1555 sql = "CREATE ROLE %s INHERIT NOLOGIN"
1556 run_sql_as_postgres(sql, AsIs(quote_identifier(role)))
1557
1558
1559def ensure_database(user, schema_user, database):
1560 sql = "SELECT datname FROM pg_database WHERE datname = %s"
1561 if run_select_as_postgres(sql, database)[0] != 0:
1562 # DB already exists
1563 pass
1564 else:
1565 sql = "CREATE DATABASE %s"
1566 run_sql_as_postgres(sql, AsIs(quote_identifier(database)))
1567 sql = "GRANT ALL PRIVILEGES ON DATABASE %s TO %s"
1568 run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
1569 AsIs(quote_identifier(schema_user)))
1570 sql = "GRANT CONNECT ON DATABASE %s TO %s"
1571 run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
1572 AsIs(quote_identifier(user)))
1573
1574
1575def ensure_extensions(extensions, database):
1576 if extensions:
1577 cur = db_cursor(db=database, autocommit=True)
1578 try:
1579 cur.execute('SELECT extname FROM pg_extension')
1580 installed_extensions = frozenset(x[0] for x in cur.fetchall())
1581 log("ensure_extensions({}), have {}"
1582 .format(extensions, installed_extensions),
1583 DEBUG)
1584 extensions_set = frozenset(extensions)
1585 extensions_to_create = \
1586 extensions_set.difference(installed_extensions)
1587 for ext in extensions_to_create:
1588 log("creating extension {}".format(ext), DEBUG)
1589 cur.execute('CREATE EXTENSION %s',
1590 (AsIs(quote_identifier(ext)),))
1591 finally:
1592 cur.close()
1593
1594
1595def snapshot_relations():
1596 '''Snapshot our relation information into local state.
1597
1598 We need this information to be available in -broken
1599 hooks letting us actually clean up properly. Bug #1190996.
1600 '''
1601 log("Snapshotting relations", DEBUG)
1602 local_state['relations'] = hookenv.relations()
1603 local_state.save()
1604
1605
1606# Each database unit needs to publish connection details to the
1607# client. This is problematic, because 1) the user and database are
1608# only created on the master unit and this is replicated to the
1609# slave units outside of juju and 2) we have no control over the
1610# order that units join the relation.
1611#
1612# The simplest approach of generating usernames and passwords in
1613# the master units db-relation-joined hook fails because slave
1614# units may well have already run their hooks and found no
1615# connection details to republish. When the master unit publishes
1616# the connection details it only triggers relation-changed hooks
1617# on the client units, not the relation-changed hook on other peer
1618# units.
1619#
1620# A more complex approach is for the first database unit that joins
1621# the relation to generate the usernames and passwords and publish
1622# this to the relation. Subsequent units can retrieve this
1623# information and republish it. Of course, the master unit also
1624# creates the database and users when it joins the relation.
1625# This approach should work reliably on the server side. However,
1626# there is a window from when a slave unit joins a client relation
1627# until the master unit has joined that relation when the
1628# credentials published by the slave unit are invalid. These
1629# credentials will only become valid after the master unit has
1630# actually created the user and database.
1631#
1632# The implemented approach is for the master unit's
1633# db-relation-joined hook to create the user and database and
1634# publish the connection details, and in addition update a list
1635# of active relations to the service's peer 'replication' relation.
1636# After the master unit has updated the peer relationship, the
1637# slave unit's peer replication-relation-changed hook will
1638# be triggered and it will have an opportunity to republish the
1639# connection details. Of course, it may not be able to do so if the
1640# slave unit's db-relation-joined hook has yet been run, so we must
1641# also attempt to to republish the connection settings there.
1642# This way we are guaranteed at least one chance to republish the
1643# connection details after the database and user have actually been
1644# created and both the master and slave units have joined the
1645# relation.
1646#
1647# The order of relevant hooks firing may be:
1648#
1649# master db-relation-joined (publish)
1650# slave db-relation-joined (republish)
1651# slave replication-relation-changed (noop)
1652#
1653# slave db-relation-joined (noop)
1654# master db-relation-joined (publish)
1655# slave replication-relation-changed (republish)
1656#
1657# master db-relation-joined (publish)
1658# slave replication-relation-changed (noop; slave not yet joined db rel)
1659# slave db-relation-joined (republish)
1660
1661
1662@hooks.hook('db-relation-joined', 'db-relation-changed')
1663def db_relation_joined_changed():
1664 reset_manual_replication_state()
1665 if local_state['state'] == 'hot standby':
1666 publish_hot_standby_credentials()
1667 return
1668
1669 # By default, we create a database named after the remote
1670 # servicename. The remote service can override this by setting
1671 # the database property on the relation.
1672 database = hookenv.relation_get('database')
1673 if not database:
1674 database = hookenv.remote_unit().split('/')[0]
1675
1676 # Generate a unique username for this relation to use.
1677 user = user_name(hookenv.relation_id(), hookenv.remote_unit())
1678
1679 roles = filter(None, (hookenv.relation_get('roles') or '').split(","))
1680
1681 extensions = filter(None,
1682 (hookenv.relation_get('extensions') or '').split(","))
1683
1684 log('{} unit publishing credentials'.format(local_state['state']))
1685
1686 password = create_user(user)
1687 reset_user_roles(user, roles)
1688 schema_user = "{}_schema".format(user)
1689 schema_password = create_user(schema_user)
1690 ensure_database(user, schema_user, database)
1691 ensure_extensions(extensions, database)
1692 host = hookenv.unit_private_ip()
1693 port = get_service_port()
1694 state = local_state['state'] # master, hot standby, standalone
1695
1696 # Publish connection details.
1697 connection_settings = dict(
1698 user=user, password=password,
1699 schema_user=schema_user, schema_password=schema_password,
1700 host=host, database=database, port=port, state=state)
1701 log("Connection settings {!r}".format(connection_settings), DEBUG)
1702 hookenv.relation_set(relation_settings=connection_settings)
1703
1704 # Update the peer relation, notifying any hot standby units
1705 # to republish connection details to the client relation.
1706 local_state['client_relations'] = ' '.join(sorted(
1707 hookenv.relation_ids('db') + hookenv.relation_ids('db-admin')))
1708 log("Client relations {}".format(local_state['client_relations']))
1709 local_state.publish()
1710
1711 postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
1712 generate_postgresql_hba(postgresql_hba, user=user,
1713 schema_user=schema_user,
1714 database=database)
1715
1716 snapshot_relations()
1717
1718
1719@hooks.hook('db-admin-relation-joined', 'db-admin-relation-changed')
1720def db_admin_relation_joined_changed():
1721 reset_manual_replication_state()
1722 if local_state['state'] == 'hot standby':
1723 publish_hot_standby_credentials()
1724 return
1725
1726 user = user_name(
1727 hookenv.relation_id(), hookenv.remote_unit(), admin=True)
1728
1729 log('{} unit publishing credentials'.format(local_state['state']))
1730
1731 password = create_user(user, admin=True)
1732 host = hookenv.unit_private_ip()
1733 port = get_service_port()
1734 state = local_state['state'] # master, hot standby, standalone
1735
1736 # Publish connection details.
1737 connection_settings = dict(
1738 user=user, password=password,
1739 host=host, database='all', port=port, state=state)
1740 log("Connection settings {!r}".format(connection_settings), DEBUG)
1741 hookenv.relation_set(relation_settings=connection_settings)
1742
1743 # Update the peer relation, notifying any hot standby units
1744 # to republish connection details to the client relation.
1745 local_state['client_relations'] = ' '.join(
1746 hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'))
1747 log("Client relations {}".format(local_state['client_relations']))
1748 local_state.publish()
1749
1750 postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
1751 generate_postgresql_hba(postgresql_hba)
1752
1753 snapshot_relations()
1754
1755
1756@hooks.hook()
1757def db_relation_broken():
1758 relid = hookenv.relation_id()
1759 if relid not in local_state['relations']['db']:
1760 # This was to be a hot standby, but it had not yet got as far as
1761 # receiving and handling credentials from the master.
1762 log("db-relation-broken called before relation finished setup", DEBUG)
1763 return
1764
1765 # The relation no longer exists, so we can't pull the database name
1766 # we used from there. Instead, we have to persist this information
1767 # ourselves.
1768 relation = local_state['relations']['db'][relid]
1769 unit_relation_data = relation[hookenv.local_unit()]
1770
1771 if local_state['state'] in ('master', 'standalone'):
1772 user = unit_relation_data.get('user', None)
1773 database = unit_relation_data['database']
1774
1775 # We need to check that the database still exists before
1776 # attempting to revoke privileges because the local PostgreSQL
1777 # cluster may have been rebuilt by another hook.
1778 sql = "SELECT datname FROM pg_database WHERE datname = %s"
1779 if run_select_as_postgres(sql, database)[0] != 0:
1780 sql = "REVOKE ALL PRIVILEGES ON DATABASE %s FROM %s"
1781 run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
1782 AsIs(quote_identifier(user)))
1783 run_sql_as_postgres(sql, AsIs(quote_identifier(database)),
1784 AsIs(quote_identifier(user + "_schema")))
1785
1786 postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
1787 generate_postgresql_hba(postgresql_hba)
1788
1789 # Cleanup our local state.
1790 snapshot_relations()
1791
1792
1793@hooks.hook()
1794def db_admin_relation_broken():
1795 if local_state['state'] in ('master', 'standalone'):
1796 user = hookenv.relation_get('user', unit=hookenv.local_unit())
1797 if user:
1798 # We need to check that the user still exists before
1799 # attempting to revoke privileges because the local PostgreSQL
1800 # cluster may have been rebuilt by another hook.
1801 sql = "SELECT usename FROM pg_user WHERE usename = %s"
1802 if run_select_as_postgres(sql, user)[0] != 0:
1803 sql = "ALTER USER %s NOSUPERUSER"
1804 run_sql_as_postgres(sql, AsIs(quote_identifier(user)))
1805
1806 postgresql_hba = os.path.join(_get_postgresql_config_dir(), "pg_hba.conf")
1807 generate_postgresql_hba(postgresql_hba)
1808
1809 # Cleanup our local state.
1810 snapshot_relations()
1811
1812
1813def update_repos_and_packages():
1814 need_upgrade = False
1815
1816 version = pg_version()
1817
1818 # Add the PGDG APT repository if it is enabled. Setting this boolean
1819 # is simpler than requiring the magic URL and key be added to
1820 # install_sources and install_keys. In addition, per Bug #1271148,
1821 # install_keys is likely a security hole for this sort of remote
1822 # archive. Instead, we keep a copy of the signing key in the charm
1823 # and can add it securely.
1824 pgdg_list = '/etc/apt/sources.list.d/pgdg_{}.list'.format(
1825 sanitize(hookenv.local_unit()))
1826 pgdg_key = 'ACCC4CF8'
1827
1828 if hookenv.config('pgdg'):
1829 if not os.path.exists(pgdg_list):
1830 # We need to upgrade, as if we have Ubuntu main packages
1831 # installed they may be incompatible with the PGDG ones.
1832 # This is unlikely to ever happen outside of the test suite,
1833 # and never if you don't reuse machines.
1834 need_upgrade = True
1835 run("apt-key add lib/{}.asc".format(pgdg_key))
1836 open(pgdg_list, 'w').write('deb {} {}-pgdg main'.format(
1837 'http://apt.postgresql.org/pub/repos/apt/', distro_codename()))
1838 if version == '9.4':
1839 pgdg_94_list = '/etc/apt/sources.list.d/pgdg_94_{}.list'.format(
1840 sanitize(hookenv.local_unit()))
1841 if not os.path.exists(pgdg_94_list):
1842 need_upgrade = True
1843 open(pgdg_94_list, 'w').write(
1844 'deb {} {}-pgdg main 9.4'.format(
1845 'http://apt.postgresql.org/pub/repos/apt/',
1846 distro_codename()))
1847
1848 elif os.path.exists(pgdg_list):
1849 log(
1850 "PGDG apt source not requested, but already in place in this "
1851 "container", WARNING)
1852 # We can't just remove a source, as we may have packages
1853 # installed that conflict with ones from the other configured
1854 # sources. In particular, if we have postgresql-common installed
1855 # from the PGDG Apt source, PostgreSQL packages from Ubuntu main
1856 # will fail to install.
1857 # os.unlink(pgdg_list)
1858
1859 # Try to optimize our calls to fetch.configure_sources(), as it
1860 # cannot do this itself due to lack of state.
1861 if (need_upgrade
1862 or local_state.get('install_sources', None)
1863 != hookenv.config('install_sources')
1864 or local_state.get('install_keys', None)
1865 != hookenv.config('install_keys')):
1866 # Support the standard mechanism implemented by charm-helpers. Pulls
1867 # from the default 'install_sources' and 'install_keys' config
1868 # options. This also does 'apt-get update', pulling in the PGDG data
1869 # if we just configured it.
1870 fetch.configure_sources(True)
1871 local_state['install_sources'] = hookenv.config('install_sources')
1872 local_state['install_keys'] = hookenv.config('install_keys')
1873 local_state.save()
1874
1875 # Ensure that the desired database locale is possible.
1876 if hookenv.config('locale') != 'C':
1877 run(["locale-gen", "{}.{}".format(
1878 hookenv.config('locale'), hookenv.config('encoding'))])
1879
1880 if need_upgrade:
1881 run("apt-get -y upgrade")
1882
1883 # It might have been better for debversion and plpython to only get
1884 # installed if they were listed in the extra-packages config item,
1885 # but they predate this feature.
1886 packages = ["python-psutil", # to obtain system RAM from python
1887 "libc-bin", # for getconf
1888 "postgresql-{}".format(version),
1889 "postgresql-contrib-{}".format(version),
1890 "postgresql-plpython-{}".format(version),
1891 "python-jinja2", "python-psycopg2"]
1892
1893 # PGDG currently doesn't have debversion for 9.3 & 9.4. Put this back
1894 # when it does.
1895 if not (hookenv.config('pgdg') and version in ('9.3', '9.4')):
1896 packages.append("postgresql-{}-debversion".format(version))
1897
1898 if hookenv.config('performance_tuning').lower() != 'manual':
1899 packages.append('pgtune')
1900
1901 if hookenv.config('swiftwal_container_prefix'):
1902 packages.append('swiftwal')
1903
1904 if hookenv.config('wal_e_storage_uri'):
1905 packages.extend(['wal-e', 'daemontools'])
1906
1907 packages.extend((hookenv.config('extra-packages') or '').split())
1908 packages = fetch.filter_installed_packages(packages)
1909 # Set package state for main postgresql package if installed
1910 if 'postgresql-{}'.format(version) not in packages:
1911 ensure_package_status('postgresql-{}'.format(version),
1912 hookenv.config('package_status'))
1913 fetch.apt_update(fatal=True)
1914 fetch.apt_install(packages, fatal=True)
1915
1916
1917@contextmanager
1918def pgpass():
1919 passwords = {}
1920
1921 # Replication.
1922 # pg_basebackup only works with the password in .pgpass, or entered
1923 # at the command prompt.
1924 if 'replication_password' in local_state:
1925 passwords['juju_replication'] = local_state['replication_password']
1926
1927 pgpass_contents = '\n'.join(
1928 "*:*:*:{}:{}".format(username, password)
1929 for username, password in passwords.items())
1930 pgpass_file = NamedTemporaryFile()
1931 pgpass_file.write(pgpass_contents)
1932 pgpass_file.flush()
1933 os.chown(pgpass_file.name, getpwnam('postgres').pw_uid, -1)
1934 os.chmod(pgpass_file.name, 0o400)
1935 org_pgpassfile = os.environ.get('PGPASSFILE', None)
1936 os.environ['PGPASSFILE'] = pgpass_file.name
1937 try:
1938 yield pgpass_file.name
1939 finally:
1940 if org_pgpassfile is None:
1941 del os.environ['PGPASSFILE']
1942 else:
1943 os.environ['PGPASSFILE'] = org_pgpassfile
1944
1945
1946def authorized_by(unit):
1947 '''Return True if the peer has authorized our database connections.'''
1948 for relid in hookenv.relation_ids('replication'):
1949 relation = hookenv.relation_get(unit=unit, rid=relid)
1950 authorized = relation.get('authorized', '').split()
1951 return hookenv.local_unit() in authorized
1952
1953
1954def promote_database():
1955 '''Take the database out of recovery mode.'''
1956 config_data = hookenv.config()
1957 version = pg_version()
1958 cluster_name = config_data['cluster_name']
1959 postgresql_cluster_dir = os.path.join(
1960 postgresql_data_dir, version, cluster_name)
1961 recovery_conf = os.path.join(postgresql_cluster_dir, 'recovery.conf')
1962 if os.path.exists(recovery_conf):
1963 # Rather than using 'pg_ctl promote', we do the promotion
1964 # this way to avoid creating a timeline change. Switch this
1965 # to using 'pg_ctl promote' once PostgreSQL propagates
1966 # timeline changes via streaming replication.
1967 os.unlink(recovery_conf)
1968 postgresql_restart()
1969
1970
1971def follow_database(master):
1972 '''Connect the database as a streaming replica of the master.'''
1973 master_relation = hookenv.relation_get(unit=master)
1974 create_recovery_conf(
1975 master_relation['private-address'],
1976 master_relation['port'], restart_on_change=True)
1977
1978
1979def elected_master():
1980 """Return the unit that should be master, or None if we don't yet know."""
1981 if local_state['state'] == 'master':
1982 log("I am already the master", DEBUG)
1983 return hookenv.local_unit()
1984
1985 if local_state['state'] == 'hot standby':
1986 log("I am already following {}".format(
1987 local_state['following']), DEBUG)
1988 return local_state['following']
1989
1990 replication_relid = hookenv.relation_ids('replication')[0]
1991 replication_units = hookenv.related_units(replication_relid)
1992
1993 if local_state['state'] == 'standalone':
1994 log("I'm a standalone unit wanting to participate in replication")
1995 existing_replication = False
1996 for unit in replication_units:
1997 # If another peer thinks it is the master, believe it.
1998 remote_state = hookenv.relation_get(
1999 'state', unit, replication_relid)
2000 if remote_state == 'master':
2001 log("{} thinks it is the master, believing it".format(
2002 unit), DEBUG)
2003 return unit
2004
2005 # If we find a peer that isn't standalone, we know
2006 # replication has already been setup at some point.
2007 if remote_state != 'standalone':
2008 existing_replication = True
2009
2010 # If we are joining a peer relation where replication has
2011 # already been setup, but there is currently no master, wait
2012 # until one of the remaining participating units has been
2013 # promoted to master. Only they have the data we need to
2014 # preserve.
2015 if existing_replication:
2016 log("Peers participating in replication need to elect a master",
2017 DEBUG)
2018 return None
2019
2020 # There are no peers claiming to be master, and there is no
2021 # election in progress, so lowest numbered unit wins.
2022 units = replication_units + [hookenv.local_unit()]
2023 master = unit_sorted(units)[0]
2024 if master == hookenv.local_unit():
2025 log("I'm Master - lowest numbered unit in new peer group")
2026 return master
2027 else:
2028 log("Waiting on {} to declare itself Master".format(master), DEBUG)
2029 return None
2030
2031 if local_state['state'] == 'failover':
2032 former_master = local_state['following']
2033 log("Failover from {}".format(former_master))
2034
2035 units_not_in_failover = set()
2036 candidates = set()
2037 for unit in replication_units:
2038 if unit == former_master:
2039 log("Found dying master {}".format(unit), DEBUG)
2040 continue
2041
2042 relation = hookenv.relation_get(unit=unit, rid=replication_relid)
2043
2044 if relation['state'] == 'master':
2045 log("{} says it already won the election".format(unit),
2046 INFO)
2047 return unit
2048
2049 if relation['state'] == 'failover':
2050 candidates.add(unit)
2051
2052 elif relation['state'] != 'standalone':
2053 units_not_in_failover.add(unit)
2054
2055 if units_not_in_failover:
2056 log("{} unaware of impending election. Deferring result.".format(
2057 " ".join(unit_sorted(units_not_in_failover))))
2058 return None
2059
2060 log("Election in progress")
2061 winner = None
2062 winning_offset = -1
2063 candidates.add(hookenv.local_unit())
2064 # Sort the unit lists so we get consistent results in a tie
2065 # and lowest unit number wins.
2066 for unit in unit_sorted(candidates):
2067 relation = hookenv.relation_get(unit=unit, rid=replication_relid)
2068 if int(relation['wal_received_offset']) > winning_offset:
2069 winner = unit
2070 winning_offset = int(relation['wal_received_offset'])
2071
2072 # All remaining hot standbys are in failover mode and have
2073 # reported their wal_received_offset. We can declare victory.
2074 if winner == hookenv.local_unit():
2075 log("I won the election, announcing myself winner")
2076 return winner
2077 else:
2078 log("Waiting for {} to announce its victory".format(winner),
2079 DEBUG)
2080 return None
2081
2082
2083@hooks.hook('replication-relation-joined', 'replication-relation-changed')
2084def replication_relation_joined_changed():
2085 config_changed() # Ensure minimal replication settings.
2086
2087 # Now that pg_hba.conf has been regenerated and loaded, inform related
2088 # units that they have been granted replication access.
2089 authorized_units = set()
2090 for unit in hookenv.related_units():
2091 authorized_units.add(unit)
2092 local_state['authorized'] = authorized_units
2093
2094 if hookenv.config('manual_replication'):
2095 log('manual_replication, nothing to do')
2096 return
2097
2098 master = elected_master()
2099
2100 # Handle state changes:
2101 # - Fresh install becoming the master
2102 # - Fresh install becoming a hot standby
2103 # - Hot standby being promoted to master
2104
2105 if master is None:
2106 log("Master is not yet elected. Deferring.")
2107
2108 elif master == hookenv.local_unit():
2109 if local_state['state'] != 'master':
2110 log("I have elected myself master")
2111 promote_database()
2112 if 'following' in local_state:
2113 del local_state['following']
2114 if 'wal_received_offset' in local_state:
2115 del local_state['wal_received_offset']
2116 if 'paused_at_failover' in local_state:
2117 del local_state['paused_at_failover']
2118 local_state['state'] = 'master'
2119
2120 # Publish credentials to hot standbys so they can connect.
2121 replication_password = create_user(
2122 'juju_replication', replication=True)
2123 local_state['replication_password'] = replication_password
2124 local_state['client_relations'] = ' '.join(
2125 hookenv.relation_ids('db') + hookenv.relation_ids('db-admin'))
2126 local_state.publish()
2127
2128 else:
2129 log("I am master and remain master")
2130
2131 elif not authorized_by(master):
2132 log("I need to follow {} but am not yet authorized".format(master))
2133
2134 else:
2135 log("Syncing replication_password from {}".format(master), DEBUG)
2136 local_state['replication_password'] = hookenv.relation_get(
2137 'replication_password', master)
2138
2139 if 'following' not in local_state:
2140 log("Fresh unit. I will clone {} and become a hot standby".format(
2141 master))
2142
2143 master_ip = hookenv.relation_get('private-address', master)
2144 master_port = hookenv.relation_get('port', master)
2145 assert master_port is not None, 'No master port set'
2146
2147 clone_database(master, master_ip, master_port)
2148
2149 local_state['state'] = 'hot standby'
2150 local_state['following'] = master
2151 if 'wal_received_offset' in local_state:
2152 del local_state['wal_received_offset']
2153
2154 elif local_state['following'] == master:
2155 log("I am a hot standby already following {}".format(master))
2156
2157 # Replication connection details may have changed, so
2158 # ensure we are still following.
2159 follow_database(master)
2160
2161 else:
2162 log("I am a hot standby following new master {}".format(master))
2163 follow_database(master)
2164 if not local_state.get("paused_at_failover", None):
2165 run_sql_as_postgres("SELECT pg_xlog_replay_resume()")
2166 local_state['state'] = 'hot standby'
2167 local_state['following'] = master
2168 del local_state['wal_received_offset']
2169 del local_state['paused_at_failover']
2170
2171 publish_hot_standby_credentials()
2172 postgresql_hba = os.path.join(
2173 _get_postgresql_config_dir(), "pg_hba.conf")
2174 generate_postgresql_hba(postgresql_hba)
2175
2176 # Swift container name make have changed, so regenerate the SwiftWAL
2177 # config. This can go away when we have real leader election and can
2178 # safely share a single container.
2179 create_swiftwal_config()
2180 create_wal_e_envdir()
2181
2182 local_state.publish()
2183
2184
2185def publish_hot_standby_credentials():
2186 '''
2187 If a hot standby joins a client relation before the master
2188 unit, it is unable to publish connection details. However,
2189 when the master does join it updates the client_relations
2190 value in the peer relation causing the replication-relation-changed
2191 hook to be invoked. This gives us a second opertunity to publish
2192 connection details.
2193
2194 This function is invoked from both the client and peer
2195 relation-changed hook. One of these will work depending on the order
2196 the master and hot standby joined the client relation.
2197 '''
2198 master = local_state['following']
2199 if not master:
2200 log("I will be a hot standby, but no master yet")
2201 return
2202
2203 if not authorized_by(master):
2204 log("Master {} has not yet authorized us".format(master))
2205 return
2206
2207 client_relations = hookenv.relation_get(
2208 'client_relations', master, hookenv.relation_ids('replication')[0])
2209
2210 if client_relations is None:
2211 log("Master {} has not yet joined any client relations".format(
2212 master), DEBUG)
2213 return
2214
2215 # Build the set of client relations that both the master and this
2216 # unit have joined.
2217 possible_client_relations = set(hookenv.relation_ids('db') +
2218 hookenv.relation_ids('db-admin') +
2219 hookenv.relation_ids('master'))
2220 active_client_relations = possible_client_relations.intersection(
2221 set(client_relations.split()))
2222
2223 for client_relation in active_client_relations:
2224 # We need to pull the credentials from the master unit's
2225 # end of the client relation.
2226 log('Hot standby republishing credentials from {} to {}'.format(
2227 master, client_relation))
2228
2229 connection_settings = hookenv.relation_get(
2230 unit=master, rid=client_relation)
2231
2232 # Override unit specific connection details
2233 connection_settings['host'] = hookenv.unit_private_ip()
2234 connection_settings['port'] = get_service_port()
2235 connection_settings['state'] = local_state['state']
2236 requested_db = hookenv.relation_get('database')
2237 # A hot standby might have seen a database name change before
2238 # the master, so override. This is no problem because we block
2239 # until this database has been created on the master and
2240 # replicated through to this unit.
2241 if requested_db:
2242 connection_settings['database'] = requested_db
2243
2244 # Block until users and database has replicated, so we know the
2245 # connection details we publish are actually valid. This will
2246 # normally be pretty much instantaneous. Do not block if we are
2247 # running in manual replication mode, as it is outside of juju's
2248 # control when replication is actually setup and running.
2249 if not hookenv.config('manual_replication'):
2250 timeout = 60
2251 start = time.time()
2252 while time.time() < start + timeout:
2253 cur = db_cursor(autocommit=True)
2254 cur.execute('select datname from pg_database')
2255 if cur.fetchone() is not None:
2256 break
2257 del cur
2258 log('Waiting for database {} to be replicated'.format(
2259 connection_settings['database']))
2260 time.sleep(10)
2261
2262 log("Relation {} connection settings {!r}".format(
2263 client_relation, connection_settings), DEBUG)
2264 hookenv.relation_set(
2265 client_relation, relation_settings=connection_settings)
2266
2267
2268@hooks.hook()
2269def replication_relation_departed():
2270 '''A unit has left the replication peer group.'''
2271 remote_unit = hookenv.remote_unit()
2272
2273 assert remote_unit is not None
2274
2275 log("{} has left the peer group".format(remote_unit))
2276
2277 # If we are the last unit standing, we become standalone
2278 remaining_peers = set(hookenv.related_units(hookenv.relation_id()))
2279 remaining_peers.discard(remote_unit) # Bug #1192433
2280
2281 # True if we were following the departed unit.
2282 following_departed = (local_state.get('following', None) == remote_unit)
2283
2284 if remaining_peers and not following_departed:
2285 log("Remaining {}".format(local_state['state']))
2286
2287 elif remaining_peers and following_departed:
2288 # If the unit being removed was our master, prepare for failover.
2289 # We need to suspend replication to ensure that the replay point
2290 # remains consistent throughout the election, and publish that
2291 # replay point. Once all units have entered this steady state,
2292 # we can identify the most up to date hot standby and promote it
2293 # to be the new master.
2294 log("Entering failover state")
2295 cur = db_cursor(autocommit=True)
2296 cur.execute("SELECT pg_is_xlog_replay_paused()")
2297 already_paused = cur.fetchone()[0]
2298 local_state["paused_at_failover"] = already_paused
2299 if not already_paused:
2300 cur.execute("SELECT pg_xlog_replay_pause()")
2301 # Switch to failover state. Don't cleanup the 'following'
2302 # setting because having access to the former master is still
2303 # useful.
2304 local_state['state'] = 'failover'
2305 local_state['wal_received_offset'] = postgresql_wal_received_offset()
2306
2307 else:
2308 log("Last unit standing. Switching from {} to standalone.".format(
2309 local_state['state']))
2310 promote_database()
2311 local_state['state'] = 'standalone'
2312 if 'following' in local_state:
2313 del local_state['following']
2314 if 'wal_received_offset' in local_state:
2315 del local_state['wal_received_offset']
2316 if 'paused_at_failover' in local_state:
2317 del local_state['paused_at_failover']
2318
2319 config_changed()
2320 local_state.publish()
2321
2322
2323@hooks.hook()
2324def replication_relation_broken():
2325 # This unit has been removed from the service.
2326 promote_database()
2327 config_changed()
2328
2329
2330@contextmanager
2331def switch_cwd(new_working_directory):
2332 org_dir = os.getcwd()
2333 os.chdir(new_working_directory)
2334 try:
2335 yield new_working_directory
2336 finally:
2337 os.chdir(org_dir)
2338
2339
2340@contextmanager
2341def restart_lock(unit, exclusive):
2342 '''Aquire the database restart lock on the given unit.
2343
2344 A database needing a restart should grab an exclusive lock before
2345 doing so. To block a remote database from doing a restart, grab a shared
2346 lock.
2347 '''
2348 key = long(hookenv.config('advisory_lock_restart_key'))
2349 if exclusive:
2350 lock_function = 'pg_advisory_lock'
2351 else:
2352 lock_function = 'pg_advisory_lock_shared'
2353 q = 'SELECT {}({})'.format(lock_function, key)
2354
2355 # We will get an exception if the database is rebooted while waiting
2356 # for a shared lock. If the connection is killed, we retry a few
2357 # times to cope.
2358 num_retries = 3
2359
2360 for count in range(0, num_retries):
2361 try:
2362 if unit == hookenv.local_unit():
2363 cur = db_cursor(autocommit=True)
2364 else:
2365 host = hookenv.relation_get('private-address', unit)
2366 port = hookenv.relation_get('port', unit)
2367 cur = db_cursor(
2368 autocommit=True, db='postgres', user='juju_replication',
2369 host=host, port=port)
2370 cur.execute(q)
2371 break
2372 except psycopg2.Error:
2373 if count == num_retries - 1:
2374 raise
2375
2376 try:
2377 yield
2378 finally:
2379 # Close our connection, swallowing any exceptions as the database
2380 # may be being rebooted now we have released our lock.
2381 try:
2382 del cur
2383 except psycopg2.Error:
2384 pass
2385
2386
2387def clone_database(master_unit, master_host, master_port):
2388 with restart_lock(master_unit, False):
2389 postgresql_stop()
2390 log("Cloning master {}".format(master_unit))
2391
2392 config_data = hookenv.config()
2393 version = pg_version()
2394 cluster_name = config_data['cluster_name']
2395 postgresql_cluster_dir = os.path.join(
2396 postgresql_data_dir, version, cluster_name)
2397 postgresql_config_dir = _get_postgresql_config_dir(config_data)
2398 cmd = [
2399 'sudo', '-E', # -E needed to locate pgpass file.
2400 '-u', 'postgres', 'pg_basebackup', '-D', postgresql_cluster_dir,
2401 '--xlog', '--checkpoint=fast', '--no-password',
2402 '-h', master_host, '-p', master_port,
2403 '--username=juju_replication']
2404 log(' '.join(cmd), DEBUG)
2405
2406 if os.path.isdir(postgresql_cluster_dir):
2407 shutil.rmtree(postgresql_cluster_dir)
2408
2409 try:
2410 # Change directory the postgres user can read, and need
2411 # .pgpass too.
2412 with switch_cwd('/tmp'), pgpass():
2413 # Clone the master with pg_basebackup.
2414 output = subprocess.check_output(cmd, stderr=subprocess.STDOUT)
2415 log(output, DEBUG)
2416 # SSL certificates need to exist in the datadir.
2417 create_ssl_cert(postgresql_cluster_dir)
2418 create_recovery_conf(master_host, master_port)
2419 except subprocess.CalledProcessError as x:
2420 # We failed, and this cluster is broken. Rebuild a
2421 # working cluster so start/stop etc. works and we
2422 # can retry hooks again. Even assuming the charm is
2423 # functioning correctly, the clone may still fail
2424 # due to eg. lack of disk space.
2425 log(x.output, ERROR)
2426 log("Clone failed, local db destroyed", ERROR)
2427 if os.path.exists(postgresql_cluster_dir):
2428 shutil.rmtree(postgresql_cluster_dir)
2429 if os.path.exists(postgresql_config_dir):
2430 shutil.rmtree(postgresql_config_dir)
2431 createcluster()
2432 config_changed()
2433 raise
2434 finally:
2435 postgresql_start()
2436 wait_for_db()
2437
2438
2439def slave_count():
2440 num_slaves = 0
2441 for relid in hookenv.relation_ids('replication'):
2442 num_slaves += len(hookenv.related_units(relid))
2443 for relid in hookenv.relation_ids('master'):
2444 num_slaves += len(hookenv.related_units(relid))
2445 return num_slaves
2446
2447
2448def postgresql_is_in_backup_mode():
2449 version = pg_version()
2450 cluster_name = hookenv.config('cluster_name')
2451 postgresql_cluster_dir = os.path.join(
2452 postgresql_data_dir, version, cluster_name)
2453
2454 return os.path.exists(
2455 os.path.join(postgresql_cluster_dir, 'backup_label'))
2456
2457
2458def pg_basebackup_is_running():
2459 cur = db_cursor(autocommit=True)
2460 cur.execute("""
2461 SELECT count(*) FROM pg_stat_activity
2462 WHERE usename='juju_replication' AND application_name='pg_basebackup'
2463 """)
2464 return cur.fetchone()[0] > 0
2465
2466
2467def postgresql_wal_received_offset():
2468 """How much WAL we have.
2469
2470 WAL is replicated asynchronously from the master to hot standbys.
2471 The more WAL a hot standby has received, the better a candidate it
2472 makes for master during failover.
2473
2474 Note that this is not quite the same as how in sync the hot standby is.
2475 That depends on how much WAL has been replayed. WAL is replayed after
2476 it is received.
2477 """
2478 cur = db_cursor(autocommit=True)
2479 cur.execute('SELECT pg_is_in_recovery(), pg_last_xlog_receive_location()')
2480 is_in_recovery, xlog_received = cur.fetchone()
2481 if is_in_recovery:
2482 return wal_location_to_bytes(xlog_received)
2483 return None
2484
2485
2486def wal_location_to_bytes(wal_location):
2487 """Convert WAL + offset to num bytes, so they can be compared."""
2488 logid, offset = wal_location.split('/')
2489 return int(logid, 16) * 16 * 1024 * 1024 * 255 + int(offset, 16)
2490
2491
2492def wait_for_db(
2493 timeout=120, db='postgres', user='postgres', host=None, port=None):
2494 '''Wait until the db is fully up.'''
2495 db_cursor(db=db, user=user, host=host, port=port, timeout=timeout)
2496
2497
2498def unit_sorted(units):
2499 """Return a sorted list of unit names."""
2500 return sorted(
2501 units, lambda a, b: cmp(int(a.split('/')[-1]), int(b.split('/')[-1])))
2502
2503
2504def delete_metrics_cronjob(cron_path):
2505 try:
2506 os.unlink(cron_path)
2507 except OSError:
2508 pass
2509
2510
2511def write_metrics_cronjob(script_path, cron_path):
2512 config_data = hookenv.config()
2513
2514 # need the following two configs to be valid
2515 metrics_target = config_data['metrics_target'].strip()
2516 metrics_sample_interval = config_data['metrics_sample_interval']
2517 if (not metrics_target
2518 or ':' not in metrics_target
2519 or not metrics_sample_interval):
2520 log("Required config not found or invalid "
2521 "(metrics_target, metrics_sample_interval), "
2522 "disabling statsd metrics", DEBUG)
2523 delete_metrics_cronjob(cron_path)
2524 return
2525
2526 charm_dir = os.environ['CHARM_DIR']
2527 statsd_host, statsd_port = metrics_target.split(':', 1)
2528 metrics_prefix = config_data['metrics_prefix'].strip()
2529 metrics_prefix = metrics_prefix.replace(
2530 "$UNIT", hookenv.local_unit().replace('.', '-').replace('/', '-'))
2531
2532 # ensure script installed
2533 charm_script = os.path.join(charm_dir, 'files', 'metrics',
2534 'postgres_to_statsd.py')
2535 host.write_file(script_path, open(charm_script, 'rb').read(), perms=0755)
2536
2537 # write the crontab
2538 with open(cron_path, 'w') as cronjob:
2539 cronjob.write(render_template("metrics_cronjob.template", {
2540 'interval': config_data['metrics_sample_interval'],
2541 'script': script_path,
2542 'metrics_prefix': metrics_prefix,
2543 'metrics_sample_interval': metrics_sample_interval,
2544 'statsd_host': statsd_host,
2545 'statsd_port': statsd_port,
2546 }))
2547
2548
2549@hooks.hook('nrpe-external-master-relation-changed')
2550def update_nrpe_checks():
2551 config_data = hookenv.config()
2552 try:
2553 nagios_uid = getpwnam('nagios').pw_uid
2554 nagios_gid = getgrnam('nagios').gr_gid
2555 except Exception:
2556 hookenv.log("Nagios user not set up.", hookenv.DEBUG)
2557 return
2558
2559 try:
2560 nagios_password = create_user('nagios')
2561 pg_pass_entry = '*:*:*:nagios:%s' % (nagios_password)
2562 with open('/var/lib/nagios/.pgpass', 'w') as target:
2563 os.fchown(target.fileno(), nagios_uid, nagios_gid)
2564 os.fchmod(target.fileno(), 0400)
2565 target.write(pg_pass_entry)
2566 except psycopg2.InternalError:
2567 if config_data['manual_replication']:
2568 log("update_nrpe_checks(): manual_replication: "
2569 "ignoring psycopg2.InternalError caught creating 'nagios' "
2570 "postgres role; assuming we're already replicating")
2571 else:
2572 raise
2573
2574 relids = hookenv.relation_ids('nrpe-external-master')
2575 relations = []
2576 for relid in relids:
2577 for unit in hookenv.related_units(relid):
2578 relations.append(hookenv.relation_get(unit=unit, rid=relid))
2579
2580 if len(relations) == 1 and 'nagios_hostname' in relations[0]:
2581 nagios_hostname = relations[0]['nagios_hostname']
2582 log("update_nrpe_checks: Obtained nagios_hostname ({}) "
2583 "from nrpe-external-master relation.".format(nagios_hostname))
2584 else:
2585 unit = hookenv.local_unit()
2586 unit_name = unit.replace('/', '-')
2587 nagios_hostname = "%s-%s" % (config_data['nagios_context'], unit_name)
2588 log("update_nrpe_checks: Deduced nagios_hostname ({}) from charm "
2589 "config (nagios_hostname not found in nrpe-external-master "
2590 "relation, or wrong number of relations "
2591 "found)".format(nagios_hostname))
2592
2593 nrpe_service_file = \
2594 '/var/lib/nagios/export/service__{}_check_pgsql.cfg'.format(
2595 nagios_hostname)
2596 nagios_logdir = '/var/log/nagios'
2597 if not os.path.exists(nagios_logdir):
2598 os.mkdir(nagios_logdir)
2599 os.chown(nagios_logdir, nagios_uid, nagios_gid)
2600 for f in os.listdir('/var/lib/nagios/export/'):
2601 if re.search('.*check_pgsql.cfg', f):
2602 os.remove(os.path.join('/var/lib/nagios/export/', f))
The diff has been truncated for viewing.

Subscribers

People subscribed via source and target branches

to all changes: