MAAS

Merge lp:~allenap/maas/dev-services-shutdown into lp:~maas-committers/maas/trunk

dev-services-shutdown
Merge into trunk

Proposed by Gavin Panella on 2013-01-02

Status:

Merged

Approved by:

Gavin Panella on 2013-01-03

Approved revision:

no longer in the source branch.

Merged at revision:

1419

Proposed branch:

lp:~allenap/maas/dev-services-shutdown

Merge into:

lp:~maas-committers/maas/trunk

Diff against target:

30 lines (+5/-4)

2 files modified

services/cluster-worker/run (+1/-1)
services/region-worker/run (+4/-3)

To merge this branch:

bzr merge lp:~allenap/maas/dev-services-shutdown

Medium

Fix Released

Link a bug report

Reviewer	Review Type	Date Requested	Status
Raphaël Badin (community)		2013-01-02	Approve on 2013-01-03
Review via email: mp+141645@code.launchpad.net

Commit message

Use pgrphack to ensure that the celeryd development services shut down correctly.

Previously the region-worker service would leave processes behind, and the cluster-worker service would hang if not started via fghack.

Revision history for this message

Gavin Panella (allenap) wrote on 2013-01-02:

On IRC:

> <rvba> ... I'm still having an error when I try to start the services
> again (after they have been stopped):
>
> {{{
> ==> logs/webapp/current <==
> [02/Jan/2013 18:28:36] "GET /accounts/login/?next=%2F HTTP/1.1" 200 53437
> ^C--> Stop `web`
> --> Stop `region-worker`
> --> Stop `database`
> --> Stop `txlongpoll`
> --> Stop `pserv`
> --> Stop `dns`
> --> Stop `webapp`
> --> Stop `reloader`
> --> Stop `cluster-worker`
>
> rvb@leaf:~/canonical/dev-services-shutdown$ make run
> --> Start `web`
> setlock: fatal: unable to lock /run/lock/maas.dev.web: temporary failure
> }}}
>
> …Am I missing something?

This means that either a supervise process (started by `make
services/<name>/@supervise` or something that depends on that) is
still running, or the service is running somewhere else, even in
another branch (possibly invoked by `make services/<name>/@run`).

Try `fuser -v /run/lock/maas.dev.web` to see which process is still
holding that lock.

Revision history for this message

Raphaël Badin (rvb) wrote on 2013-01-02:

rvb@leaf:~/canonical/dev-services-shutdown$ fuser -v /run/lock/maas.dev.web
                     USER PID ACCESS COMMAND
/run/lock/maas.dev.web:
                     rvb 27125 F.... fghack
                     rvb 27132 F.... apache2

Sometimes I also get:

--> Start `web`
--> Start `region-worker`
--> Start `database`
setlock: fatal: unable to lock /run/lock/maas.dev.database: temporary failure

rvb@leaf:~/canonical/dev-services-shutdown$ fuser -v /run/lock/maas.dev.database
                     USER PID ACCESS COMMAND
/run/lock/maas.dev.database:
                     rvb 28243 F.... postgres
                     rvb 28272 F.... postgres
                     rvb 28273 F.... postgres
                     rvb 28274 F.... postgres
                     rvb 28275 F.... postgres

Note that it does not happen every time I run 'make run'…

Revision history for this message

Raphaël Badin (rvb) wrote on 2013-01-03:

As I said this morning, it looks like the services (celery, apache2, postgres) take a few seconds to actually stop and you get in trouble if you try to start them up right after they've been told to shutdown by svok. If one waits a couple of seconds after hitting CTRL-C, then the services can be re-started ok.

This is not ideal but already a massive improvement over what we have now.

This is probably worth backporting to 1.2 btw.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Andres Rodriguez

Blake Rouse

Brendan Donegan

Dave Walker

Deepa

Enrique Chirivella Pérez

Fumihito YOSHIDA

Gavin Panella

MAAS Committers

Mike Pontillo

james beedy

1	=== modified file 'services/cluster-worker/run'
2	--- services/cluster-worker/run 2012-12-11 13:30:57 +0000
3	+++ services/cluster-worker/run 2013-01-02 17:01:23 +0000
4	@@ -22,5 +22,5 @@
5	export CLUSTER_UUID="adfd3977-f251-4f2c-8d61-745dbd690bfc"
6
7	script="$(readlink -f bin/maas-provision)"
8	-exec fghack "${script}" start-cluster-controller \
9	+exec pgrphack "${script}" start-cluster-controller \
10	http://0.0.0.0:5240/ -u "$(id -un)" -g "$(id -gn)"
11
12	=== modified file 'services/region-worker/run'
13	--- services/region-worker/run 2012-10-05 12:08:51 +0000
14	+++ services/region-worker/run 2013-01-02 17:01:23 +0000
15	@@ -14,11 +14,12 @@
16	# because there are race issues when restarting.
17	[ -z "${logdir:-}" ] \|\| exec &>> "${logdir}/current"
18
19	-# XXX JeroenVermeulen 2012-08-23, bug=1040529: Use fghack to kludge around
20	-# hanging celery shutdown.
21	export PYTHONPATH=etc/:src/
22	script="$(readlink -f bin/celeryd)"
23	-exec fghack "${script}" \
24	+# XXX GavinPanella 2013-01-02, bug=1040529: celeryd does not shutdown
25	+# correctly when signalled: processes are often left behind. However,
26	+# pgrphack works around this, ensuring a complete shutdown.
27	+exec pgrphack "${script}" \
28	--loglevel INFO --beat --queues celery,master \
29	--schedule=run/celerybeat-region-schedule \
30	--config=democeleryconfig

MAAS

Merge lp:~allenap/maas/dev-services-shutdown into lp:~maas-committers/maas/trunk

Commit message

Description of the change

Preview Diff

Subscribers