Merge ~ack/maas:1871423-supervisord-backoff into maas:master

Proposed by Alberto Donato
Status: Merged
Approved by: Alberto Donato
Approved revision: 61f73ef7c776182f7d4176188be3df44649c90cc
Merge reported by: MAAS Lander
Merged at revision: not available
Proposed branch: ~ack/maas:1871423-supervisord-backoff
Merge into: maas:master
Diff against target: 132 lines (+38/-0)
3 files modified
snap/local/tree/usr/share/maas/supervisord.conf.template (+10/-0)
src/provisioningserver/utils/service_monitor.py (+1/-0)
src/provisioningserver/utils/tests/test_service_monitor.py (+27/-0)
Reviewer Review Type Date Requested Status
Björn Tillenius Approve
MAAS Lander Approve
Review via email: mp+381881@code.launchpad.net

Commit message

LP: #1871423 - handle BACKOFF status from supervisord
LP: #1871582 - add "startsecs" to supervisor program stanzas to avoid quick respawns

To post a comment you must log in.
Revision history for this message
MAAS Lander (maas-lander) wrote :

UNIT TESTS
-b 1871423-supervisord-backoff lp:~ack/maas/+git/maas into -b master lp:~maas-committers/maas

STATUS: SUCCESS
COMMIT: 61f73ef7c776182f7d4176188be3df44649c90cc

review: Approve
Revision history for this message
Björn Tillenius (bjornt) wrote :

Discussed on IRC that mapping BACKOFF to DEAD seems a bit weird. But considering that the systemd states "activating" and "reloading" maps to DEAD as well, I guess it's not a problem.

So +1 for now, but we should rethink how we do service monitoring at some point.

review: Approve

There was an error fetching revisions from git servers. Please try again in a few minutes. If the problem persists, contact Launchpad support.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/snap/local/tree/usr/share/maas/supervisord.conf.template b/snap/local/tree/usr/share/maas/supervisord.conf.template
2index d911830..78f5eac 100644
3--- a/snap/local/tree/usr/share/maas/supervisord.conf.template
4+++ b/snap/local/tree/usr/share/maas/supervisord.conf.template
5@@ -24,6 +24,7 @@ stopasgroup=true
6 killasgroup=true
7 redirect_stderr=true
8 stdout_logfile=%(ENV_SNAP_COMMON)s/log/postgresql.log
9+startsecs=10
10 {{endif}}
11
12
13@@ -36,6 +37,7 @@ killasgroup=true
14 redirect_stderr=true
15 stdout_logfile=%(ENV_SNAP_COMMON)s/log/regiond.log
16 serverurl=unix://%(ENV_SNAP_DATA)s/supervisord/sock
17+startsecs=10
18 {{endif}}
19
20
21@@ -48,6 +50,7 @@ killasgroup=true
22 redirect_stderr=true
23 stdout_logfile=%(ENV_SNAP_COMMON)s/log/rackd.log
24 serverurl=unix://%(ENV_SNAP_DATA)s/supervisord/sock
25+startsecs=10
26
27 [program:dhcpd]
28 process_name=dhcpd
29@@ -57,6 +60,7 @@ stopasgroup=true
30 killasgroup=true
31 redirect_stderr=true
32 stdout_logfile=%(ENV_SNAP_COMMON)s/log/dhcpd.log
33+startsecs=10
34
35 [program:dhcpd6]
36 process_name=dhcpd6
37@@ -66,6 +70,7 @@ stopasgroup=true
38 killasgroup=true
39 redirect_stderr=true
40 stdout_logfile=%(ENV_SNAP_COMMON)s/log/dhcpd6.log
41+startsecs=10
42
43 [program:http]
44 process_name=http
45@@ -74,6 +79,7 @@ stopasgroup=true
46 killasgroup=true
47 redirect_stderr=true
48 stdout_logfile=%(ENV_SNAP_COMMON)s/log/nginx.log
49+startsecs=10
50 {{endif}}
51
52 {{if rackd or regiond}}
53@@ -84,6 +90,7 @@ stopasgroup=true
54 killasgroup=true
55 redirect_stderr=true
56 stdout_logfile=%(ENV_SNAP_COMMON)s/log/named.log
57+startsecs=10
58
59 [program:ntp]
60 process_name=ntp
61@@ -92,6 +99,7 @@ stopasgroup=true
62 killasgroup=true
63 redirect_stderr=true
64 stdout_logfile=%(ENV_SNAP_COMMON)s/log/chrony.log
65+startsecs=10
66
67 [program:proxy]
68 process_name=proxy
69@@ -101,6 +109,7 @@ stopasgroup=true
70 killasgroup=true
71 redirect_stderr=true
72 stdout_logfile=%(ENV_SNAP_COMMON)s/log/proxy.log
73+startsecs=10
74
75 [program:syslog]
76 process_name=syslog
77@@ -110,4 +119,5 @@ stopasgroup=true
78 killasgroup=true
79 redirect_stderr=true
80 stdout_logfile=%(ENV_SNAP_COMMON)s/log/rsyslog.log
81+startsecs=10
82 {{endif}}
83diff --git a/src/provisioningserver/utils/service_monitor.py b/src/provisioningserver/utils/service_monitor.py
84index 5770f01..5ce21ed 100644
85--- a/src/provisioningserver/utils/service_monitor.py
86+++ b/src/provisioningserver/utils/service_monitor.py
87@@ -243,6 +243,7 @@ class ServiceMonitor:
88 # Used to convert the supervisor state to the `SERVICE_STATE` enum.
89 SUPERVISOR_TO_STATE = {
90 "STARTING": SERVICE_STATE.ON,
91+ "BACKOFF": SERVICE_STATE.DEAD,
92 "RUNNING": SERVICE_STATE.ON,
93 "STOPPED": SERVICE_STATE.OFF,
94 "FATAL": SERVICE_STATE.DEAD,
95diff --git a/src/provisioningserver/utils/tests/test_service_monitor.py b/src/provisioningserver/utils/tests/test_service_monitor.py
96index 8dda950..c06f75e 100644
97--- a/src/provisioningserver/utils/tests/test_service_monitor.py
98+++ b/src/provisioningserver/utils/tests/test_service_monitor.py
99@@ -1248,6 +1248,33 @@ class TestServiceMonitor(MAASTestCase):
100 self.assertEqual("Result: exit-code", process_state)
101
102 @inlineCallbacks
103+ def test___loadSupervisorServiceState_backoff_returns_dead(self):
104+ service = make_fake_service(SERVICE_STATE.ON)
105+ service_monitor = self.make_service_monitor([service])
106+ supervisor_status_output = (
107+ dedent(
108+ """\
109+ %s BACKOFF Respawning too fast
110+ """
111+ )
112+ % (service.snap_service_name)
113+ )
114+
115+ mock_execSupervisorServiceAction = self.patch(
116+ service_monitor, "_execSupervisorServiceAction"
117+ )
118+ mock_execSupervisorServiceAction.return_value = (
119+ 1,
120+ supervisor_status_output,
121+ "",
122+ )
123+ active_state, process_state = yield (
124+ service_monitor._loadSupervisorServiceState(service)
125+ )
126+ self.assertEqual(SERVICE_STATE.DEAD, active_state)
127+ self.assertEqual("Result: exit-code", process_state)
128+
129+ @inlineCallbacks
130 def test___ensureService_logs_warning_in_mismatch_process_state(self):
131 service = make_fake_service(SERVICE_STATE.ON)
132 service_monitor = self.make_service_monitor([service])

Subscribers

People subscribed via source and target branches