Merge ~sergiodj/ubuntu/+source/rabbitmq-server:bug1784757-fix-server-restart-hang into ubuntu/+source/rabbitmq-server:ubuntu/bionic-devel

Proposed by Sergio Durigan Junior
Status: Merged
Approved by: Bryce Harrington
Approved revision: 7c060ba77e6db5d81490e90c4630c9955aa7eb77
Merged at revision: 7c060ba77e6db5d81490e90c4630c9955aa7eb77
Proposed branch: ~sergiodj/ubuntu/+source/rabbitmq-server:bug1784757-fix-server-restart-hang
Merge into: ubuntu/+source/rabbitmq-server:ubuntu/bionic-devel
Diff against target: 58 lines (+17/-3)
3 files modified
debian/changelog (+11/-0)
debian/control (+1/-0)
debian/rabbitmq-server.service (+5/-3)
Reviewer Review Type Date Requested Status
Christian Ehrhardt  (community) Approve
Canonical Server Pending
Review via email: mp+387177@code.launchpad.net

Description of the change

The systemd file rabbitmq-server.service on Bionic uses "Type=simple" when defining the service, but unfortunately this doesn't work very well for rabbitmq-server. In certain situations, systemd will fail to keep track of a start/stop/restart event, and will hang for 90 seconds before giving the prompt back to the user. Another problem is that rabbitmq-server must start after the epmd service, so we need to explicitly declare this dependency in the service file.

Although I was able to reproduce this almost 100% of the time, there were rare occasions when the restart procedure finished normally. I was also only able to reproduce it using a bionic VM, not a container. If you have multipass or lxd configured to launch VMs, that should be easy.

The steps are:

$ lxc launch ubuntu-daily:bionic --vm bug1784757-rabbitmq-server # or use multipass
$ lxc shell bug1784757-rabbitmq-server
# apt update
# apt install rabbitmq-server -y
# systemctl restart rabbitmq-server.service

In a normal scenario, the restart should take around 3 seconds or less. With the bug, it takes around 90 seconds. If you can't reproduce it, try running "systemctl restart" again. A quick way to trigger it is to run a for loop like:

# for i in $(seq 10); do time systemctl restart rabbitmq-server.service ; done

The fix to this bug involved cherry picking two commits from the Debian package:

https://salsa.debian.org/openstack-team/third-party/rabbitmq-server/commit/9764f7e6ffeaa99325b8e5ff41e80b1c4b2894d3
https://salsa.debian.org/openstack-team/third-party/rabbitmq-server/commit/c5b99f5c4811f460b73c8c8505a98fa55dcd752b

The first one was not entirely cherry-picked because it contained unrelated bits.

I also had to add "socat" to the list of dependencies of the package, since rabbitmq-server.service is now "Type=notify", and rabbitmq-server communicates with systemd-notify via a socket.

There is a PPA with the proposed fix here:

https://launchpad.net/~sergiodj/+archive/ubuntu/rabbitmq-server-bug1784757

The package doesn't contain dep8 tests, but I made sure it installs correctly and can be restarted without problems.

To post a comment you must log in.
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

I'm waiting until https://bugs.launchpad.net/ubuntu/+source/rabbitmq-server/+bug/1874075 migrates before I set this MP to ready.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Moving "SyslogIdentifier=rabbitmq" around is just noise.
I'd understand if this is a conffile that would thereby match the next versions file.
But it is at /lib/systemd/system/rabbitmq-server.service so is there any need to move this line around?

I understand you have picked changes that later happened to the service file, but for the following changes I'm not sure if they are
a) needed to fix the immediate problem
b) don't have side effects that are unwanted in an SRU

+WorkingDirectory=/var/lib/rabbitmq
+Group=rabbitmq
+UMask=0027

Would you mind to re-vise the MP either explaining why those are needed OR dropping these changes for the SRU?

review: Needs Fixing
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Also the SRU bug isn't ready, but since this is meant to be still WIP that might be ok.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

On Monday, July 13 2020, Christian Ehrhardt  wrote:

> Moving "SyslogIdentifier=rabbitmq" around is just noise.
> I'd understand if this is a conffile that would thereby match the next versions file.
> But it is at /lib/systemd/system/rabbitmq-server.service so is there any need to move this line around?

Thanks for the review, Christian. Out of curiosity, I had marked this
MP as WIP; was it you who marked it for review, or was it Launchpad?

> I understand you have picked changes that later happened to the service file, but for the following changes I'm not sure if they are
> a) needed to fix the immediate problem
> b) don't have side effects that are unwanted in an SRU
>
> +WorkingDirectory=/var/lib/rabbitmq
> +Group=rabbitmq
> +UMask=0027
>
> Would you mind to re-vise the MP either explaining why those are needed OR dropping these changes for the SRU?

You are correct here. In fact, I've been working on another bug that
involves backporting changes to the .service file, and I reached the
same conclusion there: it is better to just backport the changes needed
for the SRU (obviously), instead of backporting unrelated changes that
are present in the upstream commit.

I will drop the changes you mentioned above and force-push the
branch again.

Thanks,

--
Sergio
GPG key ID: E92F D0B3 6B14 F1F4 D8E0 EB2F 106D A1C8 C3CB BF14

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

I've simplified the backported patches, adjusted the changelog, and written the SRU info now.

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

I didn't change it away from WIP, must have been some automatism - let us set it back.

Thanks for the updates - SRU LGTM now.
The cahnges now seem properly cut down to just the SRU - thanks.

+1 once the former version SRU is completed

review: Approve
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

It did not auto-set the state this time.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Changing the state to Needs Review since rabbitmq-server has entered -updates.

Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Actually, Christian has already approved it, so I think this is ready for upload.

Revision history for this message
Bryce Harrington (bryce) wrote :

Upload sponsored:

$ git ubuntu tag --upload
$ git push pkg upload/3.6.10-1ubuntu0.4
Enumerating objects: 15, done.
Counting objects: 100% (15/15), done.
Delta compression using up to 12 threads
Compressing objects: 100% (10/10), done.
Writing objects: 100% (10/10), 1.71 KiB | 291.00 KiB/s, done.
Total 10 (delta 7), reused 0 (delta 0)
cd ..
To ssh://git.launchpad.net/ubuntu/+source/rabbitmq-server
 * [new tag] upload/3.6.10-1ubuntu0.4 -> upload/3.6.10-1ubuntu0.4
$ dput ubuntu ${changes}
Checking signature on .changes
gpg: /home/bryce/pkg/RabbitmqServer/review-mp387177/rabbitmq-server_3.6.10-1ubuntu0.4_source.changes: Valid signature from E603B2578FB8F0FB
Checking signature on .dsc
gpg: /home/bryce/pkg/RabbitmqServer/review-mp387177/rabbitmq-server_3.6.10-1ubuntu0.4.dsc: Valid signature from E603B2578FB8F0FB
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading rabbitmq-server_3.6.10-1ubuntu0.4.dsc: done.
  Uploading rabbitmq-server_3.6.10-1ubuntu0.4.debian.tar.xz: done.
  Uploading rabbitmq-server_3.6.10-1ubuntu0.4_source.buildinfo: done.
  Uploading rabbitmq-server_3.6.10-1ubuntu0.4_source.changes: done.
Successfully uploaded packages.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/debian/changelog b/debian/changelog
2index 6c65c62..a16d4a3 100644
3--- a/debian/changelog
4+++ b/debian/changelog
5@@ -1,3 +1,14 @@
6+rabbitmq-server (3.6.10-1ubuntu0.4) bionic; urgency=medium
7+
8+ * Improve systemd service file in order to avoid a 90-second hang
9+ when restarting rabbitmq-server. (LP: #1784757) (Closes #902136)
10+ - d/rabbitmq-server.service: Make sure the service starts after
11+ epmd. Use "Type=notify" for the service.
12+ - d/control: Depend on socat, which is needed for the communication
13+ between rabbitmq-server and systemd when "Type=notify" is used.
14+
15+ -- Sergio Durigan Junior <sergio.durigan@canonical.com> Tue, 07 Jul 2020 10:26:18 -0400
16+
17 rabbitmq-server (3.6.10-1ubuntu0.3) bionic; urgency=medium
18
19 * d/rabbitmq-server.service:
20diff --git a/debian/control b/debian/control
21index 912f8a8..b66338e 100644
22--- a/debian/control
23+++ b/debian/control
24@@ -29,6 +29,7 @@ Depends: adduser,
25 erlang-nox (>= 1:13.b.3) | esl-erlang,
26 lsb-base,
27 logrotate,
28+ socat,
29 ${misc:Depends},
30 ${python:Depends},
31 Description: AMQP server written in Erlang
32diff --git a/debian/rabbitmq-server.service b/debian/rabbitmq-server.service
33index d3500e9..5cc5244 100644
34--- a/debian/rabbitmq-server.service
35+++ b/debian/rabbitmq-server.service
36@@ -1,17 +1,19 @@
37 [Unit]
38 Description=RabbitMQ Messaging Server
39-After=network.target
40+After=network.target epmd@0.0.0.0.socket
41+Wants=network.target epmd@0.0.0.0.socket
42
43 [Service]
44-Type=simple
45+Type=notify
46 User=rabbitmq
47 SyslogIdentifier=rabbitmq
48+NotifyAccess=all
49+TimeoutStartSec=3600
50 LimitNOFILE=65536
51 TimeoutStartSec=600
52 Restart=on-failure
53 RestartSec=10
54 ExecStart=/usr/sbin/rabbitmq-server
55-ExecStartPost=/usr/lib/rabbitmq/bin/rabbitmq-server-wait
56 ExecStop=/usr/sbin/rabbitmqctl stop
57
58 [Install]

Subscribers

People subscribed via source and target branches