Merge lp:~stgraber/apport/bug1445064 into lp:~apport-hackers/apport/trunk

Proposed by Stéphane Graber
Status: Merged
Merged at revision: 3052
Proposed branch: lp:~stgraber/apport/bug1445064
Merge into: lp:~apport-hackers/apport/trunk
Diff against target: 168 lines (+121/-4)
4 files modified
data/apport (+54/-4)
lib/systemd/system/apport-forward.socket (+12/-0)
lib/systemd/system/apport-forward@.service (+11/-0)
test/test_apport_forward.py (+44/-0)
To merge this branch: bzr merge lp:~stgraber/apport/bug1445064
Reviewer Review Type Date Requested Status
Martin Pitt (community) Needs Information
Review via email: mp+283752@code.launchpad.net

Description of the change

This is the implementation of https://bugs.launchpad.net/ubuntu/+source/apport/+bug/1445064

The original implementation of this feature was a security nightmare and had to be reverted. This new design should be safe as the host will never actually execute any code, just contact a pre-existing apport setup and forward the crash to it.

This change introduces a couple of systemd units to setup a new /run/apport.socket UNIX socket. Upon receiving a container crash, the host apport looks for that socket in the crashed process' root. If found, it attempts to connect to it, then send the command line arguments it received along with its stdin.

Systemd in the container spawns apport whenever a crash happens, apport then connects to the host through the fd it receives from systemd and receives the stdin fd as well as the argument list from the host.

It then mangles its own sys.stdin and sys.argv to match what's sent by the host, then continues execution as normal.

The code change is pretty self contained, only new external dependency is on python3-systemd.

To post a comment you must log in.
Revision history for this message
Stéphane Graber (stgraber) wrote :

Ok, so the code looks good to me, the only problem is that it doesn't work.

It's not entirely clear to me why it doesn't. The host does get the crash, finds the container's socket and sends the crash over to it. The container's apport kicks in, parses the argument line just fine and then fails during actual crash parsing due to the crashed process no longer existing.

I don't get why the crashed process disappears though, I thought the kernel was supposed to keep it around until the crash handler exists, however the host side of the crash handler sure is still running (in fact doesn't seem to want to die for some reason)...

So basically the two remaining issues are:
 - The crashed process appears to be destroyed by the kernel before the handler exits.
 - The handler seems to never exit, apparently stuck on recv() of the unix socket, despite the remote end having long exited. This may be a weirdness of systemd's socket handler, you'd think that with accept=True it'd close the socket on exit of the service, but it doesn't appear to do so.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Correction, the timeout does kick in, so the handler eventually exits, it just doesn't do so when the container apport handler dies.

lp:~stgraber/apport/bug1445064 updated
3049. By Stéphane Graber

Implement apport crash forwarding for containers

Revision history for this message
Stéphane Graber (stgraber) wrote :

Implemented a new, cleaner and working version of this change. I've confirmed it to work and it's now ready for review.

lp:~stgraber/apport/bug1445064 updated
3050. By Stéphane Graber

Move systemd.daemon import under the LISTEN_FDS check

3051. By Stéphane Graber

Add test for apport_forward. This doesn't integrate in the test framework!

Revision history for this message
Martin Pitt (pitti) wrote :

Stephane, some nitpicks and questions after initial review. I'll work on the test suite integration now.

review: Needs Fixing
Revision history for this message
Martin Pitt (pitti) wrote :

Based on this MP I did some followup commits to lp:~pitti/apport/container-forward which turn your sketch test into a proper test case, fix the above remarks, and some other minor fallout that I noticed during testing. Stephane, can you please have a quick look at this to ensure I didn't do anything stupid?

I tested this in an LXD xenial container, and it's working nicely. Thanks again!

review: Needs Information
Revision history for this message
Stéphane Graber (stgraber) wrote :

Replied to the comments.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Your changes look good to me, the ImportError try/except certainly isn't harmful even if I wouldn't expect it to be needed for a trusty backport.

I'm not sure whether that Also statement is actually useful for systemd but you know systemd way better than I do so I'll assume you're right :)

Revision history for this message
Martin Pitt (pitti) :

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'data/apport'
2--- data/apport 2015-11-04 20:04:52 +0000
3+++ data/apport 2016-01-27 09:28:56 +0000
4@@ -14,7 +14,7 @@
5 # the full text of the license.
6
7 import sys, os, os.path, subprocess, time, traceback, pwd, io
8-import signal, inspect, grp, fcntl
9+import signal, inspect, grp, fcntl, socket, atexit, array
10
11 import apport, apport.fileutils
12
13@@ -303,6 +303,36 @@
14 #
15 #################################################################
16
17+# Systemd socket activation
18+if "LISTEN_FDS" in os.environ:
19+ from systemd.daemon import listen_fds
20+
21+ # Extract and validate the fd
22+ fds = listen_fds()
23+ if len(fds) < 1:
24+ print("Invalid socket activation, no fd provided")
25+ sys.exit(1)
26+
27+ # Open the socket
28+ sock = socket.fromfd(int(fds[0]), socket.AF_UNIX, socket.SOCK_STREAM)
29+ atexit.register(sock.shutdown, socket.SHUT_RDWR)
30+
31+ # Replace stdin by the socket activation fd
32+ sys.stdin.close()
33+
34+ fds = array.array("i")
35+ msg, ancdata, flags, addr = sock.recvmsg(4096, socket.CMSG_LEN(fds.itemsize))
36+ for cmsg_level, cmsg_type, cmsg_data in ancdata:
37+ if (cmsg_level == socket.SOL_SOCKET and cmsg_type == socket.SCM_RIGHTS):
38+ fds.fromstring(cmsg_data[:len(cmsg_data) - (len(cmsg_data) % fds.itemsize)])
39+
40+ sys.stdin = os.fdopen(int(fds[0]), "r")
41+
42+ # Replace argv by the arguments received over the socket
43+ sys.argv = [sys.argv[0]]
44+ sys.argv += msg.decode().split()
45+
46+# Normal startup
47 if len(sys.argv) not in (4, 5):
48 try:
49 print('Usage: %s <pid> <signal number> <core file ulimit> [global pid]' % sys.argv[0])
50@@ -318,10 +348,30 @@
51 # Check if we received a valid global PID (kernel >= 3.12). If we do,
52 # then compare it with the local PID. If they don't match, it's an
53 # indication that the crash originated from another PID namespace.
54-# Simply log an entry in the host error log and exit 0.
55+# Attempt to forward it to the container
56 if len(sys.argv) == 5 and sys.argv[4].isdigit() and sys.argv[4] != sys.argv[1]:
57- error_log('host pid %s crashed in a container without apport support' %
58- sys.argv[4])
59+ host_pid = int(sys.argv[4])
60+
61+ if not os.path.exists("/proc/%d/root/run/apport.socket" % host_pid):
62+ error_log('host pid %s crashed in a container without apport support' %
63+ sys.argv[4])
64+ sys.exit(0)
65+
66+ sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
67+ try:
68+ sock.connect("/proc/%d/root/run/apport.socket" % host_pid)
69+ except:
70+ error_log('host pid %s crashed in a container with a broken apport' %
71+ sys.argv[4])
72+ sys.exit(0)
73+
74+ args = "%s %s %s" % (sys.argv[1], sys.argv[2], sys.argv[3])
75+
76+ try:
77+ sock.sendmsg([args.encode()], [(socket.SOL_SOCKET, socket.SCM_RIGHTS, array.array("i", [0]))])
78+ sock.shutdown(socket.SHUT_RDWR)
79+ except socket.timeout:
80+ error_log('Container apport failed to process crash within 30s')
81 sys.exit(0)
82
83 check_lock()
84
85=== added directory 'lib'
86=== added directory 'lib/systemd'
87=== added directory 'lib/systemd/system'
88=== added file 'lib/systemd/system/apport-forward.socket'
89--- lib/systemd/system/apport-forward.socket 1970-01-01 00:00:00 +0000
90+++ lib/systemd/system/apport-forward.socket 2016-01-27 09:28:56 +0000
91@@ -0,0 +1,12 @@
92+[Unit]
93+Description=Unix socket for apport crash forwarding
94+
95+[Socket]
96+ListenStream=/run/apport.socket
97+SocketMode=0600
98+Accept=yes
99+MaxConnections=10
100+Backlog=5
101+
102+[Install]
103+WantedBy=sockets.target
104
105=== added file 'lib/systemd/system/apport-forward@.service'
106--- lib/systemd/system/apport-forward@.service 1970-01-01 00:00:00 +0000
107+++ lib/systemd/system/apport-forward@.service 2016-01-27 09:28:56 +0000
108@@ -0,0 +1,11 @@
109+[Unit]
110+Description=Apport crash forwarding receiver
111+Requires=apport-forward.socket
112+
113+[Service]
114+Type=forking
115+ExecStart=/usr/share/apport/apport
116+
117+[Install]
118+WantedBy=multi-user.target
119+Also=status.socket
120
121=== added file 'test/test_apport_forward.py'
122--- test/test_apport_forward.py 1970-01-01 00:00:00 +0000
123+++ test/test_apport_forward.py 2016-01-27 09:28:56 +0000
124@@ -0,0 +1,44 @@
125+#!/usr/bin/python3
126+
127+import array
128+import os
129+import socket
130+import sys
131+import tempfile
132+
133+path = tempfile.mktemp()
134+
135+server = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
136+server.bind(path)
137+server.listen(1)
138+
139+parent_pid = os.getpid()
140+
141+if os.fork() == 0:
142+ client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
143+ client.connect(path)
144+
145+ blob = tempfile.mktemp()
146+ with open(blob, "w+") as fd:
147+ fd.write("blob")
148+
149+ with open(blob, "r") as fd:
150+ args = "%s 11 100" % parent_pid
151+ client.sendmsg([args.encode()], [(socket.SOL_SOCKET, socket.SCM_RIGHTS,
152+ array.array("i", [fd.fileno()]))])
153+ os.remove(blob)
154+ sys.exit(0)
155+
156+conn, addr = server.accept()
157+
158+os.dup2(conn.fileno(), 3)
159+child = os.fork()
160+if child == 0:
161+ os.environ["LISTEN_PID"] = "%s" % os.getpid()
162+ os.environ["LISTEN_FDNAMES"] = "connection"
163+ os.environ["LISTEN_FDS"] = "1"
164+
165+ os.execl("data/apport", "apport")
166+
167+os.waitpid(child, 0)
168+os.remove(path)

Subscribers

People subscribed via source and target branches