Merge lp:~stgraber/apport/bug1445064 into lp:~apport-hackers/apport/trunk

Proposed by Stéphane Graber
Status: Merged
Merged at revision: 3052
Proposed branch: lp:~stgraber/apport/bug1445064
Merge into: lp:~apport-hackers/apport/trunk
Diff against target: 168 lines (+121/-4)
4 files modified
data/apport (+54/-4)
lib/systemd/system/apport-forward.socket (+12/-0)
lib/systemd/system/apport-forward@.service (+11/-0)
test/test_apport_forward.py (+44/-0)
To merge this branch: bzr merge lp:~stgraber/apport/bug1445064
Reviewer Review Type Date Requested Status
Martin Pitt (community) Needs Information
Review via email: mp+283752@code.launchpad.net

Description of the change

This is the implementation of https://bugs.launchpad.net/ubuntu/+source/apport/+bug/1445064

The original implementation of this feature was a security nightmare and had to be reverted. This new design should be safe as the host will never actually execute any code, just contact a pre-existing apport setup and forward the crash to it.

This change introduces a couple of systemd units to setup a new /run/apport.socket UNIX socket. Upon receiving a container crash, the host apport looks for that socket in the crashed process' root. If found, it attempts to connect to it, then send the command line arguments it received along with its stdin.

Systemd in the container spawns apport whenever a crash happens, apport then connects to the host through the fd it receives from systemd and receives the stdin fd as well as the argument list from the host.

It then mangles its own sys.stdin and sys.argv to match what's sent by the host, then continues execution as normal.

The code change is pretty self contained, only new external dependency is on python3-systemd.

To post a comment you must log in.
Revision history for this message
Stéphane Graber (stgraber) wrote :

Ok, so the code looks good to me, the only problem is that it doesn't work.

It's not entirely clear to me why it doesn't. The host does get the crash, finds the container's socket and sends the crash over to it. The container's apport kicks in, parses the argument line just fine and then fails during actual crash parsing due to the crashed process no longer existing.

I don't get why the crashed process disappears though, I thought the kernel was supposed to keep it around until the crash handler exists, however the host side of the crash handler sure is still running (in fact doesn't seem to want to die for some reason)...

So basically the two remaining issues are:
 - The crashed process appears to be destroyed by the kernel before the handler exits.
 - The handler seems to never exit, apparently stuck on recv() of the unix socket, despite the remote end having long exited. This may be a weirdness of systemd's socket handler, you'd think that with accept=True it'd close the socket on exit of the service, but it doesn't appear to do so.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Correction, the timeout does kick in, so the handler eventually exits, it just doesn't do so when the container apport handler dies.

lp:~stgraber/apport/bug1445064 updated
3049. By Stéphane Graber

Implement apport crash forwarding for containers

Revision history for this message
Stéphane Graber (stgraber) wrote :

Implemented a new, cleaner and working version of this change. I've confirmed it to work and it's now ready for review.

lp:~stgraber/apport/bug1445064 updated
3050. By Stéphane Graber

Move systemd.daemon import under the LISTEN_FDS check

3051. By Stéphane Graber

Add test for apport_forward. This doesn't integrate in the test framework!

Revision history for this message
Martin Pitt (pitti) wrote :

Stephane, some nitpicks and questions after initial review. I'll work on the test suite integration now.

review: Needs Fixing
Revision history for this message
Martin Pitt (pitti) wrote :

Based on this MP I did some followup commits to lp:~pitti/apport/container-forward which turn your sketch test into a proper test case, fix the above remarks, and some other minor fallout that I noticed during testing. Stephane, can you please have a quick look at this to ensure I didn't do anything stupid?

I tested this in an LXD xenial container, and it's working nicely. Thanks again!

review: Needs Information
Revision history for this message
Stéphane Graber (stgraber) wrote :

Replied to the comments.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Your changes look good to me, the ImportError try/except certainly isn't harmful even if I wouldn't expect it to be needed for a trusty backport.

I'm not sure whether that Also statement is actually useful for systemd but you know systemd way better than I do so I'll assume you're right :)

Revision history for this message
Martin Pitt (pitti) :

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'data/apport'
--- data/apport 2015-11-04 20:04:52 +0000
+++ data/apport 2016-01-27 09:28:56 +0000
@@ -14,7 +14,7 @@
14# the full text of the license.14# the full text of the license.
1515
16import sys, os, os.path, subprocess, time, traceback, pwd, io16import sys, os, os.path, subprocess, time, traceback, pwd, io
17import signal, inspect, grp, fcntl17import signal, inspect, grp, fcntl, socket, atexit, array
1818
19import apport, apport.fileutils19import apport, apport.fileutils
2020
@@ -303,6 +303,36 @@
303#303#
304#################################################################304#################################################################
305305
306# Systemd socket activation
307if "LISTEN_FDS" in os.environ:
308 from systemd.daemon import listen_fds
309
310 # Extract and validate the fd
311 fds = listen_fds()
312 if len(fds) < 1:
313 print("Invalid socket activation, no fd provided")
314 sys.exit(1)
315
316 # Open the socket
317 sock = socket.fromfd(int(fds[0]), socket.AF_UNIX, socket.SOCK_STREAM)
318 atexit.register(sock.shutdown, socket.SHUT_RDWR)
319
320 # Replace stdin by the socket activation fd
321 sys.stdin.close()
322
323 fds = array.array("i")
324 msg, ancdata, flags, addr = sock.recvmsg(4096, socket.CMSG_LEN(fds.itemsize))
325 for cmsg_level, cmsg_type, cmsg_data in ancdata:
326 if (cmsg_level == socket.SOL_SOCKET and cmsg_type == socket.SCM_RIGHTS):
327 fds.fromstring(cmsg_data[:len(cmsg_data) - (len(cmsg_data) % fds.itemsize)])
328
329 sys.stdin = os.fdopen(int(fds[0]), "r")
330
331 # Replace argv by the arguments received over the socket
332 sys.argv = [sys.argv[0]]
333 sys.argv += msg.decode().split()
334
335# Normal startup
306if len(sys.argv) not in (4, 5):336if len(sys.argv) not in (4, 5):
307 try:337 try:
308 print('Usage: %s <pid> <signal number> <core file ulimit> [global pid]' % sys.argv[0])338 print('Usage: %s <pid> <signal number> <core file ulimit> [global pid]' % sys.argv[0])
@@ -318,10 +348,30 @@
318# Check if we received a valid global PID (kernel >= 3.12). If we do,348# Check if we received a valid global PID (kernel >= 3.12). If we do,
319# then compare it with the local PID. If they don't match, it's an349# then compare it with the local PID. If they don't match, it's an
320# indication that the crash originated from another PID namespace.350# indication that the crash originated from another PID namespace.
321# Simply log an entry in the host error log and exit 0.351# Attempt to forward it to the container
322if len(sys.argv) == 5 and sys.argv[4].isdigit() and sys.argv[4] != sys.argv[1]:352if len(sys.argv) == 5 and sys.argv[4].isdigit() and sys.argv[4] != sys.argv[1]:
323 error_log('host pid %s crashed in a container without apport support' %353 host_pid = int(sys.argv[4])
324 sys.argv[4])354
355 if not os.path.exists("/proc/%d/root/run/apport.socket" % host_pid):
356 error_log('host pid %s crashed in a container without apport support' %
357 sys.argv[4])
358 sys.exit(0)
359
360 sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
361 try:
362 sock.connect("/proc/%d/root/run/apport.socket" % host_pid)
363 except:
364 error_log('host pid %s crashed in a container with a broken apport' %
365 sys.argv[4])
366 sys.exit(0)
367
368 args = "%s %s %s" % (sys.argv[1], sys.argv[2], sys.argv[3])
369
370 try:
371 sock.sendmsg([args.encode()], [(socket.SOL_SOCKET, socket.SCM_RIGHTS, array.array("i", [0]))])
372 sock.shutdown(socket.SHUT_RDWR)
373 except socket.timeout:
374 error_log('Container apport failed to process crash within 30s')
325 sys.exit(0)375 sys.exit(0)
326376
327check_lock()377check_lock()
328378
=== added directory 'lib'
=== added directory 'lib/systemd'
=== added directory 'lib/systemd/system'
=== added file 'lib/systemd/system/apport-forward.socket'
--- lib/systemd/system/apport-forward.socket 1970-01-01 00:00:00 +0000
+++ lib/systemd/system/apport-forward.socket 2016-01-27 09:28:56 +0000
@@ -0,0 +1,12 @@
1[Unit]
2Description=Unix socket for apport crash forwarding
3
4[Socket]
5ListenStream=/run/apport.socket
6SocketMode=0600
7Accept=yes
8MaxConnections=10
9Backlog=5
10
11[Install]
12WantedBy=sockets.target
013
=== added file 'lib/systemd/system/apport-forward@.service'
--- lib/systemd/system/apport-forward@.service 1970-01-01 00:00:00 +0000
+++ lib/systemd/system/apport-forward@.service 2016-01-27 09:28:56 +0000
@@ -0,0 +1,11 @@
1[Unit]
2Description=Apport crash forwarding receiver
3Requires=apport-forward.socket
4
5[Service]
6Type=forking
7ExecStart=/usr/share/apport/apport
8
9[Install]
10WantedBy=multi-user.target
11Also=status.socket
012
=== added file 'test/test_apport_forward.py'
--- test/test_apport_forward.py 1970-01-01 00:00:00 +0000
+++ test/test_apport_forward.py 2016-01-27 09:28:56 +0000
@@ -0,0 +1,44 @@
1#!/usr/bin/python3
2
3import array
4import os
5import socket
6import sys
7import tempfile
8
9path = tempfile.mktemp()
10
11server = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
12server.bind(path)
13server.listen(1)
14
15parent_pid = os.getpid()
16
17if os.fork() == 0:
18 client = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM)
19 client.connect(path)
20
21 blob = tempfile.mktemp()
22 with open(blob, "w+") as fd:
23 fd.write("blob")
24
25 with open(blob, "r") as fd:
26 args = "%s 11 100" % parent_pid
27 client.sendmsg([args.encode()], [(socket.SOL_SOCKET, socket.SCM_RIGHTS,
28 array.array("i", [fd.fileno()]))])
29 os.remove(blob)
30 sys.exit(0)
31
32conn, addr = server.accept()
33
34os.dup2(conn.fileno(), 3)
35child = os.fork()
36if child == 0:
37 os.environ["LISTEN_PID"] = "%s" % os.getpid()
38 os.environ["LISTEN_FDNAMES"] = "connection"
39 os.environ["LISTEN_FDS"] = "1"
40
41 os.execl("data/apport", "apport")
42
43os.waitpid(child, 0)
44os.remove(path)

Subscribers

People subscribed via source and target branches