[MIR] lxc

Bug #509647 reported by Stéphane Graber
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Fix Released
Wishlist
Unassigned

Bug Description

Binary package hint: lxc

The LXC team would like the MIR team to reconsider promotion of LXC to main.

The reason is that since the last request back in Lucid, the kernel has had a lot of time to stabilize and improve for the various calls used by lxc.
We also added apparmor confinement by default a few cycles ago.

LXC is used by quite a lot of people and is the default backend for JuJu charms development.

Serge Hallyn and myself are active upstream contributors and maintainers of the staging branch, so issues tend to be resolved very quickly.
We've also been maintaining LXC in precise and quantal very actively, by SRUing every fix that lands in the development release and offering backports for more complex features.

The staging LXC git tree is automatically imported on Launchpad and daily builds for precise, quantal and raring are triggered automatically.

Upstream itself only contains a limited set of test, mostly around the newly introduced liblxc API, however, Serge maintains a separate integration testing branch which we run before upload and will be integrated into autopkgtest and into the upstream dailies once we have some time to do so.

For build-depends: The only build-deps not currently in main is libseccomp for which I'll be filing a separate MIR (bug 1082431) . LXC itself doesn't strictly require this library but the feature is rather nice to have, so I think we should get it promoted too.

I believe all the dependencies are already in main (outside of libseccomp and lxc itself).

LXC doesn't ship any daemon or setuid binary by default, some people choose to mark some of the binaries as setuid or grant extra capabilities, but we don't recommend doing so and don't do it by default.

The LXC package provides two upstart jobs, one to automatically start containers at boot time (if marked as auto-started) and another to setup a "lxcbr0" bridge with a dnsmasq DHCP server running on it, similar to libvirt's virbr0.

Our package isn't usually in sync or even merged with Debian because of disagreements with the Debian maintainer. Our package tends to be much closer to upstream and the upstream staging branch.
We currently carry a lot of patches in our package, but all of them are direct cherry-picks from the staging branch. As a result, as soon as upstream tags 0.9~alpha1, we are expecting to be down to just 4-5 patches remaining.

Martin Pitt (pitti)
Changed in lxc (Ubuntu):
assignee: nobody → Martin Pitt (pitti)
Revision history for this message
Martin Pitt (pitti) wrote :

I spoke with Kees, and he said that he keeps finding security problems in lxc every time he looks at it, and that it's still too young and immature to be put into an LTS. So this should be reconsidered for 10.10.

Changed in lxc (Ubuntu):
assignee: Martin Pitt (pitti) → nobody
status: New → Won't Fix
Revision history for this message
Kees Cook (kees) wrote :

To clarify, it's actually the kernel CLONE_NEW* interfaces that I'm concerned with. There are troublesome leakages in /proc/sys for example. Given the LTS nature of Lucid, I'd really like lxc (and really, the kernel) more time to shake out some of these issues. As an example:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0e1a6ef2dea88101b056b6d9984f3325c5efced3
When compared to other new kernel interfaces, I think this is prudent (i.e. eCryptfs has had several nasty security issues since it started seeing more use, but we kept implementations out of main for a few cycles).

Changed in lxc (Ubuntu):
status: Won't Fix → New
description: updated
description: updated
Changed in lxc (Ubuntu):
assignee: nobody → Jamie Strandboge (jdstrand)
Changed in lxc (Ubuntu):
status: New → Confirmed
status: Confirmed → Triaged
importance: Undecided → Wishlist
Dave Walker (davewalker)
Changed in lxc (Ubuntu):
status: Triaged → New
Revision history for this message
Stéphane Graber (stgraber) wrote :

Just a quick update here that I updated libseccomp in Ubuntu to address the few issues raised in its MIR (bug 1082431).
I think that was the only blocking bits for lxc's MIR, so it'd be nice if an MIR team member could now review LXC.

Changed in lxc (Ubuntu):
assignee: Jamie Strandboge (jdstrand) → Seth Arnold (seth-arnold)
Revision history for this message
Seth Arnold (seth-arnold) wrote :
Download full text (5.1 KiB)

I audited lxc version 0.9.0~rc1-0ubuntu3 as checked into Raring.

This should not be considered a complete security audit, but rather
a quick gauge of maintainability.

- lxc provides userspace convenience wrappers around the Linux kernel's
  containers implementation to make using containers convenient
- There's no cryptography
- No traditional networking, though extensive control of network
  interfaces, bridges, and routing
- Some Unix domain sockets and Netlink domain sockets
- Uses libapparmor, libcap, libc, liblxc, libpthread, libseccomp, libutil
- Can start containers and dnsmasq at boot
- dnsmasq runs as specific lxc-dnsmasq user
- Carefully daemonizes, with nice mechanism to report errors to
  grandparent for good human-based interaction
- postrm does not remove user created during postinst
- No DBus services
- No setuid
- No sudo fragments
- No cron jobs
- Many "failed to load external entity" errors in build logs, probably
  just an incorrect docbook option
- No PIE, no immediate binding
- Variable stack protection (likely false negatives, small binary)
- Variable Foritify (likely false negatives, small binary)
- Consistent read-only bindings

Code quality varied; the Python bindings seem least mature, though the
average quality was otherwise good.

Python bindings:

- convert_tuple_to_char_pointer_array() appears to write to unallocated memory
- convert_tuple_to_char_pointer_array() error-case memory leak
- Container_get_cgroup_item() error-case memory leak
- Container_get_config_item() error-case memory leak
- Container_get_keys() error-case memory leak

convert_tuple_to_char_pointer_array() is serious, if I've properly
understood the function. This must be reviewed by someone else (or
valgrind) before the Python bindings can reasonably be promoted to main.
The presence of this problem in the Python bindings makes me wonder if
they can be split apart and left in universe for the time being.

Everything else:

- Calling lxc_container_free() _after_ releasing the privlock feels wrong
   - lxc_container_get() may follow stale pointers while locking
   - lxc_container_put() could be freeing an object that was acquired

Probably lxc_container_free() should be called with the privlock still
held, and the lock destroyed after the object no longer exists. If
any existing multiprocess or multithreaded code relies upon shared
lxc_container objects, this must be addressed before being promoted
to main. If no current code relies upon shared lxc_container objects,
this can be handled as a simple bug report.

- run_script() shouldn't build string for popen(3) from argv array; this
  should do the difficult pipe2(2) and dup2(2) itself and use execve(2),
  execv(3), or execl(3) directly
- run_script() shouldn't check length against INT_MAX, the argument size
  limit should be roughly two megabytes (see xargs --show-limits for
  size..) -- ideally, this would be removed entirely when replaced with
  an execve(2) version

I didn't follow the callers all the way to the configuration variables
that were used in building these strings; if they can only be set by root,
then this can be handled as a simple bug. If they can be set by a user
account o...

Read more...

Changed in lxc (Ubuntu):
assignee: Seth Arnold (seth-arnold) → MIR approval team (ubuntu-mir)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 509647] Re: [MIR] lxc

Hi Seth, thanks very much for the review.

> Everything else:
>
> - Calling lxc_container_free() _after_ releasing the privlock feels wrong
> - lxc_container_get() may follow stale pointers while locking
> - lxc_container_put() could be freeing an object that was acquired

I think this was an unintented bug. Note that

 1. lxc_container_free() sets c->privlock to NULL
 2. lxcunlock() returns an error -2 if c->privlock is NULL
 3. lxclock() doesn't do so - I probably accidentally dropped
    or forgot to add in that check.

since lxc_container_get() and _put() both return false if lxclock
fails, I adding the needed check in lxclock() this should suffice.

Does that sound ok to you for this item?

If you're up for it, I also wouldn't mind chatting sometime about
the goals and design of the locking. (The comment at top of
lxccontainer.c pretty accurately sums up my goals. It doesn't
mention that running containers are already mutexed by the
monitor UNIX socket in $lxc_path.)

-serge

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Urgh. sorry for losing track of this bug. I forgot to subscribe after submitting my comment.

I believe your proposed additional check would be sufficient. I _think_ better might be to destroy the lock pointer in the shared structure when freeing the object but before unlocking -- preventing other threads from acquiring the lock once the free path has started.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Hi Seth,

I just wanted to comment on the python side of things.

I'm the author of the binding and sadly it's now used by some of the very well used bits of LXC (lxc-ls and lxc-start-ephemeral to list the most populars), so I don't think building without this is really an option for us.

However I'm very very interested in having those issues resolved. To be clear, this binding is probably the first bit of C code that I actually wrote from scratch and released and so I certainly expected the memory management to be mostly crap. We've fixed a bunch of issues already but your comments are really useful to fix the rest of those.

I'll try to set aside some time tomorrow to poke at those and will provide a patch here for Serge and you (if you're interested) to review.

The trick with those bindings and the way LXC work is that it's almost impossible to run those through valgrind to detect leaks so we almost entirely depend on the eyes of our reviewers and even though we're pretty strict on code reviews nowadays, there's only so much you can find on thousands of line long diffs ;)

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Quoting Seth Arnold (<email address hidden>):
> Urgh. sorry for losing track of this bug. I forgot to subscribe after
> submitting my comment.
>
> I believe your proposed additional check would be sufficient. I _think_
> better might be to destroy the lock pointer in the shared structure when
> freeing the object but before unlocking -- preventing other threads from
> acquiring the lock once the free path has started.

Just to help myself think through this,

freer | racing get()er
lxc_container_put()
\ lxclock(c->privlock) lxc_container_get()
\ c->numthreads = 0 \ lxclock(c->privlock) -> waits
\ lxcunlock \
\ lxc_container_free \ lxclock() returns
                                  \ c->numthreads < 1 -> return
\ \ (free stuff)
\ \ sem_destroy(privlock)

So lxclock() needs to check that sem is not NULL as you said before;
Setting c->privlock = NULL before calling sem_destroy on (a copy of :)
it will help another race - I think that is what you are suggesting
here.

I'm trying to figure out if there is still a possible race with that,
where the get()er dereferences c->privlock, then put()er sets
c->privlock = NULL and sem_destroy()s it, then get()er calls
sem_wait(), resulting in 'undefined behavior' per the man page.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Oh but I'm being silly - once the use count goes to 0 it
cannot go back up, so we can check for that before trying
to take the lock.

Revision history for this message
Stéphane Graber (stgraber) wrote :

I just pushed the remaining python C extension fixes to saucy now and will SRU to raring.

Seth, would it be possible for you to recheck the binding as it stands in saucy?
https://github.com/lxc/lxc/blob/staging/src/python-lxc/lxc.c is the current version of the file

I think this was the last issue that you noted, so once we get that one re-checked, I'll start poking people to get LXC promoted in saucy.

Revision history for this message
Jamie Strandboge (jdstrand) wrote :

Seth, Stephane says the bindings have been updated. Can you take another look?

Changed in lxc (Ubuntu):
assignee: MIR approval team (ubuntu-mir) → Seth Arnold (seth-arnold)
Revision history for this message
Seth Arnold (seth-arnold) wrote :

I reviewed lxc 0.9.0-0ubuntu18 as checked into saucy. This is not
a complete security audit but only a quick gauge of code cleanliness.

I previously reviewed lxc (0.9.0~rc1-0ubuntu3), details here:
https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/509647/comments/4

The code quality of the Python bindings has improved drastically.
The lock ordering with lxc_container_free() has been addressed.

Well done on both counts.

Many of the less-important problems I found are still available to
be fixed (an opportunity for someone who is looking to get started in
contributing to Ubuntu, perhaps) but one issue remains that is still a
blocker for main: most binaries are lacking one or more of the security
hardening tools offered by the toolchain.

So: Please enable PIE, stack protection, and immediate binding for all
binaries. This is the final hurdle. :)

Thanks

Changed in lxc (Ubuntu):
assignee: Seth Arnold (seth-arnold) → nobody
Revision history for this message
Stéphane Graber (stgraber) wrote :

I can build with:
export DEB_BUILD_MAINT_OPTIONS = hardening=+stackprotector,+fortify,+format,+relro,+bindnow

But adding +pie causes a FTBFS: http://lxc.dev.stgraber.org/stgraber/20130809-1020/ubuntu-saucy-amd64/lxc_0.9.0.0~stgraber~20130809-1020-0ubuntu1~ppa1~saucy1_amd64-20130809-1021.build

Revision history for this message
Stéphane Graber (stgraber) wrote :

That's because we can't use PIE for the library, now to figure out how to have it ignored there...

Revision history for this message
Stéphane Graber (stgraber) wrote :

Ok, so dpkg-buildflags is too much of a pain to get right, so I'll just go the lazy way and use hardening-wrapper.

Here's the result going this way:
stgraber@castiana:~# hardening-check /usr/bin/lxc-monitor
/usr/bin/lxc-monitor:
 Position Independent Executable: yes
 Stack protected: yes
 Fortify Source functions: yes
 Read-only relocations: yes
 Immediate binding: yes

stgraber@castiana:~# hardening-check /usr/lib/x86_64-linux-gnu/liblxc.so.0
/usr/lib/x86_64-linux-gnu/liblxc.so.0:
 Position Independent Executable: no, regular shared library (ignored)
 Stack protected: yes
 Fortify Source functions: yes (some protected functions found)
 Read-only relocations: yes
 Immediate binding: yes

Revision history for this message
Seth Arnold (seth-arnold) wrote :

Stéphane, thanks for fixing this.

Security team ACK for main.

Changed in lxc (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.