mesos:1.6.x

Last commit made on 2018-07-25
Get this branch:
git clone -b 1.6.x https://git.launchpad.net/mesos

Branch merges

Branch information

Name:
1.6.x
Repository:
lp:mesos

Recent commits

2bc0086... by Greg Mann <email address hidden>

Bumped the Mesos version to 1.6.2.

70649a2... by Benjamin Bannier <email address hidden>

Added balloon framework metric for tasks which were running.

The framework currently exposes metric counters for various expected
and unexpected task termination reasons. Interpreting these counters
can be non-trivial since tasks might fail due to benign, but unknown
external reasons.

This patch adds a counter for the tasks which actually made it to the
running stage which can be correlated with the different terminal task
counts.

Review: https://reviews.apache.org/r/67928

2f6f381... by Benjamin Mahler <email address hidden>

Add MESOS-8418 to the 1.6.2 CHANGELOG.

9129ff1... by Benjamin Mahler <email address hidden>

Improved performance of cgroups::read by verifying after failure.

It turns out that cgroups::verify is expensive and leads to a severe
performance issue on the agent during container metrics collection
if there are a lot of containers on the agent. See MESOS-8418.

Since cgroups::verify serves to provide a helpful error message,
this patch preserves the error message, but only if the read fails.

Longer term, there probably needs to be some re-structuring of the
code to make verification caller-controlled, or perhaps the verify
code can occur consistently post-operation (as done in this patch),
or perhaps verify can be optimized substantially.

Review: https://reviews.apache.org/r/67923/

ae82dd5... by Greg Mann <email address hidden>

Added MESOS-8987 to the 1.6.1 CHANGELOG.

778430e... by =?utf-8?q?Gast=C3=B3n_Kleiman?= <email address hidden>

Prevented master from asking agents to shutdown on auth failures.

The Mesos master sends a `ShutdownMessage` to an agent if there is an
authentication or an authorization error during agent (re)registration.

Upon receipt of this message, the agent kills alls its tasks and commits
suicide. This means that transient auth errors can lead to whole agents
being killed along with its tasks.

This patch prevents the master from sending a `ShutdownMessage` in these
cases.

Review: https://reviews.apache.org/r/67791/

052af58... by Alexander Rojas <email address hidden>

Fixed unproperly guarded future.

This patch fixes a bug where the code path could cause a crash because
of calling `Fture<T>::get()` on a future which is failed.

Review: https://reviews.apache.org/r/67722

f86c9f8... by Benjamin Mahler <email address hidden>

Added MESOS-9024 to the 1.6.1 CHANGELOG.

b01c0a8... by Benjamin Mahler <email address hidden>

Reduced likelihood of a stack overflow in libprocess socket recv path.

Currently, the socket recv path is implemented using an asynchronous
loop with callbacks. Without using `process::loop`, this pattern is
prone to a stack overflow in the case that all asynchronous calls
complete synchronously. This is possible with sockets if the socket
is always ready for reading. The crash has been reported in MESOS-9024,
so the stack overflow has been encountered in practice.

This patch updates the recv path to leverage `process::loop`, which
is supposed to prevent stack overflows in asynchronous loops. However,
it is still possible for `process::loop` to stack overflow due to
MESOS-8852. In practice, I expect that even without MESOS-8852 fixed,
users won't see any stack overflows in the recv path.

Review: https://reviews.apache.org/r/67824

d565c17... by Zhitao Li <email address hidden>

Added MESOS-9049 to the 1.6.1 CHANGELOG.