Mir

Merge lp:~vanvugt/mir/fatal-error into lp:mir

fatal-error
Merge into development-branch

Proposed by Daniel van Vugt on 2014-05-14

Status:

Rejected

Rejected by:

Daniel van Vugt on 2014-07-23

Proposed branch:

lp:~vanvugt/mir/fatal-error

Merge into:

lp:mir

Diff against target:

587 lines (+256/-79)

13 files modified

include/shared/mir/fatal.h (+52/-0)
src/platform/graphics/mesa/CMakeLists.txt (+2/-2)
src/platform/graphics/mesa/display_buffer.cpp (+9/-8)
src/platform/graphics/mesa/real_kms_output.cpp (+18/-29)
src/shared/CMakeLists.txt (+1/-0)
src/shared/fatal/CMakeLists.txt (+3/-0)
src/shared/fatal/fatal.cpp (+43/-0)
tests/acceptance-tests/CMakeLists.txt (+1/-0)
tests/acceptance-tests/test_server_shutdown.cpp (+34/-0)
tests/unit-tests/CMakeLists.txt (+2/-0)
tests/unit-tests/graphics/mesa/test_display.cpp (+27/-27)
tests/unit-tests/graphics/mesa/test_real_kms_output.cpp (+15/-13)
tests/unit-tests/test_fatal.cpp (+49/-0)

To merge this branch:

bzr merge lp:~vanvugt/mir/fatal-error

High

Won't Fix

Link a bug report

Reviewer	Review Type	Date Requested	Status
Cemil Azizoglu (community)			Disapprove on 2014-07-22
Robert Carr (community)			Needs Fixing on 2014-07-21
Alexandros Frantzis (community)		2014-05-14	Needs Fixing on 2014-07-21
PS Jenkins bot (community)	continuous-integration		Approve on 2014-07-14
Alan Griffiths		2014-05-20	Needs Fixing on 2014-07-09
Chris Halse Rogers		2014-07-02	Approve on 2014-07-07
Gerry Boland (community)		2014-06-06	Needs Information on 2014-06-06
Michał Sawicz		2014-06-06	Pending
Review via email: mp+219471@code.launchpad.net

Commit message

Add support for "fatal errors". These are situations we don't expect to ever
happen in the absence of bugs and don't have logic to gracefully recover
from. To ensure a useful core file results from fatal errors, we must
implement them by raising a fatal signal (SIGABRT). Exceptions are frequently
useless in this scenario because stack unwinding clobbers all forensic
information including the stack trace that you need to debug and diagnose the
cause. (LP: #1316867)

Description of the change

The need for this feature has been a long time coming. In 2013 when we asked
for public testing of XMir we were bombarded by crash reports containing
unusable core files, the result of exceptions. And so the root causes of
those crashes mostly (and still do) remain unsolved.

Caveats:
  1. I have not converted everything that could be converted to mir::abort().
     Just enough to resolve LP: #1316867 covering the major sources of
     crashes reported by users.
  2. GoogleTest is reporting warnings in the presence of threads when calling
     EXPECT_DEATH/EXIT. These appear to be either spurious or harmless as
     the warnings still occur in simple single-threaded tests like:
        mir_unit_tests --gtest_filter="FatalTest.*"

Revision history for this message

Alexandros Frantzis (afrantzis) wrote on 2014-05-14:

This negates the benefits of previous work to ensure proper shutdown in case of server problems, since many (most?) of our exceptions, especially in the platforms, are not recoverable and hence could be changed to mir::abort(). From that perspective this MP causes a regression for https://bugs.launchpad.net/mir/+bug/1189770 .

Like last time we had this discussion, it's a matter of what we deem to be more important: user experience or crash information. Ideally we want both, and there are some avenues worth exploring that may allow us to achieve that middle ground. For example we could get the core with 'gcore' and pipe that to whatever command/file is in /proc/sys/kernel/core_pattern, mimicking a crash, and then graciously shutdown mir.

On the other hand, if the init system is able to bring USC/unity8 back up properly after an abort (which is probably the case) then this is a problem mostly for developers (e.g. crashing a test server and not being able to change VTs) not for normal users. So, perhaps the best way forward is to be able to configure what mir::abort() does. For example, it could std::abort by default, or optionally throw + gcore to a file.

"Needs discussion"

review: Needs Information

Revision history for this message

PS Jenkins bot (ps-jenkins) wrote on 2014-05-14:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/mir-team-mir-development-branch-ci/1618/rebuild

review: Approve (continuous-integration)

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-05-15:

"last time we had this discussion" I feel we made the wrong decision by allowing the offending change to land:
https://code.launchpad.net/~afrantzis/mir/fix-1189770/+merge/217652

The reason being we have a collection of hidden critical bugs that will never be resolvable without this change. They're hiding in the duplicates list on the right-hand-side of:
https://bugs.launchpad.net/mir/+bug/1237332

I believe the user experience of eventually not crashing at all is more important than recovering the screen but never fixing the crashes. Although we could achieve both if we utilize the existing logic that catches crashes and does lightweight recovery: server/run_mir.cpp
I think the screen/VT recovery should be there.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-05-16:

Alexandros, I think we should resolve your concern as bug 1320114. I now realize that's a problem which pre-dates this proposal as the fix for bug 1189770 should have included it.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-05-16:

It's worth noting that you can safely do a little cleanup after a fatal signal because you're only making the stack deeper in doing so. No unwinding (loss of stack frames) occurs, so it's fine.

Revision history for this message

PS Jenkins bot (ps-jenkins) wrote on 2014-05-20:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/mir-team-mir-development-branch-ci/1676/rebuild

review: Needs Fixing (continuous-integration)

Revision history for this message

PS Jenkins bot (ps-jenkins) wrote on 2014-05-20:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/mir-team-mir-development-branch-ci/1678/rebuild

review: Approve (continuous-integration)

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-05-21:

> This negates the benefits of previous work to ensure proper shutdown in case
> of server problems, since many (most?) of our exceptions, especially in the
> platforms, are not recoverable and hence could be changed to mir::abort().
> From that perspective this MP causes a regression for
> https://bugs.launchpad.net/mir/+bug/1189770 .

I presume that by rasing SIGABRT we'll run the cleanup handler installed by run_mir(). Why doesn't that adequately address lp:1189770?

> Like last time we had this discussion, it's a matter of what we deem to be
> more important: user experience or crash information. Ideally we want both,
> and there are some avenues worth exploring that may allow us to achieve that
> middle ground. For example we could get the core with 'gcore' and pipe that to
> whatever command/file is in /proc/sys/kernel/core_pattern, mimicking a crash,
> and then graciously shutdown mir.

User experience is always key. And anything that damages that should be optional (like the above or a debug mode that cores immediately without any cleanup).

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-05-22:

Alan,
Due to a small oversight, the fix for bug 1189770 is missing a fix for the signal/crash case. So I logged that as a new bug 1320114 and Alexandros is making progress toward resolving it in:
lp:~afrantzis/mir/emergency-cleanup

So fatal signals such as SIGABRT raised by mir::abort() do still go through our emergency cleanup code. We just accidentally forgot to include VT/DRM recovery in that code path. But alf is working toward fixing that(?).

Thus when all the fixes are in place Mir will both recover the screen *and* dump clean core files. However they're not strictly dependent activities, and I'd rather keep proposals separate and smaller where possible.

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-06-06:

I'm as keen as anyone to get clean cores in our development. But it is not so clear this is the best option for downstream developers - who may find it "impolite" for a library they are using to kill their whole process.

Adding Gerry and Michal to reviewers list for a "downstream" viewpoint.

review: Abstain

Revision history for this message

Michał Sawicz (saviq) wrote on 2014-06-06:

If things are recoverable at all, sure it would be best to not abrt. But then there's probably only so many instances when that's the case. In theory we could try and restart Mir, but since the shell is the client, that will probably be rather difficult, so I can't see many instances where Mir can die while the shell can live on...

Revision history for this message

Gerry Boland (gerboland) wrote on 2014-06-06:

I'm all for better stack traces. But I'd like to know if Mir aborts, what happens to threads that the shell owns? Does shell have any way to clean up its own state?

review: Needs Information

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-06-06:

> I'm all for better stack traces. But I'd like to know if Mir aborts, what
> happens to threads that the shell owns? Does shell have any way to clean up
> its own state?

Add a handler to the_emergency_cleanup()...

Revision history for this message

PS Jenkins bot (ps-jenkins) wrote on 2014-07-02:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/mir-team-mir-development-branch-ci/2056/rebuild

review: Approve (continuous-integration)

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-03:

> > I'm all for better stack traces. But I'd like to know if Mir aborts, what
> > happens to threads that the shell owns? Does shell have any way to clean up
> > its own state?
>
> Add a handler to the_emergency_cleanup()...

Although it's worth noting that this will be run in a signal context, so you'll be severely restricted as to what you can reliably do! (The emergency cleanup code currently looks like it allocates memory in the signal context which runs a reasonable chance of deadlocking).

Globally this seems like a reasonable idea, but is there really no way to get useful debugging information out of C++ in the presence of exceptions?

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-04:

Apparently not. C++ exceptions are mostly designed to be caught and handled (resolved). In some cases you do retain the original stack of an exception, but that only happens in bare threads where no one is catching. So that's a rather unrealistic scenario (unless you disallow all non-specific catches?).

Once you've caught (or even rethrown?) an exception, the original error details are lost. Because your thread only has one stack.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-04:

Another way to approach the problem would be to a:
   class UncatchableException
   {
   public:
      UncatchableException() { abort(); }
   }

That means you can still "throw" instead of using the new syntax "mir::abort". Although I'm not sure I like that.

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-04:

Thinking about this some more, I suspect that this is not actually what we really want.

Core files aren't actually very useful for diagnosing things like bug# 1237332, or the many ways unity-system-compositor can fail to start with some form of DRM error. We really want not only the current state, but also the historical state - what DRM devices have been considered, what were their statuses, etc.

For SEGV-type crashes we are forced to go poking around in a core file because we haven't considered the problem ahead of time, but as I understand the code we currently get those, and even there we'd have a better time with historical state as well.

So, while dumping a more useful core on fatal exception would be more useful than the current status, I think what we *actually* want to do is dump about 300x more information to logs by default - of unsuccessful actions, but equally importantly of successful actions. We need to understand what happened leading up to the error condition, and that's often not captured in a core, however pristine.

Dump each udev device probed, each successful modeset, successful EGL initialisation, etc. Things which happen infrequently or only at startup are good targets for logging.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-07:

Absolutely not. I've been doing core analysis from the wild for many years. And the most useless bug reports are always those from projects that try to do their own error handling/logging. Just don't do it. A developer with a debugger and correct symbols can always discover more from a clean core file. As can launchpad's automatic retracer.

With correct symbols (which is generally the hardest part, but feasible), you can analyse data structures by name and dive into all your variables in a meaningful way, post-mortem. It's just not feasible to know in advance what your crashes will be and which variables are the most relevant to debugging them. That's what core files solve.

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-07:

Hm.

So, I think we're actually talking about two different things here: crashes and “crashes”, which are really uncaught exceptions.

You obviously can't know what your crashes will be in advance. On the other hand, you know *exactly* what your “crashes” will be, because you deliberately put them in there.

A core file is good (or at least usually sufficient) for crashes, because the cause of the crash is the thing the program is trying to do _right now_, so it's highly likely that the state of interest is the current state.

For uncaught exceptions the current state is frequently not the interesting state - when unity-system-compositor fails to start with “could not find drm device”, the interesting data is in previous iterations of the “probe this udev device” loop. It's idiomatic to try and do a bunch of things, and if they *all* fail, throw.

Clean core files are useful, and mir::abort() looks to be useful, so I'm not disapproving.

However, most of the places you're calling mir::abort() in this branch are potentially temporary failures, and it _is_ inappropriate to abort there - failure to schedule pageflips, failure to set one crtc (out of potentially more than one active crtc), and failure in eglSwapBuffers are all potentially transitory or recoverable problems which this branch turns into a SIGABRT.

If we're raising these exceptions on a non-main thread we should stop doing that (or catch them and propagate them _to_ the main thread), though.

review: Needs Fixing

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-07:

Some of this is coming from Xorg experience. X writes ~500 lines of output during normal startup, and this output is extremely useful in debugging the most common failures; just a core file would be much _less_ useful for most of those.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-07:

The "potentially temporary failures" thing is a real issue I know. Although it's not feasible to go rewriting all that logic to support recovery in the short term. It's a separate long term activity and not even clear how/if they are really recoverable at all.

Making some of these fatal errors into recoverable errors is a nice enhancement for the future but should not be considered a prerequisite for this. Or else we'll never get a fix for bug 1316867 in time for RTM/utopic.

I like the idea of more logging. Although that too is a separate activity. It can be done orthogonally to error handling. Your logging does not have to be (and shouldn't be) done just when errors occur.

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-07:

Under the acknowledgement that we'll need to rip this out and implement a proper solution when the time is available, I find this reasonable.

review: Approve

Revision history for this message

Alexandros Frantzis (afrantzis) wrote on 2014-07-07:

Another solution, trying to combine the best of both worlds, was discussed during standup today: let the client of the library decide whether runtime errors like the ones discussed in this MP should abort, throw or do something completely different.

By default the library should be polite with client code and throw exceptions. Certain clients, which can safely handle Mir abort()s and from which we want comprehensive debugging information, can explicitly configure mir to abort(). I imagine that both unity8 and USC will want to do that.

"Needs discussion"

review: Needs Information

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-07:

> Another solution, trying to combine the best of both worlds, was discussed
> during standup today: let the client of the library decide whether runtime
> errors like the ones discussed in this MP should abort, throw or do something
> completely different.
>
> By default the library should be polite with client code and throw exceptions.
> Certain clients, which can safely handle Mir abort()s and from which we want
> comprehensive debugging information, can explicitly configure mir to abort().
> I imagine that both unity8 and USC will want to do that.
>
> "Needs discussion"

We as a library have no way of knowing what else is running in the same process nor whether it is acceptable to terminate the process at any given time. Hence, it is extremely impolite for to abort process unless the client code has caused it in some way. (I.e. either the client has requested the abort or has broken a precondition.)

I'm not saying it is a primary use case, but it is conceivable that a process with another primary purpose spins up a Mir compositor briefly and expects to continue after it closes down. With errors reported via exceptions that is possible, with errors reported by "abort()" it isn't. (FWIW I'd even like it to be fully supported to run multiple Mir servers in a process without an error in one bringing the whole lot down.)

So long as the client code uses the Mir API correctly I don't think we should ever abort. (That doesn't preclude the client installing handlers for "very bad things" into Mir that abort *in client code* with our stack intact.)

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-08:

Firstly keep in mind that the code does crash already, with unhandled (and undebuggable) exceptions. The change proposed here just changes them into being "debuggable".

Secondly, wanting to handle and recover from errors is a separate activity. The offending exceptions have been in the Mir code for years now and as yet no one has made them recoverable. So doing that is a large long term task and shouldn't block this one.

<IMHO>It's much more important that we be able to debug these problems and actually stop them from happening than the current situation where one can indeed clean up some things on an exception, but not report enough information to us to ever help to fix the underlying cause.</IMHO>

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-09:

> Firstly keep in mind that the code does crash already, with unhandled (and
> undebuggable) exceptions. The change proposed here just changes them into
> being "debuggable".

There's only a crash if the exception propagating from run_mir() isn't caught.

That is a (questionable) decision made by client code and not something we impose in the Mir library. (The only example code we provide that fails to handle the exception is minimal_server.cpp.)

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-09:

No, these exceptions are raised in the compositing thread, right? Client code has no opportunity to catch them.

-----Original Message-----
From: "Alan Griffiths" <email address hidden>
Sent: ‎9/‎07/‎2014 19:23
To: "Daniel van Vugt" <email address hidden>
Subject: Re: [Merge] lp:~vanvugt/mir/fatal-error into lp:mir/devel

> Firstly keep in mind that the code does crash already, with unhandled (and
> undebuggable) exceptions. The change proposed here just changes them into
> being "debuggable".

There's only a crash if the exception propagating from run_mir() isn't caught.

That is a (questionable) decision made by client code and not something we impose in the Mir library. (The only example code we provide that fails to handle the exception is minimal_server.cpp.)
--
https://code.launchpad.net/~vanvugt/mir/fatal-error/+merge/219471
You are reviewing the proposed merge of lp:~vanvugt/mir/fatal-error into lp:mir/devel.

Revision history for this message

Alexandros Frantzis (afrantzis) wrote on 2014-07-09:

> No, these exceptions are raised in the compositing thread, right? Client code
> has no opportunity to catch them.

Any exception that reaches the top of the compositing threads (i.e. it is not handled by that thread), is caught, the server is torn down, and the exception is rethrown when run_mir() finishes.

See the code in multi_threaded_compositor.cpp and the mir::terminate_with_current_exception() function for more.

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-09:

> > No, these exceptions are raised in the compositing thread, right? Client
> code
> > has no opportunity to catch them.
>
> Any exception that reaches the top of the compositing threads (i.e. it is not
> handled by that thread), is caught, the server is torn down, and the exception
> is rethrown when run_mir() finishes.
>
> See the code in multi_threaded_compositor.cpp and the
> mir::terminate_with_current_exception() function for more.

Right.

What this proposal does is take code that leads to an orderly shutdown with an exception the client can (and is expected to) handle into an unconditional crash.

As I've stated before in this thread I don't think a library should behave this way.

review: Needs Fixing

Revision history for this message

Kevin DuBois (kdub) wrote on 2014-07-09:

I don't think its very well behaved for the library to abort on the user, but on the other hand do appreciate the extra debug info. Seeing as we know when that drm (or hwc for that matter) errors are not handled, maybe we could pack the exception with all the info we could glean (say via glibc's backtrace()) and hope that the client code prints it. (or we could just spew to to stderr). Obviously, this is not as nice as a coredump, but it seems to me to be an okay compromise between the concerns.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-10:

The existing behaviour ("orderly shutdown") is in no way more useful that what is proposed here. We have now spent _years_ unable to fix our crashes because of the existing behaviour. That is unacceptable.

The assertion that "the client can handle" these exceptions is false. Clients can't handle them because even we don't know how to handle them. If we did, then we would be doing so. For clients to blindly catch those exceptions is not useful, because the system is left with a blank screen and doesn't recover.

Core files are significantly more useful to fixing bugs than exception details. Particularly since we have an automated system in place that analyses cores for us. Please let us use it so we can fix these crashes.

Simply recovering/retrying things that fail before you ever understand how they can fail is too dangerous and risks getting stuck in retry loops and/or crashing again after multiple failures in less useful locations.

Unfortunately because most developers ignore bugs the severity of the situation is not fully understood by most developers. But last time we asked for public testing of Mir we were bombarded by crash reports which to this day remain unsolved. I want to avoid that happening again.

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-10:

I don't think its very well behaved for the library to abort on the user.

I think it is important that the user (in this case U8 and/or USC) can get a core.

Alexandros has suggested an approach that meets both these concerns. (https://code.launchpad.net/~vanvugt/mir/fatal-error/+merge/219471/comments/544375)

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-11:

Absolutely crashing is not good behaviour. But if that's the fastest path to resolving the root cause then it's the best option.

I don't think making it configurable is a good idea. Because that gives shell/server developers the option of regressing on LP: #1316867 when there is no existing DRM logic in place for a clean retry/recovery to occur. I mean if a shell did catch such an error as an exception there is no code path back into the Mir DRM code to recover at all. It's not useful to have exceptions when there's no catch/recovery plan. And I think that would be irresponsible of us to allow shell developers to experience Mir bugs which don't generate meaningful crash reports we can work with and resolve.

Revision history for this message

Cemil Azizoglu (cemil-azizoglu) wrote on 2014-07-11:

I wonder if there is a way we can recover the VT/display using a separate executable tied to a watchdog, or a hot-key combination of sorts which can be triggered automatically or through a hot-key combination, to get the display back into a sane state after a fatal crash.

And perhaps a good history logging mechanism, as suggested by Chris, which we can "play back" post-mortem in a developer binary where we have aborts instead of exceptions.

I am also of the opinion that the user-experience must not be compromised just because we have bugs in the system we want to address as quickly as possible. And there will continue to be bugs even after we collect many core files and address the ones that have caused them. But we should not forget that providing a good user-experience is key - piss your users off enough many times and there won't be very many left to exercise bugs in the code and help us collect core files for us to fix.

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-11:

Additional state reporting and logging is a part of the solution. But not all of it.

There are two sets of users to consider here:

1. the users of Mir - shell/server developers exemplified by U8 and U-S-C
2. the users of U8

The former are the ones that should decide whether an abort is acceptable behavior for their program. Not the developers of Mir.

It may even be argued that the latter should be offered a choice too. Then beta testers can run with one option and final users with another.

For example, an option like this:

--on-fatal-error arg (=throw) Action to take on a fatal error [throw|abort|break]

In U8 and/or U-S-C development we can change the default to "abort" in order to ensure better diagnostics. But reconfigure in a production environment to "throw".

With "throw" there should be a bespoke a FatalError exception that is understood by mir::report_exception() and reads:

    to get additional debug information in a core:
       $ ulimit -c unlimited
    and rerun with "--on-fatal-error abort"

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-11:

I think the current "orderly shutdown" behavior is a good option to have and if we break it (even with the intent of restoring it "eventually") it will be hard to recover.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-14:

We already do recover the VT/display on fatal signals. That was Fix Released in Mir 0.4.0, although we failed to document it as fixed so I just updated bug 1320114 now. I was on vacation when that change landed.

mir::abort() follows an orderly shutdown procedure. It does recover the screen and provides a _more_ friendly user experience in that it allows us to finally debug the crashes and fix them once and for all. Thus, more friendly in that the number of crashes will be reduced.

I don't see what benefit exceptions buy you in this case. Our fatal signal handler is just as good at recovering the screen and providing a nice user experience. The only difference is that exceptions would prevent us from getting good crash reports and being able to fix bugs.

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-14:

Exceptions allow users of our library to also partake of the
clean-shutdown goodness.

They may want to do things other than SIGABRT when their Mir server
crashes.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-14:

I thought users could utilize the emergency recovery code?... Or maybe it needs more API still?...
https://code.launchpad.net/~afrantzis/mir/platform-emergency-cleanup/+merge/222662
Inserting new handlers with that _should_ allow you to clean up on both exceptions and fatal signals.

If you relied on exceptions alone for clean up then you're missing out, because that will never catch crashes like SIGSEGV.

Revision history for this message

Chris Halse Rogers (raof) wrote on 2014-07-14:

That assumes that terminating is the correct behavior, and that everything they want to do is safe in a signal context, which it probably isn't.

-----Original Message-----
From: "Daniel van Vugt" <email address hidden>
Sent: ‎14/‎07/‎2014 18:32
To: "Daniel van Vugt" <email address hidden>
Subject: Re: [Merge] lp:~vanvugt/mir/fatal-error into lp:mir/devel

If you relied on exceptions alone for clean up then you're missing out, because that will never catch crashes like SIGSEGV.
--
https://code.launchpad.net/~vanvugt/mir/fatal-error/+merge/219471
You are reviewing the proposed merge of lp:~vanvugt/mir/fatal-error into lp:mir/devel.

Revision history for this message

PS Jenkins bot (ps-jenkins) wrote on 2014-07-14:

Click here to trigger a rebuild:
http://s-jenkins.ubuntu-ci:8080/job/mir-team-mir-development-branch-ci/2144/rebuild

review: Approve (continuous-integration)

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-21:

Blocked on Alan's "Needs Fixing". Not sure what needs fixing though. Looks like a general disapproval of the whole concept without any alternate solutions for bug 1316867 being proposed yet.

Revision history for this message

Alexandros Frantzis (afrantzis) wrote on 2014-07-21:

> Blocked on Alan's "Needs Fixing". Not sure what needs fixing though. Looks
> like a general disapproval of the whole concept without any alternate
> solutions for bug 1316867 being proposed yet.

What "Needs Fixing" is:

1. Aborting as the only action for fatal errors.
2. Aborting as the default action for fatal errors.

Both of would be solved by introducing a mechanism that allows the _users_ of libmirserver to decide what to do in case of fatal errors. The default implementation would be to throw an exception, which, as discussed above, is the most safe and polite reaction.

I expect unity8 and USC to override the fatal error handler and abort() on error to provide sufficient debugging information in the wild, but that's ultimately their decision to make.

review: Needs Fixing

Revision history for this message

Robert Carr (robertcarr) wrote on 2014-07-21:

It seems like exceptions are more flexible to the library user, who knows what they want to do. Installing emergency cleanup handlers isn't a sufficient substitute (signal context). I don't like abort by default because of this.

review: Approve

Revision history for this message

Robert Carr (robertcarr) wrote on 2014-07-21:

I clicked the wrong review status

review: Needs Fixing

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-22:

I'm confused. I've already explained why exceptions are unacceptable here.

Giving people the option of regressing on the bug without providing them with any DRM recovery logic to use as an alternative solution in exception handlers is pointless. It fails to address bug 1316867, defeating the point of this whole exercise.

Revision history for this message

Daniel van Vugt (vanvugt) wrote on 2014-07-22:

I'm happy to say "Won't Fix" on the bug. But not happy to wear the blame for it.

Revision history for this message

Alan Griffiths (alan-griffiths) wrote on 2014-07-22:

I don't understand your confusion. Ever since the first review comment it has been explained why exceptions are not only acceptable but also the desired default for a library.

The fixes outlined by Alexandros do indeed need some additional work in unity-mir and/or unity8 to resolve bug 1316867 but those changes are under our control. It would also means that users of Mir will not be swearing at us for writing a library that summarily terminates their process.

Revision history for this message

Cemil Azizoglu (cemil-azizoglu) wrote on 2014-07-22:

We're kind of going round in circles. So I thought I should clarify my position and state where the red-line lies as far as I'm concerned :

- It's important for us not to abort on the user unconditionally just to make our lives easier.

I'd be happy with any configurable way that this is handled as long as abort is not the default. In its current form, I'll go ahead and disapprove.

review: Disapprove

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Daniel van Vugt

Emanuele Antonio Faraone

Gerry Boland

Mir development team

 === added file 'include/shared/mir/fatal.h'
 --- include/shared/mir/fatal.h	1970-01-01 00:00:00 +0000
 +++ include/shared/mir/fatal.h	2014-07-14 06:58:31 +0000
@@ -0,0 +1,52 @@
++/*
++ * Copyright © 2014 Canonical Ltd.
++ *
++ * This program is free software: you can redistribute it and/or modify
++ * it under the terms of the GNU Lesser General Public License version 3 as
++ * published by the Free Software Foundation.
++ *
++ * This program is distributed in the hope that it will be useful,
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++ * GNU Lesser General Public License for more details.
++ *
++ * You should have received a copy of the GNU Lesser General Public License
++ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
++ *
++ * ---
++ * Fatal error handling - Fatal errors are situations we don't expect to ever
++ * happen and don't have logic to gracefully recover from. The most useful
++ * thing you can do in that situation is abort to get a clean core file and
++ * stack trace to maximize the chances of it being readable.
++ *
++ * Author: Daniel van Vugt <daniel.van.vugt@canonical.com>
++ */
++
++#ifndef MIR_FATAL_H_
++#define MIR_FATAL_H_
++
++namespace mir
++{
++/**
++ * Kill the program and dump core as cleanly as possible. Note the reason
++ * parameter is a simple char* so its value is clearly visible in stack trace
++ * output.
++ * mir::abort() is strictly for "this should never happen" situations that
++ * you cannot recover from.
++ *   \param [in] reason  A printf-style format string.
++ */
++void abort(char const* reason, ...);
++
++/**
++ * Mir-specific assert() function that is NEVER optimized out, always tested
++ * even in release builds.
++ */
++#define mir_assert(expr) \
++{ \
++    if (!(expr)) ::mir::_fail_assertion(__FILE__, __LINE__, #expr); \
++}
++void _fail_assertion(char const* file, int line, char const* expr);
++
++} // namespace mir
++
++#endif // MIR_FATAL_H_
 === modified file 'src/platform/graphics/mesa/CMakeLists.txt'
 --- src/platform/graphics/mesa/CMakeLists.txt	2014-04-22 23:15:16 +0000
 +++ src/platform/graphics/mesa/CMakeLists.txt	2014-07-14 06:58:31 +0000
@@ -44,9 +44,9 @@
    LIBRARY_OUTPUT_DIRECTORY ${LIBRARY_OUTPUT_PATH}/mesa
+ )
--target_link_libraries(
--  mirplatformgraphicsmesa
++target_link_libraries(mirplatformgraphicsmesa
    mirplatform
++  mirfatal
    ${Boost_PROGRAM_OPTIONS_LIBRARY}
    ${DRM_LDFLAGS} ${DRM_LIBRARIES}
 === modified file 'src/platform/graphics/mesa/display_buffer.cpp'
 --- src/platform/graphics/mesa/display_buffer.cpp	2014-07-09 08:22:57 +0000
 +++ src/platform/graphics/mesa/display_buffer.cpp	2014-07-14 06:58:31 +0000
@@ -22,6 +22,7 @@
  #include "mir/graphics/display_report.h"
  #include "bypass.h"
  #include "gbm_buffer.h"
++#include "mir/fatal.h"
  #include <boost/throw_exception.hpp>
  #include <GLES2/gl2.h>
@@ -143,18 +144,18 @@
      glClear(GL_COLOR_BUFFER_BIT);
      if (!egl.swap_buffers())
--        BOOST_THROW_EXCEPTION(std::runtime_error("Failed to perform initial surface buffer swap"));
++        mir::abort("Failed to perform initial surface buffer swap");
      listener->report_successful_egl_buffer_swap_on_construction();
      scheduled_bufobj = get_front_buffer_object();
      if (!scheduled_bufobj)
--        BOOST_THROW_EXCEPTION(std::runtime_error("Failed to get frontbuffer"));
++        mir::abort("Failed to get frontbuffer");
      for (auto& output : outputs)
+     {
          if (!output->set_crtc(scheduled_bufobj->get_drm_fb_id()))
--            BOOST_THROW_EXCEPTION(std::runtime_error("Failed to set DRM crtc"));
++            mir::abort("Failed to set DRM crtc");
+     }
      egl.release_current();
@@ -256,7 +257,7 @@
       * corresponding to the front buffer.
       */
      if (!bypass_buf && !egl.swap_buffers())
--        BOOST_THROW_EXCEPTION(std::runtime_error("Failed to perform initial surface buffer swap"));
++        mir::abort("Failed to perform initial surface buffer swap");
      mgm::BufferObject *bufobj;
      if (bypass_buf)
@@ -271,7 +272,7 @@
+     }
      if (!bufobj)
--        BOOST_THROW_EXCEPTION(std::runtime_error("Failed to get front buffer object"));
++        mir::abort("Failed to get front buffer object");
      /*
       * Schedule the current front buffer object for display, and wait
@@ -284,14 +285,14 @@
+     {
          if (!bypass_buf)
              bufobj->release();
--        BOOST_THROW_EXCEPTION(std::runtime_error("Failed to schedule page flip"));
++        mir::abort("Failed to schedule page flip");
+     }
      else if (needs_set_crtc)
+     {
          for (auto& output : outputs)
+         {
              if (!output->set_crtc(bufobj->get_drm_fb_id()))
--                BOOST_THROW_EXCEPTION(std::runtime_error("Failed to set DRM crtc"));
++                mir::abort("Failed to set DRM crtc");
+         }
          needs_set_crtc = false;
+     }
@@ -397,7 +398,7 @@
+ {
      if (!egl.make_current())
+     {
--        BOOST_THROW_EXCEPTION(std::runtime_error("Failed to make EGL surface current"));
++        mir::abort("Failed to make EGL surface current");
+     }
+ }
 === modified file 'src/platform/graphics/mesa/real_kms_output.cpp'
 --- src/platform/graphics/mesa/real_kms_output.cpp	2014-07-09 10:48:47 +0000
 +++ src/platform/graphics/mesa/real_kms_output.cpp	2014-07-14 06:58:31 +0000
@@ -18,11 +18,7 @@
  #include "real_kms_output.h"
  #include "page_flipper.h"
--
--#include <boost/throw_exception.hpp>
--#include <boost/exception/info.hpp>
--
--#include <stdexcept>
++#include "mir/fatal.h"
  #include <vector>
  #include <string.h> // strcmp
@@ -166,7 +162,7 @@
      connector = resources.connector(connector_id);
      if (!connector)
--        BOOST_THROW_EXCEPTION(std::runtime_error("No DRM connector found\n"));
++        mir::abort("No DRM connector found");
      // TODO: What if we can't locate the DPMS property?
      for (int i = 0; i < connector->count_props; i++)
@@ -202,9 +198,10 @@
  bool mgm::RealKMSOutput::set_crtc(uint32_t fb_id)
+ {
      if (!ensure_crtc())
--        BOOST_THROW_EXCEPTION(std::runtime_error("Output " +
--            connector_name(connector.get()) +
--            " has no associated CRTC to set a framebuffer on"));
++    {
++        mir::abort("Output %s has no associated CRTC to set a framebuffer on",
++                    connector_name(connector.get()).c_str());
++    }
      auto ret = drmModeSetCrtc(drm_fd, current_crtc->crtc_id,
                                fb_id, fb_offset.dx.as_int(), fb_offset.dy.as_int(),
@@ -235,12 +232,8 @@
 , 0, 0, nullptr, 0, nullptr);
      if (result)
+     {
--        std::string const msg =
--            "Couldn't clear output " + connector_name(connector.get());
--
--        BOOST_THROW_EXCEPTION(
--            ::boost::enable_error_info(std::runtime_error(msg))
--                << (boost::error_info<KMSOutput, decltype(result)>(result)));
++        mir::abort("Couldn't clear output %s (drmModeSetCrtc = %d)",
++                   connector_name(connector.get()).c_str(), result);
+     }
      current_crtc = nullptr;
@@ -252,10 +245,10 @@
      if (power_mode != mir_power_mode_on)
          return true;
      if (!current_crtc)
--        BOOST_THROW_EXCEPTION(std::runtime_error("Output " +
--            connector_name(connector.get()) +
--            " has no associated CRTC to schedule page flips on"));
--
++    {
++        mir::abort("Output %s has no associated CRTC to schedule page flips on",
++                   connector_name(connector.get()).c_str());
++    }
      return page_flipper->schedule_flip(current_crtc->crtc_id, fb_id);
+ }
@@ -265,10 +258,10 @@
      if (power_mode != mir_power_mode_on)
          return;
      if (!current_crtc)
--        BOOST_THROW_EXCEPTION(std::runtime_error("Output " +
--            connector_name(connector.get()) +
--            " has no associated CRTC to wait on"));
--
++    {
++        mir::abort("Output %s has no associated CRTC to wait on",
++                   connector_name(connector.get()).c_str());
++    }
      page_flipper->wait_for_flip(current_crtc->crtc_id);
+ }
@@ -283,9 +276,7 @@
                  gbm_bo_get_width(buffer),
                  gbm_bo_get_height(buffer)))
+         {
--            BOOST_THROW_EXCEPTION(
--                ::boost::enable_error_info(std::runtime_error("drmModeSetCursor() failed"))
--                    << (boost::error_info<KMSOutput, decltype(result)>(result)));
++            mir::abort("drmModeSetCursor failed (returned %d)", result);
+         }
          has_cursor_ = true;
@@ -300,9 +291,7 @@
                                              destination.x.as_uint32_t(),
                                              destination.y.as_uint32_t()))
+         {
--            BOOST_THROW_EXCEPTION(
--                ::boost::enable_error_info(std::runtime_error("drmModeMoveCursor() failed"))
--                    << (boost::error_info<KMSOutput, decltype(result)>(result)));
++            mir::abort("drmModeMoveCursor failed (returned %d)", result);
+         }
+     }
+ }
 === modified file 'src/shared/CMakeLists.txt'
 --- src/shared/CMakeLists.txt	2014-07-11 03:33:03 +0000
 +++ src/shared/CMakeLists.txt	2014-07-14 06:58:31 +0000
@@ -14,6 +14,7 @@
  add_subdirectory(sharedlibrary)
  add_subdirectory(thread)
  add_subdirectory(testdraw)
++add_subdirectory(fatal)
  set(
    MIR_COMMON_PLATFORM_LIBRARIES
 === added directory 'src/shared/fatal'
 === added file 'src/shared/fatal/CMakeLists.txt'
 --- src/shared/fatal/CMakeLists.txt	1970-01-01 00:00:00 +0000
 +++ src/shared/fatal/CMakeLists.txt	2014-07-14 06:58:31 +0000
@@ -0,0 +1,3 @@
++add_library(mirfatal STATIC
++  fatal.cpp
++)
 === added file 'src/shared/fatal/fatal.cpp'
 --- src/shared/fatal/fatal.cpp	1970-01-01 00:00:00 +0000
 +++ src/shared/fatal/fatal.cpp	2014-07-14 06:58:31 +0000
@@ -0,0 +1,43 @@
++/*
++ * Copyright © 2014 Canonical Ltd.
++ *
++ * This program is free software: you can redistribute it and/or modify it
++ * under the terms of the GNU Lesser General Public License version 3,
++ * as published by the Free Software Foundation.
++ *
++ * This program is distributed in the hope that it will be useful,
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++ * GNU Lesser General Public License for more details.
++ *
++ * You should have received a copy of the GNU Lesser General Public License
++ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
++*
++ * Author: Daniel van Vugt <daniel.van.vugt@canonical.com>
++ */
++
++#include "mir/fatal.h"
++#include <cstdlib>
++#include <cstdio>
++#include <cstdarg>
++
++void mir::abort(char const* reason, ...)
++{
++    va_list args;
++
++    // Keep this as simple as possible, avoiding any object construction and
++    // minimizing the potential for heap operations between the error location
++    // and the abort().
++    va_start(args, reason);
++    fprintf(stderr, "Mir fatal error: ");
++    vfprintf(stderr, reason, args);
++    fprintf(stderr, "\n");
++    va_end(args);
++
++    std::abort();
++}
++
++void mir::_fail_assertion(char const* file, int line, char const* expr)
++{
++    mir::abort("Assertion failed at %s:%d: %s", file, line, expr);
++}
 === modified file 'tests/acceptance-tests/CMakeLists.txt'
 --- tests/acceptance-tests/CMakeLists.txt	2014-07-11 14:04:15 +0000
 +++ tests/acceptance-tests/CMakeLists.txt	2014-07-14 06:58:31 +0000
@@ -64,6 +64,7 @@
    mirserver
    mirclient
    mirplatform
++  mirfatal
    mir-test
    mir-test-framework
 === modified file 'tests/acceptance-tests/test_server_shutdown.cpp'
 --- tests/acceptance-tests/test_server_shutdown.cpp	2014-07-09 07:54:29 +0000
 +++ tests/acceptance-tests/test_server_shutdown.cpp	2014-07-14 06:58:31 +0000
@@ -23,6 +23,7 @@
  #include "mir/compositor/display_buffer_compositor_factory.h"
  #include "mir/input/composite_event_filter.h"
  #include "mir/run_mir.h"
++#include "mir/fatal.h"
  #include "mir_test_framework/display_server_test_fixture.h"
  #include "mir_test/fake_event_hub_input_configuration.h"
@@ -416,6 +417,39 @@
      });
+ }
++TEST_F(ServerShutdown, server_removes_endpoint_on_mir_abort)
++{   // Even fatal errors sometimes need to be caught for critical cleanup...
++    struct ServerConfig : TestingServerConfiguration
++    {
++        void on_start() override
++        {
++            sync.wait_for_signal_ready_for();
++            mir::abort("Bang");
++        }
++
++        mtf::CrossProcessSync sync;
++    };
++
++    ServerConfig server_config;
++    launch_server_process(server_config);
++
++    run_in_test_process([&]
++    {
++        ASSERT_TRUE(file_exists(server_config.the_socket_file()));
++
++        server_config.sync.signal_ready();
++
++        auto result = wait_for_shutdown_server_process();
++        EXPECT_EQ(mtf::TerminationReason::child_terminated_by_signal, result.reason);
++        // Under valgrind the server process is reported as being terminated
++        // by SIGKILL because of multithreading madness.
++        // TODO: Investigate if we can do better than this workaround
++        EXPECT_TRUE(result.signal == SIGABRT || result.signal == SIGKILL);
++
++        EXPECT_FALSE(file_exists(server_config.the_socket_file()));
++    });
++}
++
  struct OnSignal : ServerShutdown, ::testing::WithParamInterface<int> {};
  TEST_P(OnSignal, removes_endpoint_on_signal)
 === modified file 'tests/unit-tests/CMakeLists.txt'
 --- tests/unit-tests/CMakeLists.txt	2014-07-11 21:53:44 +0000
 +++ tests/unit-tests/CMakeLists.txt	2014-07-14 06:58:31 +0000
@@ -20,6 +20,7 @@
    test_thread_name.cpp
    test_default_emergency_cleanup.cpp
    test_basic_observers.cpp
++  test_fatal.cpp
+ )
  add_subdirectory(options/)
@@ -50,6 +51,7 @@
    mirdraw
    mirtestdraw
    mirlogging
++  mirfatal
    mir-test
    mir-test-doubles
 === modified file 'tests/unit-tests/graphics/mesa/test_display.cpp'
 --- tests/unit-tests/graphics/mesa/test_display.cpp	2014-06-23 16:29:17 +0000
 +++ tests/unit-tests/graphics/mesa/test_display.cpp	2014-07-14 06:58:31 +0000
@@ -450,39 +450,39 @@
+ {
      using namespace testing;
--    auto const crtc_id = get_connected_crtc_id();
--
--    setup_post_update_expectations();
--
--    {
--        InSequence s;
--
--        /* New FB flip failure */
--        EXPECT_CALL(mock_drm, drmModePageFlip(mock_drm.fake_drm.fd(),
--                                              crtc_id,
--                                              fake.fb_id2,
--                                              _, _))
--            .Times(Exactly(1))
--            .WillOnce(Return(-1));
--
--        /* Release failed bufobj */
--        EXPECT_CALL(mock_gbm, gbm_surface_release_buffer(mock_gbm.fake_gbm.surface, fake.bo2))
--            .Times(Exactly(1));
--
--        /* Release scheduled_bufobj (at destruction time) */
--        EXPECT_CALL(mock_gbm, gbm_surface_release_buffer(mock_gbm.fake_gbm.surface, fake.bo1))
--            .Times(Exactly(1));
--    }
--
--    EXPECT_THROW(
--    {
++    EXPECT_DEATH(
++    {
++        auto const crtc_id = get_connected_crtc_id();
++
++        setup_post_update_expectations();
++
++        {
++            InSequence s;
++
++            /* New FB flip failure */
++            EXPECT_CALL(mock_drm, drmModePageFlip(mock_drm.fake_drm.fd(),
++                                                  crtc_id,
++                                                  fake.fb_id2,
++                                                  _, _))
++                .Times(Exactly(1))
++                .WillOnce(Return(-1));
++
++            /* Release failed bufobj */
++            EXPECT_CALL(mock_gbm, gbm_surface_release_buffer(mock_gbm.fake_gbm.surface, fake.bo2))
++                .Times(Exactly(1));
++
++            /* Release scheduled_bufobj (at destruction time) */
++            EXPECT_CALL(mock_gbm, gbm_surface_release_buffer(mock_gbm.fake_gbm.surface, fake.bo1))
++                .Times(Exactly(1));
++        }
++
          auto display = create_display(create_platform());
          display->for_each_display_buffer([](mg::DisplayBuffer& db)
+         {
              db.post_update();
          });
--    }, std::runtime_error);
++    }, "Mir fatal error: Failed to schedule page flip");
+ }
  TEST_F(MesaDisplayTest, successful_creation_of_display_reports_successful_setup_of_native_resources)
 === modified file 'tests/unit-tests/graphics/mesa/test_real_kms_output.cpp'
 --- tests/unit-tests/graphics/mesa/test_real_kms_output.cpp	2014-03-06 06:05:17 +0000
 +++ tests/unit-tests/graphics/mesa/test_real_kms_output.cpp	2014-07-14 06:58:31 +0000
@@ -23,7 +23,7 @@
  #include "mir_test_doubles/mock_drm.h"
--#include <stdexcept>
++#include <csignal>
  #include <gtest/gtest.h>
  #include <gmock/gmock.h>
@@ -227,12 +227,15 @@
                                mt::fake_shared(mock_page_flipper)};
      EXPECT_FALSE(output.set_crtc(fb_id));
--    EXPECT_THROW({
--        output.schedule_page_flip(fb_id);
--    }, std::runtime_error);
--    EXPECT_THROW({
--        output.wait_for_page_flip();
--    }, std::runtime_error);
++
++    EXPECT_EXIT({output.schedule_page_flip(fb_id);},
++                KilledBySignal(SIGABRT),
++                "Mir fatal error: Output VGA-0 has no associated CRTC to schedule page flips on");
++
++    // Yes, gtest is so awesome we can die multiple times...
++    EXPECT_EXIT({output.wait_for_page_flip();},
++                KilledBySignal(SIGABRT),
++                "Mir fatal error: Output VGA-0 has no associated CRTC to wait on");
+ }
  TEST_F(RealKMSOutputTest, clear_crtc_gets_crtc_if_none_is_current)
@@ -285,11 +288,10 @@
      mgm::RealKMSOutput output{mock_drm.fake_drm.fd(), connector_ids[0],
                                mt::fake_shared(mock_page_flipper)};
--    EXPECT_CALL(mock_drm, drmModeSetCrtc(_, crtc_ids[0], 0, 0, 0, nullptr, 0, nullptr))
--        .Times(1)
--        .WillOnce(Return(-1));
++    ON_CALL(mock_drm, drmModeSetCrtc(_, crtc_ids[0], 0, 0, 0, nullptr, 0, nullptr))
++        .WillByDefault(Return(-1));
--    EXPECT_THROW({
--        output.clear_crtc();
--    }, std::runtime_error);
++    EXPECT_EXIT({output.clear_crtc();},
++                KilledBySignal(SIGABRT),
++                "Mir fatal error: Couldn't clear output VGA-0 \\(drmModeSetCrtc = -1\\)");
+ }
 === added file 'tests/unit-tests/test_fatal.cpp'
 --- tests/unit-tests/test_fatal.cpp	1970-01-01 00:00:00 +0000
 +++ tests/unit-tests/test_fatal.cpp	2014-07-14 06:58:31 +0000
@@ -0,0 +1,49 @@
++/*
++ * Copyright © 2014 Canonical Ltd.
++ *
++ * This program is free software: you can redistribute it and/or modify it
++ * under the terms of the GNU General Public License version 3,
++ * as published by the Free Software Foundation.
++ *
++ * This program is distributed in the hope that it will be useful,
++ * but WITHOUT ANY WARRANTY; without even the implied warranty of
++ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
++ * GNU General Public License for more details.
++ *
++ * You should have received a copy of the GNU General Public License
++ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
++ *
++ * Author: Daniel van Vugt <daniel.van.vugt@canonical.com>
++ */
++
++#include "mir/fatal.h"
++#include <gtest/gtest.h>
++#include <csignal>
++
++using namespace testing;
++
++TEST(FatalTest, abort_formats_message_to_stderr)
++{
++    EXPECT_DEATH({mir::abort("%s had %d %s %s", "Mary", 1, "little", "lamb");},
++                 "Mary had 1 little lamb");
++}
++
++TEST(FatalTest, abort_raises_sigabrt)
++{
++    EXPECT_EXIT({mir::abort("Hello world");},
++                KilledBySignal(SIGABRT),
++                "Hello world");
++}
++
++TEST(FatalTest, true_assertion_raises_nothing)
++{
++    mir_assert(1 + 1 == 2);
++}
++
++TEST(FatalTest, false_assertion_emits_message_and_aborts)
++{
++    EXPECT_EXIT({mir_assert(2+2==5);},
++                KilledBySignal(SIGABRT),
++                "Mir fatal error: Assertion failed at .* 2\\+2==5");
++}
++