Mir

Merge lp:~vanvugt/mir/log-unhandled-exceptions into lp:mir

Proposed by Daniel van Vugt
Status: Rejected
Rejected by: Daniel van Vugt
Proposed branch: lp:~vanvugt/mir/log-unhandled-exceptions
Merge into: lp:mir
Diff against target: 31 lines (+3/-1)
2 files modified
src/include/server/mir/frontend/template_protobuf_message_processor.h (+2/-1)
tests/CMakeLists.txt (+1/-0)
To merge this branch: bzr merge lp:~vanvugt/mir/log-unhandled-exceptions
Reviewer Review Type Date Requested Status
Chris Halse Rogers Disapprove
Kevin DuBois (community) Abstain
Mir CI Bot continuous-integration Approve
Alan Griffiths Needs Fixing
Review via email: mp+310398@code.launchpad.net

Commit message

Log internal errors to the server log instead of silently hiding them.
This problem was first observed on VirtualBox where buffer allocation
fails but no errors are logged anywhere and nothing appears on screen.
(related to LP: #1639745)

To post a comment you must log in.
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

FAILED: Continuous integration, rev:3814
https://mir-jenkins.ubuntu.com/job/mir-ci/2135/
Executed test runs:
    FAILURE: https://mir-jenkins.ubuntu.com/job/build-mir/2764/console
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-0-fetch/2827
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=vivid+overlay/2819
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=xenial+overlay/2819
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=yakkety/2819
    FAILURE: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=yakkety/2793/console
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial+overlay/2793
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial+overlay/2793/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=yakkety/2793
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=yakkety/2793/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=android,release=vivid+overlay/2793
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=android,release=vivid+overlay/2793/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=android,release=vivid+overlay/2793
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=android,release=vivid+overlay/2793/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial+overlay/2793
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial+overlay/2793/artifact/output/*zip*/output.zip

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/2135/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :
3815. By Daniel van Vugt

Try again

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

+ mir::log_error(result_message.error());

Assuming the surrounding code that returns an error status to the client is vaguely correct[1], then I think log_warning() is more appropriate. The server is able to continue operation, and isn't even disconnecting the application session which remains usable.

[1] The "mir::ClientVisibleError" catch clause was clearly added to select exceptions that allow continued operation on the connection, it seems very odd that other exceptions are handled the same way. (But even if this code rethrew the exception, I think "warning" is more appropriate.)

review: Needs Fixing
Revision history for this message
Mir CI Bot (mir-ci-bot) wrote :

PASSED: Continuous integration, rev:3815
https://mir-jenkins.ubuntu.com/job/mir-ci/2136/
Executed test runs:
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-mir/2767
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-0-fetch/2830
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=vivid+overlay/2822
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=xenial+overlay/2822
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-1-sourcepkg/release=yakkety/2822
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=yakkety/2796
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=clang,platform=mesa,release=yakkety/2796/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial+overlay/2796
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=xenial+overlay/2796/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=yakkety/2796
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=amd64,compiler=gcc,platform=mesa,release=yakkety/2796/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=android,release=vivid+overlay/2796
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=cross-armhf,compiler=gcc,platform=android,release=vivid+overlay/2796/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=android,release=vivid+overlay/2796
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=android,release=vivid+overlay/2796/artifact/output/*zip*/output.zip
    SUCCESS: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial+overlay/2796
        deb: https://mir-jenkins.ubuntu.com/job/build-2-binpkg-mir/arch=i386,compiler=gcc,platform=mesa,release=xenial+overlay/2796/artifact/output/*zip*/output.zip

Click here to trigger a rebuild:
https://mir-jenkins.ubuntu.com/job/mir-ci/2136/rebuild

review: Approve (continuous-integration)
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

I did consider error vs warning. But I think this one is an error. Errors which don't break the server state, but do break the correctness of what's on screen, are arguably still errors. A problem which breaks the server state should have a higher severity again: "critical" or call fatal_error() ...

enum class Severity
{
    critical = 0,
    error = 1,
    warning = 2,
    informational = 3,
    debug = 4
};

Although I get your point. Makes me think "error" perhaps shouldn't be in the list. Just have warning and critical to avoid the whole debate.

Revision history for this message
Alan Griffiths (alan-griffiths) wrote :

Reasonable people can differ on the distinction between errors and warnings, and different projects can have different views.

Sadly, grepping the Mir codebase for log_warning|log_error doesn't give conclusive evidence for either POV. I guess we need a more few votes.

We have to wait for a few more votes.

Revision history for this message
Gerry Boland (gerboland) wrote :

As an info-point, this is Qt's definition of the different grades of output:

    qDebug() is used for writing custom debug output.
    qInfo() is used for informational messages.
    qWarning() is used to report warnings and recoverable errors in your application.
    qCritical() is used for writing critical error messages and reporting system errors.
    qFatal() is used for writing fatal error messages shortly before exiting.

In practice, I seldom see qInfo being used. qCritical and qFatal are almost identical, qCritical we use as a means to identify that system state from now on may be inconsistent, and qFatal calls exit() right after printing.

Revision history for this message
Kevin DuBois (kdub) wrote :

@call for discussion on error vs warning.
I think error is alright if graphical corruption/black boxes appear. Its one of mir's guarantees that we display to screen. OTOH, i'd consider something configure_display throwing because there's no session more of a warning or info from the server's perspective. Where the log is right now in the template doesn't have enough information about the exception to figure out an appropriate logging level.

@new point of discussion:
If the client generated an error though, it would seem appropriate for it it appear in the client logs, as opposed to generating server log errors.

Revision history for this message
Kevin DuBois (kdub) wrote :

> As an info-point, this is Qt's definition of the different grades of output:
>
> qDebug() is used for writing custom debug output.
> qInfo() is used for informational messages.
> qWarning() is used to report warnings and recoverable errors in your
> application.
> qCritical() is used for writing critical error messages and reporting
> system errors.
> qFatal() is used for writing fatal error messages shortly before exiting.
>
> In practice, I seldom see qInfo being used. qCritical and qFatal are almost
> identical, qCritical we use as a means to identify that system state from now
> on may be inconsistent, and qFatal calls exit() right after printing.

Yeah, maybe there isn't enough distinction between error and warning to merit the two options in the codebase. I guess my review is abstain for now, but really only until there's a team-opinion on the matter.

review: Abstain
Revision history for this message
Chris Halse Rogers (raof) wrote :

I'm actually not sure the *server* should log anything in this circumstance; the client is getting a perfectly acceptable error return. Server logging for client errors runs the risk of a DOS attack via filesystem exhaustion.

If we log anything in the server I'd be leaning toward info/warning; this is not a server error condition. The server is behaving appropriately.

If the client silently eats the error, that's a client bug and should be fixed there.

(@Alan: mir::ClientVisibleError is there for problems for which have a specific client-side error [ie: MirError] defined for them; it's conceptually the same handling as below, just providing the client with a more structured report of what went wrong)

review: Abstain
Revision history for this message
Daniel van Vugt (vanvugt) wrote :

> I think error is alright if graphical corruption/black boxes appear.

That's indeed what we will soon see on VirtualBox (Mir 0.25). Instead of Mir crashing the screen will just be black. We should log a message indicating why the screen is black.

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

And to answer Chris -- I checked for that but it appeared clients are just hanging. There didn't seem to be any error coming back to clients for them to report, but I don't think I spent enough time debugging that to find out why or what was hanging.

Regardless, yes the screen will probably just be black on vbox starting in Mir 0.25 so that's a visual error we ideally need to log about. Hence this branch.

Revision history for this message
Chris Halse Rogers (raof) wrote :

Um, if the clients are hanging then we've clearly got a Mir client bug. The server is properly responding to the client request!

Revision history for this message
Daniel van Vugt (vanvugt) wrote :

Indeed we have over 400 bugs on the backlog. Add that to the list, probably. But I don't see it as a reason to not add this log message to the server.

Furthermore the debate over warning vs error log level is trivial and counter-productive. That's a minor issue which can be changed later if we have sufficient reason.

Revision history for this message
Chris Halse Rogers (raof) wrote :

OK, so given that I think this change is unnecessary and marginally harmful.

The problem which prompted this change was “clients don't render with no error”, and we've identified that this *is* a problem.

Working out, and fixing, why the client is silently ignoring errors will be a more useful change than having the server log perfectly well-handled exceptions.

review: Disapprove
Revision history for this message
Chris Halse Rogers (raof) wrote :

(weak disapprove; feel free to override if others disagree)

Revision history for this message
Andreas Pokorny (andreas-pokorny) wrote :

I think the location is too generic to decide the about the log level. We could come up with different cases that people would not consider errors. Since the location is generic it should be a debug/info message.

I think the concern Christopher raises could be addressed by having better logging facilities.

Unmerged revisions

3815. By Daniel van Vugt

Try again

3814. By Daniel van Vugt

Log internal errors to the server log instead of silently hiding them.
This problem was first observed on VirtualBox where buffer allocation
fails but no errors are logged anywhere and nothing appears on screen.
(related to LP: #1639745)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/include/server/mir/frontend/template_protobuf_message_processor.h'
2--- src/include/server/mir/frontend/template_protobuf_message_processor.h 2016-04-14 06:59:49 +0000
3+++ src/include/server/mir/frontend/template_protobuf_message_processor.h 2016-11-09 09:05:20 +0000
4@@ -23,7 +23,7 @@
5 #include "mir/frontend/message_processor.h"
6 #include "mir/cookie/authority.h"
7 #include "mir/client_visible_error.h"
8-
9+#include "mir/log.h"
10 #include "mir_protobuf.pb.h"
11
12 #include <google/protobuf/stubs/common.h>
13@@ -96,6 +96,7 @@
14 result_message.set_error("Error processing request: "s +
15 x.what() + "\nInternal error details: " + boost::diagnostic_information(x));
16 self->send_response(invocation.id(), &result_message);
17+ mir::log_error(result_message.error());
18 }
19 }
20
21
22=== modified file 'tests/CMakeLists.txt'
23--- tests/CMakeLists.txt 2016-11-07 16:46:53 +0000
24+++ tests/CMakeLists.txt 2016-11-09 09:05:20 +0000
25@@ -3,6 +3,7 @@
26 message(FATAL_ERROR "Umockdev not found, cannot build without disabling tests (via MIR_ENABLE_TESTS).")
27 endif()
28
29+add_definitions(-DMIR_LOG_COMPONENT_FALLBACK="Tests")
30 include_directories(${MIR_3RD_PARTY_INCLUDE_DIRECTORIES})
31 include_directories(${MIR_ANDROID_INCLUDE_DIRECTORIES})
32

Subscribers

People subscribed via source and target branches