Apport

Merge lp:~ev/apport/recoverable-errors into lp:~apport-hackers/apport/trunk

recoverable-errors
Merge into trunk

Proposed by Evan on 2012-06-25

Status:	Merged
Merged at revision:	2440
Proposed branch:	lp:~ev/apport/recoverable-errors
Merge into:	lp:~apport-hackers/apport/trunk
Diff against target:	323 lines (+243/-6) 6 files modified bin/apport-recoverable-problem (+46/-0) gtk/apport-gtk (+20/-3) kde/apport-kde (+17/-3) test/test_recoverable_problem.py (+83/-0) test/test_ui_gtk.py (+39/-0) test/test_ui_kde.py (+38/-0)
To merge this branch:	bzr merge lp:~ev/apport/recoverable-errors
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Martin Pitt (community)		2012-06-25	Approve on 2012-07-13
Review via email: mp+111840@code.launchpad.net

Description of the change

This branch adds a DBus service for reporting 'recoverable errors.' These are
problems which the application can handle, but still wishes to notify the user
and http://errors.ubuntu.com about.

As an example, the application may wish to notify the user because handling the
error resulted in degraded functionality. The user interface may fail to load
items, or the action just performed may not return any data. The developer may
wish for these types of errors to appear on http://errors.ubuntu.com so that
they may correct the root cause of the report.

I've implemented the DBus service and tests for both it and the UI frontends.

Thanks!

Revision history for this message

Martin Pitt (pitti) wrote on 2012-06-26:

I like the general idea, thanks for this Evan!

I would like to discuss/reconsider whether this really should be a D-BUS service; it would be the only piece of Apport functionality that is. All similar cases (package failure, gcc internal compiler error, uncaught Java exception, uncaught Python exception, GPU hang, etc.) have a CLI API in /usr/share/apport/ instead. E. g. gcc calls /usr/share/apport/gcc_ice_hook and submits the data on stdin and as CLI arguments. This approach reduces the asumptions how much of the system is working; adding dbus, dbus-python, and a working session bus are no small additions here. Also, by doing this on the session bus you exclude a lot of programs from this functionality: system services, user programs running from cron, or remote users.

Some nitpicks about the code (referencing line numbers in the "Preview diff" below):

5: please use /usr/bin/apport in trunk for now, as we keep the code working for Python 2 still (backports, retracer environment only has Python 2, etc.)

26, 37, others: Can we rename it to "RecoverableProblem"? It might not be an actual crash (in fact, the way you describe it it is particularly _not_ a crash)

43-46: "Traceback" is Python specific, but InterpreterPath applies to all non-ELF programs (also Shell, Perl, Ruby, Java, JS, etc.); I would not assume having a stack trace here; if the caller actually has one, it can still supply it through the arbitrary key/value pairs (additional_keys), and strongly recommend (in the documentation for that functionality) that the caller adds a "DuplicateSignature" field.

26, 48: additional_keys is misleading, as it's a dictionary (but this might go away with turning this into CLI)

test_apport_service.py: Unless you convert this to CLI, this needs to launch a local D-BUS and run on this (Gio.TestDBus is very nice for this, but only works in Quantal; dbus-launch works everywhere). During package builds and in autopkgtest we won't have a running session (nor system) bus.

Thanks!

I like the general idea, thanks for this Evan!

Some nitpicks about the code (referencing line numbers in the "Preview diff" below):

5: please use /usr/bin/apport in trunk for now, as we keep the code working for Python 2 still (backports, retracer environment only has Python 2, etc.)

26, 37, others: Can we rename it to "RecoverableProblem"? It might not be an actual crash (in fact, the way you describe it it is particularly _not_ a crash)

26, 48: additional_keys is misleading, as it's a dictionary (but this might go away with turning this into CLI)

Thanks!

review: Needs Fixing

Revision history for this message

Evan (ev) wrote on 2012-07-05:

On Tue, Jun 26, 2012 at 9:32 AM, Martin Pitt <email address hidden> wrote:
> I like the general idea, thanks for this Evan!

Thanks!

> I would like to discuss/reconsider whether this really should be a D-BUS service; it would be the only piece of Apport functionality that is. All similar cases (package failure, gcc internal compiler error, uncaught Java exception, uncaught Python exception, GPU hang, etc.) have a CLI API in /usr/share/apport/ instead. E. g. gcc calls /usr/share/apport/gcc_ice_hook and submits the data on stdin and as CLI arguments. This approach reduces the asumptions how much of the system is working; adding dbus, dbus-python, and a working session bus are no small additions here. Also, by doing this on the session bus you exclude a lot of programs from this functionality: system services, user programs running from cron, or remote users.

I do agree that is ideal. I believe my original motivation for not
going down that road was that it's hard to provide structure through a
stdin pipe, and command line arguments have a maximum length.

I suppose we end up with something fairly loose no matter what, given
that the DBus service took a small number of fixed arguments and
pushed the rest through a dictionary. Would then you be happy with
something akin to the JVM hook, where it takes key value pairs
separated by null bytes to form the report, getting the PID from
getppid?

Sanity checks could then be done on the other then of the pipe to
ensure that there is enough present to generate a signature. This
might be via the reporter pushing a backtrace
(http://www.gnu.org/software/libc/manual/html_node/Backtraces.html),
traceback, or DuplicateSignature through the pipe as a key value pair.

Revision history for this message

Martin Pitt (pitti) wrote on 2012-07-06:

Evan Dandrea [2012-07-05 14:43 -0000]:
> I suppose we end up with something fairly loose no matter what, given
> that the DBus service took a small number of fixed arguments and
> pushed the rest through a dictionary. Would then you be happy with
> something akin to the JVM hook, where it takes key value pairs
> separated by null bytes to form the report, getting the PID from
> getppid?

Poor man's marshaller :-) Yes, I think that would be suitable as long
as we only need textual data. Binary data will most likely have null
bytes in them. But this does not seem to be a significant limitation
to me. If we need something more elaborate, we could also use an
existing marshaller such as pickle (but that would be difficult to
build from C clients) or GVariant.

Martin

--
Martin Pitt | http://www.piware.de
Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org)

lp:~ev/apport/recoverable-errors updated on 2012-07-12

2405. By Evan on 2012-07-12: Instead of creating a DBus service to receive recoverable error reports, feed them into a binary with nul-separated key value pairs.
2406. By Evan on 2012-07-12: Make sure we're looking at the right report.
2407. By Evan on 2012-07-12: Refactor test for recoverable problems. Test for incomplete data.
2408. By Evan on 2012-07-12: One more.
2409. By Evan on 2012-07-12: Drop explicit python3 shebang.
2410. By Evan on 2012-07-12: Drop the DBus service.
2411. By Evan on 2012-07-12: s/RecoverableCrash/RecoverableProblem/g
2412. By Evan on 2012-07-12: Merge with trunk.
2413. By Evan on 2012-07-12: pep8 fixes.
2414. By Evan on 2012-07-12: Handle slight path differences in running tests.
2415. By Evan on 2012-07-12: Copyright header.

Revision history for this message

Evan (ev) wrote on 2012-07-12:

Okay, I've turned this into a separate binary with nul-separated key-value pairs sent over stdin. It's pretty simple, so perhaps I've missed something. :)

Revision history for this message

Martin Pitt (pitti) wrote on 2012-07-13:

This is indeed a lot simpler and more robust.

I made the following changes:

- Move from bin to data/ and rename to recoverable_problem, to be consistent with the other scripts of that kind.
- Add an error message for an odd number of fields in stdin
- Drop the check for /proc/sys/kernel/core_pattern from test/test_recoverable_problem.py, this is unrelated
- In test_recoverable_problem.py, simplify call_recoverable_problem() by dropping the os.pipe() stuff and just feeding "data" as argument of communicate()
- Add docstrings to the tests

Merged into trunk now. Thanks a lot!

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Brian Murray

Bruno Maximilian Voss

Evan

Martin Pitt

Ritesh Raj Sarraf

 === added file 'bin/apport-recoverable-problem'
 --- bin/apport-recoverable-problem	1970-01-01 00:00:00 +0000
 +++ bin/apport-recoverable-problem	2012-07-12 16:28:35 +0000
@@ -0,0 +1,46 @@
++#!/usr/bin/python
++
++'''Report an error that can be recovered from.
++
++This application should be called with its stanard input pipe fed a
++nul-separated list of key-value pairs.
++'''
++
++# Copyright (C) 2012 Canonical Ltd.
++# Author: Evan Dandrea <ev@ubuntu.com>
++#
++# This program is free software; you can redistribute it and/or modify it
++# under the terms of the GNU General Public License as published by the
++# Free Software Foundation; either version 2 of the License, or (at your
++# option) any later version.  See http://www.gnu.org/copyleft/gpl.html for
++# the full text of the license.
++
++import apport.report
++import sys
++import os
++
++
++def main():
++    report = apport.report.Report('RecoverableProblem')
++    items = sys.stdin.read().split('\0')
++    if len(items) % 2 != 0:
++        sys.exit(1)
++
++    while items:
++        key = items.pop(0)
++        if not items:
++            break
++        value = items.pop(0)
++        report[key] = value
++
++    report.pid = os.getppid()
++    if not report.pid:
++        sys.exit(1)
++    report.add_os_info()
++    report.add_proc_info(report.pid)
++    report.add_user_info()
++    with open(apport.fileutils.make_report_path(report), 'wb') as fp:
++        report.write(fp)
++
++if __name__ == '__main__':
++    main()
 === modified file 'gtk/apport-gtk'
 --- gtk/apport-gtk	2012-07-03 16:16:29 +0000
 +++ gtk/apport-gtk	2012-07-12 16:28:35 +0000
@@ -264,8 +264,12 @@
                  icon = self.desktop_info.get('icon')
                  n = self.desktop_info['name']
                  n = GLib.markup_escape_text(n)
--                self.w('title_label').set_label('<big><b>%s</b></big>' %
--                    _('The application %s has closed unexpectedly.') % n)
++                if report_type == 'RecoverableProblem':
++                    t = _('The application %s has experienced '
++                          'an internal error.') % n
++                else:
++                    t = _('The application %s has closed unexpectedly.') % n
++                self.w('title_label').set_label('<big><b>%s</b></big>' % t)
                  self.w('subtitle_label').hide()
                  if 'ProcCmdline' in self.report:
@@ -277,7 +281,11 @@
                      self.w('continue_button').set_label(_('Continue'))
              else:
                  icon = 'distributor-logo'
--                title_text = self.get_system_application_title()
++                if report_type == 'RecoverableProblem':
++                    title_text = _('The application %s has experienced '
++                                   'an internal error.') % self.cur_package
++                else:
++                    title_text = self.get_system_application_title()
                  self.w('title_label').set_label('<big><b>%s</b></big>' %
                                                  title_text)
                  self.w('subtitle_label').show()
@@ -292,6 +300,15 @@
              else:
                  self.w('ignore_future_problems').hide()
++            if report_type == 'RecoverableProblem':
++                body = self.report.get('DialogBody', '')
++                if body:
++                    del self.report['DialogBody']
++                    self.w('subtitle_label').show()
++                    # Set a maximum size for the dialog body, so developers do
++                    # not try to shove entire log files into this dialog.
++                    self.w('subtitle_label').set_label(body[:1024])
++
          if icon:
              from gi.repository import GdkPixbuf
              builtin = Gtk.IconLookupFlags.USE_BUILTIN
 === modified file 'kde/apport-kde'
 --- kde/apport-kde	2012-06-06 15:45:04 +0000
 +++ kde/apport-kde	2012-07-12 16:28:35 +0000
@@ -193,9 +193,14 @@
              # Regular crash.
              if desktop_info:
                  icon = desktop_info.get('icon')
--                self.heading.setText(_('The application %s has closed '
--                                       'unexpectedly.') %
--                                       desktop_info['name'])
++                if report_type == 'RecoverableProblem':
++                    self.heading.setText(_('The application %s has experienced '
++                                           'an internal error.') %
++                                           desktop_info['name'])
++                else:
++                    self.heading.setText(_('The application %s has closed '
++                                           'unexpectedly.') %
++                                           desktop_info['name'])
                  self.text.hide()
                  if 'ProcCmdline' in report:
                      self.closed_button.show()
@@ -219,6 +224,15 @@
              else:
                  self.ignore_future_problems.hide()
++            if report_type == 'RecoverableProblem':
++                body = report.get('DialogBody', '')
++                if body:
++                    del report['DialogBody']
++                    # Set a maximum size for the dialog body, so developers do
++                    # not try to shove entire log files into this dialog.
++                    self.text.setText(body[:1024])
++                    self.text.show()
++
          if icon:
              base = QIcon.fromTheme(icon).pixmap(42, 42)
              overlay = QIcon.fromTheme('dialog-error').pixmap(16, 16)
 === added file 'test/test_recoverable_problem.py'
 --- test/test_recoverable_problem.py	1970-01-01 00:00:00 +0000
 +++ test/test_recoverable_problem.py	2012-07-12 16:28:35 +0000
@@ -0,0 +1,83 @@
++'''Test apport-recoverable-error'''
++
++# Copyright (C) 2012 Canonical Ltd.
++# Author: Evan Dandrea <ev@ubuntu.com>
++#
++# This program is free software; you can redistribute it and/or modify it
++# under the terms of the GNU General Public License as published by the
++# Free Software Foundation; either version 2 of the License, or (at your
++# option) any later version.  See http://www.gnu.org/copyleft/gpl.html for
++# the full text of the license.
++
++import unittest
++import sys
++import os
++import subprocess
++import tempfile
++import time
++import shutil
++import apport.report
++
++
++class T(unittest.TestCase):
++    def setUp(self):
++        self.report_dir = tempfile.mkdtemp()
++        self.addCleanup(shutil.rmtree, self.report_dir)
++        os.environ['APPORT_REPORT_DIR'] = self.report_dir
++
++    def wait_for_report(self):
++        cwd = os.getcwd().replace('/', '_')
++        base = sys.argv[0]
++        if base.startswith('./'):
++            base = base[2:]
++        base = base.replace('/', '_')
++        path = '%s_%s.%d.crash' % (cwd, base, os.getuid())
++        path = os.path.join(self.report_dir, path)
++        seconds = 0
++        while not os.path.exists(path):
++            time.sleep(1)
++            seconds += 1
++            self.assertTrue(seconds < 10, 'timeout while waiting for %s to be created.' % path)
++        return path
++
++    def call_recoverable_problem(self, data):
++        (r, w) = os.pipe()
++        cmd = ['apport-recoverable-problem']
++        with os.fdopen(r, 'r') as r:
++            with os.fdopen(w, 'w') as w:
++                proc = subprocess.Popen(cmd, stdin=r, close_fds=True)
++                w.write(data)
++                w.flush()
++        proc.communicate()
++        if proc.returncode != 0:
++            raise subprocess.CalledProcessError(proc.returncode, cmd[0])
++
++    def test_recoverable_problem(self):
++        self.call_recoverable_problem('hello\0there')
++        path = self.wait_for_report()
++        with open(path, 'rb') as report_path:
++            report = apport.report.Report()
++            report.load(report_path)
++            self.assertEqual(report['hello'], 'there')
++            self.assertTrue('Pid:\t%d' % os.getpid() in report['ProcStatus'])
++
++    def test_incomplete_data(self):
++        self.assertRaises(subprocess.CalledProcessError,
++                          self.call_recoverable_problem, 'hello')
++
++        self.assertRaises(subprocess.CalledProcessError,
++                          self.call_recoverable_problem,
++                          'hello\0there\0extraneous')
++
++        self.assertRaises(subprocess.CalledProcessError,
++                          self.call_recoverable_problem,
++                          'hello\0\0there')
++
++
++with open('/proc/sys/kernel/core_pattern') as f:
++    core_pattern = f.read().strip()
++    if core_pattern[0] != '|':
++        sys.stderr.write('kernel crash dump helper is not active; please enable before running this test.\n')
++        sys.exit(0)
++
++unittest.main()
 === modified file 'test/test_ui_gtk.py'
 --- test/test_ui_gtk.py	2012-06-28 13:10:30 +0000
 +++ test/test_ui_gtk.py	2012-07-12 16:28:35 +0000
@@ -403,6 +403,45 @@
          self.assertTrue(self.app.w('details_scrolledwindow').get_property('visible'))
          self.assertTrue(self.app.w('dialog_crash_new').get_resizable())
++    def test_recoverable_crash_layout(self):
++        '''
++        +-----------------------------------------------------------------+
++        | [ logo ] The application Foo has experienced an internal error. |
++        |          Developer-specified error text.                        |
++        |                                                                 |
++        |            [x] Send an error report to help fix this problem.   |
++        |                                                                 |
++        | [ Show Details ]                                   [ Continue ] |
++        +-----------------------------------------------------------------+
++        '''
++        self.app.report['ProblemType'] = 'RecoverableProblem'
++        self.app.report['Package'] = 'apport 1.2.3~0ubuntu1'
++        self.app.report['DialogBody'] = 'Some developer-specified error text.'
++        with tempfile.NamedTemporaryFile() as fp:
++            fp.write(b'''[Desktop Entry]
++Version=1.0
++Name=Apport
++Type=Application''')
++            fp.flush()
++            self.app.report['DesktopFile'] = fp.name
++            GLib.idle_add(Gtk.main_quit)
++            self.app.ui_present_report_details(True)
++        self.assertEqual(self.app.w('dialog_crash_new').get_title(),
++                         self.distro)
++        msg = 'The application Apport has experienced an internal error.'
++        self.assertEqual(self.app.w('title_label').get_text(), msg)
++        msg = 'Some developer-specified error text.'
++        self.assertEqual(self.app.w('subtitle_label').get_text(), msg)
++        self.assertTrue(self.app.w('subtitle_label').get_property('visible'))
++        send_error_report = self.app.w('send_error_report')
++        self.assertTrue(send_error_report.get_property('visible'))
++        self.assertTrue(send_error_report.get_active())
++        self.assertTrue(self.app.w('show_details').get_property('visible'))
++        self.assertTrue(self.app.w('continue_button').get_property('visible'))
++        self.assertEqual(self.app.w('continue_button').get_label(),
++                         _('Continue'))
++        self.assertFalse(self.app.w('closed_button').get_property('visible'))
++
      def test_administrator_disabled_reporting(self):
          GLib.idle_add(Gtk.main_quit)
          self.app.ui_present_report_details(False)
 === modified file 'test/test_ui_kde.py'
 --- test/test_ui_kde.py	2012-07-09 05:52:40 +0000
 +++ test/test_ui_kde.py	2012-07-12 16:28:35 +0000
@@ -281,6 +281,44 @@
          self.assertTrue(self.app.dialog.cancel_button.isVisible())
          self.assertTrue(self.app.dialog.treeview.isVisible())
++    def test_recoverable_crash_layout(self):
++        '''
++        +-----------------------------------------------------------------+
++        | [ logo ] The application Foo has experienced an internal error. |
++        |          Developer-specified error text.                        |
++        |                                                                 |
++        |            [x] Send an error report to help fix this problem.   |
++        |                                                                 |
++        | [ Show Details ]                                   [ Continue ] |
++        +-----------------------------------------------------------------+
++        '''
++        self.app.report['ProblemType'] = 'RecoverableProblem'
++        self.app.report['Package'] = 'apport 1.2.3~0ubuntu1'
++        self.app.report['DialogBody'] = 'Some developer-specified error text.'
++
++        with tempfile.NamedTemporaryFile() as fp:
++            fp.write(b'''[Desktop Entry]
++Version=1.0
++Name=Apport
++Type=Application''')
++            fp.flush()
++            self.app.report['DesktopFile'] = fp.name
++            QTimer.singleShot(0, QCoreApplication.quit)
++            self.app.ui_present_report_details(True)
++        self.assertEqual(self.app.dialog.windowTitle(),
++                         self.distro.split()[0])
++        msg = 'The application Apport has experienced an internal error.'
++        self.assertEqual(self.app.dialog.heading.text(), msg)
++        msg = 'Some developer-specified error text.'
++        self.assertEqual(self.app.dialog.text.text(), msg)
++        self.assertTrue(self.app.dialog.text.isVisible())
++        self.assertTrue(self.app.dialog.send_error_report.isVisible())
++        self.assertTrue(self.app.dialog.send_error_report.isChecked())
++        self.assertTrue(self.app.dialog.details.isVisible())
++        self.assertTrue(self.app.dialog.continue_button.isVisible())
++        self.assertEqual(self.app.dialog.continue_button.text(), _('Continue'))
++        self.assertFalse(self.app.dialog.closed_button.isVisible())
++
      @patch.object(MainUserInterface, 'open_url')
      def test_1_crash_nodetails(self, *args):
          '''Crash report without showing details'''

Apport

Merge lp:~ev/apport/recoverable-errors into lp:~apport-hackers/apport/trunk

Commit message

Description of the change

Preview Diff

Subscribers