[phone] crash files are only uploaded on boot when not running in the foreground

Bug #1340604 reported by Jean-Baptiste Lallement
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
whoopsie (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

On the phone (mako utopic #125) it seems that crash files are only uploaded on boot and not when the crash occurs. whoopsie is running but doesn't seem to detect that a crash file is ready for upload.

= Test Case =
1. In a shell run:
    $ bash -c 'kill -SEGV $$'
2. Wait until the file _bin_bash.32011.uploaded is created

-> Check that after a moment whoopsie-upload-all is running
-> Check the progress of the upload in /var/log/upstart/apport-noui.log
-> Upon upload, check that /var/log/syslog contains a OOPS entry from whoopsie

= Actual Result =
whoopsie-upload-all times out and the crash file is not uploaded. If the device is rebooted, the crash is uploaded immediately.

ProblemType: Bug
DistroRelease: Ubuntu 14.10
Package: whoopsie 0.2.34
Uname: Linux 3.4.0-5-mako armv7l
ApportVersion: 2.14.4-0ubuntu2
Architecture: armhf
CrashReports:
 640:32011:114:63916:2014-07-11 10:44:24.101108181 +0200:2014-07-11 10:44:07.613514715 +0200:/var/crash/_bin_bash.32011.crash
 644:32011:114:0:2014-07-11 10:44:24.101108181 +0200:2014-07-11 10:44:24.101108181 +0200:/var/crash/_bin_bash.32011.upload
Date: Fri Jul 11 10:45:02 2014
InstallationDate: Installed on 2014-07-11 (0 days ago)
InstallationMedia: Ubuntu Utopic Unicorn (development branch) - armhf (20140711-020204)
RelatedPackageVersions: apport-noui 2.14.4-0ubuntu2
SourcePackage: whoopsie
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Jean-Baptiste Lallement (jibel) wrote :
summary: - [phone] crash files are only uploaded on reboot
+ [phone] crash files are only uploaded on boot
Revision history for this message
Paul Larson (pwlars) wrote : Re: [phone] crash files are only uploaded on boot

Looking at syslog on my phone after a fresh install, I'm seeing this:
Jul 16 18:14:03 ubuntu-phablet whoopsie[2080]: whoopsie 0.2.34 starting up.
Jul 16 18:14:03 ubuntu-phablet whoopsie[2080]: Using lock path: /var/lock/whoopsie/lock
Jul 16 18:14:28 ubuntu-phablet whoopsie[2154]: Could not connect to the system bus: Timeout was reached
Jul 16 18:14:29 ubuntu-phablet whoopsie[2154]: offline

Revision history for this message
Paul Larson (pwlars) wrote :

I think one thing going on here is that there's some too-aggressive timeout that is causing whoopsie to think it's offline before the network has a chance to fully come up. If I modify the upstart job to make whoopsie start later, or even just restart whoopsie after boot, it seems to continue working for me for the remainder of the boot.

Revision history for this message
Joe Talbott (joetalbott) wrote :

From my reading of the source this error, "Could not connect to the system bus: Timeout was reached", is related to checking if network-manager is up.

Revision history for this message
Joe Talbott (joetalbott) wrote :

So after further digging what I think is happening is this: when whoopsie calls monitor_connectivity() and we timeout getting the system bus monitor_connectivity() returns FALSE and that's it. I think adding a retry around the call to monitor_connectivity() might help mitigate the problem until we can determine why it's timing out getting the system bus.

Revision history for this message
Paul Larson (pwlars) wrote :

Yeah, I seem to be getting stuck at
system_bus = g_bus_get_sync (G_BUS_TYPE_SYSTEM, NULL, &err);
    if (err) {
        log_msg ("Could not connect to the system bus: %s\n", err->message);
        g_error_free (err);
        return FALSE;
    }
I was thinking about retrying it just that part, but it may be safer to retry the whole thing. I'd like to get feedback from Evan or Brian since they are more familiar with this code about whether there's a better way to go about this.

Revision history for this message
Evan (ev) wrote :

Stellar work, guys. I do find it odd that this is causing whoopsie to sit around doing nothing, given that online_state defaults to true, but let's assume for a minute that dbus is indeed the problem.

The upstart job doesn't have any non-runlevel service dependencies, so it could conceivably come up before the system bus is started. Coming up before Network Manager should be okay, thanks to dbus service activation.

If we cannot talk to Network Manager, we assume Internet connectivity. This covers the case where NetworkManager or DBus have caught fire. Besides, if we cannot submit a crash file, we'll find out pretty quickly.

However, this code was written before whoopsie was running on phones and I now worry we could end up in a situation where we're trying to send lots of large crash files on a 3G or paid connection.

As much as it pains me to add runtime dependencies to whoopsie, perhaps we should solve this by putting 'start on started dbus' and 'stop on stopping dbus' in the upstart job. What do you guys think? Brian? Does this fix the bug for you, Paul and Joe?

Revision history for this message
Paul Larson (pwlars) wrote :

Changing the upstart job to have it start/stop with dbus does not work for me. It still thinks it's not online and I get messages like this in syslog:
Jul 22 16:18:06 ubuntu-phablet whoopsie[850]: Not online; processing later (/var/crash/_usr_lib_arm-linux-gnueabihf_url-dispatcher_update-directory.32011.crash)

If I instead have it start/stop with network-manager, I don't get the messages about it being offline. However it does seem like it gets stuck, or it's waiting on some lock in whoopsie still. I see the .upload file come up, but whoopsie doesn't seem to be doing anything.

Revision history for this message
Paul Larson (pwlars) wrote :

Also, if I start whoopsie with --assume-online, it seems to work more reliably. I think Brian noted this before as well.

Revision history for this message
Brian Murray (brian-murray) wrote :

In the past there were issues with how whoopsie logged (LP: #1245524), when I run whoopsie in the foreground on a Nexus 4 that is not connected to a wireless network, but is connected to a cell network I saw the following in my terminal.

sudo CRASH_DB_URL=https://daisy.ubuntu.com APPORT_REPORT_DIR=/var/crash/ whoopsie -f

Using lock path: /var/crash/whoopsie_lock
Network connection found that may be a paid data plan.offline

However, when I did not run whoopsie in the foreground and ran it as a service I saw the following in syslog.

Jul 22 12:38:31 ubuntu-phablet whoopsie[2434]: whoopsie 0.2.34 starting up.
Jul 22 12:38:31 ubuntu-phablet whoopsie[2434]: Using lock path: /var/lock/whoopsie/lock

This leads me to believe it is a problem with the some part of this:

    if (!foreground) {
        close_log ();
        daemonize ();
        open_log();
    }

However, it would be good if someone could confirm my experience. I don't think you'd need to be connected to a cell network, rather just not connected to wifi to see an offline message in the log file.

Revision history for this message
Brian Murray (brian-murray) wrote :

Okay, I've tested my theory some more by doing the following. I set APPORT_REPORT_DIR in /etc/init/whoopsie.conf to /crashes a directory I created with permissions 770 and then started whoopsie. I saw the following in my log file.

Jul 22 15:31:05 ubuntu-phablet whoopsie[6728]: whoopsie 0.2.34 starting up.
Jul 22 15:31:05 ubuntu-phablet whoopsie[6728]: Using lock path: /crashes/whoopsie_lock
Jul 22 15:31:05 ubuntu-phablet whoopsie[6728]: Could not create lock file: Bad file descriptor
Jul 22 15:31:05 ubuntu-phablet whoopsie[6730]: Could not read directory (13): /crashes

The "Could not read directory" message comes from the monitor_directory code in monitor.c, so this is unrelated to how whoopsie logs or becomes a daemon.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package whoopsie - 0.2.35

---------------
whoopsie (0.2.35) utopic; urgency=medium

  [ Brian Murray ]
  * Add a space before the autoreport tag in bug reports.

  [ Martin Pitt ]
  * data/whoopsie.conf: Start whoopsie in the foreground and drop the "expect
    fork". This provides proper logging to /var/log/upstart/, simplifies
    things for upstart, and also curiously seems to cause crash reports to now
    actually get picked up and uploaded. (LP: #1340604)
  * debian/control: Fix Vcs-* tags to make debcheckout actually work.
  * Bump Standards-Version to 3.9.5 (no changes necessary).
 -- Martin Pitt <email address hidden> Thu, 24 Jul 2014 14:36:06 +0200

Changed in whoopsie (Ubuntu):
status: New → Fix Released
Revision history for this message
Martin Pitt (pitti) wrote :

This probably shouldn't be closed without further confirmation/investigation, and perhaps we still want to fix operation in background mode.

Changed in whoopsie (Ubuntu):
status: Fix Released → New
summary: - [phone] crash files are only uploaded on boot
+ [phone] crash files are only uploaded on boot when not running in the
+ foreground
Revision history for this message
Paul Larson (pwlars) wrote :

I'm running the new version locally right now, and it seems to be working. I did have a .crash file that existed before I did the upgrade that didn't get processed even after rebooting after the upgrade. However, when I made something crash, both it and the new .crash file were processed right away.

Revision history for this message
Brian Murray (brian-murray) wrote : Re: [Bug 1340604] Re: [phone] crash files are only uploaded on boot when not running in the foreground

On Thu, Jul 24, 2014 at 02:58:34PM -0000, Paul Larson wrote:
> I'm running the new version locally right now, and it seems to be
> working. I did have a .crash file that existed before I did the upgrade
> that didn't get processed even after rebooting after the upgrade.

Did the .upload file exist for that .crash file? If it did and
whoopsie-upload-all had already timed out waiting for whoopsie to upload
the crash file then that scenario makes sense to me. You might look at
the apport-noui log file in /var/log/upstart to see if that particular
crash file was previously detected.

--
Brian Murray

Revision history for this message
Brian Murray (brian-murray) wrote :

With the move to systemd I don't think we'll want to fix operation in background mode for upstart, so I am setting this back to Fix Released.

Changed in whoopsie (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.