dbus daemon spinning with NM AP properties change event

Bug #1394204 reported by Pat McGowan
30
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Canonical System Image
Fix Released
Critical
Unassigned
location-service (Ubuntu RTM)
Fix Released
Undecided
Unassigned
network-manager (Ubuntu)
Invalid
Critical
Unassigned
network-manager (Ubuntu RTM)
Invalid
Critical
Unassigned

Bug Description

I am consistently seeing this in the Bluefin office this week
Mako running 133

Phone gets warm in pocket, wake up the phone and attach
top shows dbus-daemon at 100%+ cpu and network-manager next at 10-20%
It never quiets own, at least not for over 5 mins

3g data is turned off
Cellular is roaming to local carrier fwiw
Seems to happen whenever I leave the AP coverage, whether I return or not

Reproducing the issue: The problem is easily reproducible when forcing NM to roam between multiple access points and never quite finishing the association to one AP. At home, I reproduce by
roaming between different floors, where each floor has got its own WiFi. Any sufficiently complex office WiFi setup should trigger it, too.

Related branches

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :
Download full text (3.3 KiB)

I see lots of these messages:

Nov 19 08:03:43 ubuntu-phablet NetworkManager[1448]: <warn> Activation (wlan0/wireless): association took too long, failing activation.
Nov 19 08:03:43 ubuntu-phablet NetworkManager[1448]: <info> (wlan0): device state change: config -> failed (reason 'SSID not found') [50 120 53]
Nov 19 08:03:43 ubuntu-phablet NetworkManager[1448]: <info> NetworkManager state is now DISCONNECTED
Nov 19 08:03:43 ubuntu-phablet NetworkManager[1448]: <warn> Activation (wlan0) failed for connection 'Canonical-5GHz-an'
Nov 19 08:03:43 ubuntu-phablet NetworkManager[1448]: <info> (wlan0): device state change: failed -> disconnected (reason 'none') [120 30 0]
Nov 19 08:03:43 ubuntu-phablet NetworkManager[1448]: <info> (wlan0): deactivating device (reason 'none') [0]
Nov 19 08:03:43 ubuntu-phablet kernel: [51481.120852] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2908: 11d AP Bssid 84:18:3a:44:da:ec chan= 36, rssi = -50, countryCode US
Nov 19 08:03:43 ubuntu-phablet kernel: [51481.121920] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2908: 11d AP Bssid 84:18:3a:44:da:e8 chan= 8, rssi = -40, countryCode US
Nov 19 08:03:43 ubuntu-phablet kernel: [51481.122835] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2928: Information about current country Bssid 84:18:3a:04:da:e8 chan= 8, rssi = -40, countryCode US
Nov 19 08:03:43 ubuntu-phablet wpa_supplicant[1828]: wlan0: Failed to initiate sched scan
Nov 19 08:03:43 ubuntu-phablet wpa_supplicant[1828]: wlan0: CTRL-EVENT-SCAN-STARTED
Nov 19 08:03:44 ubuntu-phablet kernel: [51482.020195] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2908: 11d AP Bssid 84:18:3a:08:65:b8 chan= 4, rssi = -84, countryCode US
Nov 19 08:03:44 ubuntu-phablet kernel: [51482.021142] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2908: 11d AP Bssid 84:18:3a:44:da:ec chan= 36, rssi = -50, countryCode US
Nov 19 08:03:44 ubuntu-phablet kernel: [51482.021935] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2908: 11d AP Bssid 84:18:3a:44:da:e8 chan= 8, rssi = -38, countryCode US
Nov 19 08:03:44 ubuntu-phablet kernel: [51482.022912] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2908: 11d AP Bssid 84:18:3a:04:da:e8 chan= 8, rssi = -37, countryCode US
Nov 19 08:03:44 ubuntu-phablet kernel: [51482.023614] wlan: [1317:E :SME] csrMoveTempScanResultsToMainList: 2928: Information about current country Bssid 84:18:3a:04:da:e8 chan= 8, rssi = -37, countryCode US
Nov 19 08:03:45 ubuntu-phablet NetworkManager[1448]: <info> (wlan0): supplicant interface state: scanning -> disconnected
Nov 19 08:03:45 ubuntu-phablet NetworkManager[1448]: <warn> Couldn't disconnect supplicant interface: This interface is not connected

It tells me you're likely just outside of the range of the accesspoints, and since it comes in and out of scan results regularly, NM will keep trying to authenticate while the AP is available.

A couple of things to try:
 - Switch to 2.4GHz, see if the behavior is the same, this could give us some interesting data points.
 - Disable location, which causes significantly more wifi scans to be triggered, and reducing that may also improve the situation.
 - Try to attach strace to...

Read more...

Changed in network-manager-applet (Ubuntu):
status: New → Incomplete
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

I just confirmed this happens on the krillin with 2,4Ghz as well

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

This is also occurring on other krillins here, will try to attach strace

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

This is from a spin which occured just moving within the office

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

I don't know why this was moved to nm-applet, since it's phone-specific.... ;)

affects: network-manager-applet (Ubuntu) → network-manager (Ubuntu)
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

I'm unable to reproduce this issue at home. It could be that it has to do with the sheer number of access-points in your location, which obviously would cause more updates to LastSeen timestamp for each accesspoint, and more AP changes signals.

Is NetworkManager itself spinning? It would be useful to have an excerpt of top; as well as a full debug log for NetworkManager when this happens. A strace of NetworkManager could be useful too, since dbus-daemon speaks to a lot more applications.

To enable debugging in NetworkManager, edit the /etc/init/network-manager.conf file and add to the exec NetworkManager line:
--log-level=debug --log-domains=CORE,WIFI,WIFI_SCAN

Revision history for this message
Pat McGowan (pat-mcgowan) wrote :
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

Latest testing results:
- cannot reproduce running vivid
- updated dbus to latest utopic version (1.8.8-1ubuntu2.1) and the problem still occurs

Changed in network-manager (Ubuntu):
assignee: Mathieu Trudel-Lapierre (mathieu-tl) → Ricardo Mendoza (ricmm)
status: Incomplete → In Progress
Changed in canonical-devices-system-image:
status: New → Confirmed
importance: Undecided → Critical
milestone: none → ww51-2014
Bill Filler (bfiller)
Changed in network-manager (Ubuntu RTM):
importance: Undecided → Critical
assignee: nobody → Ricardo Mendoza (ricmm)
Changed in location-service:
importance: Undecided → Critical
assignee: nobody → Thomas Voß (thomas-voss)
Revision history for this message
Thomas Voß (thomas-voss) wrote :

I tracked down the root cause to the location service reacting to changes to the "PrimaryConnection" property of org.freedesktop.NetworkManager by reaching out to all devices associated with the primary connection, and querying their properties, both for wifi devices and cross-referencing to Ofono for modem devices. This behavior of not relying on cached values causes excessive traffic on the bus. The branch attached to this bug mitigates the issue by:

  (1.) Relying on the cached devices for querying device properties. This is sufficient, as the cache is always consistent in this particular scenario.
  (2.) Not explicitly reacting to modem state changes coming in via Ofono but instead only reacting to changes to the PrimaryConnection property of NetworkManager.

description: updated
description: updated
Changed in canonical-devices-system-image:
milestone: ww51-2014 → ww03-2015
Revision history for this message
Loïc Minier (lool) wrote :

I've prepared some test binaries with the proposed fix at http://people.canonical.com/~lool/lp-1394204/

Revision history for this message
Loïc Minier (lool) wrote :

(there are some binaries for rtm in silo 7)

Loïc Minier (lool)
Changed in location-service:
status: New → In Progress
Changed in location-service (Ubuntu RTM):
status: New → In Progress
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package location-service - 2.1+15.04.20150106~rtm-0ubuntu1

---------------
location-service (2.1+15.04.20150106~rtm-0ubuntu1) 14.09; urgency=low

  [ thomas-voss ]
  * Fix #1394204 by: (LP: #1394204)

  [ Ubuntu daily release ]
  * New rebuild forced

location-service (2.1+15.04.20141110.1-0ubuntu1) vivid; urgency=low

  [ thomas-voss ]
  * Make sure that devices being added/removed by NetworkManager are
    handled correctly. (LP: #1390490)
 -- Ubuntu daily release <email address hidden> Tue, 06 Jan 2015 18:50:55 +0000

Changed in location-service (Ubuntu RTM):
status: In Progress → Fix Released
Revision history for this message
Pat McGowan (pat-mcgowan) wrote :

This is substantially fixed with the location service change, can open another bug if other symptoms arise

Changed in canonical-devices-system-image:
status: Confirmed → Fix Released
Revision history for this message
Tony Espy (awe) wrote :

Based upon tvoss' findings in comment #13, I'm closing out the network-manager tasks as Invalid. If anyone disagrees, feel free to re-open, but please provide updated logs and/or steps to reproduce if doing so.

Changed in network-manager (Ubuntu):
assignee: Ricardo Mendoza (ricmm) → nobody
Changed in network-manager (Ubuntu RTM):
assignee: Ricardo Mendoza (ricmm) → nobody
Changed in network-manager (Ubuntu):
status: In Progress → Invalid
Changed in network-manager (Ubuntu RTM):
status: New → Invalid
no longer affects: location-service
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.