[checkbox-gui] org.freedesktop.DBus.Error.NoReply

Bug #1318504 reported by Yung Shen
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Checkbox GUI (Legacy)
Invalid
Critical
Unassigned
Next Generation Checkbox (CLI)
Fix Released
Critical
Zygmunt Krynicki
PlainBox (Toolkit)
Invalid
Critical
Unassigned

Bug Description

Having trouble while resuming from accidental closing certification testing window,
testcase panel does not prompt, pause and resume does not function, but exit button works.
and received this in terminal:

app_id : "com.canonical.checkbox-gui"
"org.freedesktop.DBus.Error.NoReply" "Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken."
GuiEngine::RunJobs - Done

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-24-generic 3.13.0-24.46 [modified: boot/vmlinuz-3.13.0-24-generic]
ProcVersionSignature: Ubuntu 3.13.0-24.46-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
NonfreeKernelModules: wl
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: ubuntu 1595 F.... pulseaudio
 /dev/snd/controlC0: ubuntu 1595 F.... pulseaudio
CurrentDesktop: Unity
CurrentDmesg:
 [ 32.389520] init: plymouth-upstart-bridge main process ended, respawning
 [ 46.399177] audit_printk_skb: 123 callbacks suppressed
 [ 46.399192] type=1400 audit(1399876506.669:62): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/lib/cups/backend/cups-pdf" pid=1974 comm="apparmor_parser"
 [ 46.399223] type=1400 audit(1399876506.669:63): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/sbin/cupsd" pid=1974 comm="apparmor_parser"
 [ 46.400983] type=1400 audit(1399876506.669:64): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/sbin/cupsd" pid=1974 comm="apparmor_parser"
Date: Mon May 12 02:52:44 2014
HibernationDevice: RESUME=UUID=0d0a5734-dacf-4544-bc61-f614fa0b28c2
InstallationDate: Installed on 2014-05-09 (2 days ago)
InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Release amd64 (20140417)
MachineType: LENOVO 814VU01
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=bec53543-4e1e-4d3f-82b1-ed26ab1392ce ro rootdelay=60 quiet splash initcall_debug vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware 1.127
SourcePackage: linux
StagingDrivers: rts5139
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 05/29/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: HPET18WW(1.03)
dmi.board.asset.tag: Not Available
dmi.board.name: 814VU01
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvrHPET18WW(1.03):bd05/29/2012:svnLENOVO:pn814VU01:pvrLenovoB475e:rvnLENOVO:rn814VU01:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 814VU01
dmi.product.version: Lenovo B475e
dmi.sys.vendor: LENOVO

Related branches

Revision history for this message
Yung Shen (kaxing) wrote :
affects: linux (Ubuntu) → checkbox (Ubuntu)
Revision history for this message
Yung Shen (kaxing) wrote :

Can be reproduce with this plainbox session.

Revision history for this message
Zygmunt Krynicki (zyga) wrote : Re: [Bug 1318504] Re: [checkbox-gui] org.freedesktop.DBus.Error.NoReply

Thanks

On Mon, May 12, 2014 at 11:49 AM, Yung Shen <email address hidden>wrote:

> Can be reproduce with this plainbox session.
>
> ** Attachment added: "plainbox-cannot-resume.tar.xz"
>
> https://bugs.launchpad.net/ubuntu/+source/checkbox/+bug/1318504/+attachment/4110448/+files/plainbox-cannot-resume.tar.xz
>
> --
> You received this bug notification because you are a member of Checkbox
> Bug Wranglers, which is subscribed to checkbox in Ubuntu.
> https://bugs.launchpad.net/bugs/1318504
>
> Title:
> [checkbox-gui] org.freedesktop.DBus.Error.NoReply
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/checkbox/+bug/1318504/+subscriptions
>

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Got a simliar issue on 2013106-13836

Steps:
1. Run client-cert whitelist
2. Reboot on switching GPU job (without closing c-c-c)
3. Run c-c-c again, select re-run

c-c-c didn't resume the switch GPU job, it stuck at bluetooth detect instead

checkbox 0.17.10+bzr2819+pkg76+legacy1~ubuntu14.04.1
checkbox-certification 0.20+bzr648+201402141719~ubuntu14.04.1
checkbox-certification-client 0.20+bzr648+201402141719~ubuntu14.04.1
checkbox-certification-tools 0.20+bzr648+201402141719~ubuntu14.04.1
checkbox-gui 0.17.9~dev+bzr2988+pkg2~ubuntu14.04.1
checkbox-ng 0.4~dev+bzr2988+pkg7~ubuntu14.04.1
checkbox-ng-service 0.4~dev+bzr2988+pkg7~ubuntu14.04.1
checkbox-qt 0.17.10+bzr2819+pkg76+legacy1~ubuntu14.04.1
plainbox-provider-checkbox 0.5~dev+bzr2988+pkg4~ubuntu14.04.1
python3-checkbox 0.17.10+bzr2819+pkg76+legacy1~ubuntu14.04.1
python3-checkbox-ng 0.4~dev+bzr2988+pkg7~ubuntu14.04.1
python3-checkbox-support 0.3~dev+bzr2988+pkg2~ubuntu14.04.1
plainbox 0.6~dev+bzr2988+pkg3~ubuntu14.04.1
plainbox-provider-certification-client 0.1~dev+bzr2988+pkg14~ubuntu14.04.1
plainbox-provider-resource-generic 0.4~dev+bzr2988+pkg5~ubuntu14.04.1
plainbox-secure-policy 0.6~dev+bzr2988+pkg3~ubuntu14.04.1
python3-plainbox 0.6~dev+bzr2988+pkg3~ubuntu14.04.1

plainboc cache directory attached.

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Execution log attached.

app_id : "com.canonical.checkbox-gui"
"org.freedesktop.DBus.Error.NoReply" "Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken."
GuiEngine::RunJobs - Done

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Can you also attach logs from ~/.cache/plainbox/logs please?

Changed in checkbox (Ubuntu):
status: New → Incomplete
Revision history for this message
Taihsiang Ho (tai271828) wrote :

CID 201305-13636 on 14.04 could reproduce the bug.

Steps to reproduce this bug: the same as what has been described in comment #4

However, if selecting "continue" in step 3, it could keep running.
It seems not to continue from the expected items(not sure), this need to try reproducing again.

My ~/.cache/plainbox/logs is empty in this case.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

.cache/plainbox for comment #7

Revision history for this message
Daniel Manrique (roadmr) wrote :

I have the same issue, my system is getting stuck after selecting "rerun last test", it gets stuck running "miscellanea/submission-resources".

I can reproduce this pretty easily, so I can provide any needed diagnostic information; I should have access to the system in a couple of hours.

I'll mark as confirmed since more than one person is having this and the required diagnostic information has been supplied.

Changed in checkbox (Ubuntu):
status: Incomplete → Confirmed
importance: Undecided → High
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Gargh (meaningless anger cry).

I'll try to understand what this is about

On Wed, May 21, 2014 at 1:17 PM, Daniel Manrique <
<email address hidden>> wrote:

> I have the same issue, my system is getting stuck after selecting "rerun
> last test", it gets stuck running "miscellanea/submission-resources".
>
> I can reproduce this pretty easily, so I can provide any needed
> diagnostic information; I should have access to the system in a couple
> of hours.
>
> I'll mark as confirmed since more than one person is having this and the
> required diagnostic information has been supplied.
>
> ** Changed in: checkbox (Ubuntu)
> Status: Incomplete => Confirmed
>
> ** Changed in: checkbox (Ubuntu)
> Importance: Undecided => High
>
> --
> You received this bug notification because you are a member of Checkbox
> Bug Wranglers, which is subscribed to checkbox in Ubuntu.
> https://bugs.launchpad.net/bugs/1318504
>
> Title:
> [checkbox-gui] org.freedesktop.DBus.Error.NoReply
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/checkbox/+bug/1318504/+subscriptions
>

Revision history for this message
Daniel Manrique (roadmr) wrote :

Bumping to critical as this is seriously blocking us. The good news is, I produced a good set of debugging information.

How I came to this situation:

- started canonical-certification-client
- selected certification whitelist, didn't deselect any jobs
- went through the list until I hit the reboot job, rebooted.
- upon reboot, I restarted c-c-client, selected "rerun last job" (in order to mark the reboot test as passed).

At this point, c-c-client shows an unrelated test (submission-resources) in the bottom bar, and gets stuck there. A message about dbus replies (as seen in original report) is shown.

I killed everything, restarted c-c-client, and selected "continue" this time, and this time it got stuck on esata/storage-test. Interestingly, esata/storage-test depends on esata/insert, which I skipped manually because this system has no sata. Also, miscellanea/submission-resources (from my first run) has a failed dependency (info/requirements, fails by design). So a first theory is that this happens when the job to be run has a dependency that didn't run.

Anyway, I'm attaching the plainbox debug log (I produced it by disabling the service file, then manually running checkbox -D -C service in a terminal), the console log from checkbox-gui, and the archived session that led to this problem.

Changed in checkbox (Ubuntu):
importance: High → Critical
Revision history for this message
Daniel Manrique (roadmr) wrote :
Revision history for this message
Daniel Manrique (roadmr) wrote :
Revision history for this message
Daniel Manrique (roadmr) wrote :
Revision history for this message
Daniel Manrique (roadmr) wrote :

Moving to checkbox-gui under the initial assumption that it's the culprit

affects: checkbox (Ubuntu) → checkbox-gui
Revision history for this message
Taihsiang Ho (tai271828) wrote :

more information associated to comment #11:

If I click "continue" to try to resume c-c-client, it stuck on 'fire wire HDD' or 'disk/hdd-parking'.
That is, for each resume and 'continue', the stuck item is not the same.

After the stuck item was shown,
I select 'skip' and 'Continue', it looks like the program was reading the status information of all the tested/skipped items.
And finally I came to the item where I just stopped at, say 'reboot' in the case of comment #11.

Revision history for this message
Taihsiang Ho (tai271828) wrote :

If I tried to resume the program by 'rerun' and killed the program because of the dbus replies.
Then launching c-c-client again without rebooting the system, it looks like the process stuck on the message shown below:

ubuntu@201305-13636:~$ canonical-certification-client
GuiEngine::GuiEngine
GuiEngine::GuiEngine - Done
GuiEngine::Initialise
GuiEngine - Initialising
GuiEngine::RefreshPBObjects
Could not introspect this object
GuiEngine::RefreshPBObjects - Done
GuiEngine::Initialise() - Done
WhiteListModelFactory::CreateWhiteListModel()
Recreating TestItemModel
WhiteListModelFactory::CreateWhiteListModel() - Done

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

That seems to indicate that the service has crashed. Can you confirm or deny that took place?

Still reading other logs

Sent from my BlackBerry 10 smartphone.
  Original Message
From: Daniel Manrique
Sent: środa, 21 maja 2014 20:16
To: <email address hidden>
Reply To: Bug 1318504
Subject: [Bug 1318504] Re: [checkbox-gui] org.freedesktop.DBus.Error.NoReply

Moving to checkbox-gui under the initial assumption that it's the
culprit

** Package changed: checkbox (Ubuntu) => checkbox-gui

--
You received this bug notification because you are a member of Checkbox
Bug Wranglers, which is subscribed to checkbox in Ubuntu.
https://bugs.launchpad.net/bugs/1318504

Title:
[checkbox-gui] org.freedesktop.DBus.Error.NoReply

To manage notifications about this bug go to:
https://bugs.launchpad.net/checkbox-gui/+bug/1318504/+subscriptions

Zygmunt Krynicki (zyga)
tags: added: 14-04-blocker trusy
removed: trusty
Ara Pulido (ara)
tags: added: trusty
removed: trusy
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

BTW: when attaching logs add debug + crash + errors as debug is *only* the debug stream.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

According to the session in comment 12 plainbox was running 2013.com.canonical.certification::esata/storage-test

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

The log records however don't show that file at all so it's possible that test was never started (we set the running_job_name *before* starting the job since the job can crash the whole box and we would have no way to skip that test the next time)

Revision history for this message
Sylvain Pineau (sylvain-pineau) wrote :

Since bug #1315337 is now fixed and somehow corrupted the jobs list known from checkbox-gui, I'd like to know if this bug still happens.

Revision history for this message
Sylvain Pineau (sylvain-pineau) wrote :

ok, retested it still happens.

Attached is my own session data.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Reproduced. Thanks this will help me a lot

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I strongly believe this is a deadlock in the threading code.

Revision history for this message
Zygmunt Krynicki (zyga) wrote :
Download full text (4.2 KiB)

(gdb) info threads
  Id Target Id Frame
* 3 Thread 0x7fffeea3e700 (LWP 28196) "python3" sem_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
  2 Thread 0x7fffefffb700 (LWP 28184) "gmain" 0x00007ffff78e3fbd in poll ()
    at ../sysdeps/unix/syscall-template.S:81
  1 Thread 0x7ffff7fc8740 (LWP 28179) "python3" sem_wait ()
    at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85

(gdb) thread 1
[Switching to thread 1 (Thread 0x7ffff7fc8740 (LWP 28179))]
#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
85 ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S: Nie ma takiego pliku ani katalogu.
(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python3/dist-packages/checkbox_ng/service.py", line 1514, in _result_ready
    with self._result_lock:
  File "/usr/lib/python3.4/concurrent/futures/_base.py", line 373, in add_done_callback
    fn(self)
  File "/usr/lib/python3/dist-packages/checkbox_ng/service.py", line 1506, in _run_and_set_callback
    future.add_done_callback(self._result_ready)
  File "/usr/lib/python3/dist-packages/checkbox_ng/service.py", line 1496, in _decide_on_what_to_do
    self._result_future = self._run_and_set_callback()
  File "/usr/lib/python3/dist-packages/checkbox_ng/service.py", line 1268, in PrimeJob
    primed_job_wrapper._decide_on_what_to_do()
  File "/usr/lib/python3/dist-packages/checkbox_ng/service.py", line 245, in wrapper
    retval = func(**bound.arguments)
  File "/usr/lib/python3/dist-packages/checkbox_ng/dbus_ex/decorators.py", line 223, in sanity
    return func(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/dbus/service.py", line 707, in _message_c---Type <return> to continue, or q <return> to quit---
b
    retval = candidate_method(self, *args, **keywords)
  File "/usr/lib/python3/dist-packages/gi/overrides/GLib.py", line 523, in run
    super(MainLoop, self).run()
  File "/usr/lib/python3/dist-packages/checkbox_ng/commands/service.py", line 97, in run
    loop.run()
  File "/usr/lib/python3/dist-packages/checkbox_ng/commands/service.py", line 127, in invoked
    return ServiceInvocation(self.provider_list, self.config, ns).run()
  File "/usr/lib/python3/dist-packages/plainbox/impl/clitools.py", line 509, in dispatch_command
    return ns.command.invoked(ns)
  File "/usr/lib/python3/dist-packages/plainbox/impl/clitools.py", line 513, in dispatch_and_catch_exceptions
    return self.dispatch_command(ns)
  File "/usr/lib/python3/dist-packages/plainbox/impl/clitools.py", line 291, in main
    return self.dispatch_and_catch_exceptions(ns)
  File "/usr/lib/python3/dist-packages/checkbox_ng/main.py", line 163, in main
    raise SystemExit(CheckBoxNGTool().main(argv))
  File "/usr/bin/checkbox", line 9, in <module>
    load_entry_point('checkbox-ng==0.4.dev', 'console_scripts', 'checkbox')()
(gdb)

(gdb) thread 3
[Switching to thread 3 (Thread 0x7fffeea3e700 (LWP 28196))]
#0 sem_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S:85
85 in ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_wait.S
(gdb) py-bt
Traceback (most recent call first):
  File "/usr/lib/python3.4/threading.py", line 289, in wait
    waiter.acquire()
 ...

Read more...

Changed in plainbox:
status: New → Confirmed
Changed in checkbox-ng:
status: New → Confirmed
importance: Undecided → Critical
Changed in plainbox:
importance: Undecided → Critical
Changed in checkbox-gui:
status: Confirmed → Invalid
Revision history for this message
Zygmunt Krynicki (zyga) wrote :

And I no longer think it's directly caused by checkbox-gui

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

Just before it hangs

INFO checkbox.ng.service: Removing result <MemoryJobResult> from DBus
DEBUG checkbox.ng.service: _result_changed(<MemoryJobResult>, <DiskJobResult execution_duration:0.16682171821594238 io_log_filename:'/home/sylvain/.cache/plainbox/sessions/pbox-7p8_h2ig.session/io-logs/2013.com.canonical.certification__usb.record.gz' outcome:'pass' return_code:0>)
INFO checkbox.ng.service: Adding result <DiskJobResult execution_duration:0.16682171821594238 io_log_filename:'/home/sylvain/.cache/plainbox/sessions/pbox-7p8_h2ig.session/io-logs/2013.com.canonical.certification__usb.record.gz' outcome:'pass' return_code:0> to DBus
DEBUG checkbox.ng.service: Created DBus wrapper 140737193839128 for: <DiskJobResult execution_duration:0.16682171821594238 io_log_filename:'/home/sylvain/.cache/plainbox/sessions/pbox-7p8_h2ig.session/io-logs/2013.com.canonical.certification__usb.record.gz' outcome:'pass' return_code:0>
DEBUG checkbox.ng.service: Published DBus wrapper for <DiskJobResult execution_duration:0.16682171821594238 io_log_filename:'/home/sylvain/.cache/plainbox/sessions/pbox-7p8_h2ig.session/io-logs/2013.com.canonical.certification__usb.record.gz' outcome:'pass' return_code:0> as /plainbox/DiskJobResult/140737193839336
INFO checkbox.ng.service: Removing result <MemoryJobResult> from DBus
INFO checkbox.ng.service: PersistentSave()
INFO checkbox.ng.service: PrimeJob(<plainbox.impl.session.legacy.SessionStateLegacyAPICompatImpl object at 0x7fffef95bd58>, <JobDefinition id:'2013.com.canonical.certification::power-management/suspend-30-cycle-log-attach' plugin:'attachment'>)
DEBUG checkbox.ng.service: Created DBus wrapper 140737197401400 for: <plainbox.impl.highlevel.PrimedJob object at 0x7fffeea867a8>
DEBUG checkbox.ng.service: Published DBus wrapper for <plainbox.impl.highlevel.PrimedJob object at 0x7fffeea867a8> as /plainbox/PrimedJob/140737197402024
INFO checkbox.ng.service: Running <JobDefinition id:'2013.com.canonical.certification::power-management/suspend-30-cycle-log-attach' plugin:'attachment'> right away
[New Thread 0x7fffedb52700 (LWP 30218)]
DEBUG checkbox.ng.service: _result_ready(<Future at 0x7fffeea86a18 state=finished returned MemoryJobResult>)

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

The job we're trying to run there runs:

'[ -e $PLAINBOX_SESSION_SHARE/suspend_30_cycles.log ] && cat $PLAINBOX_SESSION_SHARE/suspend_30_cycles.log'

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

That file is present in the session and looks regular. It's also very small so the job can probably finish pretty fast

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

I solved the problem :-)

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

The problem is right here:

    def _decide_on_what_to_do(self):
        if self.native.job.startup_user_interaction_required:
            logger.info(
                _("Sending ShowInteractiveUI() and not starting the job..."))
            # Ask the application to show the interaction GUI
            self._session_wrapper.ShowInteractiveUI(self)
        else:
            logger.info(_("Running %r right away"), self.native.job)

NOTE: here we're taking the result lock and call the method below:

            with self._result_lock:
                self._result_future = self._run_and_set_callback()

    def _run_and_set_callback(self):
        """
        Internal method of PrimedJobWrapper

        Starts the future for the job result, adds a callbacks to it and
        returns the future.
        """
        future = self.native.run()
        logger.info("Added callback %r to future %r", self._result_ready, future)

NOTE: here we're running under the lock and we call the add_done_callback() method. By this time the future is already executed and ready. Calling the method now calls the callback immediately. This calls the last method in the pack below

        future.add_done_callback(self._result_ready)
        logger.info("Returning")
        return future

    def _result_ready(self, result_future):
        """
        Internal method called when the result future becomes ready
        """
        logger.debug("_result_ready(%r)", result_future)

NOTE: here we try to to re-acquire the same lock. Bam. Deadlock

        with self._result_lock:

Changed in checkbox-ng:
status: Confirmed → Triaged
Changed in plainbox:
status: Confirmed → Invalid
Changed in checkbox-ng:
assignee: nobody → Zygmunt Krynicki (zkrynicki)
milestone: none → 0.4
Zygmunt Krynicki (zyga)
Changed in checkbox-ng:
status: Triaged → In Progress
Zygmunt Krynicki (zyga)
tags: added: sru
Changed in checkbox-ng:
status: In Progress → Fix Committed
Daniel Manrique (roadmr)
Changed in checkbox-ng:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.