Comment 10 for bug 1377237

Revision history for this message
Zygmunt Krynicki (zyga) wrote :

19:41 < zyga> roadmr: I can hear you now
19:41 < zyga> roadmr: I've killed dropbox
19:41 < zyga> roadmr: sorry
19:41 < zyga> yes
19:41 < zyga> it's good
19:41 -!- Irssi: Join to #Cert was synced in 30 secs
19:41 < zyga> it's just laggy
19:41 < zyga> but I can
19:41 < zyga> ok
19:41 < zyga> so we have all the tests
19:41 < zyga> wich one should I keep?
19:41 < zyga> hehe
19:42 < zyga> so why?
19:42 < zyga> run?
19:42 < zyga> yes
19:52 < roadmr> zyga: see my last comment, an easy workaround seems to be piping dbus-launch to a file or something. Kinda scary
                because I don't know why that works (voodoo), but it may help figure things out
19:52 < roadmr> zyga: console log and compressed session attached, though the logs for that job in the session look empty :/
19:52 < roadmr> (even the io logs)
19:55 < zyga> roadmr: hmmmmmmmmmmmm
19:55 < roadmr> zyga: fun, eh? :)
19:55 < zyga> so maybe something is stuck on a PIPE buffer
19:57 < zyga> roadmr: do you have the log files from plainbox itself?
19:57 < zyga> ~/.cache/plainbox/logs/
19:57 < roadmr> zyga: oh... that one was empty :/
19:57 < zyga> roadmr: ok
19:57 < zyga> roadmr: HMM
19:57 < zyga> roadmr: /me tries something
19:58 < roadmr> zyga: if you want to see differences in environment, a dummy job that just does "env" is usually what I do
19:58 < roadmr> er, env >/tmp/some-file
19:59 < roadmr> cgregan: I pushed the whitelist change to the testing branch and am building a release candidate now, if you're
                happy with everything else we can skip testing it and just do the same process for the release package
19:59 < zyga> roadmr: I was going to try that, did you get both env outputs?
20:00 < roadmr> zyga: no, but I can do it easily, give me 5 mins
20:00 < zyga> k
20:01 < cgregan> yeah...just push to prod roadmr
20:01 < cgregan> I will do a little sanity check.
20:01 < roadmr> cgregan: ok, here I go then
20:03 < zyga> roadmr: so why do we use dbus-launch there?
20:03 < roadmr> zyga: no idea, we'd have to ask spineau, he wrote that part of the job
20:04 * roadmr would benefit from 2 computers side-by-side instead of having to 180 the chair to reach the one behind...
20:07 < zyga> roadmr: ok I have a theory
20:07 < zyga> roadmr: hehe
20:07 < zyga> :)
20:07 < roadmr> :) glad to see it's at least an interesting bug...
20:08 < roadmr> almost done collecting env data
20:12 < zyga> roadmr: so first part of the theory
20:12 < zyga> roadmr: using dbus-launch is totally broken
20:12 < zyga> roadmr: it's leaving processes behind
20:12 < zyga> roadmr: it starts new dbus daemons each time
20:14 < zyga> roadmr: that part is non-controversial and can be quickly tested by using dbus-launch and reading the maual page
20:14 < roadmr> ok... yes I thought starting a new dbus bus was weird
20:15 < zyga> roadmr: so what happens when you redirect to a file
20:15 < zyga> it "all works"
20:15 < zyga> I think that's because..
20:16 < zyga> the file that we save is actually in a temporary directory
20:17 < zyga> no, that makes no sense :/
20:17 < zyga> no idea how that changes anything
20:17 < roadmr> zyga: a head-scratcher... ok, I attached good and bad environments
20:17 < zyga> roadmr: quick test please:
20:17 < zyga> roadmr: hack the job
20:18 < zyga> roadmr: to have: 'set -x'
20:18 < zyga> roadmr: (the one that hangs)
20:18 < roadmr> ahh :)
20:18 < zyga> roadmr: and quickly retry that
20:18 < zyga> roadmr: don't kill the session
20:18 < zyga> roadmr: ssh-in
20:18 < zyga> roadmr: and look at ~/.cache/plainbox/session/last-session/io-logs/$name
20:18 < zyga> (where $name is the job name)
20:18 < roadmr> zyga: checking...
20:19 < zyga> roadmr: I'll file a bug on plainbox
20:19 < zyga> roadmr: plainbox needs to become systemd and control jobs started in a session :)
20:19 < roadmr> zyga: oh doh, the one I filed is in "checkbox", feel free to move that one and/or add new tasks to it
20:19 < zyga> roadmr: no worries :)
20:21 < zyga> roadmr: next test in line:
20:21 < zyga> roadmr: remove dbus-launch, just use gsettings
20:21 < zyga> roadmr: keep set -x
20:21 < zyga> roadmr: and let's see if that hangs
20:22 < zyga> roadmr: I don't understand the significance of additional processes that lurk around
20:22 < zyga> roadmr: but that one is clearly broken IMHO and if we remove it it might just 'fix itself'
20:24 < roadmr> zyga: ok, for test 1, I put set -x at the beginning of the job. There's no change, the job still stalls, and the io
                log files are all empty (there, but length 0)
20:24 < zyga> hmmmmmmmmmmmmmmmmm
20:24 < zyga> roadmr: that might be buffering :/
20:24 < zyga> roadmr: add a trick, wrap *all* of the job command in one big ( ) and > it to a file
20:25 < zyga> roadmr: so we see that without the extra crappy buffering
20:25 < roadmr> zyga: :D ok, let's see
20:25 < zyga> roadmr: (plainbox doesn't use splice(2)
20:25 < zyga> roadmr: I like bugs like this one
20:25 < zyga> roadmr: hard problems always teach you something new ;)
20:25 < zyga> :)
20:26 < zyga> ah, I meant tee(2) (that's a syscall now)
20:26 < zyga> I thought that was one big syscall
20:27 < roadmr> cgregan: tarballs are building now, should be up in about 5 minutes
20:27 < cgregan> cool
20:27 < cgregan> re-imaging
20:30 < roadmr> zyga: XD even with the ()> wrapper, the file is still empty :/
20:30 < zyga> roadmr: did you redirect 2 or one?
20:31 < zyga> roadmr: -x goes to stderr
20:31 < zyga> roadmr: test that job with plainbox dev script or plainbox run
20:31 < roadmr> zyga: oh doh :) ok, let me retry
20:31 < roadmr> (obvously didn't do that)
20:31 < zyga> roadmr: before re-running it in the gui
20:31 < zyga> roadmr: faster cycle for you
20:32 < roadmr> it's OK, I'm also working on the cdts release, context switching is what slows me down a bit
20:32 < zyga> roadmr: ok, work on cdts first
20:32 < zyga> roadmr: I'll be back
20:32 < roadmr> zyga: sure, I'll update the bug with that stuff
20:35 -!- kissiel [<email address hidden>] has joined #Cert
20:35 < zyga> kissiel: hey
20:35 < zyga> kissiel: :)
20:35 < zyga> kissiel: interesting issues :)
20:36 < kissiel> zyga, shoot
20:36 < zyga> kissiel: what's up, no friday?
20:36 < zyga> kissiel: https://bugs.launchpad.net/checkbox/+bug/1377237
20:36 < mup> Bug #1377237: screenshot_fullscreen_video doesn't work if run from checkbox-gui with auto-started checkbox service
             <Checkbox:Confirmed> <https://launchpad.net/bugs/1377237>
20:36 < zyga> kissiel: ideas welcome
20:36 < zyga> kissiel: my bet is on a full PIPE that nothing is reading somehow
20:36 < zyga> kissiel: but not sure
20:36 < zyga> kissiel: and not sure why it works when checkbox service is started from the console
20:36 < kissiel> zyga, http://plainbox.readthedocs.org/en/latest/author/tutorial.html :P
20:36 < roadmr> zyga: pasted the giant () and guess what, with that, it *works* :D
20:36 < zyga> kissiel: this is a problem that only occurs when checkbox service is auto-started by dbus
20:37 < zyga> roadmr: that is kind of expected
20:37 < kissiel> zyga, IDK if it's me, but i cannot get new provider to work
20:37 < zyga> roadmr: it's a pipe that is now not being fed
20:37 < zyga> roadmr: now change that 2> to a 2>&1 | tee file.foo
20:37 < zyga> roadmr: it will now hang
20:37 < kissiel> zyga, and I really need this to test my time measurement things
20:37 < zyga> roadmr: but we'll grab a log
20:37 < zyga> roadmr: (my theory)
20:37 < roadmr> zyga: I'm doing () >blah 2>&1
20:37 < zyga> kissiel: hehe
20:37 < zyga> kissiel: ok
20:37 < zyga> kissiel: what happens?
20:38 < zyga> kissiel: I know
20:38 < zyga> kissiel: ./manage.py develop -u
20:38 < zyga> kissiel: ./manage.py develop -d $PROVIDERPATH
20:38 * kissiel types too slow
20:38 < zyga> kissiel: you are welcome :)
20:38 < roadmr> cgregan: ok, tarballs are up to production and flatpage is updated (the packages one only)
20:39 < cgregan> cool
20:39 < zyga> kissiel: feel free to report a bug on manage.py develop to HINT the user that this is not going to do what she thinks
              because PROVIDERPATH is set but -d is not used and default directory is not on PROVIDERPATH at this time
20:39 < cgregan> roadmr: thanks for the help
20:39 < zyga> kissiel: should be a two-liner to fix
20:40 < kissiel> zyga, I thought this will have something to do with venv
20:40 < roadmr> cgregan: np, I think it went pretty smoothly, even with last-minute fixes the release process is more predictable
20:40 < kissiel> zyga, and yes, thanks :)

(some update on the working theory)