jujuclient.EnvError: <Env Error - Details: { u'Error': u'watcher was stopped', u'RequestId': 9, u'Response': { }}

Bug #1284183 reported by Chris Johnston
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
Medium
Unassigned

Bug Description

When deploying the CI Airline we are seeing this error:

2014-02-24 12:43:18 Waiting for units before adding relations
Traceback (most recent call last):
  File "/tmp/tmpMXiOxi/deployer/cli.py", line 224, in <module>
    main()
  File "/tmp/tmpMXiOxi/deployer/cli.py", line 119, in main
    run()
  File "/tmp/tmpMXiOxi/deployer/cli.py", line 216, in run
    importer.Importer(env, deployment, options).run()
  File "/tmp/tmpMXiOxi/deployer/action/importer.py", line 191, in run
    self.wait_for_units()
  File "/tmp/tmpMXiOxi/deployer/action/importer.py", line 166, in wait_for_units
    int(timeout), watch=self.options.watch, no_exit=ignore_error)
  File "/tmp/tmpMXiOxi/deployer/env/go.py", line 225, in wait_for_units
    self.client.wait_for_units(timeout, goal_state, callback=callback)
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 406, in wait_for_units
    return WaitForUnits(watch, goal_state).run(callback)
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 690, in run
    for change_set in self.watch:
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 216, in next
    return super(TimeoutWatcher, self).next()
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 178, in next
    'Id': self.watcher_id})
  File "/usr/lib/python2.7/dist-packages/jujuclient.py", line 145, in _rpc
    raise EnvError(result)
jujuclient.EnvError: <Env Error - Details:
 { u'Error': u'watcher was stopped', u'RequestId': 9, u'Response': { }}
 >
Problem deploying "ci-airline-staging": Command '['juju-deployer', '-v', '-c', '/tmp/tmpoUnBMa/deployer/relations.yaml', '-c', '/tmp/tmpoUnBMa/deploye$/production-only.yaml', '-c', '/tmp/tmpoUnBMa/deployer/ppa-assigner.yaml', '-c', '/tmp/tmpoUnBMa/deployer/image-builder.yaml', '-c', '/tmp/tmpoUnBMa/de
ployer/juju-gui.yaml', '-c', '/tmp/tmpoUnBMa/deployer/lander.yaml', '-c', '/tmp/tmpoUnBMa/deployer/ticket-system.yaml', '-c', '/tmp/tmpoUnBMa/deployer/
branch-source-builder.yaml', '-c', '/tmp/tmpoUnBMa/deployer/test-runner.yaml', 'ci-airline-staging']' returned non-zero exit status 1

machine-0: http://paste.ubuntu.com/6986643/

We have tried this on 1gb and 2gb canonistack instances. This has happened in lcy01 and lcy02.

07:36:34 rogpeppe | hazmat, cjohnston: it looks like it might be a problem with mongo - the port it's getting a timeout error from
                    | is mongod's port
07:37:18 rogpeppe | hmm, but that's somewhat odd
07:37:54 cjohnston | hazmat: 'status' meaning 'juju status' ?
07:38:40 rogpeppe | cjohnston: yes
07:38:54 cjohnston | juju status works
07:39:21 rogpeppe | cjohnston: ok, so it seems like this is only a transient error, which is something
07:40:00 cjohnston | http://paste.ubuntu.com/6986741/
07:40:22 rogpeppe | cjohnston: thanks. that all looks healthy.
07:40:42 rogpeppe | cjohnston: did you find out about the issue from a GUI error?
07:41:30 hazmat | rogpeppe, from a deployer error, he's been reproducing this consistently for a week
07:41:43 cjohnston | http://paste.ubuntu.com/6986666/

juju-core: 1.17.3-0ubuntu1

To reproduce:

bzr branch lp:ubuntu-ci-services-itself
cd ubuntu-ci-services-itself
juju bootstrap
sshuttle
./juju-deployer/deploy.py

Tags: api status
description: updated
Curtis Hovey (sinzui)
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
tags: added: api status
Revision history for this message
Para Siva (psivaa) wrote :

syslog from the bootstrap node

Mark Ramm (mark-ramm)
Changed in juju-core:
milestone: none → 1.18.0
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

for folks running into this via deployer, note that jujuclient 0.17.3 (in trusty or installable via pypi) works around the api server disconnecting all clients by auto reconnecting for watches.

Revision history for this message
Dimiter Naydenov (dimitern) wrote :

Following the steps, I can't seem to reproduce this issue on trunk, so perhaps the problem is already fixed.

Deploy output: https://pastebin.canonical.com/106065/

juju status output after completion: https://pastebin.canonical.com/106066/

Changed in juju-core:
milestone: 1.20.0 → next-stable
Curtis Hovey (sinzui)
Changed in juju-core:
importance: High → Medium
milestone: next-stable → none
Revision history for this message
Kapil Thangavelu (hazmat) wrote :

to be clear the watches stop is basically a sympton of any number of issues with the api server, from high load, establishing ha, restarting, error, etc.

Changed in juju-core:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.