Merge lp:~andrewjbeach/juju-ci-tools/fix-wait-for-started into lp:juju-ci-tools
Status: | Superseded |
---|---|
Proposed branch: | lp:~andrewjbeach/juju-ci-tools/fix-wait-for-started |
Merge into: | lp:juju-ci-tools |
Diff against target: |
420 lines (+346/-2) 2 files modified
jujupy.py (+194/-1) tests/test_jujupy.py (+152/-1) |
To merge this branch: | bzr merge lp:~andrewjbeach/juju-ci-tools/fix-wait-for-started |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Juju Release Engineering | Pending | ||
Review via email: mp+310553@code.launchpad.net |
This proposal has been superseded by a proposal from 2016-11-14.
Description of the change
Adds Status.
errors and translates it into an Exception.
Its intended use is in _wait_for_status and other wait_for_* functions so that they can produce meaningful error messages when something goes wrong.
Unmerged revisions
- 1655. By Andrew James Beach
-
Cleaned up Status1X.
iter_status. - 1654. By Andrew James Beach
-
Clean up, tests and trying to get compatability with Juju 1.X.
- 1653. By Andrew James Beach
-
It is datetime.strptime, not timedelta.strptime.
- 1652. By Andrew James Beach
-
to_exception now returns None if the StatusItem is not an error. Changed StatusItem's __init__.
- 1651. By Andrew James Beach
-
Small fix in check_for_errors.
- 1650. By Andrew James Beach
-
Clean-up in responce to latest round of feedback. Such as clearing out the debugging code and adding the StatusItem class.
- 1649. By Andrew James Beach
-
Didn't mean to cut that test out.
- 1648. By Andrew James Beach
-
New status->exception translation. A few more types of Errors and some 'surface' functions with ignore_recoverable. Tests have been updated, but should probably be redone.
- 1647. By Andrew James Beach
-
Fixed a few odds and ends for lint, I also changed StatusError sorting to a faster key based method.
- 1646. By Andrew James Beach
-
Worked on ordering, recoverable and InstallError.
Good questions.
1. Lets treat StatusError as the generic error seen in status, we prefer specific errors about what has errored. I want to test for errors in status all the time to exit the test early, but we need extra intelligence to know when to deffer raising :(
2. I know from experience that Juju 2 is great at retries. It sees the errors in status as we do, and attempts to resolve them. So getting status with errors should not cause us to raise an error. We need to wait for a period we believe Juju should have addresses the error.
But I know that machine errors are not recoverable at this time. Once we see on in status, we can give up.
3. The priority is:
MachineError: not recoverable, raise when first seen.
We might want subtypes in the future. Image not found for example can be a human or canonical error.
Machine failed to start is a substrate error
UnitError: juju will retry. There isn't a right answer for retries to recover, but 10 minutes often works
HookError is the most common UnitError and juju reties. (I really love this feature)
InstallError is almost always fatal
AppError: This is often a summary of UnitError. It might also show config errors which are not unit specific
StatusError: something else
I wonder if the call needs to check status needs a an arg to distinguish between fatal and recoverable errors