The tool is now a confined snap and available on the Snap Store. This
commit updates the README with:
- embedded Snap Store image link in Introduction
- additional section on how to install the charm
- removed --classic flag when installing the charm locally
A timeout mechanism is needed to running long async actions that can get stuck
such as controller backups, application backups via actions, etc.
Specifically, if an action gets "stuck" python-libjuju's wait() command will
wait indefinitely.
This commit introduces a run_with_timeout util for commands that can
potentially run forever. We also introduce JujuTimeoutError that will be
raised when the task takes too long to finish.
We're also adding a default timeout value of 10 mins which is hardcoded into
constants.py. This will inevitably be refactored into a configurable value.
Refactor backup_app exception handling into one except block.
There was a lot of duplicated code for handling the different possible errors
during backup_app execution. Since the flow is basically the same for each
(i.e. log the error and add the error to the tracker), we can simplify this
by having the error classes give more info and making the _log and add_error
calls more generic.
Add handling in the case of no unit being a leader.
In Juju, there are rare cases where no unit is elected a leader.
This can occur if all units are in a lost state. This would cause
our get_leader function to return None, which would break the
proceeding calls using that leader unit.
To combat this, we introduce error handling for the case of no
leader. get_leader has been enhanced to raise a NoLeaderError if
no leader is returned. Additionally, the error bubbles up to
process.py which records the error and gracefully continues with
the rest of the apps.