Config file changes don't only happen in response to config.changed
events; they also happen on ES- and MongoDB-related events and on
leadership events. Thus, clearing the graylog_api.configured flag
only during one of these code paths may lead to problems. For
example, if ES or MongoDB are not immediately available and become
available later on, their config changes may not properly trigger a
re-run of the config handlers which run against the Graylog API.
The fix is simple: we should clear graylog_api.configured any time we
need to restart Graylog.
Additionally, I removed the remove_state('graylog_api.configured')
call in configure_graylog() since any config file change will trigger
a restart, thus also clearing this flag.
1. report_status() now checks the REST API as one of its "waiting_apps",
and will show in the unit's status message that it is waiting on the
REST API to come alive if it is not available.
2. configure_graylog_api(), the handler prior to a bunch of chained
handlers depending on interacting with the GrayLog REST API, now
performs a REST API lifecheck with a 2 minute timeout, ensuring
that the REST API has plenty of time after e.g. a GrayLog restart
to come online, and ensuring that the REST API is up and usable
before proceeding with other chained hooks. This is especially
important since GraylogApi.request() does not raise exceptions for
failed API calls. (This should probably be addressed but would
significantly increase the scope of the needed changes; adding this
check should at least reduce the chance of hitting failures when
accessing the REST API.
3. Several handlers which depend on the REST API did not have a
@when('graylog_api.configured') decorator, and thus would not be
hitting the REST API check. Decorators have been added
appropriately.
This is an artifact from when I was experimenting with a combination
of zaza and libjuju methodologies. As zaza does everything I need,
I've removed the libjuju imports and, as of this commit, am removing
the zaza-specific differentiation.