The current charm does not indicated to the end user when a specific
resource is not running. Neither does it indicate when a node is offline
or stopped.
Validate that configured resources are actually running and let the end
user know if they are not.
This commit adds two new options, failed_actions_alert_type and
failed_actions_threshold, which map onto the check_crm options
--failedactions and --failcounts, respectively.
The default option values make check_crm generate critical alerts if
actions failed once.
The actions check can be entirely bypassed if failed_actions_alert_type
is set to 'ignore'.
As explained here[0], setting failure-timeout means that the cib will 'forget'
that a resource agent action failed by setting failcount to 0:
- if $failure-timeout seconds have elapsed from the last failure
- if an event wakes up the policy engine (i.e. at the global resource
recheck in an idle cluster)
By default the failure-timeout will be set to 0, which disables the feature,
however this change allows for tuning.
Choose whether to ignore/warn/crit on failed actions
This commit adds a new option to check_crm named --failedactions
Possible options are 'warning', 'critical', or anything else (which is
considered equivalent to 'ignore').
The default is 'critical' to be backward compatible.
5357776...
by
OpenDev Sysadmins <email address hidden>
OpenDev Migration Patch
This commit was bulk generated and pushed by the OpenDev sysadmins
as a part of the Git hosting and code review systems migration
detailed in these mailing list posts:
Attempts have been made to correct repository namespaces and
hostnames based on simple pattern matching, but it's possible some
were updated incorrectly or missed entirely. Please reach out to us
via the contact information listed at https://opendev.org/ with any
questions you may have.
Stonith is being disabled at the global cluster level despite it being needed
for pacemaker-remote nodes.
The legacy hacluster charm option 'stonith_enable' covers the main 'member'
nodes and if it is set to false then stonith resources are not created for
them and the stonith-enabled cluster parameter is set to false. However, in a
masakari deploy stonith is not required for the member nodes but is for the
remote nodes. In this case stonith-enabled cluster option should be set to
true.
When setting up resources for pacemaker remote nodes use the IP
address supplied by the remote node for communication. This
ensures that communication happens over the desired network
space.