adding units to large deployments is slow

Bug #1746134 reported by Paul Collins
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
NTP Charm
Fix Released
High
Unassigned

Bug Description

PS4.5 has approximately 90 ntp units, which upon add-unit seems to entail invoking the peer relation hook many times, possibly up to 180 times, which at ~4s per invocation is 12 minutes. This leads to a) the unit itself taking a long time to settle; and b) its hooks blocking other hooks on the same machine, slowing down adding new units overall. This became particularly acute thanks to LP:1746119, which resulted in nova-compute having to run its hooks a bunch more times than is typical in order to complete deploying.

Paul Gear (paulgear)
Changed in ntp-charm:
status: New → Confirmed
importance: Undecided → High
Paul Gear (paulgear)
Changed in ntp-charm:
assignee: nobody → Paul Gear (paulgear)
Revision history for this message
Paul Gear (paulgear) wrote : Re: [Bug 1746134] [NEW] adding units to large deployments is slow

I'm trying to optimise this along with the changes for chrony support in
bionic.  I've optimised out a relation-set from the relation-changed
hook, but I'm not hopeful of further performance improvements without
some abuse of update-status as a config-changed caching mechanism.  Once
I have the new code in place I'll do some performance comparisons and
note the results here.

Paul Gear (paulgear)
Changed in ntp-charm:
status: Confirmed → Fix Committed
Revision history for this message
Paul Gear (paulgear) wrote :

New code has been released as cs:ntp-29 (candidate channel). I'll update once I have some performance comparisons to report.

Revision history for this message
Paul Gear (paulgear) wrote :

I've completed several performance tests and have come to the conclusion that auto_peers' use of the peer relation is a fundamental performance bottleneck. To that end, I'm going to submit an MP to deprecate that feature until a better-performing implementation can be created.

Revision history for this message
Paul Gear (paulgear) wrote :

I've pushed https://code.launchpad.net/~paulgear/ntp-charm/+git/ntp-charm/+merge/364236 to mark auto_peers as deprecated, built cs:~nt​p-team/ntp-23 (candidate channel) and started a CI run, but CI seems a bit borked, so I'm leaving this as fix committed for now.

Changed in ntp-charm:
assignee: Paul Gear (paulgear) → nobody
Haw Loeung (hloeung)
Changed in ntp-charm:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.