telegraf mysql (mysql-root) and postgresql (pgsql) relations are not container-scoped

Bug #1718259 reported by Dmitrii Shcherbakov
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Telegraf Charm
Won't Fix
Medium
Unassigned

Bug Description

Following investigation in bug 1717590 it was discovered that mysql and pgsql relations do not have a "container" scope (Juju-level, not reactive conversation level scope)

https://bugs.launchpad.net/charm-percona-cluster/+bug/1717590/comments/14

https://git.launchpad.net/telegraf-charm/tree/metadata.yaml?h=built#n18
"requires":
  "mysql":
    "interface": "mysql-root"
  "postgresql":
    "interface": "pgsql"

Relation scope being a juju-level concept not a reactive library level concept needs to be properly set in metadata.yaml to avoid extra relation data passing:

https://github.com/juju/juju/blob/juju-2.2.4/apiserver/uniter/subordinaterelationwatcher.go#L27-L33
// newSubordinateRelationsWatcher creates a watcher that will notify
// about relation lifecycle events for subordinateApp, but filtered to
// be relevant to a unit deployed to a container with the
// principalName app. Global relations will be included, but only
// container-scoped relations for the principal application will be
// emitted - other container-scoped relations will be filtered out.

https://jujucharms.com/docs/2.2/authors-subordinate-applications
"Principal application: A traditional application or charm in whose container subordinate applications will execute.

Subordinate application/charm: An application designed for and deployed to the running container of another application unit.

Container relation: A scope:container relationship. While modeled identically to traditional, scope: global, relationships, juju only implements the relationship between the units belonging to the same container."

So, we need to make sure that telegraf, while being a subordinate charm (subordinate": "true" in metadata.yaml), also has a proper relation scope for mysql (and pgsql) relation so that mysql-relation-{joined,changed} events are not fired for other telegraf units that do not collect data from mysql and do not reside on the same host.

Just in case, telegraf collects metrics from mysql so a single agent only needs to reside on the same host and talk to a particular mysql instance https://github.com/influxdata/telegraf/tree/master/plugins/inputs/mysql

Related branches

description: updated
tags: added: cdo-qa cdo-qa-blocker foundations-engine
Revision history for this message
Stuart Bishop (stub) wrote :

The pgsql interface is not container scoped and does not function correctly if used as a container scoped relation, which is why telegraf does not try to declare it as container scoped.

As best I can tell, what should happen when one end of a relation declares the scope as container and the other as global is undefined (although it does happen to work with many simpler relations). For PostgreSQL, the particular problem is that the relation becomes container scoped and the PostgreSQL units are unable to see their peers, breaking the implementation because the standby units never see the master unit join the relation.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

~stub,

That's odd. At least from what I can see in the Juju code, there are no checks for otherEnd's scope and there are watchers for different relation objects.

If this end is global - should send
If this end is not global - check if the other end is a principal for this watcher - should send
If this end is not global and the other end is not a principal - should send (subordinate <-> subordinate)

https://github.com/juju/juju/blob/juju-2.2.4/apiserver/uniter/subordinaterelationwatcher.go#L118-L139
func (w *subRelationsWatcher) shouldSendCheck(key string) (bool, error) {
...
 if thisEnd.Scope == charm.ScopeGlobal {
  return true, nil
 }

 // Only allow container relations if the other end is our
 // principal or the other end is a subordinate.
 otherEnds, err := rel.RelatedEndpoints(w.app.Name())
 if err != nil {
  return false, errors.Trace(err)
 }
 for _, otherEnd := range otherEnds {
  if otherEnd.ApplicationName == w.principalName {
   return true, nil
  }
  otherApp, err := w.backend.Application(otherEnd.ApplicationName)
  if err != nil {
   return false, errors.Trace(err)
  }
  if !otherApp.IsPrincipal() {
   return true, nil
  }
 }

No checks for scope here either:
https://github.com/juju/juju/blob/juju-2.2.4/apiserver/uniter/uniter.go#L1691-L1705

This behavior of setting relation scope to 'container' if one of two sides has ScopeContainer goes way back (to 5 years ago):
https://github.com/juju/juju/blame/juju-2.2.4/state/state.go#L1661-L1668
func (st *State) AddRelation(eps ...Endpoint) (r *Relation, err error) {
...
 // If either endpoint has container scope, so must the other; and the
 // applications's series must also match, because they'll be deployed to
 // the same machines.
 matchSeries := true
 if eps[0].Scope == charm.ScopeContainer {
  eps[1].Scope = charm.ScopeContainer
 } else if eps[1].Scope == charm.ScopeContainer {
  eps[0].Scope = charm.ScopeContainer

eps ~ Endpoint ~ Relation
https://github.com/juju/juju/blob/juju-2.2.4/api/uniter/endpoint.go#L13-L15
Relation
https://github.com/juju/charm/blob/v6-unstable/meta.go#L125-L134

---

Which Juju version was used when you noticed the undefined behavior with one side Global and another side Container-scoped?

Revision history for this message
Stuart Bishop (stub) wrote :

You might be right, and the issue could have been client/subordinate side in interface:pgsql. Updates I've made to that interface since I opened Bug #1560262 may have fixed this, and switching the scope to container may now just-work.

Ante Karamatić (ivoks)
tags: added: cpe-onsite
removed: cpec
Revision history for this message
Stuart Bishop (stub) wrote :

Can anyone state what the actual bug here is by the way? My understanding is that the current approach of using the non-container scoped PostgreSQL relation as it is defined is correct, rather than chosing to use a different scope to the primary charm and hoping it works. I ask because I think people might be trying to fix the wrong problem.

The PostgreSQL charm defines the pgsql as a non-container scoped relation, which indicates to me that the other end should agree that it is non-container scoped and use a standard relation. Even if the other end is a subordinate and has other container scoped relations such as juju-info to the primary charm.

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Let me first define a term that I am going to use:

instance = object in object oriented programming terms = a copy of a struct

relation instance = an object for two units in question

If there are N units of app1 and M units of app two there will be N x M relation objects.

A relation with the same name can be reused for multiple applications.

N (app1), M (app2), K(app3)

app1 <-> rel_example_name <-> app2
app1 <-> rel_example_name <-> app3

In total I will have N x M + N x K = N (M + K) relation objects.

If I have subordinates, there is no need for me to have N x M or M x K - I will just have N1 = M, N2 = K.

So, in total, there will be: M + K container-scoped relations.

If I don't make them container-scoped, they will be treated as normal and I lose the benefit of a subordinate application. They will be placed locally but there will be additional joined, changed etc. event lifecycles for units that are not primaries or subordinates.

So, if app1 is a subordinate but relations are not container scoped, I will still get N (M + K).

Juju code assumes that both ends of a relation "instance" (think objects in object-oriented programming) for two particular units should round up to the smallest scope.

If a charm on the other end is container-scoped all copies of the relation objects in Juju will be container scoped.

This makes sense because if I wanted to have app4 as a non-subordinate but use the same relation, I would have an option to do so:

app1 (scope: container) <-> rel_example_name <-> app2 (scope: global) => final scope: container

app1 (scope: container) <-> rel_example_name <-> app3 (scope: global) => final scope: container

app4 (scope: global) <-> rel_example_name <-> app3 (scope: global) => final scope: global

This is also beneficial when I don't control app2 and app3 but want to leverage container scope (like with telegraf and postgres or mysql).

There is, however, a bug which prevents us from using that https://bugs.launchpad.net/juju/+bug/1721295

In addition to the upgradability problem.
https://bugs.launchpad.net/juju/+bug/1721289
https://bugs.launchpad.net/juju/+bug/1721295

The end goal is to cut down the amount of event life cycles while maintaining ownership and stability of charms we are relating to.

Revision history for this message
Xav Paice (xavpaice) wrote :

Is this bug still active and needing further work, or have changes in Juju made this no longer an issue?

Changed in charm-telegraf:
status: New → Incomplete
Changed in charm-telegraf:
importance: Undecided → Low
Revision history for this message
Xav Paice (xavpaice) wrote :

Confirmed that this is still an issue for the mysql relation at least

Changed in charm-telegraf:
status: Incomplete → Triaged
importance: Low → Medium
Eric Chen (eric-chen)
Changed in charm-telegraf:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.