pt-table-checksum 2.1.4 doesn't detect diffs on Percona XtraDB Cluster nodes

Bug #1062563 reported by Daniel Nichter
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona Toolkit moved to https://jira.percona.com/projects/PT
Fix Released
Medium
Brian Fraser

Bug Description

Although v2.1.4 supports Percona XtraDB Cluster, there's a bug that prevents it from detecting diffs on cluster nodes:

         $slave_lag_cxns = $slaves;

should be

         $slave_lag_cxns = [ map { $_ } @$slaves ];

because right after that assignment, cluster nodes are removed from $slave_lag_cxns, and if $slave_lag_cxns is a ref to $slaves, then removing nodes from it removes the nodes from $slaves, too, so the diffs check on $slaves has nothing to check.

Related branches

Changed in percona-toolkit:
assignee: nobody → Daniel Nichter (daniel-nichter)
summary: - pt-table-checksum v2.1.4 doesn't detect diffs on Percona XtraDB cluster
+ pt-table-checksum v2.1.4 doesn't detect diffs on Percona XtraDB Cluster
nodes
Changed in percona-toolkit:
milestone: none → 2.1.5
Changed in percona-toolkit:
status: In Progress → Fix Committed
summary: - pt-table-checksum v2.1.4 doesn't detect diffs on Percona XtraDB Cluster
+ pt-table-checksum 2.1.4 doesn't detect diffs on Percona XtraDB Cluster
nodes
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Brian Fraser (fraserbn) wrote :

(this was partially fixed in 2.1.5, tagging this to 2.1.6 so we remember to merge the rest of the fix)

Changed in percona-toolkit:
assignee: Daniel Nichter (daniel-nichter) → Brian Fraser (fraserbn)
importance: Critical → Medium
milestone: 2.1.5 → 2.1.6
status: Fix Released → In Progress
Revision history for this message
Brian Fraser (fraserbn) wrote :

Hopefully launchpad will actually submit this this time around. Here's the original commit message of the fix in thew new branch, which summarizes what it does:

  This commit adds two warnings; one for the case of master -> cluster,
  and one for cluster1 -> cluster2.
  The code that checks if two nodes belong to the same cluster
  is "best effort" -- it will generally DTRT, but will fail in
  the case that all of the following are true:
   * Both nodes have the same cluster name
   * They aren't in a master-slave relationship
   * Both nodes have a different wsrep_cluster_address,
     or both of their addresses aren't 'gcom://'; that is,
     they both aren't the first node of a cluster.

  Which can happen in the case that
  Cluster 1 (pt_cluster_name):
   node1 -> addr points to node2
   node2 -> addr points to node3
   node3 -> addr points to node1

  Cluster 2 (pt_cluster_name):
   _node1 -> addr points to _node2
   _node2 -> addr points to _node3
   _node3 -> addr points to _node1

  node1 is a master to _node1.
  The dsns table has all of the nodes.

  ptc will think that cluster 2 is only _node1, and cluster 1
  is everything else. Further heuristics could check if we've
  seen a second cluster and check the addresses from there,
  but isn't currently worth the effort.

Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: In Progress → Fix Committed
Brian Fraser (fraserbn)
Changed in percona-toolkit:
status: Fix Committed → Fix Released
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PT-588

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.