Merge lp:~percona-toolkit-dev/percona-toolkit/pt-tc-likely-to-skip-table-when-row-count-is-around-chunk-size-times-chunk-size-limit-1389041 into lp:~percona-toolkit-dev/percona-toolkit/release-2.2.14

Proposed by Frank Cizmich
Status: Needs review
Proposed branch: lp:~percona-toolkit-dev/percona-toolkit/pt-tc-likely-to-skip-table-when-row-count-is-around-chunk-size-times-chunk-size-limit-1389041
Merge into: lp:~percona-toolkit-dev/percona-toolkit/release-2.2.14
Diff against target: 81 lines (+47/-1)
3 files modified
bin/pt-table-checksum (+1/-1)
t/pt-table-checksum/basics.t (+37/-0)
t/pt-table-checksum/samples/create_skip_table.sql (+9/-0)
To merge this branch: bzr merge lp:~percona-toolkit-dev/percona-toolkit/pt-tc-likely-to-skip-table-when-row-count-is-around-chunk-size-times-chunk-size-limit-1389041
Reviewer Review Type Date Requested Status
Daniel Nichter Needs Information
Review via email: mp+254818@code.launchpad.net

Description of the change

Problem:
pt-table-checksum often skips a table when its size is close to --chunk-size * --chunk-size-limit
This is because it checks if the slave table exceeds that size and decides it's too big to do in one chunk.
Since the size of the slave table is just an estimate of the EXPLAIN query, it is often the case that it exceeds the size of the master table by only a few rows. This leads to it skipping tables that could really be done in one chunk with no problem.

Solution:
Add a 20% tolerance to the size estimate.

Note: includes test

To post a comment you must log in.
Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Why not just increase --chunk-size-limit? There's always going to be "if only it was just N% bigger!" So this seems to just move the problem because what if the new border is --chunk-size * --chunk-size-limit * 1.2?

review: Needs Information
Revision history for this message
Frank Cizmich (frank-cizmich) wrote :

Daniel, at first glance it seems as you say, but look what is happening:

chunk-size-limit * chunk-size (without the extra 20%) has already been used earlier in the code to decide that the table will be done in one chunk ( see line 6469 )

The part where I added the 20% tolerance is when it is checking if the slave table can also be done in 1 chunk. This happens *after* it has been decided that the master table can do it.

( see line 9926 )

"if ( $nibble_iter->one_nibble() ) {" # if we're going to do it in one chunk ..
  ... check and be a bit more tolerant with slave here...

The net effect is that now a table will not be skipped for minor discrepancies.

Unmerged revisions

614. By Frank Cizmich

dont skip when slave rowcount barely higher than chunk-size*chunk-size-limit

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/pt-table-checksum'
2--- bin/pt-table-checksum 2015-01-23 10:19:56 +0000
3+++ bin/pt-table-checksum 2015-03-31 19:14:15 +0000
4@@ -9939,7 +9939,7 @@
5 PTDEBUG && _d('Table on', $slave->name(),
6 'has', $n_rows, 'rows');
7 if ( $n_rows
8- && $n_rows > ($tbl->{chunk_size} * $chunk_size_limit) )
9+ && $n_rows > ($tbl->{chunk_size} * $chunk_size_limit) * 1.2 ) # 20% tolerance
10 {
11 PTDEBUG && _d('Table too large on', $slave->name());
12 push @too_large, [$slave->name(), $n_rows || 0];
13
14=== modified file 't/pt-table-checksum/basics.t'
15--- t/pt-table-checksum/basics.t 2015-01-05 22:15:27 +0000
16+++ t/pt-table-checksum/basics.t 2015-03-31 19:14:15 +0000
17@@ -57,6 +57,7 @@
18 }
19
20
21+
22 # ############################################################################
23 # Default checksum and results. The tool does not technically require any
24 # options on well-configured systems (which the test env cannot be). With
25@@ -526,6 +527,42 @@
26 "sql_mode ONLY_FULL_GROUP_BY is overidden"
27 );
28
29+# #############################################################################
30+# Bug #1389041 : Tool sometimes skips tables when
31+# row count ~ chunk-size*chunk-size-limit
32+# https://bugs.launchpad.net/percona-toolkit/+bug/1389041
33+# #############################################################################
34+$sb->load_file('master', "t/pt-table-checksum/samples/create_skip_table.sql");
35+
36+# insert 40 rows in master;
37+for (1..40) {
38+ $master_dbh->do("INSERT INTO test.test VALUES (NULL)");
39+}
40+$sb->wait_for_slaves();
41+# insert 20 more rows in slave1;
42+for (1..20) {
43+ $slave1_dbh->do("INSERT INTO test.test VALUES (NULL)");
44+}
45+$sb->wait_for_slaves();
46+
47+# now master has 40 rows, and slaves have 60
48+
49+$output = output(
50+ sub { $exit_status = pt_table_checksum::main(@args,
51+ qw(--databases test --tables test --chunk-size 25 --chunk-size-limit 2)) },
52+ stderr => 1,
53+);
54+
55+ok(
56+ $output !~ /skipping table test/i,
57+ "Doesn't skip table when row count discrepancies are small. (bug 1389041)"
58+);
59+
60+
61+# cleanup
62+$master_dbh->do("DROP DATABASE test");
63+$sb->wait_for_slaves();
64+
65
66 # #############################################################################
67 # Done.
68
69=== added file 't/pt-table-checksum/samples/create_skip_table.sql'
70--- t/pt-table-checksum/samples/create_skip_table.sql 1970-01-01 00:00:00 +0000
71+++ t/pt-table-checksum/samples/create_skip_table.sql 2015-03-31 19:14:15 +0000
72@@ -0,0 +1,9 @@
73+CREATE DATABASE IF NOT EXISTS `test`;
74+
75+USE `test`;
76+
77+DROP TABLE IF EXISTS `test`;
78+CREATE TABLE `test` (
79+ `a` int PRIMARY KEY AUTO_INCREMENT
80+) ENGINE=InnoDB;
81+

Subscribers

People subscribed via source and target branches

to all changes: