Merge lp:~percona-toolkit-dev/percona-toolkit/pt-fingerprint-broken-fingerprinting-of-specific-INSERT-REPLACE-queries-1289516 into lp:percona-toolkit/2.2
Status: | Rejected |
---|---|
Rejected by: | Frank Cizmich |
Proposed branch: | lp:~percona-toolkit-dev/percona-toolkit/pt-fingerprint-broken-fingerprinting-of-specific-INSERT-REPLACE-queries-1289516 |
Merge into: | lp:percona-toolkit/2.2 |
Diff against target: |
434 lines (+103/-34) 8 files modified
bin/pt-fingerprint (+18/-6) bin/pt-index-usage (+18/-6) bin/pt-kill (+18/-6) bin/pt-query-digest (+2/-2) bin/pt-table-usage (+18/-6) bin/pt-upgrade (+18/-6) lib/QueryRewriter.pm (+2/-2) t/lib/QueryRewriter.t (+9/-0) |
To merge this branch: | bzr merge lp:~percona-toolkit-dev/percona-toolkit/pt-fingerprint-broken-fingerprinting-of-specific-INSERT-REPLACE-queries-1289516 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
David Bennett (community) | Needs Fixing | ||
Daniel Nichter | Pending | ||
Review via email: mp+220869@code.launchpad.net |
Commit message
Description of the change
Changed QueryRewrite's function fingerprint so it isn't confused by string values with parentheses inside.
eg:
insert into foo values(
Daniel Nichter (daniel-nichter) wrote : | # |
Frank Cizmich (frank-cizmich) wrote : | # |
Agreed. The fix only addresses a related subset of the original issue. A
perfect fix might be impossible with regexp, or at least quite complex
and costly.
On 05/27/2014 10:37 PM, Daniel Nichter wrote:
> The test does not seem to address the issue originally reported:
>
> $ pt-fingerprint --query "REPLACE INTO a VALUES('INSERT INTO foo VALUES (1),(2)')"
>
> replace into a values('insert into foo values(?+)
>
> That query is going to be very difficult to fingerprinted, perhaps not even possible because fingerprinting is currently heuristic, pattern-based.
--
Frank Cizmich, Software Engineer, Percona
Tel: +1-888-401-3401 Ext: 630
Skype: percona.fcizmich
Montevideo, Uruguay (UTC -3)
www.percona.com
www.mysqlperfor
David Bennett (dbpercona) wrote : | # |
Size of duplicate full text indexes is reported as 0.
CREATE TABLE `ai5` (
`id` int(11) NOT NULL,
`c1` int(11) DEFAULT NULL,
`c2` int(11) DEFAULT NULL,
`c3` varchar(64) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `c3` (`c3`),
FULLTEXT KEY `c3_2` (`c3`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
$ ./pt-duplicate-
# #######
# test.ai5
# #######
# c3_2 (`c3`)
# PRIMARY (`id`)
# c3 (`c3`)
# c3_2 is a duplicate of c3
# Key definitions:
# FULLTEXT KEY `c3_2` (`c3`)
# FULLTEXT KEY `c3` (`c3`),
# Column types:
# `c3` varchar(64) default null
# To remove this duplicate index, execute:
ALTER TABLE `test`.`ai5` DROP INDEX `c3_2`;
# #######
# Summary of indexes
# #######
# Size Duplicate Indexes 0
# Total Duplicate Indexes 1
# Total Indexes 3
David Bennett (dbpercona) wrote : | # |
Comment 630864 made on wrong merge proposal, disregard.
Unmerged revisions
- 597. By Frank Cizmich
-
fixes query digest for inserts with values with parentheses inside - issue 1289516
Preview Diff
1 | === modified file 'bin/pt-fingerprint' | |||
2 | --- bin/pt-fingerprint 2014-02-20 08:10:16 +0000 | |||
3 | +++ bin/pt-fingerprint 2014-05-25 01:57:50 +0000 | |||
4 | @@ -1529,8 +1529,8 @@ | |||
5 | 1529 | 1529 | ||
6 | 1530 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments | 1530 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
7 | 1531 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ | 1531 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
10 | 1532 | my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ | 1532 | my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
11 | 1533 | my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW | 1533 | my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
12 | 1534 | 1534 | ||
13 | 1535 | 1535 | ||
14 | 1536 | sub new { | 1536 | sub new { |
15 | @@ -1545,7 +1545,8 @@ | |||
16 | 1545 | $query =~ s/$mlc_re//go; | 1545 | $query =~ s/$mlc_re//go; |
17 | 1546 | $query =~ s/$olc_re//go; | 1546 | $query =~ s/$olc_re//go; |
18 | 1547 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version | 1547 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
20 | 1548 | $query =~ s/$vlc_re//go; | 1548 | my $qualifier = $1 || ''; |
21 | 1549 | $query =~ s/$vlc_re/$qualifier/go; | ||
22 | 1549 | } | 1550 | } |
23 | 1550 | return $query; | 1551 | return $query; |
24 | 1551 | } | 1552 | } |
25 | @@ -1607,8 +1608,8 @@ | |||
26 | 1607 | && return $query; | 1608 | && return $query; |
27 | 1608 | $query =~ m/\A\s*(call\s+\S+)\(/i | 1609 | $query =~ m/\A\s*(call\s+\S+)\(/i |
28 | 1609 | && return lc($1); # Warning! $1 used, be careful. | 1610 | && return lc($1); # Warning! $1 used, be careful. |
31 | 1610 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 1611 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
32 | 1611 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 1612 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
33 | 1612 | } | 1613 | } |
34 | 1613 | 1614 | ||
35 | 1614 | $query =~ s/$mlc_re//go; | 1615 | $query =~ s/$mlc_re//go; |
36 | @@ -1668,6 +1669,13 @@ | |||
37 | 1668 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; | 1669 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
38 | 1669 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; | 1670 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
39 | 1670 | 1671 | ||
40 | 1672 | if ( $query =~ m/\A\s*LOAD/i ) { | ||
41 | 1673 | my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; | ||
42 | 1674 | $tbl ||= ''; | ||
43 | 1675 | $tbl =~ s/`//g; | ||
44 | 1676 | return "LOAD DATA $tbl"; | ||
45 | 1677 | } | ||
46 | 1678 | |||
47 | 1671 | if ( $query =~ m/\Aadministrator command:/ ) { | 1679 | if ( $query =~ m/\Aadministrator command:/ ) { |
48 | 1672 | $query =~ s/administrator command:/ADMIN/; | 1680 | $query =~ s/administrator command:/ADMIN/; |
49 | 1673 | $query = uc $query; | 1681 | $query = uc $query; |
50 | @@ -1680,7 +1688,7 @@ | |||
51 | 1680 | PTDEBUG && _d($query); | 1688 | PTDEBUG && _d($query); |
52 | 1681 | 1689 | ||
53 | 1682 | $query = uc $query; | 1690 | $query = uc $query; |
55 | 1683 | $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; | 1691 | $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
56 | 1684 | $query =~ s/\s+COUNT[^)]+\)//g; | 1692 | $query =~ s/\s+COUNT[^)]+\)//g; |
57 | 1685 | 1693 | ||
58 | 1686 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; | 1694 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
59 | @@ -1695,6 +1703,7 @@ | |||
60 | 1695 | eval $QueryParser::tbl_ident; | 1703 | eval $QueryParser::tbl_ident; |
61 | 1696 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; | 1704 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
62 | 1697 | if ( $dds) { | 1705 | if ( $dds) { |
63 | 1706 | $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; | ||
64 | 1698 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; | 1707 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
65 | 1699 | $obj = uc $obj if $obj; | 1708 | $obj = uc $obj if $obj; |
66 | 1700 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); | 1709 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
67 | @@ -1761,6 +1770,9 @@ | |||
68 | 1761 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; | 1770 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
69 | 1762 | $query = $verbs; | 1771 | $query = $verbs; |
70 | 1763 | } | 1772 | } |
71 | 1773 | elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { | ||
72 | 1774 | return $verbs; | ||
73 | 1775 | } | ||
74 | 1764 | else { | 1776 | else { |
75 | 1765 | my @tables = $self->__distill_tables($query, $table, %args); | 1777 | my @tables = $self->__distill_tables($query, $table, %args); |
76 | 1766 | $query = join(q{ }, $verbs, @tables); | 1778 | $query = join(q{ }, $verbs, @tables); |
77 | 1767 | 1779 | ||
78 | === modified file 'bin/pt-index-usage' | |||
79 | --- bin/pt-index-usage 2014-02-20 08:10:16 +0000 | |||
80 | +++ bin/pt-index-usage 2014-05-25 01:57:50 +0000 | |||
81 | @@ -2374,8 +2374,8 @@ | |||
82 | 2374 | 2374 | ||
83 | 2375 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments | 2375 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
84 | 2376 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ | 2376 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
87 | 2377 | my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ | 2377 | my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
88 | 2378 | my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW | 2378 | my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
89 | 2379 | 2379 | ||
90 | 2380 | 2380 | ||
91 | 2381 | sub new { | 2381 | sub new { |
92 | @@ -2390,7 +2390,8 @@ | |||
93 | 2390 | $query =~ s/$mlc_re//go; | 2390 | $query =~ s/$mlc_re//go; |
94 | 2391 | $query =~ s/$olc_re//go; | 2391 | $query =~ s/$olc_re//go; |
95 | 2392 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version | 2392 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
97 | 2393 | $query =~ s/$vlc_re//go; | 2393 | my $qualifier = $1 || ''; |
98 | 2394 | $query =~ s/$vlc_re/$qualifier/go; | ||
99 | 2394 | } | 2395 | } |
100 | 2395 | return $query; | 2396 | return $query; |
101 | 2396 | } | 2397 | } |
102 | @@ -2452,8 +2453,8 @@ | |||
103 | 2452 | && return $query; | 2453 | && return $query; |
104 | 2453 | $query =~ m/\A\s*(call\s+\S+)\(/i | 2454 | $query =~ m/\A\s*(call\s+\S+)\(/i |
105 | 2454 | && return lc($1); # Warning! $1 used, be careful. | 2455 | && return lc($1); # Warning! $1 used, be careful. |
108 | 2455 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 2456 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
109 | 2456 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 2457 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
110 | 2457 | } | 2458 | } |
111 | 2458 | 2459 | ||
112 | 2459 | $query =~ s/$mlc_re//go; | 2460 | $query =~ s/$mlc_re//go; |
113 | @@ -2513,6 +2514,13 @@ | |||
114 | 2513 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; | 2514 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
115 | 2514 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; | 2515 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
116 | 2515 | 2516 | ||
117 | 2517 | if ( $query =~ m/\A\s*LOAD/i ) { | ||
118 | 2518 | my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; | ||
119 | 2519 | $tbl ||= ''; | ||
120 | 2520 | $tbl =~ s/`//g; | ||
121 | 2521 | return "LOAD DATA $tbl"; | ||
122 | 2522 | } | ||
123 | 2523 | |||
124 | 2516 | if ( $query =~ m/\Aadministrator command:/ ) { | 2524 | if ( $query =~ m/\Aadministrator command:/ ) { |
125 | 2517 | $query =~ s/administrator command:/ADMIN/; | 2525 | $query =~ s/administrator command:/ADMIN/; |
126 | 2518 | $query = uc $query; | 2526 | $query = uc $query; |
127 | @@ -2525,7 +2533,7 @@ | |||
128 | 2525 | PTDEBUG && _d($query); | 2533 | PTDEBUG && _d($query); |
129 | 2526 | 2534 | ||
130 | 2527 | $query = uc $query; | 2535 | $query = uc $query; |
132 | 2528 | $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; | 2536 | $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
133 | 2529 | $query =~ s/\s+COUNT[^)]+\)//g; | 2537 | $query =~ s/\s+COUNT[^)]+\)//g; |
134 | 2530 | 2538 | ||
135 | 2531 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; | 2539 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
136 | @@ -2540,6 +2548,7 @@ | |||
137 | 2540 | eval $QueryParser::tbl_ident; | 2548 | eval $QueryParser::tbl_ident; |
138 | 2541 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; | 2549 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
139 | 2542 | if ( $dds) { | 2550 | if ( $dds) { |
140 | 2551 | $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; | ||
141 | 2543 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; | 2552 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
142 | 2544 | $obj = uc $obj if $obj; | 2553 | $obj = uc $obj if $obj; |
143 | 2545 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); | 2554 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
144 | @@ -2606,6 +2615,9 @@ | |||
145 | 2606 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; | 2615 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
146 | 2607 | $query = $verbs; | 2616 | $query = $verbs; |
147 | 2608 | } | 2617 | } |
148 | 2618 | elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { | ||
149 | 2619 | return $verbs; | ||
150 | 2620 | } | ||
151 | 2609 | else { | 2621 | else { |
152 | 2610 | my @tables = $self->__distill_tables($query, $table, %args); | 2622 | my @tables = $self->__distill_tables($query, $table, %args); |
153 | 2611 | $query = join(q{ }, $verbs, @tables); | 2623 | $query = join(q{ }, $verbs, @tables); |
154 | 2612 | 2624 | ||
155 | === modified file 'bin/pt-kill' | |||
156 | --- bin/pt-kill 2014-02-20 08:10:16 +0000 | |||
157 | +++ bin/pt-kill 2014-05-25 01:57:50 +0000 | |||
158 | @@ -4652,8 +4652,8 @@ | |||
159 | 4652 | 4652 | ||
160 | 4653 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments | 4653 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
161 | 4654 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ | 4654 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
164 | 4655 | my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ | 4655 | my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
165 | 4656 | my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW | 4656 | my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
166 | 4657 | 4657 | ||
167 | 4658 | 4658 | ||
168 | 4659 | sub new { | 4659 | sub new { |
169 | @@ -4668,7 +4668,8 @@ | |||
170 | 4668 | $query =~ s/$mlc_re//go; | 4668 | $query =~ s/$mlc_re//go; |
171 | 4669 | $query =~ s/$olc_re//go; | 4669 | $query =~ s/$olc_re//go; |
172 | 4670 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version | 4670 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
174 | 4671 | $query =~ s/$vlc_re//go; | 4671 | my $qualifier = $1 || ''; |
175 | 4672 | $query =~ s/$vlc_re/$qualifier/go; | ||
176 | 4672 | } | 4673 | } |
177 | 4673 | return $query; | 4674 | return $query; |
178 | 4674 | } | 4675 | } |
179 | @@ -4730,8 +4731,8 @@ | |||
180 | 4730 | && return $query; | 4731 | && return $query; |
181 | 4731 | $query =~ m/\A\s*(call\s+\S+)\(/i | 4732 | $query =~ m/\A\s*(call\s+\S+)\(/i |
182 | 4732 | && return lc($1); # Warning! $1 used, be careful. | 4733 | && return lc($1); # Warning! $1 used, be careful. |
185 | 4733 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 4734 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
186 | 4734 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 4735 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
187 | 4735 | } | 4736 | } |
188 | 4736 | 4737 | ||
189 | 4737 | $query =~ s/$mlc_re//go; | 4738 | $query =~ s/$mlc_re//go; |
190 | @@ -4791,6 +4792,13 @@ | |||
191 | 4791 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; | 4792 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
192 | 4792 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; | 4793 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
193 | 4793 | 4794 | ||
194 | 4795 | if ( $query =~ m/\A\s*LOAD/i ) { | ||
195 | 4796 | my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; | ||
196 | 4797 | $tbl ||= ''; | ||
197 | 4798 | $tbl =~ s/`//g; | ||
198 | 4799 | return "LOAD DATA $tbl"; | ||
199 | 4800 | } | ||
200 | 4801 | |||
201 | 4794 | if ( $query =~ m/\Aadministrator command:/ ) { | 4802 | if ( $query =~ m/\Aadministrator command:/ ) { |
202 | 4795 | $query =~ s/administrator command:/ADMIN/; | 4803 | $query =~ s/administrator command:/ADMIN/; |
203 | 4796 | $query = uc $query; | 4804 | $query = uc $query; |
204 | @@ -4803,7 +4811,7 @@ | |||
205 | 4803 | PTDEBUG && _d($query); | 4811 | PTDEBUG && _d($query); |
206 | 4804 | 4812 | ||
207 | 4805 | $query = uc $query; | 4813 | $query = uc $query; |
209 | 4806 | $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; | 4814 | $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
210 | 4807 | $query =~ s/\s+COUNT[^)]+\)//g; | 4815 | $query =~ s/\s+COUNT[^)]+\)//g; |
211 | 4808 | 4816 | ||
212 | 4809 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; | 4817 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
213 | @@ -4818,6 +4826,7 @@ | |||
214 | 4818 | eval $QueryParser::tbl_ident; | 4826 | eval $QueryParser::tbl_ident; |
215 | 4819 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; | 4827 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
216 | 4820 | if ( $dds) { | 4828 | if ( $dds) { |
217 | 4829 | $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; | ||
218 | 4821 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; | 4830 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
219 | 4822 | $obj = uc $obj if $obj; | 4831 | $obj = uc $obj if $obj; |
220 | 4823 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); | 4832 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
221 | @@ -4884,6 +4893,9 @@ | |||
222 | 4884 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; | 4893 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
223 | 4885 | $query = $verbs; | 4894 | $query = $verbs; |
224 | 4886 | } | 4895 | } |
225 | 4896 | elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { | ||
226 | 4897 | return $verbs; | ||
227 | 4898 | } | ||
228 | 4887 | else { | 4899 | else { |
229 | 4888 | my @tables = $self->__distill_tables($query, $table, %args); | 4900 | my @tables = $self->__distill_tables($query, $table, %args); |
230 | 4889 | $query = join(q{ }, $verbs, @tables); | 4901 | $query = join(q{ }, $verbs, @tables); |
231 | 4890 | 4902 | ||
232 | === modified file 'bin/pt-query-digest' | |||
233 | --- bin/pt-query-digest 2014-02-20 08:10:16 +0000 | |||
234 | +++ bin/pt-query-digest 2014-05-25 01:57:50 +0000 | |||
235 | @@ -2878,8 +2878,8 @@ | |||
236 | 2878 | && return $query; | 2878 | && return $query; |
237 | 2879 | $query =~ m/\A\s*(call\s+\S+)\(/i | 2879 | $query =~ m/\A\s*(call\s+\S+)\(/i |
238 | 2880 | && return lc($1); # Warning! $1 used, be careful. | 2880 | && return lc($1); # Warning! $1 used, be careful. |
241 | 2881 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 2881 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
242 | 2882 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 2882 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
243 | 2883 | } | 2883 | } |
244 | 2884 | 2884 | ||
245 | 2885 | $query =~ s/$mlc_re//go; | 2885 | $query =~ s/$mlc_re//go; |
246 | 2886 | 2886 | ||
247 | === modified file 'bin/pt-table-usage' | |||
248 | --- bin/pt-table-usage 2014-02-20 08:10:16 +0000 | |||
249 | +++ bin/pt-table-usage 2014-05-25 01:57:50 +0000 | |||
250 | @@ -2112,8 +2112,8 @@ | |||
251 | 2112 | 2112 | ||
252 | 2113 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments | 2113 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
253 | 2114 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ | 2114 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
256 | 2115 | my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ | 2115 | my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
257 | 2116 | my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW | 2116 | my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
258 | 2117 | 2117 | ||
259 | 2118 | 2118 | ||
260 | 2119 | sub new { | 2119 | sub new { |
261 | @@ -2128,7 +2128,8 @@ | |||
262 | 2128 | $query =~ s/$mlc_re//go; | 2128 | $query =~ s/$mlc_re//go; |
263 | 2129 | $query =~ s/$olc_re//go; | 2129 | $query =~ s/$olc_re//go; |
264 | 2130 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version | 2130 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
266 | 2131 | $query =~ s/$vlc_re//go; | 2131 | my $qualifier = $1 || ''; |
267 | 2132 | $query =~ s/$vlc_re/$qualifier/go; | ||
268 | 2132 | } | 2133 | } |
269 | 2133 | return $query; | 2134 | return $query; |
270 | 2134 | } | 2135 | } |
271 | @@ -2190,8 +2191,8 @@ | |||
272 | 2190 | && return $query; | 2191 | && return $query; |
273 | 2191 | $query =~ m/\A\s*(call\s+\S+)\(/i | 2192 | $query =~ m/\A\s*(call\s+\S+)\(/i |
274 | 2192 | && return lc($1); # Warning! $1 used, be careful. | 2193 | && return lc($1); # Warning! $1 used, be careful. |
277 | 2193 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 2194 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
278 | 2194 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 2195 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
279 | 2195 | } | 2196 | } |
280 | 2196 | 2197 | ||
281 | 2197 | $query =~ s/$mlc_re//go; | 2198 | $query =~ s/$mlc_re//go; |
282 | @@ -2251,6 +2252,13 @@ | |||
283 | 2251 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; | 2252 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
284 | 2252 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; | 2253 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
285 | 2253 | 2254 | ||
286 | 2255 | if ( $query =~ m/\A\s*LOAD/i ) { | ||
287 | 2256 | my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; | ||
288 | 2257 | $tbl ||= ''; | ||
289 | 2258 | $tbl =~ s/`//g; | ||
290 | 2259 | return "LOAD DATA $tbl"; | ||
291 | 2260 | } | ||
292 | 2261 | |||
293 | 2254 | if ( $query =~ m/\Aadministrator command:/ ) { | 2262 | if ( $query =~ m/\Aadministrator command:/ ) { |
294 | 2255 | $query =~ s/administrator command:/ADMIN/; | 2263 | $query =~ s/administrator command:/ADMIN/; |
295 | 2256 | $query = uc $query; | 2264 | $query = uc $query; |
296 | @@ -2263,7 +2271,7 @@ | |||
297 | 2263 | PTDEBUG && _d($query); | 2271 | PTDEBUG && _d($query); |
298 | 2264 | 2272 | ||
299 | 2265 | $query = uc $query; | 2273 | $query = uc $query; |
301 | 2266 | $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; | 2274 | $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
302 | 2267 | $query =~ s/\s+COUNT[^)]+\)//g; | 2275 | $query =~ s/\s+COUNT[^)]+\)//g; |
303 | 2268 | 2276 | ||
304 | 2269 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; | 2277 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
305 | @@ -2278,6 +2286,7 @@ | |||
306 | 2278 | eval $QueryParser::tbl_ident; | 2286 | eval $QueryParser::tbl_ident; |
307 | 2279 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; | 2287 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
308 | 2280 | if ( $dds) { | 2288 | if ( $dds) { |
309 | 2289 | $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; | ||
310 | 2281 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; | 2290 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
311 | 2282 | $obj = uc $obj if $obj; | 2291 | $obj = uc $obj if $obj; |
312 | 2283 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); | 2292 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
313 | @@ -2344,6 +2353,9 @@ | |||
314 | 2344 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; | 2353 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
315 | 2345 | $query = $verbs; | 2354 | $query = $verbs; |
316 | 2346 | } | 2355 | } |
317 | 2356 | elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { | ||
318 | 2357 | return $verbs; | ||
319 | 2358 | } | ||
320 | 2347 | else { | 2359 | else { |
321 | 2348 | my @tables = $self->__distill_tables($query, $table, %args); | 2360 | my @tables = $self->__distill_tables($query, $table, %args); |
322 | 2349 | $query = join(q{ }, $verbs, @tables); | 2361 | $query = join(q{ }, $verbs, @tables); |
323 | 2350 | 2362 | ||
324 | === modified file 'bin/pt-upgrade' | |||
325 | --- bin/pt-upgrade 2014-02-20 08:10:16 +0000 | |||
326 | +++ bin/pt-upgrade 2014-05-25 01:57:50 +0000 | |||
327 | @@ -4559,8 +4559,8 @@ | |||
328 | 4559 | 4559 | ||
329 | 4560 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments | 4560 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
330 | 4561 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ | 4561 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
333 | 4562 | my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ | 4562 | my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
334 | 4563 | my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW | 4563 | my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
335 | 4564 | 4564 | ||
336 | 4565 | 4565 | ||
337 | 4566 | sub new { | 4566 | sub new { |
338 | @@ -4575,7 +4575,8 @@ | |||
339 | 4575 | $query =~ s/$mlc_re//go; | 4575 | $query =~ s/$mlc_re//go; |
340 | 4576 | $query =~ s/$olc_re//go; | 4576 | $query =~ s/$olc_re//go; |
341 | 4577 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version | 4577 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
343 | 4578 | $query =~ s/$vlc_re//go; | 4578 | my $qualifier = $1 || ''; |
344 | 4579 | $query =~ s/$vlc_re/$qualifier/go; | ||
345 | 4579 | } | 4580 | } |
346 | 4580 | return $query; | 4581 | return $query; |
347 | 4581 | } | 4582 | } |
348 | @@ -4637,8 +4638,8 @@ | |||
349 | 4637 | && return $query; | 4638 | && return $query; |
350 | 4638 | $query =~ m/\A\s*(call\s+\S+)\(/i | 4639 | $query =~ m/\A\s*(call\s+\S+)\(/i |
351 | 4639 | && return lc($1); # Warning! $1 used, be careful. | 4640 | && return lc($1); # Warning! $1 used, be careful. |
354 | 4640 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 4641 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
355 | 4641 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 4642 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
356 | 4642 | } | 4643 | } |
357 | 4643 | 4644 | ||
358 | 4644 | $query =~ s/$mlc_re//go; | 4645 | $query =~ s/$mlc_re//go; |
359 | @@ -4698,6 +4699,13 @@ | |||
360 | 4698 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; | 4699 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
361 | 4699 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; | 4700 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
362 | 4700 | 4701 | ||
363 | 4702 | if ( $query =~ m/\A\s*LOAD/i ) { | ||
364 | 4703 | my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; | ||
365 | 4704 | $tbl ||= ''; | ||
366 | 4705 | $tbl =~ s/`//g; | ||
367 | 4706 | return "LOAD DATA $tbl"; | ||
368 | 4707 | } | ||
369 | 4708 | |||
370 | 4701 | if ( $query =~ m/\Aadministrator command:/ ) { | 4709 | if ( $query =~ m/\Aadministrator command:/ ) { |
371 | 4702 | $query =~ s/administrator command:/ADMIN/; | 4710 | $query =~ s/administrator command:/ADMIN/; |
372 | 4703 | $query = uc $query; | 4711 | $query = uc $query; |
373 | @@ -4710,7 +4718,7 @@ | |||
374 | 4710 | PTDEBUG && _d($query); | 4718 | PTDEBUG && _d($query); |
375 | 4711 | 4719 | ||
376 | 4712 | $query = uc $query; | 4720 | $query = uc $query; |
378 | 4713 | $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; | 4721 | $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
379 | 4714 | $query =~ s/\s+COUNT[^)]+\)//g; | 4722 | $query =~ s/\s+COUNT[^)]+\)//g; |
380 | 4715 | 4723 | ||
381 | 4716 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; | 4724 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
382 | @@ -4725,6 +4733,7 @@ | |||
383 | 4725 | eval $QueryParser::tbl_ident; | 4733 | eval $QueryParser::tbl_ident; |
384 | 4726 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; | 4734 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
385 | 4727 | if ( $dds) { | 4735 | if ( $dds) { |
386 | 4736 | $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; | ||
387 | 4728 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; | 4737 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
388 | 4729 | $obj = uc $obj if $obj; | 4738 | $obj = uc $obj if $obj; |
389 | 4730 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); | 4739 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
390 | @@ -4791,6 +4800,9 @@ | |||
391 | 4791 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; | 4800 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
392 | 4792 | $query = $verbs; | 4801 | $query = $verbs; |
393 | 4793 | } | 4802 | } |
394 | 4803 | elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { | ||
395 | 4804 | return $verbs; | ||
396 | 4805 | } | ||
397 | 4794 | else { | 4806 | else { |
398 | 4795 | my @tables = $self->__distill_tables($query, $table, %args); | 4807 | my @tables = $self->__distill_tables($query, $table, %args); |
399 | 4796 | $query = join(q{ }, $verbs, @tables); | 4808 | $query = join(q{ }, $verbs, @tables); |
400 | 4797 | 4809 | ||
401 | === modified file 'lib/QueryRewriter.pm' | |||
402 | --- lib/QueryRewriter.pm 2013-09-27 02:16:19 +0000 | |||
403 | +++ lib/QueryRewriter.pm 2014-05-25 01:57:50 +0000 | |||
404 | @@ -164,8 +164,8 @@ | |||
405 | 164 | # mysqldump's INSERT statements will have long values() lists, don't waste | 164 | # mysqldump's INSERT statements will have long values() lists, don't waste |
406 | 165 | # time on them... they also tend to segfault Perl on some machines when you | 165 | # time on them... they also tend to segfault Perl on some machines when you |
407 | 166 | # get to the "# Collapse IN() and VALUES() lists" regex below! | 166 | # get to the "# Collapse IN() and VALUES() lists" regex below! |
410 | 167 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { | 167 | if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
411 | 168 | $query = $beginning; # Shorten multi-value INSERT statements ASAP | 168 | $query = $beginning; # Shorten multi-value INSERT statements ASAP |
412 | 169 | } | 169 | } |
413 | 170 | 170 | ||
414 | 171 | $query =~ s/$mlc_re//go; | 171 | $query =~ s/$mlc_re//go; |
415 | 172 | 172 | ||
416 | === modified file 't/lib/QueryRewriter.t' | |||
417 | --- t/lib/QueryRewriter.t 2013-09-27 02:16:19 +0000 | |||
418 | +++ t/lib/QueryRewriter.t 2014-05-25 01:57:50 +0000 | |||
419 | @@ -285,6 +285,15 @@ | |||
420 | 285 | 'union all fingerprints together', | 285 | 'union all fingerprints together', |
421 | 286 | ); | 286 | ); |
422 | 287 | 287 | ||
423 | 288 | # Issue 1289516 | ||
424 | 289 | is( | ||
425 | 290 | $qr->fingerprint(q[insert into foo values(1,"(2),(3)",4,"(5),(6)")]), | ||
426 | 291 | 'insert into foo values(?+)', | ||
427 | 292 | 'not confused by parentheses inside a quoted string', | ||
428 | 293 | ); | ||
429 | 294 | |||
430 | 295 | |||
431 | 296 | |||
432 | 288 | # Issue 322: mk-query-digest segfault before report | 297 | # Issue 322: mk-query-digest segfault before report |
433 | 289 | is( | 298 | is( |
434 | 290 | $qr->fingerprint( load_file('t/lib/samples/huge_replace_into_values.txt') ), | 299 | $qr->fingerprint( load_file('t/lib/samples/huge_replace_into_values.txt') ), |
The test does not seem to address the issue originally reported:
$ pt-fingerprint --query "REPLACE INTO a VALUES('INSERT INTO foo VALUES (1),(2)')"
replace into a values('insert into foo values(?+)
That query is going to be very difficult to fingerprinted, perhaps not even possible because fingerprinting is currently heuristic, pattern-based.