Merge lp:~percona-toolkit-dev/percona-toolkit/pt-fingerprint-broken-fingerprinting-of-specific-INSERT-REPLACE-queries-1289516 into lp:percona-toolkit/2.2
Status: | Rejected |
---|---|
Rejected by: | Frank Cizmich |
Proposed branch: | lp:~percona-toolkit-dev/percona-toolkit/pt-fingerprint-broken-fingerprinting-of-specific-INSERT-REPLACE-queries-1289516 |
Merge into: | lp:percona-toolkit/2.2 |
Diff against target: |
434 lines (+103/-34) 8 files modified
bin/pt-fingerprint (+18/-6) bin/pt-index-usage (+18/-6) bin/pt-kill (+18/-6) bin/pt-query-digest (+2/-2) bin/pt-table-usage (+18/-6) bin/pt-upgrade (+18/-6) lib/QueryRewriter.pm (+2/-2) t/lib/QueryRewriter.t (+9/-0) |
To merge this branch: | bzr merge lp:~percona-toolkit-dev/percona-toolkit/pt-fingerprint-broken-fingerprinting-of-specific-INSERT-REPLACE-queries-1289516 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
David Bennett (community) | Needs Fixing | ||
Daniel Nichter | Pending | ||
Review via email: mp+220869@code.launchpad.net |
Commit message
Description of the change
Changed QueryRewrite's function fingerprint so it isn't confused by string values with parentheses inside.
eg:
insert into foo values(
Daniel Nichter (daniel-nichter) wrote : | # |
Frank Cizmich (frank-cizmich) wrote : | # |
Agreed. The fix only addresses a related subset of the original issue. A
perfect fix might be impossible with regexp, or at least quite complex
and costly.
On 05/27/2014 10:37 PM, Daniel Nichter wrote:
> The test does not seem to address the issue originally reported:
>
> $ pt-fingerprint --query "REPLACE INTO a VALUES('INSERT INTO foo VALUES (1),(2)')"
>
> replace into a values('insert into foo values(?+)
>
> That query is going to be very difficult to fingerprinted, perhaps not even possible because fingerprinting is currently heuristic, pattern-based.
--
Frank Cizmich, Software Engineer, Percona
Tel: +1-888-401-3401 Ext: 630
Skype: percona.fcizmich
Montevideo, Uruguay (UTC -3)
www.percona.com
www.mysqlperfor
David Bennett (dbpercona) wrote : | # |
Size of duplicate full text indexes is reported as 0.
CREATE TABLE `ai5` (
`id` int(11) NOT NULL,
`c1` int(11) DEFAULT NULL,
`c2` int(11) DEFAULT NULL,
`c3` varchar(64) DEFAULT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `c3` (`c3`),
FULLTEXT KEY `c3_2` (`c3`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
$ ./pt-duplicate-
# #######
# test.ai5
# #######
# c3_2 (`c3`)
# PRIMARY (`id`)
# c3 (`c3`)
# c3_2 is a duplicate of c3
# Key definitions:
# FULLTEXT KEY `c3_2` (`c3`)
# FULLTEXT KEY `c3` (`c3`),
# Column types:
# `c3` varchar(64) default null
# To remove this duplicate index, execute:
ALTER TABLE `test`.`ai5` DROP INDEX `c3_2`;
# #######
# Summary of indexes
# #######
# Size Duplicate Indexes 0
# Total Duplicate Indexes 1
# Total Indexes 3
David Bennett (dbpercona) wrote : | # |
Comment 630864 made on wrong merge proposal, disregard.
Unmerged revisions
- 597. By Frank Cizmich
-
fixes query digest for inserts with values with parentheses inside - issue 1289516
Preview Diff
1 | === modified file 'bin/pt-fingerprint' |
2 | --- bin/pt-fingerprint 2014-02-20 08:10:16 +0000 |
3 | +++ bin/pt-fingerprint 2014-05-25 01:57:50 +0000 |
4 | @@ -1529,8 +1529,8 @@ |
5 | |
6 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
7 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
8 | -my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ |
9 | -my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW |
10 | +my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
11 | +my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
12 | |
13 | |
14 | sub new { |
15 | @@ -1545,7 +1545,8 @@ |
16 | $query =~ s/$mlc_re//go; |
17 | $query =~ s/$olc_re//go; |
18 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
19 | - $query =~ s/$vlc_re//go; |
20 | + my $qualifier = $1 || ''; |
21 | + $query =~ s/$vlc_re/$qualifier/go; |
22 | } |
23 | return $query; |
24 | } |
25 | @@ -1607,8 +1608,8 @@ |
26 | && return $query; |
27 | $query =~ m/\A\s*(call\s+\S+)\(/i |
28 | && return lc($1); # Warning! $1 used, be careful. |
29 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
30 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
31 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
32 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
33 | } |
34 | |
35 | $query =~ s/$mlc_re//go; |
36 | @@ -1668,6 +1669,13 @@ |
37 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
38 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
39 | |
40 | + if ( $query =~ m/\A\s*LOAD/i ) { |
41 | + my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; |
42 | + $tbl ||= ''; |
43 | + $tbl =~ s/`//g; |
44 | + return "LOAD DATA $tbl"; |
45 | + } |
46 | + |
47 | if ( $query =~ m/\Aadministrator command:/ ) { |
48 | $query =~ s/administrator command:/ADMIN/; |
49 | $query = uc $query; |
50 | @@ -1680,7 +1688,7 @@ |
51 | PTDEBUG && _d($query); |
52 | |
53 | $query = uc $query; |
54 | - $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
55 | + $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
56 | $query =~ s/\s+COUNT[^)]+\)//g; |
57 | |
58 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
59 | @@ -1695,6 +1703,7 @@ |
60 | eval $QueryParser::tbl_ident; |
61 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
62 | if ( $dds) { |
63 | + $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; |
64 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
65 | $obj = uc $obj if $obj; |
66 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
67 | @@ -1761,6 +1770,9 @@ |
68 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
69 | $query = $verbs; |
70 | } |
71 | + elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { |
72 | + return $verbs; |
73 | + } |
74 | else { |
75 | my @tables = $self->__distill_tables($query, $table, %args); |
76 | $query = join(q{ }, $verbs, @tables); |
77 | |
78 | === modified file 'bin/pt-index-usage' |
79 | --- bin/pt-index-usage 2014-02-20 08:10:16 +0000 |
80 | +++ bin/pt-index-usage 2014-05-25 01:57:50 +0000 |
81 | @@ -2374,8 +2374,8 @@ |
82 | |
83 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
84 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
85 | -my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ |
86 | -my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW |
87 | +my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
88 | +my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
89 | |
90 | |
91 | sub new { |
92 | @@ -2390,7 +2390,8 @@ |
93 | $query =~ s/$mlc_re//go; |
94 | $query =~ s/$olc_re//go; |
95 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
96 | - $query =~ s/$vlc_re//go; |
97 | + my $qualifier = $1 || ''; |
98 | + $query =~ s/$vlc_re/$qualifier/go; |
99 | } |
100 | return $query; |
101 | } |
102 | @@ -2452,8 +2453,8 @@ |
103 | && return $query; |
104 | $query =~ m/\A\s*(call\s+\S+)\(/i |
105 | && return lc($1); # Warning! $1 used, be careful. |
106 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
107 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
108 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
109 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
110 | } |
111 | |
112 | $query =~ s/$mlc_re//go; |
113 | @@ -2513,6 +2514,13 @@ |
114 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
115 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
116 | |
117 | + if ( $query =~ m/\A\s*LOAD/i ) { |
118 | + my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; |
119 | + $tbl ||= ''; |
120 | + $tbl =~ s/`//g; |
121 | + return "LOAD DATA $tbl"; |
122 | + } |
123 | + |
124 | if ( $query =~ m/\Aadministrator command:/ ) { |
125 | $query =~ s/administrator command:/ADMIN/; |
126 | $query = uc $query; |
127 | @@ -2525,7 +2533,7 @@ |
128 | PTDEBUG && _d($query); |
129 | |
130 | $query = uc $query; |
131 | - $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
132 | + $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
133 | $query =~ s/\s+COUNT[^)]+\)//g; |
134 | |
135 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
136 | @@ -2540,6 +2548,7 @@ |
137 | eval $QueryParser::tbl_ident; |
138 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
139 | if ( $dds) { |
140 | + $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; |
141 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
142 | $obj = uc $obj if $obj; |
143 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
144 | @@ -2606,6 +2615,9 @@ |
145 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
146 | $query = $verbs; |
147 | } |
148 | + elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { |
149 | + return $verbs; |
150 | + } |
151 | else { |
152 | my @tables = $self->__distill_tables($query, $table, %args); |
153 | $query = join(q{ }, $verbs, @tables); |
154 | |
155 | === modified file 'bin/pt-kill' |
156 | --- bin/pt-kill 2014-02-20 08:10:16 +0000 |
157 | +++ bin/pt-kill 2014-05-25 01:57:50 +0000 |
158 | @@ -4652,8 +4652,8 @@ |
159 | |
160 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
161 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
162 | -my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ |
163 | -my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW |
164 | +my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
165 | +my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
166 | |
167 | |
168 | sub new { |
169 | @@ -4668,7 +4668,8 @@ |
170 | $query =~ s/$mlc_re//go; |
171 | $query =~ s/$olc_re//go; |
172 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
173 | - $query =~ s/$vlc_re//go; |
174 | + my $qualifier = $1 || ''; |
175 | + $query =~ s/$vlc_re/$qualifier/go; |
176 | } |
177 | return $query; |
178 | } |
179 | @@ -4730,8 +4731,8 @@ |
180 | && return $query; |
181 | $query =~ m/\A\s*(call\s+\S+)\(/i |
182 | && return lc($1); # Warning! $1 used, be careful. |
183 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
184 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
185 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
186 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
187 | } |
188 | |
189 | $query =~ s/$mlc_re//go; |
190 | @@ -4791,6 +4792,13 @@ |
191 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
192 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
193 | |
194 | + if ( $query =~ m/\A\s*LOAD/i ) { |
195 | + my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; |
196 | + $tbl ||= ''; |
197 | + $tbl =~ s/`//g; |
198 | + return "LOAD DATA $tbl"; |
199 | + } |
200 | + |
201 | if ( $query =~ m/\Aadministrator command:/ ) { |
202 | $query =~ s/administrator command:/ADMIN/; |
203 | $query = uc $query; |
204 | @@ -4803,7 +4811,7 @@ |
205 | PTDEBUG && _d($query); |
206 | |
207 | $query = uc $query; |
208 | - $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
209 | + $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
210 | $query =~ s/\s+COUNT[^)]+\)//g; |
211 | |
212 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
213 | @@ -4818,6 +4826,7 @@ |
214 | eval $QueryParser::tbl_ident; |
215 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
216 | if ( $dds) { |
217 | + $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; |
218 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
219 | $obj = uc $obj if $obj; |
220 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
221 | @@ -4884,6 +4893,9 @@ |
222 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
223 | $query = $verbs; |
224 | } |
225 | + elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { |
226 | + return $verbs; |
227 | + } |
228 | else { |
229 | my @tables = $self->__distill_tables($query, $table, %args); |
230 | $query = join(q{ }, $verbs, @tables); |
231 | |
232 | === modified file 'bin/pt-query-digest' |
233 | --- bin/pt-query-digest 2014-02-20 08:10:16 +0000 |
234 | +++ bin/pt-query-digest 2014-05-25 01:57:50 +0000 |
235 | @@ -2878,8 +2878,8 @@ |
236 | && return $query; |
237 | $query =~ m/\A\s*(call\s+\S+)\(/i |
238 | && return lc($1); # Warning! $1 used, be careful. |
239 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
240 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
241 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
242 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
243 | } |
244 | |
245 | $query =~ s/$mlc_re//go; |
246 | |
247 | === modified file 'bin/pt-table-usage' |
248 | --- bin/pt-table-usage 2014-02-20 08:10:16 +0000 |
249 | +++ bin/pt-table-usage 2014-05-25 01:57:50 +0000 |
250 | @@ -2112,8 +2112,8 @@ |
251 | |
252 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
253 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
254 | -my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ |
255 | -my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW |
256 | +my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
257 | +my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
258 | |
259 | |
260 | sub new { |
261 | @@ -2128,7 +2128,8 @@ |
262 | $query =~ s/$mlc_re//go; |
263 | $query =~ s/$olc_re//go; |
264 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
265 | - $query =~ s/$vlc_re//go; |
266 | + my $qualifier = $1 || ''; |
267 | + $query =~ s/$vlc_re/$qualifier/go; |
268 | } |
269 | return $query; |
270 | } |
271 | @@ -2190,8 +2191,8 @@ |
272 | && return $query; |
273 | $query =~ m/\A\s*(call\s+\S+)\(/i |
274 | && return lc($1); # Warning! $1 used, be careful. |
275 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
276 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
277 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
278 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
279 | } |
280 | |
281 | $query =~ s/$mlc_re//go; |
282 | @@ -2251,6 +2252,13 @@ |
283 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
284 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
285 | |
286 | + if ( $query =~ m/\A\s*LOAD/i ) { |
287 | + my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; |
288 | + $tbl ||= ''; |
289 | + $tbl =~ s/`//g; |
290 | + return "LOAD DATA $tbl"; |
291 | + } |
292 | + |
293 | if ( $query =~ m/\Aadministrator command:/ ) { |
294 | $query =~ s/administrator command:/ADMIN/; |
295 | $query = uc $query; |
296 | @@ -2263,7 +2271,7 @@ |
297 | PTDEBUG && _d($query); |
298 | |
299 | $query = uc $query; |
300 | - $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
301 | + $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
302 | $query =~ s/\s+COUNT[^)]+\)//g; |
303 | |
304 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
305 | @@ -2278,6 +2286,7 @@ |
306 | eval $QueryParser::tbl_ident; |
307 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
308 | if ( $dds) { |
309 | + $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; |
310 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
311 | $obj = uc $obj if $obj; |
312 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
313 | @@ -2344,6 +2353,9 @@ |
314 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
315 | $query = $verbs; |
316 | } |
317 | + elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { |
318 | + return $verbs; |
319 | + } |
320 | else { |
321 | my @tables = $self->__distill_tables($query, $table, %args); |
322 | $query = join(q{ }, $verbs, @tables); |
323 | |
324 | === modified file 'bin/pt-upgrade' |
325 | --- bin/pt-upgrade 2014-02-20 08:10:16 +0000 |
326 | +++ bin/pt-upgrade 2014-05-25 01:57:50 +0000 |
327 | @@ -4559,8 +4559,8 @@ |
328 | |
329 | my $olc_re = qr/(?:--|#)[^'"\r\n]*(?=[\r\n]|\Z)/; # One-line comments |
330 | my $mlc_re = qr#/\*[^!].*?\*/#sm; # But not /*!version */ |
331 | -my $vlc_re = qr#/\*.*?[0-9+].*?\*/#sm; # For SHOW + /*!version */ |
332 | -my $vlc_rf = qr#^(SHOW).*?/\*![0-9+].*?\*/#sm; # Variation for SHOW |
333 | +my $vlc_re = qr#/\*.*?[0-9]+.*?\*/#sm; # For SHOW + /*!version */ |
334 | +my $vlc_rf = qr#^(?:SHOW).*?/\*![0-9]+(.*?)\*/#sm; # Variation for SHOW |
335 | |
336 | |
337 | sub new { |
338 | @@ -4575,7 +4575,8 @@ |
339 | $query =~ s/$mlc_re//go; |
340 | $query =~ s/$olc_re//go; |
341 | if ( $query =~ m/$vlc_rf/i ) { # contains show + version |
342 | - $query =~ s/$vlc_re//go; |
343 | + my $qualifier = $1 || ''; |
344 | + $query =~ s/$vlc_re/$qualifier/go; |
345 | } |
346 | return $query; |
347 | } |
348 | @@ -4637,8 +4638,8 @@ |
349 | && return $query; |
350 | $query =~ m/\A\s*(call\s+\S+)\(/i |
351 | && return lc($1); # Warning! $1 used, be careful. |
352 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
353 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
354 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
355 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
356 | } |
357 | |
358 | $query =~ s/$mlc_re//go; |
359 | @@ -4698,6 +4699,13 @@ |
360 | $query =~ m/\A\s*UNLOCK TABLES/i && return "UNLOCK"; |
361 | $query =~ m/\A\s*xa\s+(\S+)/i && return "XA_$1"; |
362 | |
363 | + if ( $query =~ m/\A\s*LOAD/i ) { |
364 | + my ($tbl) = $query =~ m/INTO TABLE\s+(\S+)/i; |
365 | + $tbl ||= ''; |
366 | + $tbl =~ s/`//g; |
367 | + return "LOAD DATA $tbl"; |
368 | + } |
369 | + |
370 | if ( $query =~ m/\Aadministrator command:/ ) { |
371 | $query =~ s/administrator command:/ADMIN/; |
372 | $query = uc $query; |
373 | @@ -4710,7 +4718,7 @@ |
374 | PTDEBUG && _d($query); |
375 | |
376 | $query = uc $query; |
377 | - $query =~ s/\s+(?:GLOBAL|SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
378 | + $query =~ s/\s+(?:SESSION|FULL|STORAGE|ENGINE)\b/ /g; |
379 | $query =~ s/\s+COUNT[^)]+\)//g; |
380 | |
381 | $query =~ s/\s+(?:FOR|FROM|LIKE|WHERE|LIMIT|IN)\b.+//ms; |
382 | @@ -4725,6 +4733,7 @@ |
383 | eval $QueryParser::tbl_ident; |
384 | my ( $dds ) = $query =~ /^\s*($QueryParser::data_def_stmts)\b/i; |
385 | if ( $dds) { |
386 | + $query =~ s/\s+IF(?:\s+NOT)?\s+EXISTS/ /i; |
387 | my ( $obj ) = $query =~ m/$dds.+(DATABASE|TABLE)\b/i; |
388 | $obj = uc $obj if $obj; |
389 | PTDEBUG && _d('Data def statment:', $dds, 'obj:', $obj); |
390 | @@ -4791,6 +4800,9 @@ |
391 | map { $verbs =~ s/$_/$alias_for{$_}/ } keys %alias_for; |
392 | $query = $verbs; |
393 | } |
394 | + elsif ( $verbs && $verbs =~ m/^LOAD DATA/ ) { |
395 | + return $verbs; |
396 | + } |
397 | else { |
398 | my @tables = $self->__distill_tables($query, $table, %args); |
399 | $query = join(q{ }, $verbs, @tables); |
400 | |
401 | === modified file 'lib/QueryRewriter.pm' |
402 | --- lib/QueryRewriter.pm 2013-09-27 02:16:19 +0000 |
403 | +++ lib/QueryRewriter.pm 2014-05-25 01:57:50 +0000 |
404 | @@ -164,8 +164,8 @@ |
405 | # mysqldump's INSERT statements will have long values() lists, don't waste |
406 | # time on them... they also tend to segfault Perl on some machines when you |
407 | # get to the "# Collapse IN() and VALUES() lists" regex below! |
408 | - if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\(.*?\))\s*,\s*\(/is ) { |
409 | - $query = $beginning; # Shorten multi-value INSERT statements ASAP |
410 | + if ( my ($beginning) = $query =~ m/\A((?:INSERT|REPLACE)(?: IGNORE)?\s+INTO.+?VALUES\s*\("[^"]*?"|[^"]+?\))\s*,\s*\(/is ) { |
411 | + $query = $beginning; # Shorten multi-value INSERT statements ASAP |
412 | } |
413 | |
414 | $query =~ s/$mlc_re//go; |
415 | |
416 | === modified file 't/lib/QueryRewriter.t' |
417 | --- t/lib/QueryRewriter.t 2013-09-27 02:16:19 +0000 |
418 | +++ t/lib/QueryRewriter.t 2014-05-25 01:57:50 +0000 |
419 | @@ -285,6 +285,15 @@ |
420 | 'union all fingerprints together', |
421 | ); |
422 | |
423 | +# Issue 1289516 |
424 | +is( |
425 | + $qr->fingerprint(q[insert into foo values(1,"(2),(3)",4,"(5),(6)")]), |
426 | + 'insert into foo values(?+)', |
427 | + 'not confused by parentheses inside a quoted string', |
428 | +); |
429 | + |
430 | + |
431 | + |
432 | # Issue 322: mk-query-digest segfault before report |
433 | is( |
434 | $qr->fingerprint( load_file('t/lib/samples/huge_replace_into_values.txt') ), |
The test does not seem to address the issue originally reported:
$ pt-fingerprint --query "REPLACE INTO a VALUES('INSERT INTO foo VALUES (1),(2)')"
replace into a values('insert into foo values(?+)
That query is going to be very difficult to fingerprinted, perhaps not even possible because fingerprinting is currently heuristic, pattern-based.