Merge lp:~percona-toolkit-dev/percona-toolkit/fix-917770-pt-config-diff-uninit-value-crash into lp:percona-toolkit/2.1
- fix-917770-pt-config-diff-uninit-value-crash
- Merge into 2.1
Status: | Merged |
---|---|
Approved by: | Brian Fraser |
Approved revision: | 469 |
Merged at revision: | 478 |
Proposed branch: | lp:~percona-toolkit-dev/percona-toolkit/fix-917770-pt-config-diff-uninit-value-crash |
Merge into: | lp:percona-toolkit/2.1 |
Diff against target: |
521 lines (+240/-152) 4 files modified
bin/pt-config-diff (+78/-66) lib/MySQLConfig.pm (+105/-85) t/lib/MySQLConfig.t (+26/-1) t/pt-config-diff/samples/bug_917770.cnf (+31/-0) |
To merge this branch: | bzr merge lp:~percona-toolkit-dev/percona-toolkit/fix-917770-pt-config-diff-uninit-value-crash |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Brian Fraser (community) | Approve | ||
Daniel Nichter | Approve | ||
Review via email: mp+137301@code.launchpad.net |
This proposal supersedes a proposal from 2012-11-26.
Commit message
Description of the change
Daniel Nichter (daniel-nichter) wrote : Posted in a previous version of this proposal | # |
Brian Fraser (fraserbn) wrote : | # |
New fix deals with the actual root cause, and adds two safety checks against undefs. Plus a test case : D
Daniel Nichter (daniel-nichter) wrote : | # |
64 + # We were passed an undef for the value, or we're dealing with an empty
65 + # line
66 + if ( !defined($item) ) {
67 + $item = '';
68 + }
69 if ( $item ) {
Are the new lines necessary? Line 69 will be false if $item is either undef or '', so setting it to '' doesn't seem to do anything except introduce the possibility of no-name items. I think this is better handled around
+ map { $_ =~ m/^--([
and its non-"--" variant in another call. Granted, those split/grep/map are a poor place: in the actual sub call. So perhaps it should be refactored into sub which can match for the item=val and if the item part is undef then warn about the failure to parse that line and skip it. Doing that, however, will mean that we also need to skip section headers like "[client]" in addition to comment and blank lines.
Maybe the refactored sub can take a pattern arg, m/^--([
- 469. By Brian Fraser
-
Refactor _parse_varvals.
Now it takes two arguments: A regexp and a string to match against.
_parse_varvals itself was split in three: _preprocess_varvals,
_parse_varvals, and _process_val.This also modifies the three places that call _parse_varvals; For
two, no real changes were needed, but parse_mysqld() required a fix
to deal with the two final lines of mysqld --help --verbose:To see what values a running MySQL server is using, type
'mysqladmin variables' instead of 'mysqld --verbose --help'.
Daniel Nichter (daniel-nichter) wrote : | # |
Looks good. The refactored code is much saner.
Brian Fraser (fraserbn) : | # |
Preview Diff
1 | === modified file 'bin/pt-config-diff' |
2 | --- bin/pt-config-diff 2012-11-19 18:47:13 +0000 |
3 | +++ bin/pt-config-diff 2012-12-04 08:13:22 +0000 |
4 | @@ -2145,15 +2145,16 @@ |
5 | PTDEBUG && _d("mysqld help output doesn't list option files"); |
6 | } |
7 | |
8 | - if ( $output !~ m/^-+ -+$/mg ) { |
9 | + if ( $output !~ m/^-+ -+$(.+?)(?:\n\n.+)?\z/sm ) { |
10 | PTDEBUG && _d("mysqld help output doesn't list vars and vals"); |
11 | return; |
12 | } |
13 | |
14 | - my $varvals = substr($output, (pos $output) + 1, length $output); |
15 | + my $varvals = $1; |
16 | |
17 | my ($config, undef) = _parse_varvals( |
18 | - $varvals =~ m/\G^(\S+)(.*)\n/mg |
19 | + qr/^(\S+)(.*)$/, |
20 | + $varvals, |
21 | ); |
22 | |
23 | return $config, \@opt_files; |
24 | @@ -2168,7 +2169,8 @@ |
25 | my ($output) = @args{@required_args}; |
26 | |
27 | my ($config, $dupes) = _parse_varvals( |
28 | - map { $_ =~ m/^--([^=]+)(?:=(.*))?$/ } split("\n", $output) |
29 | + qr/^--([^=]+)(?:=(.*))?$/, |
30 | + $output, |
31 | ); |
32 | |
33 | return $config, $dupes; |
34 | @@ -2186,84 +2188,94 @@ |
35 | die "Failed to parse the [mysqld] section" unless $mysqld_section; |
36 | |
37 | my ($config, $dupes) = _parse_varvals( |
38 | - map { $_ =~ m/^([^=]+)(?:=(.*))?$/ } |
39 | - grep { $_ !~ m/^\s*#/ } # no # comment lines |
40 | - split("\n", $mysqld_section) |
41 | + qr/^([^=]+)(?:=(.*))?$/, |
42 | + $mysqld_section, |
43 | ); |
44 | |
45 | return $config, $dupes; |
46 | } |
47 | |
48 | +sub _preprocess_varvals { |
49 | + my ($re, $to_parse) = @_; |
50 | + |
51 | + my %vars; |
52 | + LINE: |
53 | + foreach my $line ( split /\n/, $to_parse ) { |
54 | + next LINE if $line =~ m/^\s*$/; # no empty lines |
55 | + next LINE if $line =~ m/^\s*#/; # no # comment lines |
56 | + |
57 | + if ( $line !~ $re ) { |
58 | + PTDEBUG && _d("Line <", $line, "> didn't match $re"); |
59 | + next LINE; |
60 | + } |
61 | + |
62 | + my ($var, $val) = ($1, $2); |
63 | + |
64 | + $var =~ tr/-/_/; |
65 | + |
66 | + if ( !defined $val ) { |
67 | + $val = ''; |
68 | + } |
69 | + |
70 | + for my $item ($var, $val) { |
71 | + $item =~ s/^\s+//; |
72 | + $item =~ s/\s+$//; |
73 | + } |
74 | + |
75 | + push @{$vars{$var} ||= []}, $val |
76 | + } |
77 | + |
78 | + return \%vars; |
79 | +} |
80 | + |
81 | sub _parse_varvals { |
82 | - my ( @varvals ) = @_; |
83 | + my ( $vars ) = _preprocess_varvals(@_); |
84 | |
85 | my %config; |
86 | |
87 | - my $duplicate_var = 0; |
88 | my %duplicates; |
89 | |
90 | - my $var; # current variable (e.g. datadir) |
91 | - my $val; # value for current variable |
92 | - ITEM: |
93 | - foreach my $item ( @varvals ) { |
94 | - if ( $item ) { |
95 | - $item =~ s/^\s+//; |
96 | - $item =~ s/\s+$//; |
97 | - } |
98 | - |
99 | - if ( !$var ) { |
100 | - $var = $item; |
101 | - |
102 | - $var =~ s/-/_/g; |
103 | - |
104 | - if ( exists $config{$var} && !$can_be_duplicate{$var} ) { |
105 | - PTDEBUG && _d("Duplicate var:", $var); |
106 | - $duplicate_var = 1; # flag on, save all the var's values |
107 | - } |
108 | - } |
109 | - else { |
110 | - my $val = $item; |
111 | - PTDEBUG && _d("Var:", $var, "val:", $val); |
112 | - |
113 | - if ( !defined $val ) { |
114 | - $val = ''; |
115 | - } |
116 | - else { |
117 | - $val =~ s/ |
118 | - \A # Start of value |
119 | - (['"]) # Opening quote |
120 | - (.*) # Value |
121 | - \1 # Closing quote |
122 | - [\n\r]*\z # End of value |
123 | - /$2/x; |
124 | - if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) { |
125 | - my %factor_for = ( |
126 | - k => 1_024, |
127 | - m => 1_048_576, |
128 | - g => 1_073_741_824, |
129 | - t => 1_099_511_627_776, |
130 | - ); |
131 | - $val = $num * $factor_for{lc $factor}; |
132 | - } |
133 | - elsif ( $val =~ m/No default/ ) { |
134 | - $val = ''; |
135 | - } |
136 | - } |
137 | - |
138 | - if ( $duplicate_var ) { |
139 | - push @{$duplicates{$var}}, $config{$var}; |
140 | - $duplicate_var = 0; # flag off for next var |
141 | - } |
142 | - |
143 | - $config{$var} = $val; |
144 | - |
145 | - $var = undef; # next item should be a var |
146 | - } |
147 | + while ( my ($var, $vals) = each %$vars ) { |
148 | + my $val = _process_val( pop @$vals ); |
149 | + if ( @$vals && !$can_be_duplicate{$var} ) { |
150 | + PTDEBUG && _d("Duplicate var:", $var); |
151 | + foreach my $current_val ( map { _process_val($_) } @$vals ) { |
152 | + push @{$duplicates{$var} ||= []}, $current_val; |
153 | + } |
154 | + } |
155 | + |
156 | + PTDEBUG && _d("Var:", $var, "val:", $val); |
157 | + |
158 | + $config{$var} = $val; |
159 | } |
160 | |
161 | return \%config, \%duplicates; |
162 | } |
163 | |
164 | +sub _process_val { |
165 | + my ($val) = @_; |
166 | + $val =~ s/ |
167 | + \A # Start of value |
168 | + (['"]) # Opening quote |
169 | + (.*) # Value |
170 | + \1 # Closing quote |
171 | + [\n\r]*\z # End of value |
172 | + /$2/x; |
173 | + if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) { |
174 | + my %factor_for = ( |
175 | + k => 1_024, |
176 | + m => 1_048_576, |
177 | + g => 1_073_741_824, |
178 | + t => 1_099_511_627_776, |
179 | + ); |
180 | + $val = $num * $factor_for{lc $factor}; |
181 | + } |
182 | + elsif ( $val =~ m/No default/ ) { |
183 | + $val = ''; |
184 | + } |
185 | + return $val; |
186 | +} |
187 | + |
188 | sub _mimic_show_variables { |
189 | my ( %args ) = @_; |
190 | my @required_args = qw(vars format); |
191 | |
192 | === modified file 'lib/MySQLConfig.pm' |
193 | --- lib/MySQLConfig.pm 2012-07-30 14:50:42 +0000 |
194 | +++ lib/MySQLConfig.pm 2012-12-04 08:13:22 +0000 |
195 | @@ -259,18 +259,28 @@ |
196 | # help TRUE |
197 | # abort-slave-event-count 0 |
198 | # So we search for that line of hypens. |
199 | - if ( $output !~ m/^-+ -+$/mg ) { |
200 | + # |
201 | + # It also ends with something like |
202 | + # |
203 | + # wait_timeout 28800 |
204 | + # |
205 | + # To see what values a running MySQL server is using, type |
206 | + # 'mysqladmin variables' instead of 'mysqld --verbose --help'. |
207 | + # |
208 | + # So try to find it by locating a double newline, and strip it away |
209 | + if ( $output !~ m/^-+ -+$(.+?)(?:\n\n.+)?\z/sm ) { |
210 | PTDEBUG && _d("mysqld help output doesn't list vars and vals"); |
211 | return; |
212 | } |
213 | |
214 | - # Cut off everything before the list of vars and vals. |
215 | - my $varvals = substr($output, (pos $output) + 1, length $output); |
216 | + # Grab the varval list. |
217 | + my $varvals = $1; |
218 | |
219 | # Parse the "var val" lines. 2nd retval is duplicates but there |
220 | # shouldn't be any with mysqld. |
221 | my ($config, undef) = _parse_varvals( |
222 | - $varvals =~ m/\G^(\S+)(.*)\n/mg |
223 | + qr/^(\S+)(.*)$/, |
224 | + $varvals, |
225 | ); |
226 | |
227 | return $config, \@opt_files; |
228 | @@ -288,7 +298,8 @@ |
229 | |
230 | # Parse the "--var=val" lines. |
231 | my ($config, $dupes) = _parse_varvals( |
232 | - map { $_ =~ m/^--([^=]+)(?:=(.*))?$/ } split("\n", $output) |
233 | + qr/^--([^=]+)(?:=(.*))?$/, |
234 | + $output, |
235 | ); |
236 | |
237 | return $config, $dupes; |
238 | @@ -309,107 +320,116 @@ |
239 | |
240 | # Parse the "var=val" lines. |
241 | my ($config, $dupes) = _parse_varvals( |
242 | - map { $_ =~ m/^([^=]+)(?:=(.*))?$/ } |
243 | - grep { $_ !~ m/^\s*#/ } # no # comment lines |
244 | - split("\n", $mysqld_section) |
245 | + qr/^([^=]+)(?:=(.*))?$/, |
246 | + $mysqld_section, |
247 | ); |
248 | |
249 | return $config, $dupes; |
250 | } |
251 | |
252 | -# Parses a list of variables and their values ("varvals"), returns two |
253 | +# Called by _parse_varvals(), takes two arguments: a regex, and |
254 | +# a string to match against. The string will be split in lines, |
255 | +# and each line will be matched against the regex. |
256 | +# The regex must return to captures, although the second doesn't |
257 | +# have to match anything. |
258 | +# Returns a hashref of arrayrefs ala |
259 | +# { port => [ 12345, 12346 ], key_buffer_size => [ "16M" ] } |
260 | +sub _preprocess_varvals { |
261 | + my ($re, $to_parse) = @_; |
262 | + |
263 | + my %vars; |
264 | + LINE: |
265 | + foreach my $line ( split /\n/, $to_parse ) { |
266 | + next LINE if $line =~ m/^\s*$/; # no empty lines |
267 | + next LINE if $line =~ m/^\s*#/; # no # comment lines |
268 | + |
269 | + if ( $line !~ $re ) { |
270 | + PTDEBUG && _d("Line <", $line, "> didn't match $re"); |
271 | + next LINE; |
272 | + } |
273 | + |
274 | + my ($var, $val) = ($1, $2); |
275 | + |
276 | + # Variable names are usually specified like "log-bin" |
277 | + # but in SHOW VARIABLES they're all like "log_bin". |
278 | + $var =~ tr/-/_/; |
279 | + |
280 | + if ( !defined $val ) { |
281 | + $val = ''; |
282 | + } |
283 | + |
284 | + # Strip leading and trailing whitespace. |
285 | + for my $item ($var, $val) { |
286 | + $item =~ s/^\s+//; |
287 | + $item =~ s/\s+$//; |
288 | + } |
289 | + |
290 | + push @{$vars{$var} ||= []}, $val |
291 | + } |
292 | + |
293 | + return \%vars; |
294 | +} |
295 | + |
296 | +# Parses a string of variables and their values ("varvals"), returns two |
297 | # hashrefs: one with normalized variable=>value, the other with duplicate |
298 | -# vars. The varvals list should start with a var at index 0 and its value |
299 | -# at index 1 then repeat for the next var-val pair. |
300 | +# vars. |
301 | sub _parse_varvals { |
302 | - my ( @varvals ) = @_; |
303 | + my ( $vars ) = _preprocess_varvals(@_); |
304 | |
305 | # Config built from parsing the given varvals. |
306 | my %config; |
307 | |
308 | # Discover duplicate vars. |
309 | - my $duplicate_var = 0; |
310 | my %duplicates; |
311 | |
312 | - # Keep track if item is var or val because each needs special modifications. |
313 | - my $var; # current variable (e.g. datadir) |
314 | - my $val; # value for current variable |
315 | - ITEM: |
316 | - foreach my $item ( @varvals ) { |
317 | - if ( $item ) { |
318 | - # Strip leading and trailing whitespace. |
319 | - $item =~ s/^\s+//; |
320 | - $item =~ s/\s+$//; |
321 | - } |
322 | - |
323 | - if ( !$var ) { |
324 | - # No var means this item is (should be) the next var in the list. |
325 | - $var = $item; |
326 | - |
327 | - # Variable names are usually specified like "log-bin" |
328 | - # but in SHOW VARIABLES they're all like "log_bin". |
329 | - $var =~ s/-/_/g; |
330 | - |
331 | + while ( my ($var, $vals) = each %$vars ) { |
332 | + my $val = _process_val( pop @$vals ); |
333 | + # If the variable has duplicates, then @$vals will contain |
334 | + # the rest of the values |
335 | + if ( @$vals && !$can_be_duplicate{$var} ) { |
336 | # The var is a duplicate (in the bad sense, i.e. where user is |
337 | # probably unaware that there's two different values for this var |
338 | - # but only the last is used) if we've seen it already and it cannot |
339 | - # be duplicated. We don't have its value yet (next loop iter), |
340 | - # so we set a flag to indicate that we should save the duplicate value. |
341 | - if ( exists $config{$var} && !$can_be_duplicate{$var} ) { |
342 | - PTDEBUG && _d("Duplicate var:", $var); |
343 | - $duplicate_var = 1; # flag on, save all the var's values |
344 | - } |
345 | - } |
346 | - else { |
347 | - # $var is set so this item should be its value. |
348 | - my $val = $item; |
349 | - PTDEBUG && _d("Var:", $var, "val:", $val); |
350 | - |
351 | - # Avoid crashing on undef comparison. Also, SHOW VARIABLES uses |
352 | - # blank strings, not NULL/undef. |
353 | - if ( !defined $val ) { |
354 | - $val = ''; |
355 | - } |
356 | - else { |
357 | - $val =~ s/ |
358 | - \A # Start of value |
359 | - (['"]) # Opening quote |
360 | - (.*) # Value |
361 | - \1 # Closing quote |
362 | - [\n\r]*\z # End of value |
363 | - /$2/x; |
364 | - if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) { |
365 | - # value is a size like 1k, 16M, etc. |
366 | - my %factor_for = ( |
367 | - k => 1_024, |
368 | - m => 1_048_576, |
369 | - g => 1_073_741_824, |
370 | - t => 1_099_511_627_776, |
371 | - ); |
372 | - $val = $num * $factor_for{lc $factor}; |
373 | - } |
374 | - elsif ( $val =~ m/No default/ ) { |
375 | - $val = ''; |
376 | - } |
377 | - } |
378 | - |
379 | - if ( $duplicate_var ) { |
380 | - # Save the var's last value before we overwrite it with this |
381 | - # current value. |
382 | - push @{$duplicates{$var}}, $config{$var}; |
383 | - $duplicate_var = 0; # flag off for next var |
384 | - } |
385 | - |
386 | - # Save this var-val. |
387 | - $config{$var} = $val; |
388 | - |
389 | - $var = undef; # next item should be a var |
390 | - } |
391 | + # but only the last is used). |
392 | + PTDEBUG && _d("Duplicate var:", $var); |
393 | + foreach my $current_val ( map { _process_val($_) } @$vals ) { |
394 | + push @{$duplicates{$var} ||= []}, $current_val; |
395 | + } |
396 | + } |
397 | + |
398 | + PTDEBUG && _d("Var:", $var, "val:", $val); |
399 | + |
400 | + # Save this var-val. |
401 | + $config{$var} = $val; |
402 | } |
403 | |
404 | return \%config, \%duplicates; |
405 | } |
406 | |
407 | +sub _process_val { |
408 | + my ($val) = @_; |
409 | + $val =~ s/ |
410 | + \A # Start of value |
411 | + (['"]) # Opening quote |
412 | + (.*) # Value |
413 | + \1 # Closing quote |
414 | + [\n\r]*\z # End of value |
415 | + /$2/x; |
416 | + if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) { |
417 | + # value is a size like 1k, 16M, etc. |
418 | + my %factor_for = ( |
419 | + k => 1_024, |
420 | + m => 1_048_576, |
421 | + g => 1_073_741_824, |
422 | + t => 1_099_511_627_776, |
423 | + ); |
424 | + $val = $num * $factor_for{lc $factor}; |
425 | + } |
426 | + elsif ( $val =~ m/No default/ ) { |
427 | + $val = ''; |
428 | + } |
429 | + return $val; |
430 | +} |
431 | + |
432 | # Sub: _mimic_show_variables |
433 | # Make the variables' values mimic SHOW VARIABLES. Different output formats |
434 | # list values differently. To make comparisons easier, outputs are made to |
435 | |
436 | === modified file 't/lib/MySQLConfig.t' |
437 | --- t/lib/MySQLConfig.t 2012-06-03 17:54:32 +0000 |
438 | +++ t/lib/MySQLConfig.t 2012-12-04 08:13:22 +0000 |
439 | @@ -9,7 +9,7 @@ |
440 | use strict; |
441 | use warnings FATAL => 'all'; |
442 | use English qw(-no_match_vars); |
443 | -use Test::More tests => 30; |
444 | +use Test::More; |
445 | |
446 | use MySQLConfig; |
447 | use DSNParser; |
448 | @@ -828,6 +828,30 @@ |
449 | } |
450 | |
451 | # ############################################################################# |
452 | +# Use of uninitialized value in substitution (s///) at pt-config-diff line 1996 |
453 | +# https://bugs.launchpad.net/percona-toolkit/+bug/917770 |
454 | +# ############################################################################# |
455 | + |
456 | +$config = eval { |
457 | + new MySQLConfig( |
458 | + file => "$trunk/t/pt-config-diff/samples/bug_917770.cnf", |
459 | + TextResultSetParser => $trp, |
460 | + ); |
461 | +}; |
462 | + |
463 | +is( |
464 | + $EVAL_ERROR, |
465 | + '', |
466 | + "Bug 917770: Lives ok on lines with just spaces" |
467 | +); |
468 | + |
469 | +is( |
470 | + $config->format(), |
471 | + 'option_file', |
472 | + "Detect option_file type" |
473 | +); |
474 | + |
475 | +# ############################################################################# |
476 | # Done. |
477 | # ############################################################################# |
478 | { |
479 | @@ -841,4 +865,5 @@ |
480 | '_d() works' |
481 | ); |
482 | ok($sb->ok(), "Sandbox servers") or BAIL_OUT(__FILE__ . " broke the sandbox"); |
483 | +done_testing; |
484 | exit; |
485 | |
486 | === added file 't/pt-config-diff/samples/bug_917770.cnf' |
487 | --- t/pt-config-diff/samples/bug_917770.cnf 1970-01-01 00:00:00 +0000 |
488 | +++ t/pt-config-diff/samples/bug_917770.cnf 2012-12-04 08:13:22 +0000 |
489 | @@ -0,0 +1,31 @@ |
490 | +[client] |
491 | +user = msandbox |
492 | +password = msandbox |
493 | +port = PORT |
494 | +socket = /tmp/PORT/mysql_sandboxPORT.sock |
495 | + |
496 | +[mysqld] |
497 | +port = PORTa |
498 | +socket = /tmp/PORT/mysql_sandboxPORT.sock |
499 | +pid-file = /tmp/PORT/data/mysql_sandboxPORT.pid |
500 | +basedir = PERCONA_TOOLKIT_SANDBOX |
501 | +datadir = /tmp/PORT/data |
502 | +key_buffer_size = 16M |
503 | +innodb_buffer_pool_size = 16M |
504 | +innodb_data_home_dir = /tmp/PORT/data |
505 | +innodb_log_group_home_dir = /tmp/PORT/data |
506 | +innodb_data_file_path = ibdata1:10M:autoextend |
507 | +innodb_log_file_size = 5M |
508 | +log-bin = mysql-bin |
509 | +relay_log = mysql-relay-bin |
510 | +log_slave_updates |
511 | +server-id = PORT |
512 | +report-host = 127.0.0.1 |
513 | +report-port = PORT |
514 | +log-error = /tmp/PORT/data/mysqld.log |
515 | +innodb_lock_wait_timeout = 3 |
516 | +general_log |
517 | +general_log_file = genlog |
518 | +# These next two lines have a space in them, and trigger the bug |
519 | + |
520 | + |
521 | \ No newline at end of file |
We need a reproducible test case. I wrote that regex and I know that if a group fails to match the corresponding $N will be undef not empty string, so pt-config-diff should be resilient to undef values.