Merge lp:~percona-toolkit-dev/percona-toolkit/fix-917770-pt-config-diff-uninit-value-crash into lp:percona-toolkit/2.1

Proposed by Brian Fraser
Status: Merged
Approved by: Brian Fraser
Approved revision: 469
Merged at revision: 478
Proposed branch: lp:~percona-toolkit-dev/percona-toolkit/fix-917770-pt-config-diff-uninit-value-crash
Merge into: lp:percona-toolkit/2.1
Diff against target: 521 lines (+240/-152)
4 files modified
bin/pt-config-diff (+78/-66)
lib/MySQLConfig.pm (+105/-85)
t/lib/MySQLConfig.t (+26/-1)
t/pt-config-diff/samples/bug_917770.cnf (+31/-0)
To merge this branch: bzr merge lp:~percona-toolkit-dev/percona-toolkit/fix-917770-pt-config-diff-uninit-value-crash
Reviewer Review Type Date Requested Status
Brian Fraser (community) Approve
Daniel Nichter Approve
Review via email: mp+137301@code.launchpad.net

This proposal supersedes a proposal from 2012-11-26.

To post a comment you must log in.
Revision history for this message
Daniel Nichter (daniel-nichter) wrote : Posted in a previous version of this proposal

We need a reproducible test case. I wrote that regex and I know that if a group fails to match the corresponding $N will be undef not empty string, so pt-config-diff should be resilient to undef values.

review: Needs Resubmitting
Revision history for this message
Brian Fraser (fraserbn) wrote :

New fix deals with the actual root cause, and adds two safety checks against undefs. Plus a test case : D

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

64 + # We were passed an undef for the value, or we're dealing with an empty
65 + # line
66 + if ( !defined($item) ) {
67 + $item = '';
68 + }
69 if ( $item ) {

Are the new lines necessary? Line 69 will be false if $item is either undef or '', so setting it to '' doesn't seem to do anything except introduce the possibility of no-name items. I think this is better handled around

+ map { $_ =~ m/^--([^=]+)(?:=(.*))?$/ ? ($1, $2) : () }

and its non-"--" variant in another call. Granted, those split/grep/map are a poor place: in the actual sub call. So perhaps it should be refactored into sub which can match for the item=val and if the item part is undef then warn about the failure to parse that line and skip it. Doing that, however, will mean that we also need to skip section headers like "[client]" in addition to comment and blank lines.

Maybe the refactored sub can take a pattern arg, m/^--([^=]+)(?:=(.*))?$/ or m/^([^=]+)(?:=(.*))?$/, because skipping blank, header, and section lines is common to both inputs. If the code to handle EOL comments is ever implemented, it could be put in this sub, too.

469. By Brian Fraser

Refactor _parse_varvals.

Now it takes two arguments: A regexp and a string to match against.
_parse_varvals itself was split in three: _preprocess_varvals,
_parse_varvals, and _process_val.

This also modifies the three places that call _parse_varvals; For
two, no real changes were needed, but parse_mysqld() required a fix
to deal with the two final lines of mysqld --help --verbose:

   To see what values a running MySQL server is using, type
   'mysqladmin variables' instead of 'mysqld --verbose --help'.

Revision history for this message
Daniel Nichter (daniel-nichter) wrote :

Looks good. The refactored code is much saner.

review: Approve
Revision history for this message
Brian Fraser (fraserbn) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bin/pt-config-diff'
2--- bin/pt-config-diff 2012-11-19 18:47:13 +0000
3+++ bin/pt-config-diff 2012-12-04 08:13:22 +0000
4@@ -2145,15 +2145,16 @@
5 PTDEBUG && _d("mysqld help output doesn't list option files");
6 }
7
8- if ( $output !~ m/^-+ -+$/mg ) {
9+ if ( $output !~ m/^-+ -+$(.+?)(?:\n\n.+)?\z/sm ) {
10 PTDEBUG && _d("mysqld help output doesn't list vars and vals");
11 return;
12 }
13
14- my $varvals = substr($output, (pos $output) + 1, length $output);
15+ my $varvals = $1;
16
17 my ($config, undef) = _parse_varvals(
18- $varvals =~ m/\G^(\S+)(.*)\n/mg
19+ qr/^(\S+)(.*)$/,
20+ $varvals,
21 );
22
23 return $config, \@opt_files;
24@@ -2168,7 +2169,8 @@
25 my ($output) = @args{@required_args};
26
27 my ($config, $dupes) = _parse_varvals(
28- map { $_ =~ m/^--([^=]+)(?:=(.*))?$/ } split("\n", $output)
29+ qr/^--([^=]+)(?:=(.*))?$/,
30+ $output,
31 );
32
33 return $config, $dupes;
34@@ -2186,84 +2188,94 @@
35 die "Failed to parse the [mysqld] section" unless $mysqld_section;
36
37 my ($config, $dupes) = _parse_varvals(
38- map { $_ =~ m/^([^=]+)(?:=(.*))?$/ }
39- grep { $_ !~ m/^\s*#/ } # no # comment lines
40- split("\n", $mysqld_section)
41+ qr/^([^=]+)(?:=(.*))?$/,
42+ $mysqld_section,
43 );
44
45 return $config, $dupes;
46 }
47
48+sub _preprocess_varvals {
49+ my ($re, $to_parse) = @_;
50+
51+ my %vars;
52+ LINE:
53+ foreach my $line ( split /\n/, $to_parse ) {
54+ next LINE if $line =~ m/^\s*$/; # no empty lines
55+ next LINE if $line =~ m/^\s*#/; # no # comment lines
56+
57+ if ( $line !~ $re ) {
58+ PTDEBUG && _d("Line <", $line, "> didn't match $re");
59+ next LINE;
60+ }
61+
62+ my ($var, $val) = ($1, $2);
63+
64+ $var =~ tr/-/_/;
65+
66+ if ( !defined $val ) {
67+ $val = '';
68+ }
69+
70+ for my $item ($var, $val) {
71+ $item =~ s/^\s+//;
72+ $item =~ s/\s+$//;
73+ }
74+
75+ push @{$vars{$var} ||= []}, $val
76+ }
77+
78+ return \%vars;
79+}
80+
81 sub _parse_varvals {
82- my ( @varvals ) = @_;
83+ my ( $vars ) = _preprocess_varvals(@_);
84
85 my %config;
86
87- my $duplicate_var = 0;
88 my %duplicates;
89
90- my $var; # current variable (e.g. datadir)
91- my $val; # value for current variable
92- ITEM:
93- foreach my $item ( @varvals ) {
94- if ( $item ) {
95- $item =~ s/^\s+//;
96- $item =~ s/\s+$//;
97- }
98-
99- if ( !$var ) {
100- $var = $item;
101-
102- $var =~ s/-/_/g;
103-
104- if ( exists $config{$var} && !$can_be_duplicate{$var} ) {
105- PTDEBUG && _d("Duplicate var:", $var);
106- $duplicate_var = 1; # flag on, save all the var's values
107- }
108- }
109- else {
110- my $val = $item;
111- PTDEBUG && _d("Var:", $var, "val:", $val);
112-
113- if ( !defined $val ) {
114- $val = '';
115- }
116- else {
117- $val =~ s/
118- \A # Start of value
119- (['"]) # Opening quote
120- (.*) # Value
121- \1 # Closing quote
122- [\n\r]*\z # End of value
123- /$2/x;
124- if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) {
125- my %factor_for = (
126- k => 1_024,
127- m => 1_048_576,
128- g => 1_073_741_824,
129- t => 1_099_511_627_776,
130- );
131- $val = $num * $factor_for{lc $factor};
132- }
133- elsif ( $val =~ m/No default/ ) {
134- $val = '';
135- }
136- }
137-
138- if ( $duplicate_var ) {
139- push @{$duplicates{$var}}, $config{$var};
140- $duplicate_var = 0; # flag off for next var
141- }
142-
143- $config{$var} = $val;
144-
145- $var = undef; # next item should be a var
146- }
147+ while ( my ($var, $vals) = each %$vars ) {
148+ my $val = _process_val( pop @$vals );
149+ if ( @$vals && !$can_be_duplicate{$var} ) {
150+ PTDEBUG && _d("Duplicate var:", $var);
151+ foreach my $current_val ( map { _process_val($_) } @$vals ) {
152+ push @{$duplicates{$var} ||= []}, $current_val;
153+ }
154+ }
155+
156+ PTDEBUG && _d("Var:", $var, "val:", $val);
157+
158+ $config{$var} = $val;
159 }
160
161 return \%config, \%duplicates;
162 }
163
164+sub _process_val {
165+ my ($val) = @_;
166+ $val =~ s/
167+ \A # Start of value
168+ (['"]) # Opening quote
169+ (.*) # Value
170+ \1 # Closing quote
171+ [\n\r]*\z # End of value
172+ /$2/x;
173+ if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) {
174+ my %factor_for = (
175+ k => 1_024,
176+ m => 1_048_576,
177+ g => 1_073_741_824,
178+ t => 1_099_511_627_776,
179+ );
180+ $val = $num * $factor_for{lc $factor};
181+ }
182+ elsif ( $val =~ m/No default/ ) {
183+ $val = '';
184+ }
185+ return $val;
186+}
187+
188 sub _mimic_show_variables {
189 my ( %args ) = @_;
190 my @required_args = qw(vars format);
191
192=== modified file 'lib/MySQLConfig.pm'
193--- lib/MySQLConfig.pm 2012-07-30 14:50:42 +0000
194+++ lib/MySQLConfig.pm 2012-12-04 08:13:22 +0000
195@@ -259,18 +259,28 @@
196 # help TRUE
197 # abort-slave-event-count 0
198 # So we search for that line of hypens.
199- if ( $output !~ m/^-+ -+$/mg ) {
200+ #
201+ # It also ends with something like
202+ #
203+ # wait_timeout 28800
204+ #
205+ # To see what values a running MySQL server is using, type
206+ # 'mysqladmin variables' instead of 'mysqld --verbose --help'.
207+ #
208+ # So try to find it by locating a double newline, and strip it away
209+ if ( $output !~ m/^-+ -+$(.+?)(?:\n\n.+)?\z/sm ) {
210 PTDEBUG && _d("mysqld help output doesn't list vars and vals");
211 return;
212 }
213
214- # Cut off everything before the list of vars and vals.
215- my $varvals = substr($output, (pos $output) + 1, length $output);
216+ # Grab the varval list.
217+ my $varvals = $1;
218
219 # Parse the "var val" lines. 2nd retval is duplicates but there
220 # shouldn't be any with mysqld.
221 my ($config, undef) = _parse_varvals(
222- $varvals =~ m/\G^(\S+)(.*)\n/mg
223+ qr/^(\S+)(.*)$/,
224+ $varvals,
225 );
226
227 return $config, \@opt_files;
228@@ -288,7 +298,8 @@
229
230 # Parse the "--var=val" lines.
231 my ($config, $dupes) = _parse_varvals(
232- map { $_ =~ m/^--([^=]+)(?:=(.*))?$/ } split("\n", $output)
233+ qr/^--([^=]+)(?:=(.*))?$/,
234+ $output,
235 );
236
237 return $config, $dupes;
238@@ -309,107 +320,116 @@
239
240 # Parse the "var=val" lines.
241 my ($config, $dupes) = _parse_varvals(
242- map { $_ =~ m/^([^=]+)(?:=(.*))?$/ }
243- grep { $_ !~ m/^\s*#/ } # no # comment lines
244- split("\n", $mysqld_section)
245+ qr/^([^=]+)(?:=(.*))?$/,
246+ $mysqld_section,
247 );
248
249 return $config, $dupes;
250 }
251
252-# Parses a list of variables and their values ("varvals"), returns two
253+# Called by _parse_varvals(), takes two arguments: a regex, and
254+# a string to match against. The string will be split in lines,
255+# and each line will be matched against the regex.
256+# The regex must return to captures, although the second doesn't
257+# have to match anything.
258+# Returns a hashref of arrayrefs ala
259+# { port => [ 12345, 12346 ], key_buffer_size => [ "16M" ] }
260+sub _preprocess_varvals {
261+ my ($re, $to_parse) = @_;
262+
263+ my %vars;
264+ LINE:
265+ foreach my $line ( split /\n/, $to_parse ) {
266+ next LINE if $line =~ m/^\s*$/; # no empty lines
267+ next LINE if $line =~ m/^\s*#/; # no # comment lines
268+
269+ if ( $line !~ $re ) {
270+ PTDEBUG && _d("Line <", $line, "> didn't match $re");
271+ next LINE;
272+ }
273+
274+ my ($var, $val) = ($1, $2);
275+
276+ # Variable names are usually specified like "log-bin"
277+ # but in SHOW VARIABLES they're all like "log_bin".
278+ $var =~ tr/-/_/;
279+
280+ if ( !defined $val ) {
281+ $val = '';
282+ }
283+
284+ # Strip leading and trailing whitespace.
285+ for my $item ($var, $val) {
286+ $item =~ s/^\s+//;
287+ $item =~ s/\s+$//;
288+ }
289+
290+ push @{$vars{$var} ||= []}, $val
291+ }
292+
293+ return \%vars;
294+}
295+
296+# Parses a string of variables and their values ("varvals"), returns two
297 # hashrefs: one with normalized variable=>value, the other with duplicate
298-# vars. The varvals list should start with a var at index 0 and its value
299-# at index 1 then repeat for the next var-val pair.
300+# vars.
301 sub _parse_varvals {
302- my ( @varvals ) = @_;
303+ my ( $vars ) = _preprocess_varvals(@_);
304
305 # Config built from parsing the given varvals.
306 my %config;
307
308 # Discover duplicate vars.
309- my $duplicate_var = 0;
310 my %duplicates;
311
312- # Keep track if item is var or val because each needs special modifications.
313- my $var; # current variable (e.g. datadir)
314- my $val; # value for current variable
315- ITEM:
316- foreach my $item ( @varvals ) {
317- if ( $item ) {
318- # Strip leading and trailing whitespace.
319- $item =~ s/^\s+//;
320- $item =~ s/\s+$//;
321- }
322-
323- if ( !$var ) {
324- # No var means this item is (should be) the next var in the list.
325- $var = $item;
326-
327- # Variable names are usually specified like "log-bin"
328- # but in SHOW VARIABLES they're all like "log_bin".
329- $var =~ s/-/_/g;
330-
331+ while ( my ($var, $vals) = each %$vars ) {
332+ my $val = _process_val( pop @$vals );
333+ # If the variable has duplicates, then @$vals will contain
334+ # the rest of the values
335+ if ( @$vals && !$can_be_duplicate{$var} ) {
336 # The var is a duplicate (in the bad sense, i.e. where user is
337 # probably unaware that there's two different values for this var
338- # but only the last is used) if we've seen it already and it cannot
339- # be duplicated. We don't have its value yet (next loop iter),
340- # so we set a flag to indicate that we should save the duplicate value.
341- if ( exists $config{$var} && !$can_be_duplicate{$var} ) {
342- PTDEBUG && _d("Duplicate var:", $var);
343- $duplicate_var = 1; # flag on, save all the var's values
344- }
345- }
346- else {
347- # $var is set so this item should be its value.
348- my $val = $item;
349- PTDEBUG && _d("Var:", $var, "val:", $val);
350-
351- # Avoid crashing on undef comparison. Also, SHOW VARIABLES uses
352- # blank strings, not NULL/undef.
353- if ( !defined $val ) {
354- $val = '';
355- }
356- else {
357- $val =~ s/
358- \A # Start of value
359- (['"]) # Opening quote
360- (.*) # Value
361- \1 # Closing quote
362- [\n\r]*\z # End of value
363- /$2/x;
364- if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) {
365- # value is a size like 1k, 16M, etc.
366- my %factor_for = (
367- k => 1_024,
368- m => 1_048_576,
369- g => 1_073_741_824,
370- t => 1_099_511_627_776,
371- );
372- $val = $num * $factor_for{lc $factor};
373- }
374- elsif ( $val =~ m/No default/ ) {
375- $val = '';
376- }
377- }
378-
379- if ( $duplicate_var ) {
380- # Save the var's last value before we overwrite it with this
381- # current value.
382- push @{$duplicates{$var}}, $config{$var};
383- $duplicate_var = 0; # flag off for next var
384- }
385-
386- # Save this var-val.
387- $config{$var} = $val;
388-
389- $var = undef; # next item should be a var
390- }
391+ # but only the last is used).
392+ PTDEBUG && _d("Duplicate var:", $var);
393+ foreach my $current_val ( map { _process_val($_) } @$vals ) {
394+ push @{$duplicates{$var} ||= []}, $current_val;
395+ }
396+ }
397+
398+ PTDEBUG && _d("Var:", $var, "val:", $val);
399+
400+ # Save this var-val.
401+ $config{$var} = $val;
402 }
403
404 return \%config, \%duplicates;
405 }
406
407+sub _process_val {
408+ my ($val) = @_;
409+ $val =~ s/
410+ \A # Start of value
411+ (['"]) # Opening quote
412+ (.*) # Value
413+ \1 # Closing quote
414+ [\n\r]*\z # End of value
415+ /$2/x;
416+ if ( my ($num, $factor) = $val =~ m/(\d+)([KMGT])b?$/i ) {
417+ # value is a size like 1k, 16M, etc.
418+ my %factor_for = (
419+ k => 1_024,
420+ m => 1_048_576,
421+ g => 1_073_741_824,
422+ t => 1_099_511_627_776,
423+ );
424+ $val = $num * $factor_for{lc $factor};
425+ }
426+ elsif ( $val =~ m/No default/ ) {
427+ $val = '';
428+ }
429+ return $val;
430+}
431+
432 # Sub: _mimic_show_variables
433 # Make the variables' values mimic SHOW VARIABLES. Different output formats
434 # list values differently. To make comparisons easier, outputs are made to
435
436=== modified file 't/lib/MySQLConfig.t'
437--- t/lib/MySQLConfig.t 2012-06-03 17:54:32 +0000
438+++ t/lib/MySQLConfig.t 2012-12-04 08:13:22 +0000
439@@ -9,7 +9,7 @@
440 use strict;
441 use warnings FATAL => 'all';
442 use English qw(-no_match_vars);
443-use Test::More tests => 30;
444+use Test::More;
445
446 use MySQLConfig;
447 use DSNParser;
448@@ -828,6 +828,30 @@
449 }
450
451 # #############################################################################
452+# Use of uninitialized value in substitution (s///) at pt-config-diff line 1996
453+# https://bugs.launchpad.net/percona-toolkit/+bug/917770
454+# #############################################################################
455+
456+$config = eval {
457+ new MySQLConfig(
458+ file => "$trunk/t/pt-config-diff/samples/bug_917770.cnf",
459+ TextResultSetParser => $trp,
460+ );
461+};
462+
463+is(
464+ $EVAL_ERROR,
465+ '',
466+ "Bug 917770: Lives ok on lines with just spaces"
467+);
468+
469+is(
470+ $config->format(),
471+ 'option_file',
472+ "Detect option_file type"
473+);
474+
475+# #############################################################################
476 # Done.
477 # #############################################################################
478 {
479@@ -841,4 +865,5 @@
480 '_d() works'
481 );
482 ok($sb->ok(), "Sandbox servers") or BAIL_OUT(__FILE__ . " broke the sandbox");
483+done_testing;
484 exit;
485
486=== added file 't/pt-config-diff/samples/bug_917770.cnf'
487--- t/pt-config-diff/samples/bug_917770.cnf 1970-01-01 00:00:00 +0000
488+++ t/pt-config-diff/samples/bug_917770.cnf 2012-12-04 08:13:22 +0000
489@@ -0,0 +1,31 @@
490+[client]
491+user = msandbox
492+password = msandbox
493+port = PORT
494+socket = /tmp/PORT/mysql_sandboxPORT.sock
495+
496+[mysqld]
497+port = PORTa
498+socket = /tmp/PORT/mysql_sandboxPORT.sock
499+pid-file = /tmp/PORT/data/mysql_sandboxPORT.pid
500+basedir = PERCONA_TOOLKIT_SANDBOX
501+datadir = /tmp/PORT/data
502+key_buffer_size = 16M
503+innodb_buffer_pool_size = 16M
504+innodb_data_home_dir = /tmp/PORT/data
505+innodb_log_group_home_dir = /tmp/PORT/data
506+innodb_data_file_path = ibdata1:10M:autoextend
507+innodb_log_file_size = 5M
508+log-bin = mysql-bin
509+relay_log = mysql-relay-bin
510+log_slave_updates
511+server-id = PORT
512+report-host = 127.0.0.1
513+report-port = PORT
514+log-error = /tmp/PORT/data/mysqld.log
515+innodb_lock_wait_timeout = 3
516+general_log
517+general_log_file = genlog
518+# These next two lines have a space in them, and trigger the bug
519+
520+
521\ No newline at end of file

Subscribers

People subscribed via source and target branches