Code review comment for lp:~jaypipes/drizzle/bugs

Revision history for this message
Jay Pipes (jaypipes) wrote :

yeah, yeah :P showoff.

Monty Taylor wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Fair enough. Just to be annoying... you know you can do this:
>
> /* initialize match_vector to OUT_VECTOR_SIZE elements containing 0 */
> vector<int> match_vector(OUT_VECTOR_SIZE, 0);
>
> int32_t result= pcre_exec(_re
> , NULL /* No extra data */
> , data_len
> , 0 /* Start at offset 0 of subject...*/
> , 0 /* Default options */
> , &match_vector[0]
> , OUT_VECTOR_SIZE
> );
>
> and from there on treat match_vector like an actual std::vector, right?
> One of the fun bits with vector - guaranteed contiguous memory storage.
>
> Monty
>
> Jay Pipes wrote:
>> Oh, it's the PCRE library terminology, that's all. The "match vector"
>> is an address of the set of integer offsets that you pass to pcre_exec().
>>
>> Monty Taylor wrote:
>> Looks good to me in general. Why aren't you using a std::vector for
>> match vector?
>>
>> Jay Pipes wrote:
>>>>> Jay Pipes has proposed merging lp:~jaypipes/drizzle/bugs into lp:drizzle.
>>>>>
>>>>> Requested reviews:
>>>>> Brian Aker (brianaker)
>>>>> Drizzle-developers (drizzle-developers)
>>>>> Related bugs:
>>>>> #384531 compile failure centos 5.2
>>>>> https://bugs.launchpad.net/bugs/384531
>>>>> #402831 Query Logging output missing a comma
>>>>> https://bugs.launchpad.net/bugs/402831
>>>>> #402855 session->connect_utime doesn't appear to be populated
>>>>> https://bugs.launchpad.net/bugs/402855
>>>>> #409711 plugin/command_log/command_log.cc can't be compiled on 32 bit platforms
>>>>> https://bugs.launchpad.net/bugs/409711
>>>>> #434128 Build failure on debian
>>>>> https://bugs.launchpad.net/bugs/434128
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Fixes LP Bug #500031:
>>>>>
>>>>> "dbt2 fails with 1024 connections"
>>>>>
>>>>> After investigation into this, I discovered that there was a
>>>>> race condition in TemporalFormat::match():
>>>>>
>>>>> TemporalFormat::_re is the compiled PCRE regular expression object
>>>>> inside each of the TemporalFormat objects, which are shared
>>>>> among all threads and live in global scope.
>>>>>
>>>>> Unfortunately, TemporalFormat::match() was using the member
>>>>> variable TemporalFormat::_match_vector as its match state.
>>>>> At high concurrency, this means that the following race
>>>>> condition could happen:
>>>>>
>>>>> Thread 1 executes pcre_exec() and finds a match, therefore
>>>>> populating TemporalFormat::_match_vector of integers
>>>>> with the position offsets of the matched pieces of the temporal object.
>>>>>
>>>>> Thread 1, during construction of the Temporal output of
>>>>> TemporalFormat::match(), uses these _match_vector position
>>>>> offsets in calling std::string::substr on a copy of the
>>>>> matched string, essentially "cutting up" the string
>>>>> into year, month, day, etc.
>>>>>
>>>>> Thread 2 executes pcre_exec() and also finds a match,
>>>>> thereby changing TemporalFormat::_match_vector to something
>>>>> different
>>>>>
>>>>> Thread 1 continues trying to use std::string::substr(),
>>>>> but now uses offsets that are invalid for its string,
>>>>> thereby producing an out_of_range exception.
>>>>>
>>>>> The solution is to pull the TemporalFormat::_match_vector
>>>>> member variable and instead put a function-scope-level
>>>>> match_vector variable on the stack inside TemporalFormat::match().
>>>>>
>>>>>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAks0UUgACgkQ2Jv7/VK1RgHwjwCfYGWm+qrV7aTUHYV7yA8sQwUl
> UIoAoOhqhm0dlsnB1YwliyzZHCB5PoB+
> =d3OQ
> -----END PGP SIGNATURE-----

« Back to merge proposal