Merge lp:~gl-az/percona-server/5.5-bug1053342 into lp:percona-server/5.5

Proposed by George Ormond Lorch III
Status: Merged
Approved by: Laurynas Biveinis
Approved revision: no longer in the source branch.
Merged at revision: 316
Proposed branch: lp:~gl-az/percona-server/5.5-bug1053342
Merge into: lp:percona-server/5.5
Diff against target: 11 lines (+1/-1)
1 file modified
Percona-Server/sql/log_event.h (+1/-1)
To merge this branch: bzr merge lp:~gl-az/percona-server/5.5-bug1053342
Reviewer Review Type Date Requested Status
Laurynas Biveinis (community) Approve
Review via email: mp+128562@code.launchpad.net

Description of the change

Fixed upstream slave crash bug http://bugs.mysql.com/bug.php?id=66890.

Issue was due to the thd->mem_root heap being released prematurely during the processing of deferred events.

In sql/log_event.cc:6060 User_var_log_event::do_apply_event(), a test is made to determine if deferred events are being collected and processed. If true, a call to this->set_deferred is made to set a flag indicating that 'this' is a deferred event. This means that the thd->mem_root should not be freed on the completion of User_var_log_event::do_apply_event().

The problem lies in the way User_var_log_event::set_deferred() is implemented. The class User_var_log_event has an internal member 'bool deferred' that should be set to true when User_var_log_event::set_deferred() is called. The original code was setting 'deferred' to the state of another member 'char* val'. At the time User_var_log_event::set_deferred() is being called, the 'val' member is NULL, therefore leaving the 'deferred' member at 0/false instead of the true that was intended.

Changing User_var_log_event::set_deferred() in sql/log_event.h:2596 to be '{ deferred= true; }' instead of '{ deferred= val; }' fixes the issue. This is easily seen by running the slave with Valgrind before and after. Valgrind reports 15434 errors of both 'Invalid read' and 'Invalid write' before the fix and 0 errors and 0 leaks after the fix.

http://jenkins.percona.com/view/PS%205.5/job/percona-server-5.5-param/538/

No MTR test case was added due to the non-deterministic nature of the failure.

Valgrind report before:
==7905== Memcheck, a memory error detector
==7905== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==7905== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==7905== Command: ./bin/mysqld --defaults-file=/usr/local/dev/percona-server/5.5-26461-slave/etc/my.cnf --basedir=/usr/local/dev/percona-server/5.5-26461-debug --user=root --console
==7905== Parent PID: 7359
==7905==
==7905== Thread 20:
==7905== Invalid read of size 8
==7905== at 0x596588: check_single_table_access(THD*, unsigned long, TABLE_LIST*, bool) (sql_parse.cc:4772)
==7905== by 0x59667E: check_one_table_access(THD*, unsigned long, TABLE_LIST*) (sql_parse.cc:4818)
==7905== by 0x597785: update_precheck(THD*, TABLE_LIST*) (sql_parse.cc:7107)
==7905== by 0x59C670: mysql_execute_command(THD*) (sql_parse.cc:2917)
==7905== by 0x5A061F: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==7905== by 0x76B703: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3471)
==7905== by 0x76BC8E: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:3214)
==7905== by 0x52DFC8: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1143)
==7905== by 0x53756B: exec_relay_log_event(THD*, Relay_log_info*) (slave.cc:2560)
==7905== by 0x5386BB: handle_slave_sql (slave.cc:3385)
==7905== by 0x33D2607850: start_thread (in /lib64/libpthread-2.12.so)
==7905== by 0x33D1EE811C: clone (in /lib64/libc-2.12.so)
==7905== Address 0x14fca620 is 608 bytes inside a block of size 8,208 free'd
==7905== at 0x4A069D9: free (vg_replace_malloc.c:427)
==7905== by 0x7F4CDE: my_free (my_malloc.c:128)
==7905== by 0x7EDBE4: free_root (my_alloc.c:371)
==7905== by 0x763E73: User_var_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:6142)
==7905== by 0x63AFC7: Deferred_log_events::execute(Relay_log_info*) (log_event.h:1143)
==7905== by 0x763F6E: slave_execute_deferred_events(THD*) (log_event.cc:5633)
==7905== by 0x5A0054: mysql_execute_command(THD*) (sql_parse.cc:2164)
==7905== by 0x5A061F: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==7905== by 0x76B703: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3471)
==7905== by 0x76BC8E: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:3214)
==7905== by 0x52DFC8: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1143)
==7905== by 0x53756B: exec_relay_log_event(THD*, Relay_log_info*) (slave.cc:2560)
==7905==
==7905== Invalid read of size 8
==7905== at 0x59659B: check_single_table_access(THD*, unsigned long, TABLE_LIST*, bool) (sql_parse.cc:4776)
==7905== by 0x59667E: check_one_table_access(THD*, unsigned long, TABLE_LIST*) (sql_parse.cc:4818)
==7905== by 0x597785: update_precheck(THD*, TABLE_LIST*) (sql_parse.cc:7107)
==7905== by 0x59C670: mysql_execute_command(THD*) (sql_parse.cc:2917)
==7905== by 0x5A061F: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==7905== by 0x76B703: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3471)
==7905== by 0x76BC8E: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:3214)
==7905== by 0x52DFC8: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1143)
==7905== by 0x53756B: exec_relay_log_event(THD*, Relay_log_info*) (slave.cc:2560)
==7905== by 0x5386BB: handle_slave_sql (slave.cc:3385)
==7905== by 0x33D2607850: start_thread (in /lib64/libpthread-2.12.so)
==7905== by 0x33D1EE811C: clone (in /lib64/libc-2.12.so)
==7905== Address 0x14fca5e0 is 544 bytes inside a block of size 8,208 free'd
==7905== at 0x4A069D9: free (vg_replace_malloc.c:427)
==7905== by 0x7F4CDE: my_free (my_malloc.c:128)
==7905== by 0x7EDBE4: free_root (my_alloc.c:371)
==7905== by 0x763E73: User_var_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:6142)
==7905== by 0x63AFC7: Deferred_log_events::execute(Relay_log_info*) (log_event.h:1143)
==7905== by 0x763F6E: slave_execute_deferred_events(THD*) (log_event.cc:5633)
==7905== by 0x5A0054: mysql_execute_command(THD*) (sql_parse.cc:2164)
==7905== by 0x5A061F: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==7905== by 0x76B703: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3471)
==7905== by 0x76BC8E: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:3214)
==7905== by 0x52DFC8: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1143)
==7905== by 0x53756B: exec_relay_log_event(THD*, Relay_log_info*) (slave.cc:2560)

... snip ...

==7905== Invalid write of size 4
==7905== at 0x70281C: Item_func_get_user_var::fix_length_and_dec() (item.h:1229)
==7905== by 0x70066E: Item_func::fix_fields(THD*, Item**) (item_func.cc:227)
==7905== by 0x700548: Item_func::fix_fields(THD*, Item**) (item_func.cc:200)
==7905== by 0x700548: Item_func::fix_fields(THD*, Item**) (item_func.cc:200)
==7905== by 0x700A56: Item_func_set_user_var::fix_fields(THD*, Item**) (item_func.cc:4244)
==7905== by 0x553F5B: setup_fields(THD*, Item**, List<Item>&, enum_mark_columns, List<Item>*, bool) (sql_base.cc:7949)
==7905== by 0x6164BC: mysql_update(THD*, TABLE_LIST*, List<Item>&, List<Item>&, Item*, unsigned int, st_order*, unsigned long long, enum_duplicates, bool, unsigned long long*, unsigned long long*) (sql_update.cc:362)
==7905== by 0x59C73B: mysql_execute_command(THD*) (sql_parse.cc:2940)
==7905== by 0x5A061F: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==7905== by 0x76B703: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3471)
==7905== by 0x76BC8E: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:3214)
==7905== by 0x52DFC8: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1143)
==7905== Address 0x150df750 is 2,816 bytes inside a block of size 8,160 free'd
==7905== at 0x4A069D9: free (vg_replace_malloc.c:427)
==7905== by 0x7F4CDE: my_free (my_malloc.c:128)
==7905== by 0x7EDBE4: free_root (my_alloc.c:371)
==7905== by 0x763E73: User_var_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:6142)
==7905== by 0x63AFC7: Deferred_log_events::execute(Relay_log_info*) (log_event.h:1143)
==7905== by 0x763F6E: slave_execute_deferred_events(THD*) (log_event.cc:5633)
==7905== by 0x5A0054: mysql_execute_command(THD*) (sql_parse.cc:2164)
==7905== by 0x5A061F: mysql_parse(THD*, char*, unsigned int, Parser_state*) (sql_parse.cc:5811)
==7905== by 0x76B703: Query_log_event::do_apply_event(Relay_log_info const*, char const*, unsigned int) (log_event.cc:3471)
==7905== by 0x76BC8E: Query_log_event::do_apply_event(Relay_log_info const*) (log_event.cc:3214)
==7905== by 0x52DFC8: apply_event_and_update_pos(Log_event*, THD*, Relay_log_info*) (log_event.h:1143)
==7905== by 0x53756B: exec_relay_log_event(THD*, Relay_log_info*) (slave.cc:2560)
==7905==
==7905==
==7905== HEAP SUMMARY:
==7905== in use at exit: 0 bytes in 0 blocks
==7905== total heap usage: 58,496 allocs, 58,496 frees, 58,071,208 bytes allocated
==7905==
==7905== All heap blocks were freed -- no leaks are possible
==7905==
==7905== For counts of detected and suppressed errors, rerun with: -v
==7905== ERROR SUMMARY: 15434 errors from 758 contexts (suppressed: 4 from 4)

Valgrind report after:
==11123== Memcheck, a memory error detector
==11123== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==11123== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==11123== Command: ./bin/mysqld --defaults-file=/usr/local/dev/percona-server/5.5-26461-slave/etc/my.cnf --basedir=/usr/local/dev/percona-server/5.5-26461-debug --user=root --console
==11123== Parent PID: 7359
==11123==
==11123==
==11123== HEAP SUMMARY:
==11123== in use at exit: 0 bytes in 0 blocks
==11123== total heap usage: 57,056 allocs, 57,056 frees, 54,847,051 bytes allocated
==11123==
==11123== All heap blocks were freed -- no leaks are possible
==11123==
==11123== For counts of detected and suppressed errors, rerun with: -v
==11123== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 4)

To post a comment you must log in.
Revision history for this message
George Ormond Lorch III (gl-az) wrote :

26461

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) :
review: Approve
Revision history for this message
Stewart Smith (stewart) wrote :

Wow... that was it? Amazing. The sad thing, I saw that and went "wtf is that, that has to be buggy somehow" earlier. I should have listened to my gut instinct there... dammit. Replication code can fool you that way.

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

> Wow... that was it? Amazing. The sad thing, I saw that and went "wtf is that,
> that has to be buggy somehow" earlier. I should have listened to my gut
> instinct there... dammit. Replication code can fool you that way.

Yup, sure was. I tried in many ways to understand the original intent and just could not come up with a viable reason why it was done that way. First, there was no need for any class flag since the thd->mem_root was being freed in the same method where the deferred check was being made. Second, no idea why 'bool deferred' was being set to a 'char* val' which would always be NULL at that point in the code execution. History provided no insight to why the original fix was made the way it was. Nothing else, anywhere in the code was accessing the public class member 'deferred' nor calling set_deferred() or is_deferred() methods. So, one has to assume that the only purpose of the flag and methods was to provide for the skipping of the mem_root free when processing deferred events. Laurynas and I spend some good time too trying to find some logical reason why this proposed fix is wrong and just could not. Maybe we will be proven wrong sometime down the road.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Percona-Server/sql/log_event.h'
2--- Percona-Server/sql/log_event.h 2012-09-17 13:08:32 +0000
3+++ Percona-Server/sql/log_event.h 2012-10-08 18:34:22 +0000
4@@ -2593,7 +2593,7 @@
5 and which case the applier adjusts execution path.
6 */
7 bool is_deferred() { return deferred; }
8- void set_deferred() { deferred= val; }
9+ void set_deferred() { deferred= true; }
10 #endif
11 bool is_valid() const { return 1; }
12

Subscribers

People subscribed via source and target branches