Transaction log losing data when a bad query (ERROR 1093) is encountered mid-transaction.

Bug #644760 reported by Patrick Crews
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
High
David Shrewsbury
7.0
Fix Released
High
David Shrewsbury

Bug Description

The transaction log is appearing to lose data from row-touching queries with AUTOCOMMIT=OFF after a new START TRANSACTION has been issued.

With AUTOCOMMIT = OFF, this query:
UPDATE `c` SET `col_bigint` = 4 ORDER BY `col_bigint`,`col_bigint_key`,`col_bigint_not_null`,`col_bigint_not_null_key`,`col_char_10`,`col_char_1024`,`col_char_1024_key`,`col_char_1024_not_null`,`col_char_1024_not_null_key`,`col_char_10_key`,`col_char_10_not_null`,`col_char_10_not_null_key`,`col_enum`,`col_enum_key`,`col_enum_not_null`,`col_enum_not_null_key`,`col_int`,`col_int_key`,`col_int_not_null`,`col_int_not_null_key`,`col_text`,`col_text_key`,`col_text_not_null`,`col_text_not_null_key`,`pk` LIMIT 7

Appears to not be logged. The diff is produced via drizzledump output of the original server and a server populated by the output of transaction_reader called on the original server's transaction.log file.

This appears to affect several other transactions as well. Only the simplest diff is provided as example.

Still working on a repeatable test case / troubleshooting what exactly is happening here.

# 2010-09-20T13:06:06 Executing diff --unified /tmp//translog_rpl_dump_14883_9306.sql /tmp//translog_rpl_dump_14883_9307.sql
--- /tmp//translog_rpl_dump_14883_9306.sql 2010-09-20 13:06:06.641540272 -0400
+++ /tmp//translog_rpl_dump_14883_9307.sql 2010-09-20 13:06:06.731542932 -0400
@@ -278,7 +278,7 @@
   KEY `col_char_10_key` (`col_char_10_key`),
   KEY `col_char_1024_key` (`col_char_1024_key`(255))
 ) ENGINE=InnoDB COLLATE = utf8_general_ci;
-INSERT INTO `c` VALUES (1,NULL,'',2,'TJNCD',NULL,'h','h','','v',0,'q',747700224,'e',NULL,NULL,'h','o',8,7,'z','k','BKJRK','q',4);
+INSERT INTO `c` VALUES (1,NULL,'',2,'TJNCD',NULL,'h','h','','v',0,'q',747700224,'e',NULL,NULL,'h','o',8,7,'z','k','BKJRK','q',NULL);

Tags: replication

Related branches

Changed in drizzle:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Patrick Crews (patrick-crews) wrote :

Attached is trimmed randgen output. For clarity, I have removed all bad queries (impossible WHERE's, etc) and SELECT's (I doubt they contributed to this).

What I have tried to do is compare randgen queries to the SQL generated by transaction_reader when called against the server's transaction log.

From my analysis, it appears that a block of queries was somehow missed / ignored by the log as queries that came before and after this block are visible in the transaction_reader output.

I can provide assistance with running the randgen test if you are interested in duplicating things yourself that way. Otherwise, we can work on trying to produce a test-run'able test case.

Revision history for this message
Stewart Smith (stewart) wrote : Re: [Bug 644760] [NEW] Transaction log appears to lose data on implicit commit after a large number of queries have been executed.

FYI: I want to completely kill implicit commit.. it's just dumb.

--
Stewart Smith

Changed in drizzle:
assignee: nobody → David Shrewsbury (dshrews)
tags: added: replication
Changed in drizzle:
status: Confirmed → In Progress
Revision history for this message
David Shrewsbury (dshrews) wrote : Re: Transaction log appears to lose data on implicit commit after a large number of queries have been executed.

Yes, definitely need a test case. :)

Revision history for this message
Patrick Crews (patrick-crews) wrote :

I tried to cook up a test case based on my filtered randgen output, but for whatever reason, the bug didn't present itself.
I will try to make a test-suite .test file based on all of the executed queries, but in the mean time you can duplicate this via the randgen command line:
perl ./gentest.pl --dsn=dbi:drizzle:host=localhost:port=9306:user=root:password="":database=test --gendata=conf/drizzle/translog.zz --grammar=conf/drizzle/transaction_log_single_thread_drizzle.yy --queries=50 --threads=1 --debug --Validator=DrizzleTransactionLog

To use the Validator, you will need another drizzle server up and running. The Validator expects the server to be listening on port 9307, but you can edit randgen/lib/GenTest/Validator/DrizzleTransactionLog.pm and set the $validator_port variable to whatever you would like.

You can expect to see output like:
# 2010-09-22T17:42:34 Differences between the two servers were found after comparing dumpfiles
# 2010-09-22T17:42:34 diff command: diff --unified /tmp//translog_rpl_dump_9484_9306.sql /tmp//translog_rpl_dump_9484_9307.sql
# 2010-09-22T17:42:34 Master dumpfile: /tmp//translog_rpl_dump_9484_9306.sql
# 2010-09-22T17:42:34 Slave dumpfile: /tmp//translog_rpl_dump_9484_9307.sql
# 2010-09-22T17:42:35 Killing periodic reporting process with pid 9483...
# 2010-09-22T17:42:35 Kill GenTest::ErrorFilter(9482)
# 2010-09-22T17:42:35 Test completed with failure status STATUS_REPLICATION_FAILURE (103)

Revision history for this message
Patrick Crews (patrick-crews) wrote :

I've linked a branch with a test - transaction_log.bug644670, that will result in a buggy log.

./test-run --suite=transaction_log bug64470

If parsing the print_transaction_message() output is tough (I find it is on long entries), you can call transaction_reader against the transaction_log to get SQL output.

As noted in the original description, there should be an entry for UPDATE'ing table c, but none will be present in the generated SQL.

from the /tests dir:
../drizzled/message/transaction_reader var/master-data/transaction.log

Revision history for this message
Patrick Crews (patrick-crews) wrote :

The bug is being caused by this query (line 387 in the test case).

It would appear that encountering the 1093 ERROR is causing log data to be lost.

#--error ER_UPDATE_TABLE_USED
#DELETE FROM `c` WHERE `col_char_10_not_null_key` IN ( SELECT `col_enum_key` FROM `c` WHERE `pk` = 112 ) ORDER BY `col_bigint`,`col_bigint_key`,`col_bigint_not_null`,`col_bigint_not_null_key`,`col_char_10`,`col_char_1024`,`col_char_1024_key`,`col_char_1024_not_null`,`col_char_1024_not_null_key`,`col_char_10_key`,`col_char_10_not_null`,`col_char_10_not_null_key`,`col_enum`,`col_enum_key`,`col_enum_not_null`,`col_enum_not_null_key`,`col_int`,`col_int_key`,`col_int_not_null`,`col_int_not_null_key`,`col_text`,`col_text_key`,`col_text_not_null`,`col_text_not_null_key`,`pk` LIMIT 4 ;
# 2010-09-20T13:05:55 Query: DELETE FROM `c` WHERE `col_char_10_not_null_key` IN ( SELECT `col_enum_key` FROM `c` WHERE `pk` = 112 ) ORDER BY `col_bigint`,`col_bigint_key`,`col_bigint_not_null`,`col_bigint_not_null_key`,`col_char_10`,`col_char_1024`,`col_char_1024_key`,`col_char_1024_not_null`,`col_char_1024_not_null_key`,`col_char_10_key`,`col_char_10_not_null`,`col_char_10_not_null_key`,`col_enum`,`col_enum_key`,`col_enum_not_null`,`col_enum_not_null_key`,`col_int`,`col_int_key`,`col_int_not_null`,`col_int_not_null_key`,`col_text`,`col_text_key`,`col_text_not_null`,`col_text_not_null_key`,`pk` LIMIT 4 failed: 1093 You can't specify target table '%s' for update in FROM clause. Further errors of this kind will be suppressed.;

summary: - Transaction log appears to lose data on implicit commit after a large
- number of queries have been executed.
+ Transaction log losing data when a bad query (ERROR 1093) is encountered
+ mid-transaction.
Revision history for this message
David Shrewsbury (dshrews) wrote :

For my sanity, inserting a complete test. All statements added to the transaction before the error are being removed:

use test;
create table t (a int not null auto_increment primary key, b varchar(20));

start transaction;
insert into t (b) values ('a'), ('b'), ('c'), ('d'), ('e');
commit;

SET GLOBAL transaction_log_truncate_debug= true;

start transaction;
update t set b = 'z' where a in (1,3,5);
delete from t where a in (select a from t where a < 3);
update t set b = 'y' where a in (2,4);
commit;

select print_transaction_message('transaction.log', ENTRY_OFFSET) from data_dictionary.transaction_log_entries\G

Changed in drizzle:
status: In Progress → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.