Rollbacked inserts remain permanently in table

Bug #315564 reported by Philip Stoev
2
Affects Status Importance Assigned to Milestone
PBXT
Fix Released
Undecided
Vladimir Kolesnikov

Bug Description

When executing a concurrent workload, records that were inserted in a transaction that was subsequently rollbacked remain permanently visible in the table. Even after REPAIR TABLE, the records are still there. This is different from any other transaction isolation issues where the table is temporarily inconsistent -- for this bug, the inconsistency is permanent.

A test case will be uploaded shortly.

Related branches

Revision history for this message
Philip Stoev (pstoev) wrote :

To run, please branch the Random Query Generator from and then execute:

$ perl runall.pl \
  --mem \
  --queries=1000000 \
  --duration=300 \
  --basedir=/build/bzr/mysql-6.0 \
  --mysqld=--plugin-dir=/build/bzr/pbxt/src/.libs/\
  --mysqld=--plugin-load=PBXT=libpbxt.so \
  --engine=PBXT \
  --grammar=bug315564.yy \
  --gendata=conf/transactions.zz \
  --validator=DatabaseConsistency \
  --mysqld=--loose-lock-wait-timeout=1 \
  --mysqld=--log-output=file

This will start a random workload containing UPDATE and INSERT statements. Shortly after takeoff, the test will complain that the average of all integers in the table has diverged from the desired. This is because NULL values have appeared in the table, even though the SQL grammar specifies that every INSERT of NULLs should be followed by a ROLLBACK. You can verify in the query log that this is indeed the case.

Revision history for this message
Vladimir Kolesnikov (vkolesnikov) wrote :
Download full text (4.0 KiB)

Philip,

Thanks for the report. However I cannot reproduce the problem on 5.1.30. What I get is:

....
# 23:04:56 Validators: DatabaseConsistency
# 23:04:56 Starting 10 processes, 1000000 queries each, duration 300 seconds.
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: COMMIT
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:56 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: SET AUTOCOMMIT=OFF
# 23:04:57 Bad checksum table: table10_pbxt_int_autoinc; sum: 2400; affected_rows:0E0; query: START TRANSACTION
# 23:04:57 Started periodic reporting process...
# 23:05:02 Killing periodic reporting process with pid 22114...
....

so the tool reports some problem, in the query log I can see the following (an excerpt):

                    5 Query SET AUTOCOMMIT=OFF
                    5 Query START TRANSACTION
                    6 Query SELECT SUM(`int_key`) + SUM(`int`)
                        FROM `table10_pbxt_int_autoinc`
                    5 Query SHOW TABLES
                    5 Query START TRANSACTION
                    6 Query SELECT SUM(`int_key`) + SUM(`int`)
                        FROM `table10_pbxt_int_autoinc`
                    5 Query INSERT INTO `table10_pbxt_int_autoinc` ( `pk` ) VALUES ( NULL ) , ( NULL ) , ( NULL )
                    6 Query SELECT SUM(`int_key`) + SUM(...

Read more...

Changed in pbxt:
assignee: nobody → vkolesnikov
status: New → Incomplete
Revision history for this message
Philip Stoev (pstoev) wrote :

Hello,

Please pull from lp:~randgen/randgen/main and try again. I have not tried with 5.1, only with 6.0.

Philip Stoev

Changed in pbxt:
status: Incomplete → New
Revision history for this message
Vladimir Kolesnikov (vkolesnikov) wrote :

Ok, now I can repro it with 5.1.30:

...
# 01:40:49 Bad average for table: table10_pbxt_int_autoinc; average1: 200.0000; average2: 198.9455; count: 569; affected_rows: 0E0; query: ROLLBACK
# 01:40:49 Bad average for table: table10_pbxt_int_autoinc; average1: 200.0000; average2: 198.9455; count: 569; affected_rows: 0E0; query: COMMIT
# 01:40:49 Bad average for table: table10_pbxt_int_autoinc; average1: 200.0000; average2: 198.9455; count: 569; affected_rows: 0E0; query: ROLLBACK
# 01:40:49 Bad average for table: table10_pbxt_int_autoinc; average1: 200.0000; average2: 198.9455; count: 569; affected_rows: 0E0; query: ROLLBACK
# 01:40:49 Bad average for table: table10_pbxt_int_autoinc; average1: 200.0000; average2: 198.9455; count: 569; affected_rows: 0E0; query: COMMIT
#
...

and

mysql> select * from table10_pbxt_int_autoinc where `int` is null;
+---------+------+------+
| int_key | pk | int |
+---------+------+------+
| NULL | 1283 | NULL |
| NULL | 1284 | NULL |
| NULL | 1285 | NULL |
+---------+------+------+
3 rows in set (0.00 sec)

I assume this is what you meant.

Changed in pbxt:
status: New → Confirmed
Revision history for this message
Philip Stoev (pstoev) wrote :

Yes this is what I meant. Those NULLs should not be there.

Please accept my apologies for the low quality of the bug reports -- I am freaking out a bit those days :-) I will work with you to move all the bugs to the "Verified" state.

Thank you.

Revision history for this message
Vladimir Kolesnikov (vkolesnikov) wrote :

Philip,

you don't have to apologize. Your reports are very valuable for us. Even if you are not sure or cannot permanently reproduce a problem it better to report than not.

Thanks again!

Changed in pbxt:
status: Confirmed → In Progress
Changed in pbxt:
status: In Progress → Fix Committed
Revision history for this message
John H. Embretsen (johnemb) wrote :

The issue of uncommitted (rolled back) records being visible in separate concurrent REPEATABLE-READ transactions has also been observed with the test suite described in bug 346808, available at http://launchpadlibrarian.net/24210604/stress_tx_rr.tgz, against PBXT 1.0.07-rc and MySQL 6.0.8, using 50 client threads.

Unable to determine if the issue has been fixed, due to crashing bug https://bugs.launchpad.net/bugs/346808

Changed in pbxt:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.