Cluster does not follow MDL semantics
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC |
Confirmed
|
Medium
|
Alexey Zhebel |
Bug Description
Whenever a DDL is executed and on the source node where its executed, if there is a query or a transaction that holds MDL locks, then that transaction is aborted and the DDL does not wait. This is against the MDL semantics and if its purposely done then it should be documented.
Test case follows.
-- Create the required table and populate it with data:
CREATE TABLE `t1` (
`i` int(11) NOT NULL AUTO_INCREMENT,
`c` char(32) NOT NULL DEFAULT 'dummary_text',
PRIMARY KEY (`i`)
) ENGINE=InnoDB AUTO_INCREMENT=518 DEFAULT CHARSET=latin1;
insert into t1 values(null, 'dummary_text');
insert into t1 values(null, 'dummary_text');
insert into t1 values(null, 'dummary_text');
-- Run the test queries
node2_session1> start transaction;
Query OK, 0 rows affected (0.00 sec)
node2_session1> select count(*) from t1; -- this query causes metadata locks to be held till the transaction is not done
+----------+
| count(*) |
+----------+
| 3 |
+----------+
1 row in set (0.00 sec)
node2_session2> alter table t1 engine=innodb; -- this goes through without getting blocked
Query OK, 3 rows affected (1.87 sec)
Records: 3 Duplicates: 0 Warnings: 0
node2_session1> select count(*) from t1; -- deadlock is reported here
ERROR 1213 (40001): Deadlock found when trying to get lock; try restarting transaction
As you can see that both the transaction and alter were started on the same node, and the alter table caused the transaction to be aborted. Following can be seen in the error log of the same node:
130316 4:15:56 [Note] WSREP: cluster conflict due to high priority abort for threads:
130316 4:15:56 [Note] WSREP: Winning thread:
THD: 25, mode: total order, state: executing, conflict: no conflict, seqno: 551518
SQL: alter table t1 engine=innodb
130316 4:15:56 [Note] WSREP: Victim thread:
THD: 24, mode: local, state: idle, conflict: no conflict, seqno: 0
SQL: (null)
130316 4:15:56 [Note] WSREP: MDL conflict db=test table=t1 ticket=3 solved by abort
130316 4:15:56 [Note] WSREP: cluster conflict due to high priority abort for threads:
130316 4:15:56 [Note] WSREP: Winning thread:
THD: 25, mode: total order, state: executing, conflict: no conflict, seqno: 551518
SQL: alter table t1 engine=innodb
130316 4:15:56 [Note] WSREP: Victim thread:
THD: 24, mode: local, state: idle, conflict: aborting, seqno: 0
SQL: (null)
This is a deviation from the MDL behaviour and hence should be documented.
tags: | added: doc |
Changed in percona-xtradb-cluster: | |
status: | New → Triaged |
assignee: | nobody → Hrvoje Matijakovic (hrvojem) |
milestone: | none → 5.5.30-24.8 |
Changed in percona-xtradb-cluster: | |
status: | Triaged → In Progress |
Changed in percona-xtradb-cluster: | |
status: | In Progress → Fix Committed |
Changed in percona-xtradb-cluster: | |
status: | Fix Committed → Confirmed |
Changed in percona-xtradb-cluster: | |
milestone: | 5.5.31-23.7.5 → none |
Changed in percona-xtradb-cluster: | |
importance: | Undecided → Medium |
Changed in percona-xtradb-cluster: | |
assignee: | Hrvoje Matijakovic (hrvojem) → Alexey Zhebel (alexey-zhebel) |
DDL is not transactional in MySQL, and therefore requires special treatment with synchronous replication.
When DDL is processed under total order isolation (wsrep_ osu_method= TOI - which is default), the DDL statement will be replicated up front to the cluster. i.e. cluster will assign global transaction ID for the DDL statement before the DDL processing begins. Then every node in the cluster has the responsibility to execute the DDL in the given slot in the sequence of incoming transactions, and this DDL execution has to happen with high priority. So, local transactions have to yield, and even MDL does not protect them.
If DDL is configured to happen under rolling schema upgrade (RSU) method, then it would be possible to honor MDL policy.