Comment 2 for bug 1280896

Revision history for this message
jfr (j-francoz) wrote :

Hello

The problem seems to be still present.
I think that, due to the snapshot isolation between the nodes of the cluster, we can have two transactions doing an insert on the same table with the same data for the unique constraint.

Exemple :
create table uniq (u varchar(10), unique key `unique_key`(`u`));

echo "insert into uniq (u) values ('const');" | mysql -h maria01 &
echo "insert into uniq (u) values ('const');" | mysql -h maria02 &

I've got this error each time :
140328 15:42:38 [ERROR] Slave SQL: Could not execute Write_rows event on table test-lbn.uniq; Duplicate entry 'const' for key 'unique_key', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 85, Error_code: 1062
140328 15:42:38 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 5116
140328 15:42:38 [Warning] WSREP: Failed to apply app buffer: seqno: 5116, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 2th time
140328 15:42:38 [ERROR] Slave SQL: Could not execute Write_rows event on table test-lbn.uniq; Duplicate entry 'const' for key 'unique_key', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 85, Error_code: 1062
140328 15:42:38 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 5116
140328 15:42:38 [Warning] WSREP: Failed to apply app buffer: seqno: 5116, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 3th time
140328 15:42:38 [ERROR] Slave SQL: Could not execute Write_rows event on table test-lbn.uniq; Duplicate entry 'const' for key 'unique_key', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 85, Error_code: 1062
140328 15:42:38 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 5116
140328 15:42:38 [Warning] WSREP: Failed to apply app buffer: seqno: 5116, status: 1
         at galera/src/trx_handle.cpp:apply():340
Retrying 4th time
140328 15:42:38 [ERROR] Slave SQL: Could not execute Write_rows event on table test-lbn.uniq; Duplicate entry 'const' for key 'unique_key', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 85, Error_code: 1062
140328 15:42:38 [Warning] WSREP: RBR event 2 Write_rows apply warning: 121, 5116
140328 15:42:38 [Note] WSREP: commit failed for reason: 3
140328 15:42:38 [Note] WSREP: conflict state: 0
140328 15:42:38 [Note] WSREP: cluster conflict due to certification failure for threads:
140328 15:42:38 [Note] WSREP: Victim thread:
   THD: 13, mode: local, state: executing, conflict: cert failure, seqno: 5117
   SQL: insert into uniq (u) values ('const')
140328 15:42:38 [Note] WSREP: wsrep retrying AC query: insert into uniq (u) values ('const')
140328 15:42:38 [ERROR] WSREP: Failed to apply trx: source: dd3ef4a4-b686-11e3-9fec-67b866b47c4d version: 3 local: 0 state: APPLYING flags: 1 conn_id: 45 trx_id: 47302914 seqnos (l: 6, g: 5116, s: 5115, d: 5115, ts: 967608690227148)
140328 15:42:38 [ERROR] WSREP: Failed to apply trx 5116 4 times
140328 15:42:38 [ERROR] WSREP: Node consistency compromized, aborting...
140328 15:42:38 [Note] WSREP: Closing send monitor...
140328 15:42:38 [Note] WSREP: Closed send monitor.
140328 15:42:38 [Note] WSREP: gcomm: terminating thread
140328 15:42:38 [Note] WSREP: gcomm: joining thread
140328 15:42:38 [Note] WSREP: gcomm: closing backend
140328 15:42:38 [Note] WSREP: commit failed for reason: 3
140328 15:42:38 [Note] WSREP: conflict state: 0
140328 15:42:38 [Note] WSREP: cluster conflict due to certification failure for threads:
140328 15:42:38 [Note] WSREP: Victim thread:
   THD: 13, mode: local, state: executing, conflict: cert failure, seqno: 5118
   SQL: insert into uniq (u) values ('const')
140328 15:42:38 [Note] WSREP: cert failure, thd: 13 is_AC: 1, retry: 1 - 1 SQL: insert into uniq (u) values ('const')
140328 15:42:38 [Note] WSREP: releasing retry_query: conf 0 sent 0 kill 0 errno 1213 SQL insert into uniq (u) values ('const')
140328 15:42:38 [Note] WSREP: view(view_id(NON_PRIM,17b7a7cb-b687-11e3-b6d6-0f5b9fb2f831,57) memb {
        17b7a7cb-b687-11e3-b6d6-0f5b9fb2f831,0
} joined {
} left {
} partitioned {
        c39f6170-b367-11e3-817a-d32320dd52da,0
        dd3ef4a4-b686-11e3-9fec-67b866b47c4d,0
})
140328 15:42:38 [Note] WSREP: New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1
140328 15:42:38 [Note] WSREP: view((empty))
140328 15:42:38 [Note] WSREP: gcomm: closed
140328 15:42:38 [Note] WSREP: Flow-control interval: [16, 16]
140328 15:42:38 [Note] WSREP: Received NON-PRIMARY.
140328 15:42:38 [Note] WSREP: Shifting SYNCED -> OPEN (TO: 5118)
140328 15:42:38 [Note] WSREP: Received self-leave message.
140328 15:42:38 [Note] WSREP: Flow-control interval: [0, 0]
140328 15:42:38 [Note] WSREP: Received SELF-LEAVE. Closing connection.
140328 15:42:38 [Note] WSREP: Shifting OPEN -> CLOSED (TO: 5118)
140328 15:42:38 [Note] WSREP: RECV thread exiting 0: Success
140328 15:42:38 [Note] WSREP: recv_thread() joined.
140328 15:42:38 [Note] WSREP: Closing replication queue.
140328 15:42:38 [Note] WSREP: Closing slave action queue.
140328 15:42:38 [Note] WSREP: /usr/sbin/mysqld: Terminated.
140328 15:42:38 mysqld_safe Number of processes running now: 0
140328 15:42:38 mysqld_safe WSREP: not restarting wsrep node automatically
140328 15:42:38 mysqld_safe mysqld from pid file /var/run/lbn-daemon/mysql-1/mysql-1.pid ended

Maybe the unique constraint checking should be done before certifying the transaction ?

rpm -qa|egrep -i '(maria|galera)'
galera-25.3.2-1.rhel6.x86_64
MariaDB-compat-5.5.36-1.el6.x86_64
MariaDB-common-5.5.36-1.el6.x86_64
MariaDB-client-5.5.36-1.el6.x86_64
MariaDB-Galera-server-5.5.36-1.el6.x86_64
perl-DBD-MySQL-Maria-4.013-4.el6_lbn.x86_64

Thanks