Bug #1092593 “crash-resistant replication doesn't work when Inno...” : Bugs : Percona Server moved to https://jira.percona.com/projects/PS

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-01-09:

#1

Not seeing the expected result. But logs need to be further verified.

The hosts were prepared as described in the bug report.

Full slave error log is here http://sprunge.us/RERT (ignore the 1062 error for now)

Few show slave status outputs: http://sprunge.us/MFNJ

Interesting to note:

a)

show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: Percona
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 75733104
               Relay_Log_File: archie-relay-bin.000004
                Relay_Log_Pos: 14733102
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 72653284
              Relay_Log_Space: 39013793
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 1
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 392
1 row in set (0.00 sec)

mysql>
mysql> show slave status\G
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)
ERROR:
Can't connect to the server

b)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 79107507, file name mysql-bin.000004
InnoDB: and relay log file
InnoDB: position 21187325, file name ./archie-relay-bin.000004
InnoDB: Last MySQL binlog file position 0 22958867, file name /var/lib/mysql/mysql-bin.000002

As can be seen, the position looks good.

@Nickolay, can you verify the above results? I crashed it twice so as to get more results.

Not seeing the expected result. But logs need to be further verified.

The hosts were prepared as  described in the bug report.

Full slave error log is here  http://sprunge.us/RERT  (ignore the 1062 error for now)

Few show slave status outputs:  http://sprunge.us/MFNJ

Interesting to note:

a)

show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Queueing master event to the relay log
                  Master_Host: Percona
                  Master_User: root
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000004
          Read_Master_Log_Pos: 75733104
               Relay_Log_File: archie-relay-bin.000004
                Relay_Log_Pos: 14733102
        Relay_Master_Log_File: mysql-bin.000004
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 72653284
              Relay_Log_Space: 39013793
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 1
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 392
1 row in set (0.00 sec)

mysql>
mysql> show slave status\G
ERROR 2006 (HY000): MySQL server has gone away
No connection. Trying to reconnect...
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (111)
ERROR:
Can't connect to the server

b)

InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 79107507, file name mysql-bin.000004
InnoDB: and relay log file
InnoDB: position 21187325, file name ./archie-relay-bin.000004
InnoDB: Last MySQL binlog file position 0 22958867, file name /var/lib/mysql/mysql-bin.000002

As can be seen, the position looks good.

@Nickolay, can you verify the above results?  I crashed it twice so as to get more results.

Revision history for this message

Nickolay Ihalainen (ihanick) wrote on 2013-04-12:

#2

Download full text (3.7 KiB)

I have 3 Virtual machines:
c1:
[mysqld]
server_id=1
innodb_file_per_table
log-bin=mysql1_binlog

c2: slave of c1, donor for c3, future master of c3
[mysqld]
server_id=2
innodb_file_per_table
log-bin=mysql2_binlog
relay-log=mysql2_relaylog
innodb_recovery_update_relay_log=1
innodb_flush_log_at_trx_commit=0
log_slave_updates

c3: new server,
[mysqld]
server_id=3
innodb_file_per_table
log-bin=mysql3_binlog
relay-log=mysql3_relaylog
innodb_recovery_update_relay_log=1
innodb_flush_log_at_trx_commit=0

Steps to reproduce:
1) setup c1 with single innodb table
2) setup replication from c1(master) to c2(slave)
3) on c1 insert rows:
start while true ; do mysql test -e 'insert x values(1000);' ; done
4) c2, send kill -9 to mysqld_safe & mysqld
5) copy /var/lib/mysql from c2 to c3
6) close connections via tcp 3306 from c3 to c1
7) start mysql on c2
8) start mysql on c3
9) on c3: stop slave; change master (not working) ; reset slave;change master; start slave;
9) check on c3 if replication is up&running
10) on c3 kill -9 on mysqld pid
11) replication is not working now on c3, because it wont to use relay log names from c2.
SHOW SLAVE STATUS\G shows:
             Master_Log_File: mysql2_binlog.000015
          Read_Master_Log_Pos: 211757
               Relay_Log_File: mysql2_relaylog.000008
                Relay_Log_Pos: 839409
        Relay_Master_Log_File: mysql1_binlog.000002

InnoDB: Warning: innodb_overwrite_relay_log_info is enabled. Updates by other storage engines may not be synchronized.
InnoDB: relay-log.info is detected.
InnoDB: relay log: position 842094, file name ./mysql2_relaylog.000008
InnoDB: master log: position 2341964, file name mysql1_binlog.000002

InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 2339279, file name mysql1_binlog.000002
InnoDB: and relay log file
InnoDB: position 839409, file name ./mysql2_relaylog.000008
InnoDB: Last MySQL binlog file position 0 797205, file name ./mysql2_binlog.000014

              Master_Log_File: mysql1_binlog.000002
          Read_Master_Log_Pos: 2341964
               Relay_Log_File: mysql2_relaylog.000008
                Relay_Log_Pos: 839409
        Relay_Master_Log_File: mysql1_binlog.000002
             Slave_IO_Running: No
            Slave_SQL_Running: No

Can't run change master:
change master to master_host='192.168.2.148', master_user='repl', master_password='pass',master_log_file='mysql2_binlog.000014', master_log_pos=797205;
130412 21:40:26 [ERROR] Failed to open the relay log './mysql2_relaylog.000008' (relay_log_pos 839409)

reset slave;
now change master is working

kill -9 to mysqld on new server

130412 21:42:05 [Note] Plugin 'FEDERATED' is disabled.
InnoDB: Warning: innodb_overwrite_relay_log_info is enabled. Updates by other storage engines may not be synchronized.
InnoDB: relay-log.info is detected.
InnoDB: relay log: position 211907, file name ./mysql3_relaylog.000003
InnoDB: master log: position 211757, file name mysql2_binlog.000015

InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 2339279, file name mysql1_binlog.000002
InnoDB: and relay log f...

I have 3 Virtual machines:
c1:
[mysqld]
server_id=1
innodb_file_per_table
log-bin=mysql1_binlog

c2: slave of c1, donor for c3, future master of c3
[mysqld]
server_id=2
innodb_file_per_table
log-bin=mysql2_binlog
relay-log=mysql2_relaylog
innodb_recovery_update_relay_log=1
innodb_flush_log_at_trx_commit=0
log_slave_updates

c3: new server, 
[mysqld]
server_id=3
innodb_file_per_table
log-bin=mysql3_binlog
relay-log=mysql3_relaylog
innodb_recovery_update_relay_log=1
innodb_flush_log_at_trx_commit=0

Steps to reproduce:
1) setup c1 with single innodb table
2) setup replication from c1(master) to c2(slave)
3) on c1 insert rows:
start while true ; do mysql test -e 'insert x values(1000);' ; done 
4) c2, send kill -9 to mysqld_safe & mysqld
5) copy /var/lib/mysql from c2 to c3
6) close connections via tcp 3306 from c3 to c1
7) start mysql on c2
8) start mysql on c3
9) on c3: stop slave; change master (not working) ; reset slave;change master; start slave;
9) check on c3 if replication is up&running
10) on c3  kill -9 on mysqld pid
11) replication is not working now on c3, because it wont to use relay log names from c2.
SHOW SLAVE STATUS\G shows:
             Master_Log_File: mysql2_binlog.000015
          Read_Master_Log_Pos: 211757
               Relay_Log_File: mysql2_relaylog.000008
                Relay_Log_Pos: 839409
        Relay_Master_Log_File: mysql1_binlog.000002

InnoDB: Warning: innodb_overwrite_relay_log_info is enabled. Updates by other storage engines may not be synchronized.
InnoDB: relay-log.info is detected.
InnoDB: relay log: position 842094, file name ./mysql2_relaylog.000008
InnoDB: master log: position 2341964, file name mysql1_binlog.000002

InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 2339279, file name mysql1_binlog.000002
InnoDB: and relay log file
InnoDB: position 839409, file name ./mysql2_relaylog.000008
InnoDB: Last MySQL binlog file position 0 797205, file name ./mysql2_binlog.000014

Master_Log_File: mysql1_binlog.000002
          Read_Master_Log_Pos: 2341964
               Relay_Log_File: mysql2_relaylog.000008
                Relay_Log_Pos: 839409
        Relay_Master_Log_File: mysql1_binlog.000002
             Slave_IO_Running: No
            Slave_SQL_Running: No

Can't run change master:
change master to master_host='192.168.2.148', master_user='repl', master_password='pass',master_log_file='mysql2_binlog.000014', master_log_pos=797205;
130412 21:40:26 [ERROR] Failed to open the relay log './mysql2_relaylog.000008' (relay_log_pos 839409)

reset slave;
now change master is working

kill -9 to mysqld on new server

130412 21:42:05 [Note] Plugin 'FEDERATED' is disabled.
InnoDB: Warning: innodb_overwrite_relay_log_info is enabled. Updates by other storage engines may not be synchronized.
InnoDB: relay-log.info is detected.
InnoDB: relay log: position 211907, file name ./mysql3_relaylog.000003
InnoDB: master log: position 211757, file name mysql2_binlog.000015

InnoDB: Apply batch completed
InnoDB: In a MySQL replication slave the last master binlog file
InnoDB: position 2339279, file name mysql1_binlog.000002
InnoDB: and relay log file
InnoDB: position 839409, file name ./mysql2_relaylog.000008
InnoDB: Last MySQL binlog file position 0 797205, file name ./mysql2_binlog.000014
130412 21:42:05  InnoDB: Waiting for the background threads to start
130412 21:42:06 Percona XtraDB (http://www.percona.com) 5.5.30-rel30.2 started; log sequence number 8346274
InnoDB: relay-log.info was overwritten.
130412 21:42:06 [Note] Recovering after a crash using mysql3_binlog
130412 21:42:06 [Note] Starting crash recovery...
130412 21:42:06 [Note] Crash recovery finished.
130412 21:42:06 [ERROR] Failed to open the relay log './mysql2_relaylog.000008' (relay_log_pos 839409)

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2013-04-14:

#3

Nickolay -

What VM software and client OSes do you use?

Revision history for this message

Sergei Glushchenko (sergei.glushchenko) wrote on 2013-04-17:

#4

I am able to reproduce this following instructions provided by Nickolay.

Revision history for this message

Sergei Glushchenko (sergei.glushchenko) wrote on 2013-04-18:

#5

The workaround is to turn on log-slave-updates on c3 as well.

Sergei Glushchenko (sergei.glushchenko) on 2013-05-07

summary:

- crash resistant replication doesn't work correctly after change master
- or reset slave
+ crash-resistant replication doesn't work when InnoDB operates with
+ binary log disabled

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2013-05-13:

#6

Will port the testcase to 5.6.

Revision history for this message

David Smid (datasmid) wrote on 2013-07-05:

#7

Regression on Fedora 19 with 5.5.32-31.0

[12:23:00][Step 2/4] rpl.rpl_truncate_3innodb 'mix' [ pass ] 612
[12:23:00][Step 2/4] rpl.rpl_typeconv_innodb 'mix' [ skipped ] Doesn't support --binlog-format='mixed'
[12:23:15][Step 2/4] rpl.rpl_percona_bug1092593 'mix' [ fail ] Found warnings/errors in server log file!
[12:23:15][Step 2/4] Test ended at 2013-07-05 12:23:15
[12:23:15][Step 2/4] line
[12:23:15][Step 2/4] 130705 13:23:06 InnoDB: Error: table `mysqld.2`.`#sql2784_5e_1` does not exist in the InnoDB internal
[12:23:15][Step 2/4] 130705 13:23:06 InnoDB: Error: table `mysqld.2`.`#sql2784_e8_1` does not exist in the InnoDB internal
[12:23:15][Step 2/4] ^ Found warnings in /home/white_dev_01_d_nl_eng_tc_01/buildAgent/work/ae1dfd3f186721d8/debug/mysql-test/var/log/mysqld.2.err
[12:23:15][Step 2/4] ok
[12:23:15][Step 2/4]
[12:23:15][Step 2/4] - saving '/home/white_dev_01_d_nl_eng_tc_01/buildAgent/work/ae1dfd3f186721d8/debug/mysql-test/var/log/rpl.rpl_percona_bug1092593-mix/' to '/home/white_dev_01_d_nl_eng_tc_01/buildAgent/work/ae1dfd3f186721d8/debug/mysql-test/var/log/rpl.rpl_percona_bug1092593-mix/'
[12:23:15][Step 2/4] mysql-test-run: *** ERROR: Not all tests completed
[12:23:15][Step 2/4]
[12:23:15][Step 2/4] Only 341 of 2518 completed.

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2013-07-09:

#8

David -

Please open a new bug for this. Thanks!

Revision history for this message

Shahriyar Rzayev (rzayev-sehriyar) wrote on 2018-01-25:

#9

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-87

	Status	Importance	Assigned to	Milestone
Percona Server moved to https://jira.percona.com/projects/PS	Fix Released	Medium	Sergei Glushchenko	Percona Server moved to https://jira.percona.com/projects/PS 5.6.11-60.3
5.1	Fix Released	High	Sergei Glushchenko	Percona Server moved to https://jira.percona.com/projects/PS 5.1.69-14.7
5.5	Fix Released	High	Sergei Glushchenko	Percona Server moved to https://jira.percona.com/projects/PS 5.5.31-30.3
5.6	Fix Released	Medium	Sergei Glushchenko	Percona Server moved to https://jira.percona.com/projects/PS 5.6.11-60.3

Percona Server moved to https://jira.percona.com/projects/PS

crash-resistant replication doesn't work when InnoDB operates with binary log disabled

Bug Description

Related branches

Other bug subscribers

Remote bug watches