InnoDB log scanning failure results in SST failure

Bug #1375383 reported by David Bennett
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Fix Released
Medium
Sergei Glushchenko
2.1
Won't Fix
Undecided
Unassigned
2.2
Fix Released
Medium
Sergei Glushchenko

Bug Description

Running automated tests using percona-xtradb-cluster-tests/t/xb_galera_sst_advanced-v2.sh with percona-xtradb-cluster-tests/conf/conf10.cnf-node1. 1 out of 12 SST tests failed due to failure during XtraBackup InnoDB log scanning.

===== Platform: Centos 6.5 - x86_64

  updated to 2014-09-23
  direct on hardware - Dell PowerEdge R200

===== Software Versions

  Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64.tar.gz
    (Jenkins debug build)

  percona-xtrabackup-2.2.5-5026-Linux-x86_64.tar.gz
    Built with -DCMAKE_CXX_FLAGS=-m64 -DCMAKE_C_FLAGS=-m64 -DWITH_DEBUG=1

====== node 1 (donor) configuration

# xbstream+nc+compress+progress+time+encrypt2+threads + thread pool
[mysqld]
core_file
thread_handling=pool-of-threads
extra_port=50002

[xtrabackup]
core_file
compress
compact
parallel=4
compress-threads=4

[sst]
streamfmt=xbstream
time=1
progress=1
encrypt=2
tca=/tmp/certs/node1.crt
tcert=/tmp/certs/node2.pem
use_extra=1

====== node 1 (donor) pertinent mysqld log excerpt

...
WSREP_SST: [INFO] Streaming with xbstream (20140928 03:09:24.179)
WSREP_SST: [INFO] Using socat as streamer (20140928 03:09:24.181)
WSREP_SST: [INFO] Using openssl based encryption with socat: with crt and pem (20140928 03:09:24.187)
WSREP_SST: [INFO] Encrypting with PEM /tmp/certs/node2.pem, CA: /tmp/certs/node1.crt (20140928 03:09:24.189)
WSREP_SST: [INFO] Using /tmp/xbtemp.fide8C/tmp.qZX0ykPqxZ as innobackupex temporary directory (20140928 03:09:24.424)
WSREP_SST: [INFO] SST through extra_port 50002 (20140928 03:09:24.436)
WSREP_SST: [INFO] Streaming GTID file before SST (20140928 03:09:24.438)
WSREP_SST: [INFO] Evaluating xbstream -c ${INFO_FILE} | socat -u stdio openssl-connect:127.0.0.1:26168,cert=/tmp/certs/node2.pem,cafile=/tmp/certs/node1.crt; RC=( ${PIPESTATUS[@]} ) (20140928 03:09:24.442)
WSREP_SST: [INFO] NOTE: donor-OpenSSL-Encrypted-2-gtid took 0 seconds (20140928 03:09:24.495)
WSREP_SST: [INFO] Sleeping before data transfer for SST (20140928 03:09:24.508)
2014-09-28 03:09:25 20391 [Note] WSREP: (576ea228, 'tcp://127.0.0.1:22821') turning message relay requesting off
WSREP_SST: [INFO] Streaming the backup to joiner at 127.0.0.1 26168 (20140928 03:09:34.511)
WSREP_SST: [INFO] Evaluating innobackupex --defaults-file=/home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var1/my.cnf --no-version-check $tmpopts $INNOEXTRA --galera-info --st
ream=$sfmt $itmpdir 2>${LOGDIR}/innobackup.backup.14092803091411888164.log | pv -f -i 10 -N donor -s 11649024 | socat -u stdio openssl-connect:127.0.0.1:26168,cert=/tmp/certs/node2.pem,cafile=/tmp/certs/node1.crt; RC=( ${PIPESTATUS[@]}
) (20140928 03:09:34.515)
    donor: 695kB 0:00:10 [69.1kB/s] [> ] 6% ETA 0:02:33^M donor: 695kB 0:00:20 [ 0B/s ] [> ] 6% ETA 0:05:07^M donor: 695kB 0:00:30 [ 0B/s ] [> ] 6% ETA 0:07
:41^M donor: 695kB 0:00:31 [22.3kB/s] [> ] 6%
WSREP_SST: [INFO] NOTE: donor-OpenSSL-Encrypted-2-SST took 31 seconds (20140928 03:10:05.752)
WSREP_SST: [ERROR] innobackupex finished with error: 1. Check /home/dbennett/logs/innobackup.backup.14092803091411888164.log (20140928 03:10:05.755)
WSREP_SST: [ERROR] Cleanup after exit with status:22 (20140928 03:10:05.757)
WSREP_SST: [INFO] Cleaning up temporary directories (20140928 03:10:05.760)
2014-09-28 03:10:05 20391 [ERROR] WSREP: Failed to read from: wsrep_sst_xtrabackup-v2 --role 'donor' --address '127.0.0.1:26168/xtrabackup_sst' --auth 'root:password' --socket '/tmp/xbtemp.fide8C/mysql.sock.ttXtu6' --datadir '/home/dbenne
tt/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var1/data/' --defaults-file '/home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/v
ar1/my.cnf' '' --gtid '576f4183-46de-11e4-a6c4-364d612c6151:7'
2014-09-28 03:10:05 20391 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'donor' --address '127.0.0.1:26168/xtrabackup_sst' --auth 'root:password' --socket '/tmp/xbtemp.fide8C/mysql.sock.ttXtu6' --datadir '/ho
me/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var1/data/' --defaults-file '/home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests
/var/w1/var1/my.cnf' '' --gtid '576f4183-46de-11e4-a6c4-364d612c6151:7': 22 (Invalid argument)
2014-09-28 03:10:05 20391 [ERROR] WSREP: Command did not run: wsrep_sst_xtrabackup-v2 --role 'donor' --address '127.0.0.1:26168/xtrabackup_sst' --auth 'root:password' --socket '/tmp/xbtemp.fide8C/mysql.sock.ttXtu6' --datadir '/home/dbenne
tt/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var1/data/' --defaults-file '/home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/v
ar1/my.cnf' '' --gtid '576f4183-46de-11e4-a6c4-364d612c6151:7'
2014-09-28 03:10:05 20391 [Warning] WSREP: 0.0 (c-p130-u1404-245.webazilla.com): State transfer to 1.0 (c-p130-u1404-245.webazilla.com) failed: -22 (Invalid argument)

====== node 1 (donor) pertinent xtrabackup log excerpt

xtrabackup: open files limit requested 0, set to 1024
xtrabackup: using the following InnoDB configuration:
xtrabackup: innodb_data_home_dir = ./
xtrabackup: innodb_data_file_path = ibdata1:10M:autoextend
xtrabackup: innodb_log_group_home_dir = ./
xtrabackup: innodb_log_files_in_group = 2
xtrabackup: innodb_log_file_size = 50331648
>> log scanned up to (2462007)
xtrabackup: Starting 4 threads for parallel data files transfer
[01] Compressing and streaming ./ibdata1
[03] Compressing and streaming ./mysql/slave_relay_log_info.ibd
[04] Compressing and streaming ./mysql/slave_master_info.ibd
[02] Compressing and streaming ./sbtest/sbtest1.ibd
[03] ...done
[04] ...done
[03] Compressing and streaming ./mysql/innodb_table_stats.ibd
[04] Compressing and streaming ./mysql/slave_worker_info.ibd
[04] ...done
[04] Compressing and streaming ./mysql/innodb_index_stats.ibd
[03] ...done
[04] ...done
[01] ...done
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
>> log scanned up to (2462007)
InnoDB: Tried to read 1048576 bytes at offset 6291456. Was only able to read 16384.
InnoDB: File (unknown): 'read' returned OS error 0. Cannot continue operation
innobackupex: Error: The xtrabackup child process has died at /home/dbennett/work/7/percona-xtrabackup-2.2.5-Linux-x86_64/bin/innobackupex line 2681.

====== node 2 (joiner) pertinent mysqld log excerpt

...
WSREP_SST: [INFO] Decompression with 4 threads (20140928 03:10:05.769)
WSREP_SST: [INFO] Evaluating find /home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var901/data/ -type f -name '*.qp' -printf '%p\n%h\n' | pv -f -s 12 -l -N Decompression | xargs -n 2 qpress -T4d (20140928 03:10:05.772)
Decompression: 12 0:00:00 [ 167k/s] [=====================>] 100%
WSREP_SST: [INFO] NOTE: Joiner-Decompression took 0 seconds (20140928 03:10:05.817)
WSREP_SST: [INFO] Removing qpress files after decompression (20140928 03:10:05.819)
WSREP_SST: [INFO] Preparing the backup at /home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var901/data/ (20140928 03:10:05.823)
WSREP_SST: [INFO] Evaluating innobackupex --no-version-check --apply-log $rebuildcmd ${DATA} &>${LOGDIR}/innobackup.prepare.14092803091411888163.log (20140928 03:10:05.826)
WSREP_SST: [ERROR] Cleanup after exit with status:1 (20140928 03:10:06.024)
WSREP_SST: [INFO] Removing the sst_in_progress file (20140928 03:10:06.026)
2014-09-28 03:10:06 20559 [ERROR] WSREP: Process completed with error: wsrep_sst_xtrabackup-v2 --role 'joiner' --address '127.0.0.1:26168' --auth 'root:password' --datadir '/home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var901/data/' --defaults-file '/home/dbennett/work/7/Percona-XtraDB-Cluster-5.6.20-25.7.893.Linux.x86_64/percona-xtradb-cluster-tests/var/w1/var901/my.cnf' --parent '20559' '' : 1 (Operation not permitted)
2014-09-28 03:10:06 20559 [ERROR] WSREP: Failed to read uuid:seqno from joiner script.
2014-09-28 03:10:06 20559 [ERROR] WSREP: SST failed: 1 (Operation not permitted)
2014-09-28 03:10:06 20559 [ERROR] Aborting

====== node 2 (joiner) pertinent xtrabackup log excerpt

...
xtrabackup: Error: cannot open ./xtrabackup_checkpoints
xtrabackup: error: xtrabackup_read_metadata()
xtrabackup: This target seems not to have correct metadata...
2014-09-28 03:10:06 7f8c0d376720 InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
xtrabackup: Warning: cannot open ./xtrabackup_logfile. will try to find.
2014-09-28 03:10:06 7f8c0d376720 InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
  xtrabackup: Fatal error: cannot find ./xtrabackup_logfile.
xtrabackup: Error: xtrabackup_init_temp_log() failed.
innobackupex: Error:
innobackupex: ibbackup failed at /home/dbennett/work/7/percona-xtrabackup-2.2.5-Linux-x86_64/bin/innobackupex line 2619

Related branches

Revision history for this message
Alexey Kopytov (akopytov) wrote :
Revision history for this message
David Bennett (dbpercona) wrote :

Following the comments, it looks like this was addressed in the MySQL 5.7 code base by implementing read/write retry logic:

  http://bazaar.launchpad.net/~mysql/mysql-server/5.7/revision/4334#storage/innobase/os/os0file.cc

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-707

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.