Streaming issue with xbcrypt

Bug #1273196 reported by Raghavendra D Prabhu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Percona XtraBackup moved to https://jira.percona.com/projects/PXB
Fix Released
High
George Ormond Lorch III
2.1
Fix Released
High
George Ormond Lorch III
2.2
Fix Released
High
George Ormond Lorch III

Bug Description

xbcrypt fails to parse packet (or so it seems).

Session 1:

mkdir /tmp/x1
dd if=/dev/urandom of=bigfile bs=1M count=100
cd /tmp/x1

xbstream -c bigfile | xbcrypt --encrypt-algo=AES256 --encrypt-key=6F3AD9F428143F133FD7D50D77D91EA4 | gzip | pv -f -i 10 -N donor -s 22806528 -L 2m | socat -u stdio TCP:127.0.0.1:21792

2014/01/27 16:30:47 socat[52746] E write(3, 0x1aec5f0, 8192): Broken pipe===========================================================================================================> ] 91% ETA 0:00:00

Session 2:
mkdir /tmp/x2
cd /tmp/x2
socat -u TCP-LISTEN:21792,reuseaddr stdio | pv -f -i 10 -N joiner | gzip -dc | xbcrypt --encrypt-algo=AES256 --encrypt-key=6F3AD9F428143F133FD7D50D77D91EA4 -d | xbstream -x -C .

xbcrypt:xb_crypt_read_chunk: unable to read chunk iv size at offset 0x1ef5f2e. ]
   joiner: 31.1MiB 0:00:16 [1.84MiB/s] [ <=> ]
2014/01/27 16:30:47 socat[52714] E write(1, 0xfe37f0, 8192): Broken pipe

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Another test yielded this: (I removed rate limiting in pv to see if that is causing it)

xbstream -c bigfile | xbcrypt --encrypt-algo=AES256 --encrypt-key=6F3AD9F428143F133FD7D50D77D91EA4 | gzip | pv -f -i 10 -N donor | socat -u stdio TCP:127.0.0.1:21792

2014/01/27 16:40:56 socat[55215] E write(3, 0x1bf55f0, 8192): Broken pipe ]
donor: 123MiB 0:00:16 [7.72MiB/s] [ <=>

Session 2:

socat -u TCP-LISTEN:21792,reuseaddr stdio | pv -f -i 10 -N joiner | gzip -dc | xbcrypt --encrypt-algo=AES256 --encrypt-key=6F3AD9F428143F133FD7D50D77D91EA4 -d | xbstream -x -C .
xbcrypt:xb_crypt_read_chunk: unable to read chunk header data at offset 0x7ba7ca3. ]
joiner: 123MiB 0:00:20 [5.99MiB/s] [ <=> ]
2014/01/27 16:40:56 socat[55169] E write(1, 0x130f7f0, 8192): Broken pipe

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

This issue manifests in other ways with xbcrypt itself as in here

https://launchpadlibrarian.net/163779506/%2Aview%2A.txt or https://launchpadlibrarian.net/163779661/%2Aview%2A.txt

[Reported here https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1272982 ]

Tags: pxc

Related branches

tags: added: pxc
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Removing xbcrypt from the above pipe chain allows the transfer to succeed. So, it is indeed an issue with xbcrypt.

description: updated
summary: - Issue with xbcrypt/xbstream
+ Streaming issue with xbcrypt
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Note, that the failure occurs during SST (hence with innobackupex in session 1) with this, the test case is a simplification of the process. The SST config is here https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1272982/comments/3

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

One more thing, even after xbcrypt fails with "xbcrypt:xb_crypt_read_chunk: unable to read chunk header data at offset 0x....", xbstream doesn't exit. (even though others - socat,gzip, pv etc. exit). Isn't this a bug as well? ie. does xbstream handle SIGPIPE well

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Xbstream will be looping around this

read(0, "", 327721) = 0
read(0, "", 327721) = 0
read(0, "", 327721) = 0
read(0, "", 327721) = 0

even after others have exited due to broken pipe.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

I added abort() in error handling path of xb_crypt_read_chunk.

Here is the backtrace:

Thread 1 (Thread 0x7f88c4bdf740 (LWP 19526)):
#0 0x00007f88c3370369 in raise () from /usr/lib/libc.so.6
No symbol table info available.
#1 0x00007f88c3371768 in abort () from /usr/lib/libc.so.6
No symbol table info available.
#2 0x00000000004040fc in xb_crypt_read_chunk (crypt=crypt@entry=0x2511ab0, buf=buf@entry=0x7fff43c7e800, olen=olen@entry=0x7fff43c7e820, elen=elen@entry=0x7fff43c7e810, iv=iv@entry=0x7fff43c7e830,
    ivlen=ivlen@entry=0x7fff43c7e840) at xbcrypt_read.c:244
        tmpbuf = "\020\000CRYP02\000\000\000\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\361\343\200\002"
        ptr = <optimized out>
        tmp = <optimized out>
        checksum = <optimized out>
        checksum_exp = <optimized out>
        version = <optimized out>
        bytesread = <optimized out>
        result = <optimized out>
        __FUNCTION__ = "xb_crypt_read_chunk"
#3 0x0000000000402f57 in mode_decrypt (fileout=1, filein=0) at xbcrypt.c:346
        xbcrypt_file = 0x2511ab0
        ivbuf = 0x0
        ivsize = 0
        cipher_handle = 0x25114b0
        chunkbuf = 0x0
        chunksize = 0
        originalsize = 0
        decryptbuf = 0x2521b20
        decryptbufsize = 65536
        ttlbytesread = 32441550
        ttlchunksread = 525
        result = <optimized out>
        gcry_error = <optimized out>
#4 main (argc=0, argv=0x7fff43c7ea00) at xbcrypt.c:248
        gcry_error = <optimized out>
        filein = 0
        fileout = 1

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

OK, a few things...
- First, the xbstream looping after failure is due to the MY_FULL_IO
- Second, the xbcrypt failures are due to NOT using either MY_FULL_IO or manually retrying reads when less than the full expected amount has been read. A simple fix to loop and retry and watch for 0 bytes read (to prevent failure looping like xbstream) until the ful expected amount has been read seems to fix. I'll post up a branch here shortly for you to do a source build of and try out.

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

OK Raghu, please try the fixes at lp:~gl-az/percona-xtrabackup/bug1273196-2.1 ... I will be starting jenkins and merge to 2.2, etc...

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

@George,

Quick smoke-test (of the testcase provided in description) reveals this bug to be fixed now.

I have also built from your branch for PXC jenkins tests - http://jenkins.percona.com/user/raghavendra-prabhu/my-views/view/PXC/job/percona-xtrabackup-2.1-binaries/ - this should further verify it.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Jenkins tests are also good about this.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXB-151

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.