Merge lp:~akopytov/percona-xtrabackup/bug1294782-2.1 into lp:percona-xtrabackup/2.1

Proposed by Alexey Kopytov
Status: Merged
Approved by: Alexey Kopytov
Approved revision: no longer in the source branch.
Merged at revision: 741
Proposed branch: lp:~akopytov/percona-xtrabackup/bug1294782-2.1
Merge into: lp:percona-xtrabackup/2.1
Diff against target: 45 lines (+5/-12)
2 files modified
innobackupex.pl (+4/-11)
src/xtrabackup.cc (+1/-1)
To merge this branch: bzr merge lp:~akopytov/percona-xtrabackup/bug1294782-2.1
Reviewer Review Type Date Requested Status
Alexey Kopytov (community) Approve
Review via email: mp+217728@code.launchpad.net

Description of the change

    Bug #1294782: Innobackupex hangs on fatal error in some cases

    The problem was that the code in both innobackupex and xtrabackup, that
    was supposed to make sure no child processes are left running in case
    innobackupex got killed or failed with an error, relied on the fact the
    SIGTERM and SIGINT signals were not blocked by the xtrabackup process
    (or any other child processes spawned by innobackupex).

    When innobackupex terminates gracefully, it calls kill_child_processes()
    to send SIGTERM to its children.

    There’s also (Linux-specific) auto-termination logic in xtrabackup for
    cases when innobackupex is killed or terminates ungracefully. It worked
    as follows: the xtrabackup process called prctl(PR_SET_PDEATHSIG,
    SIGINT) on start, so that SIGINT is sent automatically in case the parent
    process (innobackupex) dies.

    However, both SIGTERM and SIGINT might be blocked by the
    process that had invoked innobackupex, for example, by the PXC server
    processes doing an SST, in which case they were also blocked by the
    xtrabackup process, since the signal mask is inherited by child
    processes.

    Fixed by replacing SIGTERM in innobackupex and SIGINT in xtrabackup
    auto-termination with SIGKILL.

    Also removed some ancient code around FTWRL. It has been unnecessary for
    a long time, which has been reported in duplicate bug #1311120.

    No test case, as it is impossible in Bash to create a signal mask that
    would be inherited by child processes. Signal masks created by ‘trap’
    are only inherited by built-in commands according to the manual.

http://jenkins.percona.com/view/PXB 2.1/job/percona-xtrabackup-2.1-param/565/

To post a comment you must log in.
Revision history for this message
Alexey Kopytov (akopytov) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'innobackupex.pl'
2--- innobackupex.pl 2014-03-01 13:29:34 +0000
3+++ innobackupex.pl 2014-04-30 09:33:56 +0000
4@@ -242,7 +242,7 @@
5 's' => ' ');
6
7 # signal that is sent to child processes when they are killed
8-my $kill_signal = 15;
9+my $kill_signal = 9;
10
11 # current local time
12 my $now;
13@@ -3376,16 +3376,9 @@
14 usleep(1);
15 }
16 }
17- if (compare_versions($mysql_server_version, '4.0.22') == 0
18- || compare_versions($mysql_server_version, '4.1.7') == 0) {
19- # MySQL server version is 4.0.22 or 4.1.7
20- mysql_query($con, "COMMIT");
21- mysql_query($con, "FLUSH TABLES WITH READ LOCK");
22- } else {
23- # MySQL server version is other than 4.0.22 or 4.1.7
24- mysql_query($con, "FLUSH TABLES WITH READ LOCK");
25- mysql_query($con, "COMMIT");
26- }
27+
28+ mysql_query($con, "FLUSH TABLES WITH READ LOCK");
29+
30 # stop query killer process
31 if ($option_kill_long_queries_timeout) {
32 stop_query_killer();
33
34=== modified file 'src/xtrabackup.cc'
35--- src/xtrabackup.cc 2014-04-27 11:31:56 +0000
36+++ src/xtrabackup.cc 2014-04-30 09:33:56 +0000
37@@ -5836,7 +5836,7 @@
38 /* Ensure xtrabackup process is killed when the parent one
39 (innobackupex) is terminated with an unhandled signal */
40
41- if (prctl(PR_SET_PDEATHSIG, SIGINT)) {
42+ if (prctl(PR_SET_PDEATHSIG, SIGKILL)) {
43 msg("prctl() failed with errno = %d\n", errno);
44 exit(EXIT_FAILURE);
45 }

Subscribers

People subscribed via source and target branches

to all changes: