Merge lp:~akopytov/percona-xtrabackup/bug1170806-2.0 into lp:percona-xtrabackup/2.0

Proposed by Alexey Kopytov
Status: Merged
Approved by: Laurynas Biveinis
Approved revision: no longer in the source branch.
Merged at revision: 541
Proposed branch: lp:~akopytov/percona-xtrabackup/bug1170806-2.0
Merge into: lp:percona-xtrabackup/2.0
Diff against target: 52 lines (+11/-12)
1 file modified
src/xtrabackup.cc (+11/-12)
To merge this branch: bzr merge lp:~akopytov/percona-xtrabackup/bug1170806-2.0
Reviewer Review Type Date Requested Status
Laurynas Biveinis (community) Approve
George Ormond Lorch III (community) g2 Approve
Review via email: mp+160660@code.launchpad.net

Description of the change

    Bug #1170806: innobackupex continuing with FTWRL even when xtrabackup
                  failed

    The problem was that when the log copying thread failed, it did not
    terminate the xtrabackup process immediately. Instead only the log
    copying thread was terminated. The main xtrabackup thread would later
    check if the 'log_copying_succeed' flag is FALSE, and terminated the
    process, but that happened later, after xtrabackup was done with copying
    InnoDB files and signaled innobackupex to lock tables and proceed with
    copying non-InnoDB files.

    The fix is to terminate the entire xtrabackup process immediately on log
    copying failure. There is no reasons to delay checking its status
    and fail much later.

http://jenkins.percona.com/view/XtraBackup/job/percona-xtrabackup-2.0-param/418/

To post a comment you must log in.
Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Looks good except for one more general question. There seems to be a lot of exit(EXIT_FAILURE) which bypasses all resource cleanup operations. Basic OS resources (memory, open files, etc) are not a big deal as they will be cleaned up on exit, but we do use temp files and some external libraries/tools that may also have some temporary resources left behind if not de-initialized/shut down cleanly. Should we consider a new blueprint to implement a better cleanup on error?

review: Approve (g2)
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) :
review: Approve
Revision history for this message
Alexey Kopytov (akopytov) wrote :

Hi George,

On Wed, 24 Apr 2013 16:33:24 -0000, George Ormond Lorch III wrote:
> Review: Approve g2
>
> Looks good except for one more general question. There seems to be a lot of exit(EXIT_FAILURE) which bypasses all resource cleanup operations. Basic OS resources (memory, open files, etc) are not a big deal as they will be cleaned up on exit, but we do use temp files and some external libraries/tools that may also have some temporary resources left behind if not de-initialized/shut down cleanly. Should we consider a new blueprint to implement a better cleanup on error?
>

So files is the only concern. All temporary files should be properly
created so that they are removed automatically on process termination
(bug #1172016 was a result of a typo). And I'm not sure we should put
any effort into cleaning up non-temporary files (basically, data files).
We don't remove backup data on a failed backup currently. Even if we do
that on exit(), they will still be left behind on a crash.

Revision history for this message
George Ormond Lorch III (gl-az) wrote :

Yeah, I agree, I was only concerned with the 'hidden' stuff such as temps in non-obvious locations. I agree that any resulting real data is up to the user to clean.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/xtrabackup.cc'
--- src/xtrabackup.cc 2013-04-24 12:44:51 +0000
+++ src/xtrabackup.cc 2013-04-24 14:21:37 +0000
@@ -1355,7 +1355,6 @@
1355lsn_t log_copy_scanned_lsn;1355lsn_t log_copy_scanned_lsn;
1356ibool log_copying = TRUE;1356ibool log_copying = TRUE;
1357ibool log_copying_running = FALSE;1357ibool log_copying_running = FALSE;
1358ibool log_copying_succeed = FALSE;
13591358
1360ibool xtrabackup_logfile_is_renamed = FALSE;1359ibool xtrabackup_logfile_is_renamed = FALSE;
13611360
@@ -4542,17 +4541,22 @@
45424541
4543 counter++;4542 counter++;
4544 if(counter >= SLEEPING_PERIOD * 5) {4543 if(counter >= SLEEPING_PERIOD * 5) {
4545 if(xtrabackup_copy_logfile(log_copy_scanned_lsn, FALSE))4544
4546 goto end;4545 if(xtrabackup_copy_logfile(log_copy_scanned_lsn,
4546 FALSE)) {
4547 exit(EXIT_FAILURE);
4548 }
4549
4547 counter = 0;4550 counter = 0;
4548 }4551 }
4549 }4552 }
45504553
4551 /* last copying */4554 /* last copying */
4552 if(xtrabackup_copy_logfile(log_copy_scanned_lsn, TRUE))4555 if(xtrabackup_copy_logfile(log_copy_scanned_lsn, TRUE)) {
4553 goto end;4556
45544557 exit(EXIT_FAILURE);
4555 log_copying_succeed = TRUE;4558 }
4559
4556end:4560end:
4557 log_copying_running = FALSE;4561 log_copying_running = FALSE;
4558 my_thread_end();4562 my_thread_end();
@@ -5613,11 +5617,6 @@
5613 }5617 }
5614 msg("\n");5618 msg("\n");
56155619
5616 if (!log_copying_succeed) {
5617 msg("xtrabackup: Error: log_copying_thread failed.\n");
5618 exit(EXIT_FAILURE);
5619 }
5620
5621 /* Signal innobackupex that log copying has stopped and it may now5620 /* Signal innobackupex that log copying has stopped and it may now
5622 unlock tables, so we can possibly stream xtrabackup_logfile later5621 unlock tables, so we can possibly stream xtrabackup_logfile later
5623 without holding the lock. */5622 without holding the lock. */

Subscribers

People subscribed via source and target branches