Merge lp:~akopytov/percona-xtrabackup/BT27412-bug1055989-2.0 into lp:percona-xtrabackup/2.0

Proposed by Alexey Kopytov
Status: Merged
Approved by: Alexey Kopytov
Approved revision: no longer in the source branch.
Merged at revision: 478
Proposed branch: lp:~akopytov/percona-xtrabackup/BT27412-bug1055989-2.0
Merge into: lp:percona-xtrabackup/2.0
Diff against target: 202 lines (+99/-28)
2 files modified
innobackupex (+52/-6)
src/xtrabackup.c (+47/-22)
To merge this branch: bzr merge lp:~akopytov/percona-xtrabackup/BT27412-bug1055989-2.0
Reviewer Review Type Date Requested Status
Alexey Kopytov (community) Approve
Laurynas Biveinis (community) Needs Fixing
Review via email: mp+134413@code.launchpad.net

Description of the change

    Bug #1055989: innobackupex waits for streaming temporary log file to
                  finish before unlocking tables

    The problem was that in XtraBackup 2.0 streaming xtrabackup_logfile from
    a temporary file was performed in the xtrabackup binary and
    innobackupex waited for the xtrabackup process to exit before unlocking
    the tables. Streaming the log file might take a long time, while
    unnecessarily keeping the server locked with FTWRL. In fact, releasing
    the lock can be performed as soon as the xtrabackup binary stops
    duplicating the redo log.

    Fixed by:

    - splitting resume_ibbackup() into two functions. resume_ibbackup()
      itself now resumes the xtrabackup process and then just waits for the
      log copying to stop rather than process termination as previously. The
      new function wait_for_ibbackup_finish() now waits for process
      termination and obtains its exit status. This function is called after
      releasing the FTWRL lock.

    - implementing a new synchronization point between innobackupex and
      xtrabackup using the same mechanism (the xtrabackup_suspended file)
      which is used for synchronization before starting FTWRL: when
      innobackupex wants to wait until log copying finishes, it removes
      xtrabackup_suspended and waits until it is created again by
      xtrabackup. So xtrabackup creates the file to signal innobackupex when
      log copying is complete and it is now safe to release the FTWRL lock.

To post a comment you must log in.
Revision history for this message
Alexey Kopytov (akopytov) wrote :
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Alexey -

The patch is OK with one point: the second use of xtrabackup_suspended file has nothing to do with xtrabackup being suspended when it's created. Also I have a feature in works that suspends/resumes XB at a different point. I think it's a good idea to have unique file names for all the "sync points": avoid "wires-crossed" sync bugs (it's hard to imagine how one could happen currently, but who what else we will develop before we merge IB and XB :), help with troubleshooting if XB crashed and there's a directory listing only, etc.

review: Needs Fixing
Revision history for this message
Alexey Kopytov (akopytov) wrote :

Laurynas,

Thanks for the review. I agree there are currently no problems caused by reusing xtrabackup_suspended file, though the name looks a bit confusing. I could rename it to xtrabackup_lock or something, but I try to do small incremental changes, and not change something unless I absolutely have to.

It does the job for this specific fix. In case we need more fine-grained locking (before merging innobackupex and xtrabackup), we may consider adding more lock files in the future.

Setting to approved, as there are no comments other than those about the lock file name.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'innobackupex'
--- innobackupex 2012-11-07 06:44:45 +0000
+++ innobackupex 2012-11-15 09:01:29 +0000
@@ -425,12 +425,14 @@
425 # (or finalize the backup by syncing changes if using rsync)425 # (or finalize the backup by syncing changes if using rsync)
426 backup_files(0);426 backup_files(0);
427427
428 # resume ibbackup and wait till it has finished428 # resume ibbackup and wait till log copying is finished
429 my $ibbackup_exit_code = resume_ibbackup();429 resume_ibbackup();
430430
431 # release read locks on all tables431 # release read locks on all tables
432 mysql_unlockall() if !$option_no_lock;432 mysql_unlockall() if !$option_no_lock;
433433
434 my $ibbackup_exit_code = wait_for_ibbackup_finish();
435
434 if ( $option_safe_slave_backup && $sql_thread_started) {436 if ( $option_safe_slave_backup && $sql_thread_started) {
435 print STDERR "$prefix: Starting slave SQL thread\n";437 print STDERR "$prefix: Starting slave SQL thread\n";
436 mysql_send('START SLAVE SQL_THREAD;');438 mysql_send('START SLAVE SQL_THREAD;');
@@ -921,12 +923,56 @@
921923
922924
923#925#
924# resume_ibbackup subroutine signals ibbackup to complete its execution926# resume_ibbackup subroutine signals ibbackup to finish log copying by deleting
925# by deleting the 'ibbackup_suspended' file.927# the 'xtrabackup_suspended' file and waits until log copying is stopped,
928# i.e. until 'xtrabackup_suspended' is created again.
929#
930# If this functions detects that the xtrabackup process has terminated, it also
931# sets ibbackup_exit_code and resets ibbackup_pid to avoid trying to reap a
932# non-existing process later
926#933#
927sub resume_ibbackup {934sub resume_ibbackup {
928 print STDERR "$prefix Resuming ibbackup\n\n";935 my $pid = -1;
929 unlink $suspend_file || Die "Failed to delete '$suspend_file': $!";936
937 $now = current_time();
938 print STDERR "$now $prefix Waiting for log copying to finish\n\n";
939 unlink $suspend_file || Die "Failed to delete '$suspend_file': $!";
940
941 for (;;) {
942 # check if the xtrabackup process is still alive _before_ checking if
943 # the file exists to avoid a race condition when the file is created and
944 # the process terminates right after we do the file check
945 $pid = waitpid($ibbackup_pid, &WNOHANG);
946
947 last if -e $suspend_file;
948
949 if ($ibbackup_pid == $pid) {
950 # The file doesn't exist, but the process has terminated
951 Die "ibbackup child process has died";
952 }
953
954 sleep 1;
955 }
956
957 if ($pid == $ibbackup_pid) {
958 $ibbackup_exit_code = $CHILD_ERROR >> 8;
959 $ibbackup_pid = '';
960 }
961
962 unlink $suspend_file || Die "Failed to delete '$suspend_file': $!";
963}
964
965#
966# wait for ibbackup to finish and return its exit code
967#
968sub wait_for_ibbackup_finish {
969 if (!$ibbackup_pid) {
970 # The process has already been reaped.
971 return $ibbackup_exit_code;
972 }
973
974 $now = current_time();
975 print STDERR "$now $prefix Waiting for ibbackup (pid=$ibbackup_pid) to finish\n";
930976
931 # wait for ibbackup to finish977 # wait for ibbackup to finish
932 waitpid($ibbackup_pid, 0);978 waitpid($ibbackup_pid, 0);
933979
=== modified file 'src/xtrabackup.c'
--- src/xtrabackup.c 2012-11-08 23:42:03 +0000
+++ src/xtrabackup.c 2012-11-15 09:01:29 +0000
@@ -4195,6 +4195,32 @@
4195 return(TRUE);4195 return(TRUE);
4196}4196}
41974197
4198/***********************************************************************
4199Create an empty file with a given path and close it.
4200@return TRUE on succees, FALSE on error. */
4201static ibool
4202xb_create_suspend_file(const char *path)
4203{
4204 ibool success;
4205 os_file_t suspend_file = XB_FILE_UNDEFINED;
4206
4207 /* xb_file_create reads srv_unix_file_flush_method */
4208 suspend_file = xb_file_create(path, OS_FILE_CREATE,
4209 OS_FILE_NORMAL, OS_DATA_FILE,
4210 &success);
4211
4212 if (success && suspend_file != XB_FILE_UNDEFINED) {
4213
4214 os_file_close(suspend_file);
4215
4216 return(TRUE);
4217 }
4218
4219 msg("xtrabackup: Error: failed to create file '%s'\n", path);
4220
4221 return(FALSE);
4222}
4223
4198static void4224static void
4199xtrabackup_backup_func(void)4225xtrabackup_backup_func(void)
4200{4226{
@@ -4204,6 +4230,8 @@
4204 datasink_t *ds;4230 datasink_t *ds;
4205 ds_ctxt_t *ds_ctxt = NULL;4231 ds_ctxt_t *ds_ctxt = NULL;
4206 ds_ctxt_t *meta_ds_ctxt;4232 ds_ctxt_t *meta_ds_ctxt;
4233 char suspend_path[FN_REFLEN];
4234 ibool success;
42074235
4208#ifdef USE_POSIX_FADVISE4236#ifdef USE_POSIX_FADVISE
4209 msg("xtrabackup: uses posix_fadvise().\n");4237 msg("xtrabackup: uses posix_fadvise().\n");
@@ -4648,36 +4676,25 @@
46484676
4649 /* suspend-at-end */4677 /* suspend-at-end */
4650 if (xtrabackup_suspend_at_end) {4678 if (xtrabackup_suspend_at_end) {
4651 os_file_t suspend_file = XB_FILE_UNDEFINED;4679 ibool exists;
4652 char suspend_path[FN_REFLEN];
4653 ibool success, exists;
4654 os_file_type_t type;4680 os_file_type_t type;
46554681
4656 sprintf(suspend_path, "%s%s", xtrabackup_target_dir,4682 sprintf(suspend_path, "%s%s", xtrabackup_target_dir,
4657 "/xtrabackup_suspended");4683 "/xtrabackup_suspended");
4658
4659 srv_normalize_path_for_win(suspend_path);4684 srv_normalize_path_for_win(suspend_path);
4660 /* xb_file_create reads srv_unix_file_flush_method */
4661 suspend_file = xb_file_create(suspend_path, OS_FILE_OVERWRITE,
4662 OS_FILE_NORMAL, OS_DATA_FILE,
4663 &success);
46644685
4665 if (!success) {4686 if (!xb_create_suspend_file(suspend_path)) {
4666 msg("xtrabackup: Error: failed to create file "4687 exit(EXIT_FAILURE);
4667 "'xtrabackup_suspended'\n");
4668 }4688 }
46694689
4670 if (suspend_file != XB_FILE_UNDEFINED)
4671 os_file_close(suspend_file);
4672
4673 exists = TRUE;4690 exists = TRUE;
4674 while (exists) {4691 while (exists) {
4675 os_thread_sleep(200000); /*0.2 sec*/4692 os_thread_sleep(200000); /*0.2 sec*/
4676 success = os_file_status(suspend_path, &exists, &type);4693 success = os_file_status(suspend_path, &exists, &type);
4677 if (!success)4694 /* success == FALSE if file exists, but stat() failed.
4678 break;4695 os_file_status() prints an error message in this case */
4696 ut_a(success);
4679 }4697 }
4680 xtrabackup_suspend_at_end = FALSE; /* suspend is 1 time */
4681 }4698 }
46824699
4683 /* Reinit the datasink */4700 /* Reinit the datasink */
@@ -4724,6 +4741,19 @@
4724 }4741 }
4725 msg("\n");4742 msg("\n");
47264743
4744 if (!log_copying_succeed) {
4745 msg("xtrabackup: Error: log_copying_thread failed.\n");
4746 exit(EXIT_FAILURE);
4747 }
4748
4749 /* Signal innobackupex that log copying has stopped and it may now
4750 unlock tables, so we can possibly stream xtrabackup_logfile later
4751 without holding the lock. */
4752 if (xtrabackup_suspend_at_end &&
4753 !xb_create_suspend_file(suspend_path)) {
4754 exit(EXIT_FAILURE);
4755 }
4756
4727 if(!xtrabackup_incremental) {4757 if(!xtrabackup_incremental) {
4728 strcpy(metadata_type, "full-backuped");4758 strcpy(metadata_type, "full-backuped");
4729 metadata_from_lsn = ut_dulint_zero;4759 metadata_from_lsn = ut_dulint_zero;
@@ -4753,11 +4783,6 @@
47534783
4754 }4784 }
47554785
4756 if (!log_copying_succeed) {
4757 msg("xtrabackup: Error: log_copying_thread failed.\n");
4758 exit(EXIT_FAILURE);
4759 }
4760
4761 /* Stream the transaction log from the temporary file */4786 /* Stream the transaction log from the temporary file */
4762 if (!xtrabackup_log_only && xtrabackup_stream &&4787 if (!xtrabackup_log_only && xtrabackup_stream &&
4763 xtrabackup_stream_temp_logfile(dst_log_fd, ds_ctxt)) {4788 xtrabackup_stream_temp_logfile(dst_log_fd, ds_ctxt)) {

Subscribers

People subscribed via source and target branches