- as we discussed previously, the major problem with the current approach is that when log archiving is enabled the maximum checkpoint age is reduced by 1 log file size, which in the most common configuration with 2 log files means the available log space is reduced in half. but we will get to this issue later after implementing the XtraBackup part of the task; - another major problem that I discovered is that log archiving is not crash-safe: a crash may lead to 1) partially written archive files and 2) log archiving itself not being able to recover without user intervention; We should fix that, let's discuss the details later; Less significant problems: - no error handling in innobase_purge_archive_logs(). The function comment says "@return non-zero if error", but it always returns zero. The same applies to purge_archived_logs(), it should return an error code. Not having error handling in those functions is bad because it's currently impossible for a user to know after running PURGE ARCHIVED LOGS whether it was successful or not without looking at the error log. Instead, the PURGE statement itself should return an error to the client. - while you're fixing innobase_purge_archive_logs() you may want to replace strstr() with is_prefix() and strlen(IB_ARCHIVED_LOGS_PREFIX) with a constant, e.g. IB_ARCHIVED_LOGS_PREFIX_LEN. - as Laurynas wrote, the following change fixes a comment typo in the upstream comment. please revert it: > === modified file 'Percona-Server/sql/sql_parse.cc' > --- Percona-Server/sql/sql_parse.cc 2011-11-24 16:33:30 +0000 > +++ Percona-Server/sql/sql_parse.cc 2012-05-05 14:06:22 +0000 > @@ -2491,7 +2491,7 @@ mysql_execute_command(THD *thd) > > if (check_global_access(thd, SUPER_ACL)) > goto error; > - /* PURGE MASTER LOGS BEFORE 'data' */ > + /* PURGE MASTER LOGS BEFORE 'date' */ > it= (Item *)lex->value_list.head(); > if ((!it->fixed && it->fix_fields(lex->thd, &it)) || > it->check_cols(1)) > - another change in the upstream code for no apparent reasons. Note that it changes code inside #ifdef UNIV_LOG_ARCHIVE which is never (and in fact, cannot be) enabled in Percona Server: > @@ -852,7 +897,7 @@ log_init(void) > #ifdef UNIV_LOG_ARCHIVE > /* Under MySQL, log archiving is always off */ > log_sys->archiving_state = LOG_ARCH_OFF; > - log_sys->archived_lsn = log_sys->lsn; > + log_sys->archived_lsn = 0; > log_sys->next_archived_lsn = 0; > > log_sys->n_pending_archive_ios = 0; > that initialization should probably be outside of #ifdef UNIV_LOG_ARCHIVE? - I wonder if we a failure in os_file_rename_throttled() should result in a fatal error, i.e. an assertion failure. Currently, we do crash if a read or a write fail on a throttled cross-device copy. However, when rename() fails, we just return FALSE, which only results in a "Note:" message in the error log. Which looks inconsistent to me. E.g. if the destination directory does not exist, or we have insufficient privileges to write to it, logs will not be archived, so without looking in the error log one might incorrectly assume everything is OK with his backups. - when we have to do a cross-device file copy, that operation may potentially take a long time, especially when it's throttled. As that operation may also block the server on shutdown, I wonder if we should also have a diagnostic message in the error log when file copying (rather than renaming) is in progress. Something like "Archiving ib_logfileN to ib_log_archive_NNNN" and "Archiving ib_logfileN completed". I think it would be even better to abort throttled copying on shutdown and resume/restart it later on startup. - the 'err' variable in the following code from os_file_rename_throttled() is not used anywhere else, so you don't really need it and can just use "if (errno != EXDEV)"? > + err = (ulint) errno; > + > + if (err != EXDEV) { > + os_file_handle_error_no_exit(oldpath, "rename"); > + return(FALSE); > + } > - I think the usage of POSIX_FADV_DONTNEED is still wrong. XtraBackup uses it after each read/write operation, i.e. after reading/writing every 1 MB. In os_file_rename_throttled(), however, we only use it after copying a file, which might be insufficient for large log files. Suppose we have to copy a 2 GB log file, so we will end up thrashing the page case with them, and only tell the kernel we don't really need them _after_ copying all 2 GB. - it looks inconsistent that we use POSIX_FADV_DONTNEED and throttling for log file copying, but don't use them for new file creation. So when creating a new file the server may suffer from the same problems we try to avoid on file copy. - this part of purge_archive_logs() looks wrong: > + while(!os_file_readdir_next_file(log_xtra_log_archive_dir, dir, > + &fileinfo) ) { > + if (fileinfo.name != strstr(fileinfo.name, > + IB_ARCHIVED_LOGS_PREFIX)) { > + } > I think a continue statement is supposed to be there (or merge it with the next "if")? Please also use strncmp() instead of strstr(), as is_prefix() is not available outside of ha_innodb.cc - this message in purge_archived_logs() is not very informative: > + fprintf(stderr, > + "InnoDB: Note: read archived log last" > + " modification date for: %s.\n", > + archived_log_filename); > without looking into the code, it's impossible to tell whether if means we are going to purge a log file, or skip it. do we really need that message? - a minor grammar issue: s/was failed/failed/. I think it occurs in 3 different messages in the patch.