Code review comment for lp:~sergei.glushchenko/percona-server/ps56-univ-log-archive

Revision history for this message
Alexey Kopytov (akopytov) wrote :

Sergei,

Introducing new fields to the on-disk data format doesn't look good.

It looks like we can just avoid using LOG_CHECKPOINT_ARCHIVED_FILE_NO
and LOG_CHECKPOINT_ARCHIVED_OFFSET altogether. Upstream InnoDB code
stores archived_file_no and archived_offset on disk, because it uses
file naming based on ordinal file numbers. So on a restart it has to get
the file number from the checkpoint header to figure out the file _name_
that should be used for further archive writes. It normally doesn't need
the offset, because on a clean restart or when log archiving is
re-enabled, it starts a new file (i.e. with a zero file offset). The
only case when the offset is used is when InnoDB recovers from a
crash. In which case it will proceed with writing to the incompletely
written file starting from archived_offset, so it can eventually write
the file full and mark it as completed.

Now we use LSN-based file naming, so we don't have that problem. With
the new archived_file_no semantics we have archived_lsn =
archived_file_no + archived_offset. We can still maintain them
internally for simplicity, just don't write them to disk.

We have to do more work on a restart though to avoid incompletely
written files in case of a crash:

1. Read archived_lsn from the checkpoint header.
2. Scan innodb_log_arch_dir and find the log file with the maximum LSN
   in its name
3. If no files are found, set archived_file_no to archived_lsn,
   archived_offset to 0 and return (i.e. start a new file)
4. If LOG_FILE_ARCH_COMPLETED in the found file is TRUE (i.e. we are restarting from a
   clean shutdown) do the same as in 3.
5. Assert that archived_lsn lies within the incompletely written file we
   have found (i.e. file_start_lsn + group->file_size > archived_lsn)
6. set archived_file_no to file_start_lsn, archived_offset to
   (archived_lsn - file_start_lsn) and return

Another option would be to give up on the LSN-based file names idea, but
it wouldn't be a complete solution, because as you noted, the offset
should still be a 64-bit value now when log files can be > 4G.

Which means there are actually 2 options: either extend the on-disk
format, or do more work on restart.

Other minor things I noticed when looking at the code:

- too long line:
: IB_ARCHIVED_LOGS_PREFIX, IB_ARCHIVED_LOGS_PREFIX_LEN)) {

- spaces instead of tabs here:

: if (fil_space_contains_node(group->archive_space_id,
: archived_log_filename) ||

review: Needs Fixing

« Back to merge proposal