Percona Server moved to https://jira.percona.com/projects/PS

Code review comment for lp:~laurynas-biveinis/percona-server/bug1012715-5.1

bug1012715-5.1
Merge into 5.1

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2012-07-04:

> On 07/03/2012 08:35 PM, Laurynas Biveinis wrote:
> => Namely, why is the assumption "slave position update is a part of
> actual transaction changes" wrong in the light of this? If a prepared
> transaction is rolled back, the old slave position is restored from an
> undo seg. If it is 2pc-committed, the new slave position becomes permanent.
> >
>
> Right, it didn't occur to me that updates of the trx header fields are
> done as a part of the current transaction. If they are rolled back when
> a prepared transaction is rolled back, then the fix is correct. Can we
> have it covered by the test case?

Good idea, I will work on this if the points below are agreed upon. I will probably use the explicit XA syntax.

> Though it's not clear then what the difference with the binlog position
> update is, i.e. what following means:
>
> > + /* Update the replication position info inside InnoDB. This
> is
> > + different from the binlog position update that happens during
> > + XA COMMIT. In contrast to that, the slave position is an

My understanding is that binlog position record in InnoDB means "InnoDB transactions are committed (not prepared) up to this binlog position." Thus it cannot possibly go back and in contrast to the slave info log, it is not part of the transaction itself, but rather InnoDB/binlog metadata of sorts. I briefly experimented with moving this to PREPARE too and broke crash recovery even worse. I can research to provide more info if you want me to.

> > Maybe you view this fix as incomplete or working by chance because of the
> following (which is the actual action sequence that happens in this bug): on
> crash recovery the relay status log overwrite will happen before the XA
> rollback, thus the slave position will point to as if the transaction was
> fully committed. Then the transaction will be rolled back, (which would
> require overwriting relay status log with the old position) and replayed from
> the binlog, (which would require overwriting the relay status log position
> again, but binlog does not have the required info for that). This results in
> correct positions, although with a shortcut taken. The assumption here that
> all replicated InnoDB XA prepared transactions will be eventually committed.
> It is not perfect but IMHO "slave position is a part of transaction itself" is
> a step to the right direction.
> >
>
> I don't understand this. When and how a replay from binlog of a rolled
> back XA transaction occurs? If I'm not mistaken, a roll back happens
> when the corresponding event is _not_ in the binary log (or binlog not
> used at all).

Right, sorry, memory failed me. The transaction is never rolledback, it sits there in prepared state on crash recovery and is committed during the binlog crash recovery. The fix still works as designed.

> On 07/03/2012 08:35 PM, Laurynas Biveinis wrote:
> => Namely, why is the assumption "slave position update is a part of
> actual transaction changes" wrong in the light of this? If a prepared
> transaction is rolled back, the old slave position is restored from an
> undo seg. If it is 2pc-committed, the new slave position becomes permanent.
> >
> 
> Right, it didn't occur to me that updates of the trx header fields are
> done as a part of the current transaction. If they are rolled back when
> a prepared transaction is rolled back, then the fix is correct. Can we
> have it covered by the test case?

Good idea, I will work on this if the points below are agreed upon. I will probably use the explicit XA syntax.

> Though it's not clear then what the difference with the binlog position
> update is, i.e. what following means:
> 
> > +             /* Update the replication position info inside InnoDB.  This
> is
> > +             different from the binlog position update that happens during
> > +             XA COMMIT.  In contrast to that, the slave position is an

> > Maybe you view this fix as incomplete or working by chance because of the
> following (which is the actual action sequence that happens in this bug): on
> crash recovery the relay status log overwrite will happen before the XA
> rollback, thus the slave position will point to as if the transaction was
> fully committed. Then the transaction will be rolled back, (which would
> require overwriting relay status log with the old position) and replayed from
> the binlog, (which would require overwriting the relay status log position
> again, but binlog does not have the required info for that). This results in
> correct positions, although with a shortcut taken. The assumption here that
> all replicated InnoDB XA prepared transactions will be eventually committed.
> It is not perfect but IMHO "slave position is a part of transaction itself" is
> a step to the right direction.
> >
> 
> I don't understand this. When and how a replay from binlog of a rolled
> back XA transaction occurs? If I'm not mistaken, a roll back happens
> when the corresponding event is _not_ in the binary log (or binlog not
> used at all).

« Back to merge proposal