maria:10.6-MDEV-31949-gtid_prepare_fail_paths

Last commit made on 2023-10-18
Get this branch:
git clone -b 10.6-MDEV-31949-gtid_prepare_fail_paths https://git.launchpad.net/maria

Branch merges

Branch information

Name:
10.6-MDEV-31949-gtid_prepare_fail_paths
Repository:
lp:maria

Recent commits

acb9c9e... by Brandon Nesterenko

MDEV-31949: rpl_xa_prepare_gtid_fail deterministic paths

happy_xac is where the XA COMMIT completes before noticing the
error signalled by the prior XAP

sad_xac is where the XA COMMIT notices the error signalled by
the prior XAC and rolls back, leaving a dangling XAP.

bd37485... by Brandon Nesterenko

MDEV-31949: Fix rpl_xa_prepare_gtid_fail

rpl.rpl_xa_prepare_gtid_fail would sporadically fail, as an XA COMMIT,
running concurrently with a failed prepare, could sometimes complete
succesfully, or be rolled back due to failure of prior commit. The
test expected the completion case, but if it failed, the xa transaction
would be left in a prepared state with a lock on its table. This then
created a lock time-out during test cleanup, as it tries to drop that
table.

The fix is to extend the test to check if the transaction is still
prepared, and silently commit it if so, thus releasing the locks.
The non-determinism is fine (i.e. DEBUG_SYNC isn't needed to force
one path), as the verification performed by the test has already
completed. This is just for cleanup.

96bd9e6... by Andrei <email address hidden>

MDEV-31949 parallel slave xa Round-Robin distribution

XA-Prepare group of events

  XA START xid
  ...
  XA END xid
  XA PREPARE xid

and its XA-"complete" terminator

  XA COMMIT or
  XA ROLLBACK

are made distributed Round-Robin across slave parallel workers.
The former hash-based policy was proven to attribute to execution
latency through creating a big - many times larger than the size
of the worker pool - queue of binlog-ordered transactions
to commit.

Acronyms and notations used below:

  XAP := XA-Prepare event or the whole prepared XA group of events
  XAC := XA-"complete", which is a solitary group of events
  |W| := the size of the slave worker pool
  Subscripts like `_k' denote order in a corresponding sequence
     (e.g binlog file).

KEY CHANGES:

The parallel slave
------------------
driver thread now maintains a list XAP:s currently
in processing. It's purpose is to avoid "wild" parallel execution of XA:s
with duplicate xids (unlikely, but that's the user's right).
The list is arranged as a sliding window with the size of 2*|W| to account
a possibility of XAP_k -> XAP_k+2|W|-1 the largest (in the group-of-events
count sense) dependency.
Say k=1, and |W| the # of Workers is 4. As transactions are distributed
Round-Robin, it's possible to have T^*_1 -> T^*_8 as the largest
dependency ('*' marks the dependents) in runtime.
It can be seen from worker queues, like in the picture below.
Let Q_i worker queues develop downward:

  Q1 ... Q4
  1^* 2 3 4
  5 6 7 8^*

Worker # 1 has assigned with T_1 and T_5.
Worker #4 can take on its T_8 when T_1 is yet at the
beginning of its processing, so even before XA START of that XAP.

XA related
----------
XID_cache_element is extended with two pointers to resolve
two types of dependencies: the duplicate xid XAP_k -> XAP_k+i
and the ordinary completion on the prepare XAP_k -> XAC_k+j.
The former is handled by a wait-for-xid protocol conducted by
xid_cache_delete() and xid_cache_insert_maybe_wait().
The later is done analogously by xid_cache_search_maybe_wait() and
slave_applier_reset_xa_trans().

XA-"complete" are allowed to go forward before its XAP parent
has released the xid (all recovery concerns are covered in MDEV-21496,
MDEV-21777).
Yet XAC is going to wait for it at a critical
point of execution which is at "complete" the work in Engine.

CAVEAT: storage/innobase/trx/trx0undo.cc changes are due to possibly
        fixed MDEV-32144,
 TODO: to be verified.

Thanks to Brandon Nesterenko at mariadb.com for initial review and
a lot of creative efforts to advance with this work!

ee5cadd... by THIRUNARAYANAN BALATHANDAYUTHAPANI

MDEV-28122 Optimize table crash while applying online log

- InnoDB fails to check the overflow buffer while applying
the operation to the table that was rebuilt. This is caused
by commit 3cef4f8f0fc88ae5bfae4603d8d600ec84cc70a9 (MDEV-515).

cca9547... by Monty <email address hidden>

Post fix for MDEV-32449

1c55445... by Monty <email address hidden>

MDEV-32449 Server crashes in Alter_info::add_stat_drop_index upon CREATE TABLE

Fixed missing initialization of Alter_info()

This could cause crashes in some create table like scenarios
where some generated indexes where automatically dropped.

I also added a test that we do not try to drop from index_stats for
temporary tables.

ec277a7... by Monty <email address hidden>

Do not create histograms for single column unique key

The intentention was always to not create histograms for single value
unique keys (as histograms is not useful in this case), but because of
a bug in the code this was still done.

The changes in the test cases was mainly because hist_size is now NULL
for these kind of columns.

18fa00a... by Vlad Lesin

MDEV-32272 lock_release_on_prepare_try() does not release lock if supremum bit is set along with other bits set in lock's bitmap

The error is caused by MDEV-30165 fix with the following commit:
d13a57ae8181f2a8fbee86838d5476740e050d50

There is logical error in lock_release_on_prepare_try():

        if (supremum_bit)
          lock_rec_unlock_supremum(*cell, lock);
        else
          lock_rec_dequeue_from_page(lock, false);

Because there can be other bits set in the lock's bitmap, and the lock
type can be suitable for releasing criteria, but the above logic
releases only supremum bit of the lock.

The fix is to release lock if it suits for releasing criteria and unlock
supremum if supremum is locked otherwise.

Tere is also the test for the case, which was reported by QA team. I
placed it in a separate files, because it requires debug build.

Reviewed by: Marko Mäkelä

cbad0bc... by THIRUNARAYANAN BALATHANDAYUTHAPANI

MDEV-31098 InnoDB Recovery doesn't display encryption message when no encryption configuration passed

- InnoDB fails to report the error when encryption configuration
wasn't passed. This patch addresses the issue by adding
the error while loading the tablespace and deferring the
tablespace creation.

8bf17c5... by Monty <email address hidden>

MDEV-32388 MSAN / Valgrind errors in Item_func_like::get_mm_leaf upon query from partitioned table

The problem was that RANGE_OPT_PARAM was not completely initialized in
some cases.
Added bzero() to ensure that all elements are always initialized.