maria:merge-10.11-MDEV-33668

Last commit made on 2024-04-17
Get this branch:
git clone -b merge-10.11-MDEV-33668 https://git.launchpad.net/maria

Branch merges

Branch information

Name:
merge-10.11-MDEV-33668
Repository:
lp:maria

Recent commits

e28b2c5... by Brandon Nesterenko

Restore thread existence check

9a215d6... by Andrei <email address hidden>

compilation fixed

ab85a00... by Kristian Nielsen

MDEV-33668: Refactor parallel replication round-robin scheduling to use explicit FIFO

This is a preparatory patch to facilitate the next commit to improve
the scheduling of XA transactions in parallel replication.

When choosing the scheduling bucket for the next event group in
rpl_parallel_entry::choose_thread(), use an explicit FIFO for the
round-robin selection instead of a simple cyclic counter i := (i+1) % N.

This allows to schedule XA COMMIT/ROLLBACK dependencies explicitly without
changing the round-robin scheduling of other event groups.

Reviewed-by: Andrei Elkin <email address hidden>
Signed-off-by: Kristian Nielsen <email address hidden>

MDEV-33668: More precise dependency tracking of XA XID in parallel replication

Keep track of each recently active XID, recording which worker it was queued
on. If an XID might still be active, choose the same worker to queue event
groups that refer to the same XID to avoid conflicts.

Otherwise, schedule the XID freely in the next round-robin slot.

This way, XA PREPARE can normally be scheduled without restrictions (unless
duplicate XID transactions come close together). This improves scheduling
and parallelism over the old method, where the worker thread to schedule XA
PREPARE on was fixed based on a hash value of the XID.

XA COMMIT will normally be scheduled on the same worker as XA PREPARE, but
can be a different one if the XA PREPARE is far back in the event history.

Testcase and code for trimming dynamic array due to Andrei.

Reviewed-by: Andrei Elkin <email address hidden>
Signed-off-by: Kristian Nielsen <email address hidden>

9705d62... by THIRUNARAYANAN BALATHANDAYUTHAPANI

MDEV-33809 Bulk insert or DDL fails if a BLOB is too long

SyncFileIO should do multiple read/write call
when length is greater than os_file_request_size_max value.

863f599... by THIRUNARAYANAN BALATHANDAYUTHAPANI

MDEV-33868 Assertion `trx->bulk_insert' failed in innodb_prepare_commit_versioned

- This issue is caused by commit 188c5da72a0057e4572b1d7e4efcd0c39332a839 (MDEV-32453).
InnoDB fails to end the bulk insert for the table after
applying the bulk insert operation. This leads to assertion
during commit process.

cac0fc9... by =?utf-8?q?Jan_Lindstr=C3=B6m?= <email address hidden>

MDEV-32974 : Member fails to join due to old seqno in GTID

Before MDEV-15158, wsrep xid information was stored in only one place:
in the TRX_SYS page. Starting with 10.3, it is not stored there but
in the rollback segment header pages, and the latest one is what
matters. MDEV-19229 allows the undo tablespaces to be rebuilt when
innodb_undo_tablespaces is changed on startup. Previously it was not
possible to change that parameter.

These changes caused the fact that rollback segment header pages could
contain several wsrep xid's stored and when undo tablespaces were
rebuilt there was a effort to restore wsrep xid back to rollback
segment header page but because there was several of them the latest
wsrep xid was overwritten with older one.

trx_rseg_read_wsrep_checkpoint
trx_rseg_init_wsrep_xid
 Return true if read xid is wsrep xid, false if not

trx_rseg_mem_restore
 Try to read wsrep xid and if it is found copy it to
 trx_sys.recovered_wsrep_xid if read xid has larger
 seqno.

5faf2fd... by Marko Mäkelä

MDEV-33585 fixup: GCC -Wsign-compare

d8a60dd... by Oleksandr "Sanja" Byelkin

Fix a typo which lead to compiler error on 32 bit systems

42bda68... by Marko Mäkelä

MDEV-33585 follow-up optimization

log_t: Define buf_size, max_buf_free as 32-bit and next_checkpoint_no
as byte (we only need a bit) and rearrange some data members,
so that on AMD64 we can fit log_sys.latch and log_sys.log in
the same 64-byte cache line.

mtr_t::commit_log(), mtr_t::commit_logger: A part of mtr_t::commit()
split into a separate function, so that we will not unnecessarily invoke
log_sys.get_write_target() when running on a memory-mapped log file,
or log_sys.is_pmem().

Reviewed by: Vladislav Vaintroub
Tested by: Matthias Leich

0892e6d... by Marko Mäkelä

MDEV-33585 The maximum innodb_log_buffer_size is too large

On Microsoft Windows, ReadFile() as well as WriteFile() limit the size
of the request to DWORD, which is 32 bits (at most 4 GiB - 1) also on
64-bit systems.

On FreeBSD, sysctl debug.iosize_max_clamp could limit the size of a
write request to INT_MAX. The size of a read request is always limited
to INT_MAX. This would allow the request size to be 4095 bytes more than
the Linux limit (0x7ffff000 according to "man 2 read" and "man 2 write").

On OpenBSD, Solaris and possibly NetBSD, the read request size is limited
to SSIZE_T_MAX, which would be half the current maximum
innodb_log_buffer_size. This should be not much of an issue anyway,
because on contemporary 64-bit platforms, the virtual addresses are
limited to 48 bits.

IBM AIX documentation mentions OFF_MAX which would apply when
a 64-bit application is running on a 32-bit kernel.

Let us declare innodb_log_buffer_size as 32-bit unsigned and make the
maximum 0x7ffff000, to be compatible with the least common
denominator (Linux).

The maximum innodb_sort_buffer_size already was 64 MiB,
which is not a problem.

SyncFileIO::execute(): Assert that the size of a synchronous read or
write request is limited to the maximum.

Reviewed by: Vladislav Vaintroub