maria:ib_fix_plugin_init

Last commit made on 2022-04-27
Get this branch:
git clone -b ib_fix_plugin_init https://git.launchpad.net/maria

Branch merges

Branch information

Name:
ib_fix_plugin_init
Repository:
lp:maria

Recent commits

a915d31... by Nikita Malyavin

fix ha_innobase plugin initialization race with purge table open

Now that purge opens a table very early, at undo node parsing stage,
it slips before plugin initialization ends.
This causes ASAN global-buffer-overflow and segfault.

The problem is that `resolve_sysvars` function that normalizes table
options (used in parse_engine_table_options during table open) is called
after purge starts (srv_start() call in innodb_init()).

To solve this, additional resolve_sysvars call is addid before srv_start
in innodb_init.

93ddc2b... by Nikita Malyavin

MDEV-26951 gcol.innodb_virtual_basic fails

In this commit, the ultimate solution is given: no more error-prone retries
in the depths of purge!

All the mdl acquisitions now happen in row_purge_parse_undo_rec().

Rationale:
* innodb_acquire_mdl also reopens table data dictionary.
* It reacquires it and never releases
* It can open to a different table pointer. Therefore, data dictionary
 should be updated at the caller side.
* Either we should release the data dictionary opened in innodb_acquire_mdl
 and reopen in row_purge_parse_undo_rec, or keep it, but propogate the
 changed pointer. If not, then SEGFAULT, data corruption, or unreleased
 table instance are possible.

This commit chooses the first way -- keep the data dictionary opened.
It is technically hard to propogate table changes from any
innobase_allocate_row_for_vcol call, so it is decided to make the table
properly opened to the moment of allocation. Besides, this can also stand
as a crude port of 10.5 behavior.

innodb_find_table_for_vc: innodb_acquire_mdl call is removed. THDVAR check
  is now only reasonable for an assertion
row_purge_parse_undo_rec: acquire mdl (and open a mariadb table) for any
  table that has vcol indexes. For the rest of the tables, dict_sys.latch
  is acquired.
dict_table_t::vc_templ: the allocations are supposed to be done under
  dict_sys.mutex protection. NULL check should be also atomic with the
  allocation (see switch (type) in row_purge_parse_undo_rec for example)
innobase_create_v_templ: is introduced as a shortcut for an allocation.
innodb_acquire_mdl: table_name_parse is now protected by dict_sys.latch
  table->release() is also made under MDL or latch. Rationale:
    failing assertion `ctx0->old_table->get_ref_count() == 1` in
    ha_innobase::commit_inplace_alter_table.
  After maria_table open, dict_sys.latch is acquired.
    This is also to protect table_name_parse after table open.
    We can't open it only when mariadb_table is failed to open,
    because the tables could've been renamed in any combinations while
    lathces are released, and therefore MDL wouldn't protect this table
    data dictionary (because it can be different table now).
purge_vcol_info_t: everything is removed except table.

5ffb9de... by Nikita Malyavin

ib_purge: add random sleeps under special debug mode

c519aa3... by Jan Lindström

MDEV-24143 : Galera nodes "randomly" crashing in Item_func_release_lock::val_int

Fixed on MDEV-27713. Added additional test case.

507030c... by mkaruza <email address hidden>

MDEV-27713 Crash after a conflict of applier thread with stored procedure call by event scheduler

When thread is BF aborted by high priority service, ULL (user level
locks need to be removed and released). Calling directly release of lock for
MDL_EXPLICIT type doesn't clear also `thd->ull_hash`. Method
`mysql_ull_cleanup` will properly clear all information about ULL locks
for thread.

Reviewed-by: Jan Lindström <email address hidden>

304f75c... by mkaruza <email address hidden>

MDEV-27568 Parallel async replication hangs on a Galera node

Using parallel slave applying can cause deadlock between between DDL and
other events. GTID with lower seqno can be blocked in galera when node
entered TOI mode, but DDL GTID which has higher node can be blocked
before previous GTIDs are applied locally.

Fix is to check prior commits before entering TOI.

Reviewed-by: Jan Lindström <email address hidden>

c63eab2... by Daniele Sciascia <email address hidden>

MDEV-28055: Galera ps-protocol fixes

* Fix test galera.MW-44 to make it work with --ps-protocol
* Skip test galera.MW-328C under --ps-protocol This test
  relies on wsrep_retry_autocommit, which has no effect
  under ps-protocol.
* Return WSREP related errors on COM_STMT_PREPARE commands
  Change wsrep_command_no_result() to allow sending back errors
  when a statement is prepared. For example, to handle deadlock
  error due to BF aborted transaction during prepare.
* Add sync waiting before statement prepare
  When a statement is prepared, tables used in the statement may be
  opened and checked for existence. Because of that, some tests (for
  example galera_create_table_as_select) that CREATE a table in one node
  and then SELECT from the same table in another node may result in errors
  due to non existing table.
  To make tests behave similarly under normal and PS protocol, we add a
  call to sync wait before preparing statements that would sync wait
  during normal execution.

Reviewed-by: Jan Lindström <email address hidden>

39ed400... by Daniele Sciascia <email address hidden>

Fixup for MDEV-27553

Update wsrep-lib which contains a fixup introduced with MDEV-27553.
Also, adapt the corresponding test: after apply failure on ROLLBACK,
node will disconnect from cluster

Reviewed-by: Jan Lindström <email address hidden>

97582f1... by sjaakola <email address hidden>

MDEV-27649 PS conflict handling causing node crash

Handling BF abort for prepared statement execution so that EXECUTE processing will continue
until parameter setup is complete, before BF abort bails out the statement execution.

THD class has new boolean member: wsrep_delayed_BF_abort, which is set if BF abort is observed
in do_command() right after reading client's packet, and if the client has sent PS execute command.
In such case, the deadlock error is not returned immediately back to client, but the PS execution
will be started. However, the PS execution loop, will now check if wsrep_delayed_BF_abort is set, and
stop the PS execution after the type information has been assigned for the PS.
With this, the PS protocol type information, which is present in the first PS EXECUTE command, is not lost
even if the first PS EXECUTE command was marked to abort.

Reviewed-by: Jan Lindström <email address hidden>

8e9e1c3... by sjaakola <email address hidden>

MDEV-27649 Crash with PS execute after BF abort

This commit contains a test for reproducing the issue in MDEV-27649,
where a transaction, executing a prepared statment, is BF aborted.
The scenario, in MDEV-27649 has a transaction which has prepared a PS,
but not yet executed it, and this transaction is then BF aborted in this state.
When the BF aborted transaction tries to execute the PS, it will receive deadlock error.
But, when it tries to execute the PS second time, the node crashes.

Mtr test galera.galera_bf_abort_ps_bind, exercises this scenario.

However, mtr test platform does not have mechanism to control the execution of PS in required detail.
For this purpose, mysqltetst.cc was extended to contain 4 new commands:
PS_prepare - to prepare a prepared statement
PS_bind - to bind values for parameters for the PS
PS_execute - to execute the PS
PS_close - to close the PS

The support for controlling prepared statments in mtr scripts is quite minimal
in this commit. Limitations are:
* only one PS can be used by a connection, at a time
* only input parameters can be bound for the PS
* only varchar, integer or float type of parameters can be bound

added the result

fixes

Reviewed-by: Jan Lindström <email address hidden>