maria:bb-10.6-MDEV-27983

Last commit made on 2022-08-29
Get this branch:
git clone -b bb-10.6-MDEV-27983 https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-10.6-MDEV-27983
Repository:
lp:maria

Recent commits

8ba5e85... by Marko Mäkelä

MDEV-27983: InnoDB hangs after loading a ROW_FORMAT=COMPRESSED page

If multiple threads invoke buf_page_get_low() on a ROW_FORMAT=COMPRESSED
page that does not reside in the buffer pool, then one of the threads
will end up acquiring an exclusive page latch (the "if" statement
right before the new wait_for_unzip: label) and other threads will
end up waiting for a shared latch while holding a buffer-fix.
The exclusive latch holder would then wait for the buffer-fixes to
be released while the buffer-fix holders are waiting for the shared latch.

buf_page_get_low(): Prevent the hang that was introduced
in commit 9436c778c3adba7c29dab5649668433d71e086f2 (MDEV-27058),
by releasing the buffer-fix, sleeping some time, and retrying the
page lookup.

0fbcb0a... by Marko Mäkelä

MDEV-29383 Assertion mysql_mutex_assert_owner(&log_sys.flush_order_mutex) failed in mtr_t::commit()

In commit 0b47c126e31cddda1e94588799599e138400bcf8 (MDEV-13542)
a few calls to mtr_t::memo_push() were moved before a write latch
on the page was acquired. This introduced a race condition:

1. is_block_dirtied() returned false to mtr_t::memo_push()
2. buf_page_t::write_complete() was executed, the block marked clean,
and a page latch released
3. The page latch was acquired by the caller of mtr_t::memo_push(),
and mtr_t::m_made_dirty was not set even though the block is in
a clean state.

The impact of this race condition is that crash recovery and backups
may fail.

btr_cur_latch_leaves(), btr_store_big_rec_extern_fields(),
btr_free_externally_stored_field(), trx_purge_free_segment():
Acquire the page latch before invoking mtr_t::memo_push().
This fixes the regression caused by MDEV-13542.

Side note: It would suffice to set mtr_t::m_made_dirty at the time
we set the MTR_MEMO_MODIFY flag for a block. Currently that flag is
unnecessarily set if a mini-transaction acquires a page latch on
a page that is in a clean state, and will not actually modify the block.
This may cause unnecessary acquisitions of log_sys.flush_order_mutex
on mtr_t::commit().

mtr_t::free(): If the block had been exclusively latched in this
mini-transaction, set the m_made_dirty flag so that the flush order mutex
will be acquired during mtr_t::commit(). This should have been part of
commit 4179f93d28035ea2798cb1c16feeaaef87ab4775 (MDEV-18976).
It was necessary to change mtr_t::free() so that
WriteOPT_PAGE_CHECKSUM::operator() would be able to avoid writing
checksums for freed pages.

76bb671... by Marko Mäkelä

Merge 10.5 into 10.6

9929301... by Marko Mäkelä

Merge 10.4 into 10.5

851058a... by Marko Mäkelä

Merge 10.3 into 10.4

d1a80c4... by Marko Mäkelä

MDEV-29384 Hangs caused by innodb_adaptive_hash_index=ON

buf_defer_drop_ahi(): Remove. Ever since
commit c7f8cfc9e733517cff4aaa6f6eaca625a3afc098 (MDEV-27700)
it is safe to invoke btr_search_drop_page_hash_index(block, true)
to remove an orphan adaptive hash index.

Any attempt to upgrade page latches is prone to deadlocks. Recently,
we observed a few hangs that involved nothing more than a small table
consisting of one clustered index page, one secondary index page and
some undo pages.

2f6a728... by Sergei Golubchik

update a global_suppressions() list

followup for "remove invalid options from warning messages"

8ff1096... by Vlad Lesin

MDEV-29081 trx_t::lock.was_chosen_as_deadlock_victim race in lock_wait_end()

The issue is that trx_t::lock.was_chosen_as_deadlock_victim can be reset
before the transaction check it and set trx_t::error_state.

The fix is to reset trx_t::lock.was_chosen_as_deadlock_victim only in
trx_t::commit_in_memory(), which is invoked on full rollback. There is
also no need to have separate bit in
trx_t::lock.was_chosen_as_deadlock_victim to flag transaction it was
chosen as a victim of Galera conflict resolution, the same variable can be
used for both cases except debug build. For debug build we need to
distinguish deadlock and Galera's abort victims for debug checks. Also
there is no need to check for deadlock in lock_table_enqueue_waiting() for
Galera as the coresponding check presents in lock_wait().

Local variable "error_state" in lock_wait() was replaced with
trx->error_state, because before the replace
lock_sys_t::cancel<false>(trx, lock) and lock_sys.deadlock_check() could
change trx->error_state, which then could be overwritten with the local
"error_state" variable value.

The lock_wait_suspend_thread_enter DEBUG_SYNC point name is misleading,
because lock_wait_suspend_thread was eliminated in e71e613. It was renamed
to lock_wait_start.

Reviewed by: Marko Mäkelä, Jan Lindström.

f2a53b6... by Marko Mäkelä

btr_search_drop_page_hash_index(): Remove a racey debug check

a3fd9e6... by Vladislav Vaintroub

MDEV-29367 Refactor tpool::cache

Removed use std::vector's ba push_back(), pop_back() to make it more
obvious that memory in the vectors won't be reallocated.

Also, "borrowed" elements can be debugged a little better now,
they are put into the start of the m_cache vector.