maria:st-10.5-MDEV-10962-deadlock-deletes

Last commit made on 2023-07-06
Get this branch:
git clone -b st-10.5-MDEV-10962-deadlock-deletes https://git.launchpad.net/maria

Branch merges

Branch information

Name:
st-10.5-MDEV-10962-deadlock-deletes
Repository:
lp:maria

Recent commits

c040313... by Vlad Lesin

MDEV-10962 Deadlock with 3 concurrent DELETEs by unique key

PROBLEM:
A deadlock was possible when a transaction tried to "upgrade" an already
held Record Lock to Next Key Lock.

SOLUTION:
This patch is based on observations that:
(1) a Next Key Lock is equivalent to Record Lock combined with Gap Lock
(2) a GAP Lock never has to wait for any other lock
In case we request a Next Key Lock, we check if we already own a Record
Lock of equal or stronger mode, and if so, then we change the requested
lock type to GAP Lock, which we either already have, or can be granted
immediately, as GAP locks don't conflict with any other lock types.
(We don't consider Insert Intention Locks a Gap Lock in above statements).

The reason of why we don't upgrage Record Lock to Next Key Lock is the
following.

Imagine a transaction which does something like this:

for each row {
    request lock in LOCK_X|LOCK_REC_NOT_GAP mode
    request lock in LOCK_S mode
}

If we upgraded lock from Record Lock to Next Key lock, there would be
created only two lock_t structs for each page, one for
LOCK_X|LOCK_REC_NOT_GAP mode and one for LOCK_S mode, and then used
their bitmaps to mark all records from the same page.

The situation would look like this:

request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 1:
// -> creates new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode and sets bit for
// 1
request lock in LOCK_S mode on row 1:
// -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 1,
// so it upgrades it to X
request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 2:
// -> creates a new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode (because we
// don't have any after we've upgraded!) and sets bit for 2
request lock in LOCK_S mode on row 2:
// -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 2,
// so it upgrades it to X
    ...etc...etc..

Each iteration of the loop creates a new lock_t struct, and in the end we
have a lot (one for each record!) of LOCK_X locks, each with single bit
set in the bitmap. Soon we run out of space for lock_t structs.

If we create LOCK_GAP instead of lock upgrading, the above scenario works
like the following:

// -> creates new lock_t for LOCK_X|LOCK_REC_NOT_GAP mode and sets bit for
// 1
request lock in LOCK_S mode on row 1:
// -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 1,
// so it creates LOCK_S|LOCK_GAP only and sets bit for 1
request lock in LOCK_X|LOCK_REC_NOT_GAP mode on row 2:
// -> reuses the lock_t for LOCK_X|LOCK_REC_NOT_GAP by setting bit for 2
request lock in LOCK_S mode on row 2:
// -> notices that we already have LOCK_X|LOCK_REC_NOT_GAP on the row 2,
// so it reuses LOCK_S|LOCK_GAP setting bit for 2

In the end we have just two locks per page, one for each mode:
LOCK_X|LOCK_REC_NOT_GAP and LOCK_S|LOCK_GAP.
Another benefit of this solution is that it avoids not-entirely
const-correct, (and otherwise looking risky) "upgrading".

The fix was ported from
mysql/mysql-server@bfba840dfa7794b988c59c94658920dbe556075d
mysql/mysql-server@75cefdb1f73b8f8ac8e22b10dfb5073adbdfdfb0

Reviewed by: Marko Mäkelä

fdab2c4... by Alexander Barkov

MDEV-31578 DECLARE CURSOR: "Memory not freed: 280 bytes lost" on syntax error

When CURSOR parameters get parsed, their sp_assignment_lex instances
(one instance per parameter) get collected to List<sp_assignment_lex>.

These instances get linked to sphead only in the end of the list.
If a syntax error happened in the middle of the parameter list,
these instances were not deleted, which caused memory leaks.

Fix:

using a Bison %destructor to free rules of the <sp_assignment_lex_list>
type (on syntax errors).

Afte the fix these sp_assignment_lex instances from CURSOR parameters
deleted as follows:

- If the CURSOR statement was fully parsed, then these instances
  get properly linked to sp_head structures, so they are deleted
  during ~sp_head (this did not change)

- If the CURSOR statement failed on a syntax error, then by Bison's
  %destructor (this is being added in the current patch).

0d3720c... by Alexander Barkov

MDEV-30680 Warning: Memory not freed: 280 on mangled query, LeakSanitizer: detected memory leaks

The parser works as follows:

The rule expr_lex returns a pointer to a newly created sp_expr_lex
instance which is not linked to any MariaDB structures yet - it is
pointed only from a Bison stack variable. The sp_expr_lex instance
gets linked to other structures (such as sp_instr_jump_if_not) later,
after scanning some following grammar.

Problem before the fix:
If a parse error happened immediately after expr_lex (before it got linked),
the created sp_expr_lex value got lost causing a memory leak.

Fix:

- Using Bison's "destructor" directive to free the results of expr_lex
  on parse/oom errors.

- Moving the call for LEX::cleanup_lex_after_parse_error() from
  MYSQL_YYABORT and yyerror inside parse_sql().
  This is needed because Bison calls destructors after yyerror(),
  while it's important to delete the sp_expr_lex instance before
  LEX::cleanup_lex_after_parse_error().
  The latter frees the memory root containing the sp_expr_lex instance.

  After this change the code block are executed in the following order:

  - yyerror() -- now only raises the error to DA (no cleanup done any more)
  - %destructor { delete $$; } <expr_lex> -- destructs the sp_expr_lex instance
  - LEX::cleanup_lex_after_parse_error() -- frees the memory root containing
                                              the sp_expr_lex instance

- Removing the "delete sublex" related code from restore_lex():
  - restore_lex() is called in most cases on success, when delete is not needed.
  - There is one place when restore_lex() is called on error:
    In sp_create_assignment_instr(). But in this case LEX::sp_lex_in_use
    is true anyway.
    The patch adds a new DBUG_ASSERT(lex->sp_lex_in_use) to guard this.

33877cf... by Marko Mäkelä

Fix WITH_UBSAN GCC -Wconversion

24faa5d... by Yuchen Pei <email address hidden>

MDEV-30542 Fixing spider/bugfix.self_reference_multi

The server needs to have a unique name

84dbd02... by Marko Mäkelä

MDEV-31487: Recovery or backup failure after innodb_undo_log_truncate=ON

recv_sys_t::parse(): For undo tablespace truncation mini-transactions,
remember the start_lsn instead of the end LSN. This is what we expect
after commit 461402a56432fa5cd0f6d53118ccce23081fca18 (MDEV-30479).

0b61f4e... by Sergey Petrunia

Fix comment

bd076d4... by THIRUNARAYANAN BALATHANDAYUTHAPANI

MDEV-31442 page_cleaner thread aborts while releasing the tablespace

- InnoDB shouldn't acquire the tablespace when it is being stopped
or closed

f7e9ac0... by Sergey Petrunia

MDEV-31449: Assertion s->table->opt_range_condition_rows <= s->found_records

Fix a typo in make_join_statistics(): when updating statistics for
derived table, set s->table->... not "table->..."

0e2e70c... by Sergey Petrunia

MDEV-31479: Inconsistency between MRR and SQL layer costs can cause poor query plan

(Same as
TODO-3938: best_access_path shows negative costs for mrr=on)

best_access_path() assumes that quick select cost includes
(quick->rows/TIME_FOR_COMPARE) as a cost of checking the attached
part of the WHERE condition.

It calls adjust_quick_cost() to subtract addition from quick's cost.

The problem was that DS-MRR cost formula didn't include this cost.
For very large tables, adjust_quick_cost() would produce a negative
cost which would cause assert in debug build or bad query plan choice
in release builds.

Approved-by: Monty <email address hidden>