maria:bb-10.9-midenok

Last commit made on 2023-10-06
Get this branch:
git clone -b bb-10.9-midenok https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-10.9-midenok
Repository:
lp:maria

Recent commits

a421873... by midenok

MDEV-25547 Auto-create: Undetected deadlock lasts longer than the configured timeout

While the principle of fixing the stated problem is simple: remove
forcing result=false after ER_LOCK_WAIT_TIMEOUT, there is much serious
problem. While the waiting threads wait for the creating thread to
complete they run the loop continuously, acquiring and releasing
MDL_SHARED_WRITE. If the creating thread is blocked for a long time
waiting for MDL_EXCLUSIVE and the lock wait timeout is usually very
long as MDEV-25547 states, this long time the waiting threads will hog
up the CPU time.

The patch introduces vers_create_signal based on condition variable
which holds down the waiting threads until creating thread is busy
adding the partition. The problem is, there is no ready common storage
for vers_create_signal as the share may be freed once we released
MDL_SHARED_WRITE.

To solve this we allocate vers_auto_create_signal from heap, store it
into TABLE_SHARE temporarily, only for passing it to the waiting
threads. Then we re-lay it into Open_table_context to be able to use
at time when TABLE_SHARE is already can be freed.

All the waiting threads after they release MDL_SHARED_WRITE are
blocked until creating thread signals they can continue. But there is
another problem: how to free vers_auto_create_signal. If done by
waiting threads this leads to race condition as waiting happens in
free from locks zone. Like if the waiting thread decides he is the
last one and frees, this moment another waiting thread wants to
initiate another wait.

So the freeing of vers_auto_create_signal is only possible from the
creating thread as it is under MDL_EXCLUSIVE and this guarantees all
the waiting threads already in free from locks zone and all of them
stated they want to wait. So the creating thread can know when the
last waiting ended the wait. This is implemented by waiters count and
by feedback signal.

The waiting threads tell they are going_wait() under MDL_SHARED_WRITE,
creating thread didn't acquire MDL_EXCLUSIVE yet. This increments
waiters counter. When creating thread acquires MDL_EXCLUSIVE all the
waiting threads are on its way to wait() and the waiters counter will
not increase anymore. After creating thread is done its main logic it
broadcasts to the waiters so they start to exit waiting and decrement
waiters counter. When the waiters counter is zero it is the time for
the creating thread to free vers_auto_create_signal.

The logic of waiting is based on mysql_cond_timedwait(), so we can not
delay over lock_wait_timeout limit. And since there were more delays
(acquiring MDL f.ex.) we use only the remainder of lock_wait_timeout
duration since query start time.

8e1341e... by midenok

MDEV-25547 MDL rollback and auto-create tracing

Use command like this:

  mtr --mysqld=--debug=T:d,mdl,auto-create,cond,query:i:o,/tmp/ac.log

fecec02... by midenok

MDEV-29872 MSAN/Valgrind uninitialised value errors in TABLE::vers_switch_partition

Delayed_insert has its own THD (initialized at mysql_insert()) and
hence its own LEX. Delayed_insert initalizes a very few parameters for
LEX and 'duplicates' is not in this list. Now we copy this missing
parameter from parser LEX (as well as sql_command).

2a85eb9... by midenok

MDEV-31903 Server crashes in _ma_reset_history upon UNLOCK table with auto-create history partitions

When INSERT does auto-create for t1 all its handler instances are
closed by alter_close_table(). At this time down the stack
maria_close() clears share->state_history. Later when we unlock the
tables Aria transaction manager accesses old share instance (the one
before t1 was closed) and tries to reset its state_history.

The problem is maria_close() didn't remove table from transaction's
list (used_tables). The fix does _ma_remove_table_from_trnman() which
is triggered by HA_EXTRA_PREPARE_FOR_RENAME.

3e0009d... by Oleksandr "Sanja" Byelkin

Merge branch '10.6' into 10.9

0d16eb3... by Oleksandr "Sanja" Byelkin

Merge branch '10.5' into 10.6

7e65025... by Oleksandr "Sanja" Byelkin

Merge branch '10.4' into 10.5

2aea938... by Monty <email address hidden>

MDEV-31893 Valgrind reports issues in main.join_cache_notasan

This is also related to
MDEV-31348 Assertion `last_key_entry >= end_pos' failed in virtual bool
           JOIN_CACHE_HASHED::put_record()

Valgrind exposed a problem with the join_cache for hash joins:
=25636== Conditional jump or move depends on uninitialised value(s)
==25636== at 0xA8FF4E: JOIN_CACHE_HASHED::init_hash_table()
          (sql_join_cache.cc:2901)

The reason for this was that avg_record_length contained a random value
if one had used SET optimizer_switch='optimize_join_buffer_size=off'.

This causes either 'random size' memory to be allocated (up to
join_buffer_size) which can increase memory usage or, if avg_record_length
is less than the row size, memory overwrites in thd->mem_root, which is
bad.

Fixed by setting avg_record_length in JOIN_CACHE_HASHED::init()
before it's used.

There is no test case for MDEV-31893 as valgrind of join_cache_notasan
checks that.
I added a test case for MDEV-31348.

f692b2b... by Oleksandr "Sanja" Byelkin

Merge branch '10.6' into 10.9

# Conflicts:
# mysql-test/main/sp.result
# mysql-test/main/sp.test

8d210fc... by Sergey Petrunia

MDEV-31877: ASAN errors in Exec_time_tracker::get_cycles with innodb slow log verbosity

Remove redundant delete_explain_query() calls in

sp_instr_set::exec_core(), sp_instr_set_row_field::exec_core(),
sp_instr_set_row_field_by_name::exec_core().

These calls are made before the SP instruction's tables are
"closed" by close_thread_tables() call.

When we call close_thread_tables() after that, we no longer
can collect engine's counter variables, as they use the data
structures that are located in the Explain Data Structures.

Also, these delete_explain_query() calls are redundant, as
sp_lex_keeper::reset_lex_and_exec_core() has another
delete_explain_query() call, which is located in the right
location after the close_thread_tables() call.