maria:10.2-MDEV-21910

Last commit made on 2020-03-13
Get this branch:
git clone -b 10.2-MDEV-21910 https://git.launchpad.net/maria

Branch merges

Branch information

Name:
10.2-MDEV-21910
Repository:
lp:maria

Recent commits

dd68a68... by Jan Lindström

MDEV-21910 : KIlling thread on Galera could cause mutex deadlock

Following issues here:

Whenever Galera BF (brute force) transaction decides to abort conflicting
transaction it will kill that thread using thd::awake()

User KILL [QUERY|CONNECTION] ... for a thread it will also call thd::awake()

Whenever one of these actions is executed we will hold number of InnoDB
internal mutexes and thd mutexes. Sometimes these mutexes are taken in
different order causing mutex deadlock.

Lets consider a example starting from Galera BF transaction deciding to abort
conflicting transaction (let's call this thread1):

thread1:
lock_rec_other_has_conflicting
  here we hold both lock_sys mutex and trx mutex for conflicting lock transaction.
wsrep_innobase_kill_one_trx
  For victim we call wsrep_thd_awake

Next thread2 we assume user to execute KILL QUERY to some other executing
query (note it can't be BF query).

thread2:
sql_kill()
kill_one_thread
find_thread_by_id
  takes LOCK_thd_kill
thd->awake_no_mutex()

thread1:
thd->awake(KILL_QUERY)
  Tries to have LOCK_thd_kill but it is hold by thread2 so we wait (note that
  we still hold lock_sys mutex and trx mutex).

thread2:
ha_kill_query()
kill_handlerton
innobase_kill_query
lock_trx_handle_wait
  lock_mutex_enter() must wait as thread1 is holding it

Thus thread1 lock_sys, trx_mutex waits -> thread2 LOCK_thd_kill waits lock_sys -> thread1
==> thread1 waits -> thread2 waits -> thread1 ==> mutex deadlock.

In this patch we will fix Galera BF and user kill cases so that we enqueue
victim thread to a list while we hold InnoDB mutexes and we then release them.
A new background thread will pick victim thread from this new list and uses
thd::awake() with no InnoDB mutexes. Idea is similar to replication background
kill. This fix enforces that we take LOCK_thd_data -> lock sys mutex -> trx mutex
always in this order.

wsrep_mysqld.cc
 Here we introduce a list where victim threads are stored,
 condition variable to be used to wake up background thread
 and mutex to protect list.

wsrep_thd.cc
 Create a new background thread to handle victim thread
 abort. We may take wsrep_thd_LOCK mutex here but not any
 InnoDB mutexes.

wsrep_innobase_kill_one_trx
 Remove all the wsrep code that was moved to wsrep_thd.cc
 We just enqueue required information to background kill
 list and cancel victim trx lock wait if there is such.
 Here we have InnoDB lock sys mutex and trx mutex so here
 we can't take wsrep_thd_LOCK mutex.

wsrep_abort_transaction
 Cleanup only.

ed21202... by Marko Mäkelä

Fix GCC 10.0 -Wstringop-overflow

myrg_open(): Reduce the scope of the variable 'end' and
simplify the code.

For some reason, I got no warning for this code in the 10.2
branch, only 10.3 or later.

The ENGINE=MERGE is covered by the tests main.merge, main.merge_debug,
and main.merge-big.

d9d3c22... by Sujatha Sivakumar

MDEV-10047: table-based master info repository

Problem:
=======
When we upgrade from "mysql" to "mariadb" if slave is using repositories as
tables their data is completely ignored and no warning is issued in error log.

Fix:
===
"mysql_upgrade" test should check for the presence of data in
"mysql.slave_master_info" and "mysql.slave_relay_log_info" tables. When tables
have some data the upgrade script should report a warning which hints users
that the data in repository tables will be ignored.

9f858f3... by Marko Mäkelä

Fix clang 10 warnings

_ma_fetch_keypage(): Correct an assertion that used to always hold.
Thanks to clang -Wint-in-bool-context for flagging this.

double_to_datetime_with_warn(): Suppress -Wimplicit-int-float-conversion
by adding a cast. LONGLONG_MAX converted to double will actually be
LONGLONG_MAX+1.

2e8b0c5... by Marko Mäkelä

MDEV-21933 INFORMATION_SCHEMA.INNODB_SYS_TABLESPACES accesses SYS_DATAFILES

All tablespace metadata is buffered in fil_system. There is a LRU
mechanism, but that only controls the opening and closing of
fil_node_t::handle.

It is much more efficient and less error-prone to access data file names
by looking up the fil_space_t object rather than by essentially joining
each row with an access to SYS_DATAFILES via the InnoDB internal SQL parser.

dict_get_first_path(): Declare static. The function may only be needed
when loading or updating the data dictionary. Also, change a condition
in order to avoid a bogus GCC 10 -Wstringop-overflow warning for
mem_strdupl() about len==ULINT_UNDEFINED.

i_s_sys_tablespaces_fill_table(): Do not access other InnoDB internal
dictionary tables than SYS_TABLESPACES.

47382a2... by Marko Mäkelä

Fix GCC 10 -Wclass-memaccess

a8566f7... by Marko Mäkelä

Fix GCC 10 -Wstringop-truncation

2c8fa28... by Marko Mäkelä

Update libmariadb

This fixes GCC 10.0.1 -Wstringop-truncation and some typos.

32904dc... by Marko Mäkelä

Merge 10.1 into 10.2

7b082fb... by Marko Mäkelä

Merge 5.5 into 10.1