MDEV-13915: STOP SLAVE takes very long time on a busy system
The problem is that a parallel replica would not immediately stop
running/queued transactions when issued STOP SLAVE. That is, it
allowed the current group of transactions to run, and sometimes the
transactions which belong to the next group could be started and run
through commit after STOP SLAVE was issued too, if the last group
had started committing. This would lead to long periods to wait for
all waiting transactions to finish.
This patch updates a parallel replica to try and abort immediately
and roll-back any ongoing transactions. The exception to this is any
transactions which are non-transactional (e.g. those modifying
sequences or non-transactional tables), and any prior transactions,
will be run to completion.
The specifics are as follows:
1. A new stage was added to SHOW PROCESSLIST output for the SQL
Thread when it is waiting for a replica thread to either rollback or
finish its transaction before stopping. This stage presents as
“Waiting for worker thread to stop”
2. Worker threads which error or are killed no longer perform GCO
cleanup if there is a concurrently running prior transaction. This
is because a worker thread scheduled to run in a future GCO could be
killed and incorrectly perform cleanup of the active GCO.
3. Refined cases when the FL_TRANSACTIONAL flag is added to GTID
binlog events to disallow adding it to transactions which modify
both transactional and non-transactional engines when the binlogging
configuration allow the modifications to exist in the same event,
i.e. when using binlog_direct_non_trans_update == 0 and
binlog_format == statement.
4. A few existing MTR tests relied on the completion of certain
transactions after issuing STOP SLAVE, and were re-recorded
(potentially with added synchronizations) under the new rollback
behavior.
Reviewed By:
============
Andrei Elkin <email address hidden>
trx_purge_truncate_rseg_history(): Add a parameter to specify if
the entire rollback segment is safe to be freed. If not, we may
still be able to invoke trx_undo_truncate_start() and free some pages.
MDEV-31234 fixup: Allow innodb_undo_log_truncate=ON after upgrade
trx_purge_truncate_history(): Relax a condition that would prevent
undo log truncation if the undo log tablespaces were "contaminated"
by the bug that commit e0084b9d315f10e3ceb578b65e144d751b208bf1 fixed.
That is, trx_purge_truncate_rseg_history() would have invoked
flst_remove() on TRX_RSEG_HISTORY but not reduced TRX_RSEG_HISTORY_SIZE.
To avoid any regression with normal operation, we implement this
fixup during slow shutdown only. The condition on the history list
being empty is necessary: without it, in the test
innodb.undo_truncate_recover there may be much fewer than the
expected 90,000 calls to row_purge() before the truncation.
That is, we would truncate the undo tablespace before actually having
processed all undo log records in it.
To truncate such "contaminated" or "bloated" undo log tablespaces
(when using innodb_undo_tablespaces=2 or more)
you can execute the following SQL:
BEGIN;INSERT mysql.innodb_table_stats VALUES('','',DEFAULT,0,0,0);ROLLBACK;
SET GLOBAL innodb_undo_log_truncate=ON, innodb_fast_shutdown=0;
SHUTDOWN;
The first line creates a dummy InnoDB transaction, to ensure that there
will be some history to be purged during shutdown and that the undo
tablespaces will be truncated.
MDEV-28285 Unexpected result when combining DISTINCT, subselect and LIMIT
The problem was that when JOIN_TAB::remove_duplicates() noticed there
can only be one possible row in the output, it adjusted limits but
didn't take into account any possible offset.
Fixed by not adjusting limit offset when setting one-row-limit.
This ensures that no mtr test can change install.db after it's initial
creation as changing it while as another thread is coping it will lead to
failures in at least InnoDB and Aria recovery.
Fixed spider/bugfix.mdev_30370 that was wrongly used install.db
MDEV-6768 Wrong result with aggregate with join with no result set
When a query does implicit grouping and join operation produces an empty
result set, a NULL-complemented row combination is generated.
However, constant table fields still show non-NULL values.
What happens in the is that end_send_group() is called with a
const row but without any rows matching the WHERE clause.
This last part is shown by 'join->first_record' not being set.
This causes item->no_rows_in_result() to be called for all items to reset
all sum functions to their initial state. However fields are not set
to NULL.
The used fix is to produce NULL-complemented records for constant tables
as well. Also, reset the constant table's records back in case we're
in a subquery which may get re-executed.
An alternative fix would have item->no_rows_in_result() also work
with Item_field objects.
There is some other issues with the code:
- join->no_rows_in_result_called is used but never set.
- Tables that are used with group functions are not properly marked as
maybe_null, which is required if the table rows should be regarded as
null-complemented (not existing).
- The code that tries to detect if mixed_implicit_grouping should be set
didn't take into account all usage of fields and sum functions.
- Item_func::restore_to_before_no_rows_in_result() called the wrong
function.
- join->clear() does not use a table_map argument to clear_tables(),
which caused it to ignore constant tables.
- unclear_tables() does not correctly restore status to what is
was before clear_tables().
Main bug fix was to always use a table_map argument to clear_tables() and
always use join->clear() and clear_tables() together with unclear_tables().
Other fixes:
- Fixed Item_func::restore_to_before_no_rows_in_result()
- Set 'join->no_rows_in_result_called' when no_rows_in_result_set()
is called.
- Removed not used argument from setup_end_select_func().
- More code comments
- Ensure that end_send_group() modifies the same fields as are in the
result set.
- Changed return_zero_rows() to use pointers instead of references,
similar to the rest of the code.