maria:bb-10.11-selectivity-rebase-nov

Last commit made on 2022-11-16
Get this branch:
git clone -b bb-10.11-selectivity-rebase-nov https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-10.11-selectivity-rebase-nov
Repository:
lp:maria

Recent commits

dc456e3... by Monty <email address hidden>

Fixed bug in Aria with aria_log files that are exactly 8K

In the case one has an old Aria log file that ands with a Aria checkpoint
and the server restarts after next recovery, just after created a
new Aria log file (of 8K), the Aria recovery code would abort.
If one would try to delete all Aria log files after this (but not the
aria_control_file), the server would crash during recovery.

The problem was that translog_get_last_page_addr() would regard a log file
of exactly 8K as illegal and the rest of the code could not handle this
case.

Another issue was that if there was a crash directly after the log file
head was written to the next page, the code in translog_get_next_chunk()
would crash.

This patch fixes most of the issues, but not all. For Sanja to look at!

Things fixed:
- Added code to ignore 8K log files.
- Removed ASSERT in translog_get_next_chunk() that checks if page only
  contains the log page header.

58374bb... by Monty <email address hidden>

Small improvements to aria recovery

I spent 4 hours on work and 12 hours of testing to try to find
the reason for aria crashing in recovery when starting a new test,
in which case the 'data directory' should be a copy of "install.db",
but aria_log.00000001 content was not correct.

The following changes are mostly done to make it a bit easier to find out
more in case of future similar crashes:

- Mark last_checkpoint_lsn volatile (safety).
- Write checkpoint message to aria_recovery.trace
- When compling with DBUG, use checksum's for Aria log pages.
- Added a message to mtr --verbose when copying the data directory.
- Removed extra linefeed in Aria recovery message (cleanup)

a83548d... by Monty <email address hidden>

Added rowid_filter support to Aria

This includes:
- cleanup and optimization of filtering and pushdown engine code.
- Adjusted costs for rowid filters (based on extensive testing
  and profiling).

This made a small two changes to the handler_rowid_filter_is_active()
API:
- One should not call it with a zero pointer!
- One does not need to call handler_rowid_filter_is_active() for every
  row anymore. It is enough to check if filter is active by calling it
  call it during index_init() or when handler::rowid_filter_changed()
  is called

The changes was to avoid unnecessary function calls and checks if
pushdown conditions and rowid_filter is not used.

Updated costs for rowid_filter_lookup() to be closer to reality.
The old cost was based only on rowid_compare_cost. This is now
changed to take into account the overhead in checking the rowid.

Changed the Range_rowid_filter class to use DYNAMIC_ARRAY directly
instead of Dynamic_array<>. This was done to be able to use the new
append_dynamic() functions which gives a notable speed improvment
compared to the old code. Removing the abstraction also makes
the code easier to understand.

The cost of filtering is now slightly lower than before, which
is reflected in some test cases that is now using rowid filters.

686c476... by Monty <email address hidden>

Set thd->query() for internal (startup) transactions

This helps with debugging as 'Query: ' in DBUG traces will show something
useful, for internal transactions, instead of just "".

4aa6719... by Sergey Petrunia

Attempt to get to compile with ColumnStore

4faa973... by Monty <email address hidden>

Disabled spider.partition_mrr until MDEV-29947 is fixed in 10.9

7b5a98a... by Monty <email address hidden>

Don't do zerofill of Aria table if it's already zerofilled

This will speed up using tables that are already zerofilled
with aria_chk --zerofill.

c6f47ea... by Monty <email address hidden>

Added test cases for preceding test

This includes all test changes from
"Changing all cost calculation to be given in milliseconds"
and forwards.

Some of the things that caused changes in the result files:

- As part of fixing tests, I added 'echo' to some comments to be able to
  easier find out where things where wrong.
- MATERIALIZED has now a higher cost compared to X than before. Because
  of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY.
  - Some test cases that required MATERIALIZED to repeat a bug was
    changed by adding more rows to force MATERIALIZED to happen.
- 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to
  something smaller. This is because now filtered also takes into
  account the smallest possible ref access and filters, even if they
  where not used. Another reason for 'Filtered' being smaller is that
  we now also take into account implicit filtering done for subqueries
  using FIRSTMATCH.
  (main.subselect_no_exists_to_in)
  This is caluculated in best_access_path() and stored in records_out.
- Table orders has changed because more accurate costs.
- 'index' and 'ALL' for small tables has changed to use 'range' or
   'ref' because of optimizer_scan_setup_cost.
- index can be changed to 'range' as 'range' optimizer assumes we don't
  have to read the blocks from disk that range optimizer has already read.
  This can be confusing in the case where there is no obvious where clause
  but instead there is a hidden 'key_column > NULL' added by the optimizer.
  (main.subselect_no_exists_to_in)
- Scan on primary clustered key does not report 'Using Index' anymore
  (It's a table scan, not an index scan).
- For derived tables, the number of rows is now 100 instead of 2,
  which can be seen in EXPLAIN.
- More tests have "Using index for group by" as the cost of this
  optimization is now more correct (lower).
- A primary key could be preferred for a normal key, even if it would
  access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a
  clustered primary key than one lookup trough a secondary.
  (main.stat_tables_innodb)

Notes:

- There was a 4.7% more calls to best_extension_by_limited_search() in
  the main.greedy_optimizer test. However examining the test results
  it looked that the plans where slightly better (eq_ref where more
  chained together) so I assume this is ok.
- I have verified a few test cases where there was notable/unexpected
  changes in the plan and in all cases the new optimizer plans where
  faster. (main.greedy_optimizer and some others)

fde0983... by Monty <email address hidden>

MDEV-29677 Wrong result with join query and innodb fulltext search

InnoDB FTS scan was used by a subquery. A subquery execution may start
a table read and continue until it finds the first matching record
combination. This can happen before the table read returns EOF.

The next time the subquery is executed, it will start another table read.
InnoDB FTS table read fails to re-initialize its data structures in this
scenario and will try to continue the scan started at the first execution.

Fixed by ha_innobase::ft_init() to stop the FTS scan if there is one.

Author: Sergei Petrunia <email address hidden>
Reviewer: Monty

231598a... by Monty <email address hidden>

Fixes for 'Filtering'

- table_after_join_selectivity() should use records_init (new bug)
- get_examined_rows() changed to double to get similar results
  as in MariaDB 10.11
- Fixed bug where table_after_join_selectivity() did not correct
  selectivity in the case where a RANGE is used instead of a REF.
  This can happen if the range can use more key_parts than the REF.
  WHERE key_part1=10 and key_part2 < 10

Other things:
- Use JT_RANGE instead of JT_ALL for RANGE access in all parts of the code.
  Before we used JT_ALL for RANGE.
- Force RANGE be used in best_access_path() if the range used more key
  parts than ref. In the original code, this was done much later in
  make_join_select)(). However we need to know in
  table_after_join_selectivity() if we have used RANGE or not.
- Added more information about filtering to optimizer_trace.