maria:bb-10.10-spetrunia

Last commit made on 2022-06-18
Get this branch:
git clone -b bb-10.10-spetrunia https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-10.10-spetrunia
Repository:
lp:maria

Recent commits

95fe701... by Sergey Petrunia

Cleanups for: 0762dd928 Improve pruning in greedy_search by sorting tables during search

4a888d5... by Sergey Petrunia

Test result updates for 3ae31f66

a788c83... by Monty <email address hidden>

Temporary push to fix a crash in mariabackup

b8a8498... by Monty <email address hidden>

Reduced size of POSITION

Replaced Cost_estimate prefix_cost with a double as prefix_cost was
only used to store and retrive total prefix cost.

This also speeds up things (a bit) as don't have to call
Cost_estimate::total_cost() for every access to the prefix_cost.

Sizeof POSITION decreased from 304 to 256.

bf876c2... by Monty <email address hidden>

Added current_cost and best_cost to optimizer trace when pruning by cost

This helps a lot when trying to find out why a table combination was pruned

3ae31f6... by Sergey Petrunia

Cleanup for: Added EQ_REF chaining to the greedy_optimizer

5abb6bf... by Monty <email address hidden>

Added EQ_REF chaining to the greedy_optimizer

MDEV-28073 Slow query performance in MariaDB when using many table

The idea is to prefer and chain EQ_REF tables (tables that uses an
unique key to find a row) when searching for the best table combination.
This significantly reduces row combinations that has to be examined.
This is optimization is enabled when setting optimizer_prune_level=2
(which is now default).

Implementation:
- optimizer_prune_level has a new level, 2, which enables EQ_REF
  optimization in addition to the pruning done by level 1.
  Level 2 is now default.
- Added JOIN::eq_ref_tables that contains bits of tables that could use
  potentially use EQ_REF access in the query. This is calculated
  in sort_and_filter_keyuse()

Under optimizer_prune_level=2:
- When the greedy_optimizer notices that the preceding table was an
  EQ_REF table, it tries to add an EQ_REF table next. If an EQ_REF
  table exists, only this one will be considered at this level.
  We also collect all EQ_REF tables chained by the next levels and these
  are ignored on the starting level as we have already examined these.
  If no EQ_REF table exists, we continue as normal.

This optimization speeds up the greedy_optimizer combination test with
~25%

Other things:
- I ported the changes in MySQL 5.7 to greedy_optimizer.test to MariaDB
  to be able to ensure we can handle all cases that MySQL can do.
- I have run all tests with --mysqld=--optimizer_prune_level=1 to verify that
  there where no test changes.

d9fa07e... by Monty <email address hidden>

Remove unnecessary testing of join dependency when sorting tables

The dependency checking is not needed when using
best_extension_by_limited_search() as this function ensures
that we don't use tables that are depending on any of the remaning
tables.

b4d5adb... by Monty <email address hidden>

Added get_allowed_nj_tables() to speed up gready_search()

"Get the tables that one is allowed to have as the next table in the
current plan"

Main author: Sergei Petrunia <email address hidden>
Co author: Monty

0762dd9... by Monty <email address hidden>

Improve pruning in greedy_search by sorting tables during search

MDEV-28073 Slow query performance in MariaDB when using many tables

The faster we can find a good query plan, the more options we have for
finding and pruning (ignoring) bad plans.

This patch adds sorting of plans to best_extension_by_limited_search().
The plans, from best_access_path() are sorted according to the numbers
of found rows. This allows us to faster find 'good tables' and we are
thus able to eliminate 'bad plans' faster.

One side effect of this patch is that if two tables have equal cost,
the table that which was used earlier in the query is preferred.
This allows users to improve plans by reordering eq_ref tables in the
order they would like them to be uses.

Result changes caused by the patch:
- Traces are different as now we print the cost for using tables before
  we start considering them in the plan.
- Table order are changed for some plans. In most cases this is because
  the plans are equal and tables are in this case sorted according to
  their usage in the original query.
- A few plans was changed as the optimizer was able to find a better
  plan (that was pruned by the original code).

Other things:

- Added a new statistic variable: "optimizer_join_prefixes_check_calls",
  which counts number of calls to best_extension_by_limited_search().
  This can be used to check the prune efficiency in greedy_search().
- Added variable "JOIN_TAB::embedded_dependent" to be able to handle
  XX IN (SELECT..) in the greedy_optimizer. The idea is that we
  should prune a table if any of the tables in embedded_dependent is
  not yet read.
- When using many tables in a query, there will be some additional
  memory usage as we need to pre-allocate table of
  table_count*table_count*sizeof(POSITION) objects (POSITION is 312
  bytes for now) to hold the pre-calculated best_access_path()
  information. This memory usage is offset by the expected
  performance improvement when using many tables in a query.
- Removed the code from an earlier patch to keep the table order in
  join->best_ref in the original order. This is not needed anymore as we
  are now sorting the tables for each best_extension_by_limited_search()
  call.