maria:bb-10.11-selectivity

Last commit made on 2022-11-15
Get this branch:
git clone -b bb-10.11-selectivity https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-10.11-selectivity
Repository:
lp:maria

Recent commits

e693936... by Monty <email address hidden>

fixed test result

31d300a... by Monty <email address hidden>

Fixed bug in Aria with aria_log files that are exactly 8K

In the case one has an old Aria log file that ands with a Aria checkpoint
and the server restarts after next recovery, just after created a
new Aria log file (of 8K), the Aria recovery code would abort.
If one would try to delete all Aria log files after this (but not the
aria_control_file), the server would crash during recovery.

The problem was that translog_get_last_page_addr() would regard a log file
of exactly 8K as illegal and the rest of the code could not handle this
case.

Another issue was that if there was a crash directly after the log file
head was written to the next page, the code in translog_get_next_chunk()
would crash.

This patch fixes most of the issues, but not all. For Sanja to look at!

Things fixed:
- Added code to ignore 8K log files.
- Removed ASSERT in translog_get_next_chunk() that checks if page only
  contains the log page header.

5b99d2a... by Monty <email address hidden>

fixup! 5cafbf82183347b1c3a211bf19b8d997ae379bec (microseconds)

2def9a4... by Monty <email address hidden>

Small improvements to aria recovery

I spent 4 hours on work and 12 hours of testing to try to find
the reason for aria crashing in recovery when starting a new test,
in which case the 'data directory' should be a copy of "install.db",
but aria_log.00000001 content was not correct.

The following changes are mostly done to make it a bit easier to find out
more in case of future similar crashes:

- Mark last_checkpoint_lsn volatile (safety).
- Write checkpoint message to aria_recovery.trace
- When compling with DBUG, use checksum's for Aria log pages.
- Added a message to mtr --verbose when copying the data directory.
- Removed extra linefeed in Aria recovery message (cleanup)

9c2204b... by Monty <email address hidden>

Added rowid_filter support to Aria

This includes:
- cleanup and optimization of filtering and pushdown engine code.
- Adjusted costs for rowid filters (based on extensive testing
  and profiling).

This made a small change to the handler_rowid_filter_is_active()
API: One should not call it with a zero pointer!

The changes was to avoid unnecessary function calls and checks if
pushdown conditions and rowid_filter is not used.

The cost of filtering is now slightly lower than before, which
is reflected in the test cases.

8852eff... by Monty <email address hidden>

Set thd->query() for internal (startup) transactions

This helps with debugging as 'Query: ' in DBUG traces will show something
useful, for internal transactions, instead of just "".

d8fe0f3... by Monty <email address hidden>

Disabled spider.partition_mrr until MDEV-29947 is fixed in 10.9

2c9b3f8... by Monty <email address hidden>

Don't do zerofill of Aria table if it's already zerofilled

This will speed up using tables that are already zerofilled
with aria_chk --zerofill.

d049d3f... by Monty <email address hidden>

fixup! 5cafbf82183347b1c3a211bf19b8d997ae379bec (test cases)

5cafbf8... by Monty <email address hidden>

Added test cases for preceding test

This includes all test changes from
"Changing all cost calculation to be given in milliseconds"
and forwards.

Some of the things that caused changes in the result files:

- As part of fixing tests, I added 'echo' to some comments to be able to
  easier find out where things where wrong.
- MATERIALIZED has now a higher cost compared to X than before. Because
  of this some MATERIALIZED types have changed to DEPENDEND SUBQUERY.
  - Some test cases that required MATERIALIZED to repeat a bug was
    changed by adding more rows to force MATERIALIZED to happen.
- 'Filtered' in SHOW EXPLAIN has in many case changed from 100.00 to
  something smaller. This is because now filtered also takes into
  account the smallest possible ref access and filters, even if they
  where not used. Another reason for 'Filtered' being smaller is that
  we now also take into account implicit filtering done for subqueries
  using FIRSTMATCH.
  (main.subselect_no_exists_to_in)
  This is caluculated in best_access_path() and stored in records_out.
- Table orders has changed because more accurate costs.
- 'index' and 'ALL' for small tables has changed to use 'range' or
   'ref' because of optimizer_scan_setup_cost.
- index can be changed to 'range' as 'range' optimizer assumes we don't
  have to read the blocks from disk that range optimizer has already read.
  This can be confusing in the case where there is no obvious where clause
  but instead there is a hidden 'key_column > NULL' added by the optimizer.
  (main.subselect_no_exists_to_in)
- Scan on primary clustered key does not report 'Using Index' anymore
  (It's a table scan, not an index scan).
- For derived tables, the number of rows is now 100 instead of 2,
  which can be seen in EXPLAIN.
- More tests have "Using index for group by" as the cost of this
  optimization is now more correct (lower).
- A primary key could be preferred for a normal key, even if it would
  access more rows, as it's faster to do 1 lokoup and 3 'index_next' on a
  clustered primary key than one lookup trough a secondary.
  (main.stat_tables_innodb)

Notes:

- There was a 4.7% more calls to best_extension_by_limited_search() in
  the main.greedy_optimizer test. However examining the test results
  it looked that the plans where slightly better (eq_ref where more
  chained together) so I assume this is ok.
- I have verified a few test cases where there was notable/unexpected
  changes in the plan and in all cases the new optimizer plans where
  faster. (main.greedy_optimizer and some others)