maria:bb-11.0-ycp-mdev-26247

Last commit made on 2023-10-05
Get this branch:
git clone -b bb-11.0-ycp-mdev-26247 https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-11.0-ycp-mdev-26247
Repository:
lp:maria

Recent commits

4f6687f... by Yuchen Pei <email address hidden>

MDEV-26247 Re-implement spider gbh query rewrite of tables

Spider GBH's query rewrite of table joins is overly complex and
error-prone. We replace it with something closer to what
dbug_print() (more specifically, print_join()) does, but catered to
spider.

More specifically, we replace the body of
spider_db_mbase_util::append_from_and_tables() with a call to
spider_db_mbase_util::append_join(), and remove downstream append_X
functions.

We make it handle const tables by rewriting them as (select 1). This
fixes the main issue in MDEV-26247.

We also ban semijoin from spider gbh, which fixes MDEV-31645 and
MDEV-30392, as semi-join is an "internal" join, and "semi join" does
not parse, and it is different from "join" in that it deduplicates the
right hand side

Not all queries passed to a group by handler are valid (MDEV-32273),
for example, a join on expr may refer outer fields not in the current
context. We detect this during the handler creation when walking the
join. See also gbh_outer_fields_in_join.test.

It also skips eliminated tables, which fixes MDEV-26193.

e9b7e7e... by Yuchen Pei <email address hidden>

MDEV-26247 clean up spider_group_by_handler::init_scan()

0a67479... by Yuchen Pei <email address hidden>

MDEV-26247 Clean up spider_fields

Spider gbh query rewrite should get table for fields in a simple way.
Add a method spider_fields::find_table that searches its table holders
to find table for a given field. This way we will be able to get rid
of the first pass during the gbh creation where field_chains and
field_holders are created.

We also check that the field belongs to a spider table while walking
through the query, so we could remove
all_query_fields_are_query_table_members(). However, this requires an
earlier creation of the table_holder so that tables are added before
checking. We do that, and in doing so, also decouple table_holder and
spider_fields

Remove unused methods and fields. Add comments.

75f5906... by Yuchen Pei <email address hidden>

MDEV-26247 Remove some unused spider methods

Two methods from spider_fields. There are probably more of these
conn_holder related methods that can be removed

reappend_tables_part()
reappend_tables()

69c1706... by Yuchen Pei <email address hidden>

MDEV-28998 remove a known reference to a SPIDER_CONN when it is freed

6387499... by Yuchen Pei <email address hidden>

MDEV-32157 MDEV-28856 Spider: Tests, documentation, small fixes and cleanups

Removed some redundant hint related string literals from
spd_db_conn.cc

Clean up SPIDER_PARAM_*_[CHAR]LEN[S]

Adding tests covering monitoring_kind=2. What it does is that it reads
from mysql.spider_link_mon_servers with matching db_name, table_name,
link_id, and does not do anything about that...

How monitoring_* can be useful: in the deprecated spider high
availability feature, when one remote fails, spider will try another
remote, which apparently makes use of these table parameters.

A test covering the query_cache_sync table param. Some further tests
on some spider table params.

Wrapper should be case insensitive.

Code documentation on spider priority binary tree.

Add an assertion that static_key_cardinality is always -1. All tests
pass still

51e7989... by Yuchen Pei <email address hidden>

MDEV-32157 MDEV-28856 Spider: drop server in tests

This helps eliminate "server exists" failures

Also, spider/bugfix.mdev_29676, when enabled after MDEV-29525 is
pushed will fail because we have not --recorded the result. But the
failure will only emerge when working on MDEV-31138 where we manually
re-enable this test, so let's worry about that then.

61c2946... by Yuchen Pei <email address hidden>

MDEV-29502 Fix some issues with spider direct aggregate

The direct aggregate mechanism sems to be only intended to work when
otherwise a full table scan query will be executed from the spider
node and the aggregation done at the spider node too. Typically this
happens in sub_select(). In the test spider.direct_aggregate_part
direct aggregate allows to send COUNT statements directly to the data
nodes and adds up the results at the spider node, instead of iterating
over the rows one by one at the spider node.

By contrast, the group by handler (GBH) typically sends aggregated
queries directly to data nodes, in which case DA does not improve the
situation here.

That is why we should fix it by disabling DA when GBH is used.

There are other reasons supporting this change. First, the creation of
GBH results in a call to change_to_use_tmp_fields() (as opposed to
setup_copy_fields()) which causes the spider DA function
spider_db_fetch_for_item_sum_funcs() to work on wrong items. Second,
the spider DA function only calls direct_add() on the items, and the
follow-up add() needs to be called by the sql layer code. In
do_select(), after executing the query with the GBH, it seems that the
required add() would not necessarily be called.

Disabling DA when GBH is used does fix the bug. There are a few
other things included in this commit to improve the situation with
spider DA:

1. Add a session variable that allows user to disable DA completely,
this will help as a temporary measure if/when further bugs with DA
emerge.

2. Move the increment of direct_aggregate_count to the spider DA
function. Currently this is done in rather bizarre and random
locations.

3. Fix the spider_db_mbase_row creation so that the last of its row
field (sentinel) is NULL. The code is already doing a null check, but
somehow the sentinel field is on an invalid address, causing the
segfaults. With a correct implementation of the row creation, we can
avoid such segfaults.

906b506... by Yuchen Pei <email address hidden>

MDEV-31673 MDEV-29502 Remove spider_db_handler::need_lock_before_set_sql_for_exec

This function trivially returns false

6dfe8d1... by Yuchen Pei <email address hidden>

MDEV-31117 clean up spider connection info parsing

Spider connection string is a comma-separated parameter definitions,
where each definition is of the form "<param_title> <param_value>",
where <param_value> is quote delimited on both ends, with backslashes
acting as an escaping prefix.

The code however treated param title the same way as param value when
assigning, and have nonsensical fields like delim_title_len and
delim_title. We remove these.

We also clean up the spider comment connection string parsing,
including:

- Factoring out some code from the parsing function
- Rewriting the struct `st_spider_param_string_parse`, including
  replacing its messy methods with cleaner ones
- And any necessary changes caused by the above changes