maria:bb-11.2-mdev-22534-item-ref-propagate-unsquashed

Last commit made on 2023-08-22
Get this branch:
git clone -b bb-11.2-mdev-22534-item-ref-propagate-unsquashed https://git.launchpad.net/maria

Branch merges

Branch information

Name:
bb-11.2-mdev-22534-item-ref-propagate-unsquashed
Repository:
lp:maria

Recent commits

ca390e5... by Yuchen Pei <email address hidden>

MDEV-22534 [demo] Assert const in Item*ref::propagate_equal_fields()

All tests prefixed with main.subselect pass:

mtr --suite main --do-test=subselect

pending CI for regressions.

3eeab3c... by Yuchen Pei <email address hidden>

MDEV-22534 Avoid inconsistent Item_ref::propagate_equal_fields

When the Item_ref points to another Item_ref, delegate the
propagate_equal_fields to the latter.

This fixes the inconsitency in equal_field of the underlying
Item_field, when an Item_ref points to a Item_direct_view_ref, as the
latter has its own propagate_equal_fields() that may "move" the
equal_field of the Item_field to itself.

ffc61fb... by Yuchen Pei <email address hidden>

MDEV-22534 Simplify the workaround for MDEV-31269

Skip exists2in transformations if we are in a ps execution

a87f227... by Yuchen Pei <email address hidden>

MDEV-22534 Decorrelate IN subquery

Transform

in (select inner_col' from inner_table where inner_col = outer_col)

to

, outer_col in (select inner_col', inner_col from inner_table)

Achieved by implementing Item_in_subselect::exists2in_processor(),
accompanied with comprehensive test coverage. Factored out common code
between the two transformations.

Caveat:

- Cannot recognise bad item mismatch in equalities that causes
  materialization to not materialize down the road

6219630... by Yuchen Pei <email address hidden>

MDEV-29502 Fix some issues with spider direct aggregate

The direct aggregate mechanism sems to be only intended to work when
otherwise a full table scan query will be executed from the spider
node and the aggregation done at the spider node too. Typically this
happens in sub_select(). In the test spider.direct_aggregate_part
direct aggregate allows to send COUNT statements directly to the data
nodes and adds up the results at the spider node, instead of iterating
over the rows one by one at the spider node.

By contrast, the group by handler (GBH) typically sends aggregated
queries directly to data nodes, in which case DA does not improve the
situation here.

That is why we should fix it by disabling DA when GBH is used.

There are other reasons supporting this change. First, the creation of
GBH results in a call to change_to_use_tmp_fields() (as opposed to
setup_copy_fields()) which causes the spider DA function
spider_db_fetch_for_item_sum_funcs() to work on wrong items. Second,
the spider DA function only calls direct_add() on the items, and the
follow-up add() needs to be called by the sql layer code. In
do_select(), after executing the query with the GBH, it seems that the
required add() would not necessarily be called.

Disabling DA when GBH is used does fix the bug. There are a few
other things included in this commit to improve the situation with
spider DA:

1. Add a session variable that allows user to disable DA completely,
this will help as a temporary measure if/when further bugs with DA
emerge.

2. Move the increment of direct_aggregate_count to the spider DA
function. Currently this is done in rather bizarre and random
locations.

3. Fix the spider_db_mbase_row creation so that the last of its row
field (sentinel) is NULL. The code is already doing a null check, but
somehow the sentinel field is on an invalid address, causing the
segfaults. With a correct implementation of the row creation, we can
avoid such segfaults.

ad1c2f1... by Yuchen Pei <email address hidden>

MDEV-31117 clean up spider connection info parsing

Spider connection string is a comma-separated parameter definitions,
where each definition is of the form "<param_title> <param_value>",
where <param_value> is quote delimited on both ends, with backslashes
acting as an escaping prefix.

The code however treated param title the same way as param value when
assigning, and have nonsensical fields like delim_title_len and
delim_title. We remove these.

We also clean up the spider comment connection string parsing,
including:

- Factoring out some code from the parsing function
- Rewriting the struct `st_spider_param_string_parse`, including
  replacing its messy methods with cleaner ones
- And any necessary changes caused by the above changes

c5b42d9... by Yuchen Pei

MDEV-22979 MDEV-27233 MDEV-28218 Fixing spider init bugs

Fix spider init bugs (MDEV-22979, MDEV-27233, MDEV-28218) while
preventing regression on old ones (MDEV-30370, MDEV-29904)

Two things are changed:

First, Spider initialisation is made fully synchronous, i.e. it no
longer happens in a background thread. Adapted from the original fix
by nayuta for MDEV-27233. This change itself would cause failure when
spider is initialised early, by plugin-load-add, due to dependency on
Aria and udf function creation, which are fixed in the second and
third parts below. Requires SQL Service, thus porting earlier versions
requires MDEV-27595

Second, if spider is initialised before udf_init(), create udf by
inserting into `mysql.func`, otherwise do it by `CREATE FUNCTION` as
usual. This change may be generalised in MDEV-31401.

Also factor out some clean-up queries from deinit_spider.inc for use
of spider init tests.

A minor caveat is that early spider initialisation will fail if the
server is bootstrapped for the first time, due to missing `mysql`
database which needs to be created by the bootstrap script.

32af3e1... by Yuchen Pei

MDEV-27095 clean up spd_init_query.h

Removing procedures that were created and dropped during init.

This also fixes a race condition where mtr test with
plugin-load-add=ha_spider.so causes post test check to fail as it
expects the procedures to still be there.

9ba5034... by Yuchen Pei

MDEV-27095 installing one spider plugin should not trigger others

There are several plugins in ha_spider: spider, spider_alloc_mem,
spider_wrapper_protocols, spider_rewrite etc.

INSTALL PLUGIN foo SONAME ha_spider causes all the other ones to be
installed by the init queries where foo is any of the plugins.

This introduces unnecessary complexiy. For example it reads
mysql.plugins to find all other plugins, causing the hack of moving
spider plugin init to a separate thread.

To install all spider related plugins, install soname ha_spider should
be used instead.

This also fixes spurious rows in mysql.plugin when installing say only
the spider plugin with `plugin-load-add=SPIDER=ha_spider.so`:

select * from mysql.plugin;
name dl
spider_alloc_mem ha_spider.so # should not be here
spider_wrapper_protocols ha_spider.so # should not be here

Adapted from part of the reverted commit
c160a115b8b6dcd54bb3daf1a751ee9c68b7ee47.

a3a9188... by Yuchen Pei

MDEV-31400 Simple plugin dependency resolution

We introduce simple plugin dependency. A plugin init function may
return HA_ERR_RETRY_INIT. If this happens during server startup when
the server is trying to initialise all plugins, the failed plugins
will be retried, until no more plugins succeed in initialisation or
want to be retried.

This will fix spider init bugs which is caused in part by its
dependency on Aria for initialisation.

The reason we need a new return code, instead of treating every
failure as a request for retry, is that it may be impossible to clean
up after a failed plugin initialisation. Take InnoDB for example, it
has a global variable `buf_page_cleaner_is_active`, which may not
satisfy an assertion during a second initialisation try, probably
because InnoDB does not expect the initialisation to be called
twice.