Percona Server moved to https://jira.percona.com/projects/PS

Code review comment for lp:~laurynas-biveinis/percona-server/tokudb-clustering-query-opt

tokudb-clustering-query-opt
Merge into 5.6

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2014-03-19:

> - all MY_TEST() occurrences in the patch are unnecessary (“if (a ? 1 :
> 0)” is equivalent to “if (a)”)

Heh, fixed.

> - HA_CLUSTERED_INDEX can only be set for secondary indexes. Making
> index_flags() return HA_CLUSTERED_INDEX for PK in InnoDB tables
> would simplify a couple of cumbersome checks like “if (key == PK
> && primary_key_is_clustered(key) ||
> index_flags(key) & HA_CLUSTERED_INDEX)

That's possible, I tested it, but honestly I don't like it much. There
is only a couple of those cumbersome checks where it helps, they diff
easily against upstream, it makes "no new code runs if no index has
HA_CLUSTERED_INDEX in flags" reasoning less trivial, and I guess
XtraDB downstream will like not introducing new index flag better.

> - in the following code the check for best_clust_is_pk is redundant:
>
> if (sec_clustering_only)
> return best_clust_is_pk ? MAX_KEY : best_clustered;

Fixed.

> - no check for tab->do_loosescan() around the find_shortest_key(...,
> true) call?

Fixed.

> - the additional argument to find_shortest_key() and the logic around
> it look fishy. The intention was that if we are going to do a full
> table scan, but have no covering keys on the table (if we had, that
> would be converted to an index scan by existing logic), look for the
> shortest TokuDB CLUSTERING key in tab->keys and convert the table
> scan to an index scan on the shortest one, if any. But I see nothing
> that would guarantee TokuDB CLUSTERING keys to be present in
> tab->keys?

Query optimizer "index" join type has a case where a (non-clustering)
key is read to read data rows in that key order. The existing logic
thus adds such keys, and if they happen to be clustering, we'll avoid
the data row read. It might miss some of the keys, but IMHO safe to
leave for future work by Tokutek.

> Moreover, tab->keys is a list of possible keys that could
> be used to optimize an index scan with conditions on that
> index. Which is excluded by requiring either tab->select or
> tab->select->quick to be NULL.

Yes. But we indeed have the case of tab->select == NULL,
tab->select->quick == NULL, and non-empty tab->keys here, probably
due to the above. I guess tab->keys isn't pruned if quick select
setup fails.

> On top of that, the second argument to find_shortest_key() is
> already supposed to indicate which keys we should be looking at. So
> introducing another argument with the sane semantics looks
> redundant. Can we use a pre-created bitmap of CLUSTERING keys
> instead?

Yes, it's a good idea. The question is how to create such bitmap. For
now I create it locally using tab->keys. This might miss some of the
keys, but it's local, good enough for Tokutek, and easy to extend in
the future as needed. An alternative would be add new key bitmap to
the table struct, and add index hint processing.

With this bitmap, the only change in the find_shortest_key is not to
assume that primary key is cheaper than a secondary key on all
columns, if the latter is clustered.

> - these changes are obviously asking for regression tests. Combined
> with my previous suggestion on disabling CLUSTERING keys for
> non-TokuDB tables, this means TokuDB SE should be a prerequisite for
> this MP. TokuDB probably already has this covered in its own test
> suite? Even if it doesn’t, basing this MP on a tree with TokuDB
> included would allow to create a proper test case.

Replied offline previously. I have tried to arrange the patch so that
no new code runs if no HA_CLUSTERED_INDEX is returned in
index_flags. The only bit that does run is the following in
make_join_readinfo.

+ else if (!(tab->select && tab->select->quick))
+ {
+ DBUG_ASSERT(table->covering_keys.is_clear_all());
...
+ }

>   - all MY_TEST() occurrences in the patch are unnecessary (“if (a ? 1 :
>     0)” is equivalent to “if (a)”)

Heh, fixed.

>   - HA_CLUSTERED_INDEX can only be set for secondary indexes. Making
>     index_flags() return HA_CLUSTERED_INDEX for PK in InnoDB tables
>     would simplify a couple of cumbersome checks like “if (key == PK
>     && primary_key_is_clustered(key) ||
>     index_flags(key) & HA_CLUSTERED_INDEX)

>   - in the following code the check for best_clust_is_pk is redundant:
> 
>   if (sec_clustering_only)
>     return best_clust_is_pk ? MAX_KEY : best_clustered;

Fixed.

>   - no check for tab->do_loosescan() around the find_shortest_key(...,
>     true) call?

Fixed.

>   - the additional argument to find_shortest_key() and the logic around
>     it look fishy. The intention was that if we are going to do a full
>     table scan, but have no covering keys on the table (if we had, that
>     would be converted to an index scan by existing logic), look for the
>     shortest TokuDB CLUSTERING key in tab->keys and convert the table
>     scan to an index scan on the shortest one, if any. But I see nothing
>     that would guarantee TokuDB CLUSTERING keys to be present in
>     tab->keys?

>     Moreover, tab->keys is a list of possible keys that could
>     be used to optimize an index scan with conditions on that
>     index. Which is excluded by requiring either tab->select or
>     tab->select->quick to be NULL.

>     On top of that, the second argument to find_shortest_key() is
>     already supposed to indicate which keys we should be looking at. So
>     introducing another argument with the sane semantics looks
>     redundant. Can we use a pre-created bitmap of CLUSTERING keys
>     instead?

With this bitmap, the only change in the find_shortest_key is not to
assume that primary key is cheaper than a secondary key on all
columns, if the latter is clustered.

>   - these changes are obviously asking for regression tests. Combined
>     with my previous suggestion on disabling CLUSTERING keys for
>     non-TokuDB tables, this means TokuDB SE should be a prerequisite for
>     this MP. TokuDB probably already has this covered in its own test
>     suite? Even if it doesn’t, basing this MP on a tree with TokuDB
>     included would allow to create a proper test case.

+          else if (!(tab->select && tab->select->quick))
+          {
+            DBUG_ASSERT(table->covering_keys.is_clear_all());
...
+          }

« Back to merge proposal