Merge into 5.6 : tokudb-clustering-query-opt : Code : Percona Server moved to https://jira.percona.com/projects/PS

Status:	Merged
Approved by:	Alexey Kopytov on 2014-03-28
Approved revision:	no longer in the source branch.
Merged at revision:	572
Proposed branch:	lp:~laurynas-biveinis/percona-server/tokudb-clustering-query-opt
Merge into:	lp:percona-server/5.6
Prerequisite:	lp:~laurynas-biveinis/percona-server/tokudb-multiple-clust-keys
Diff against target:	109 lines (+37/-8) 3 files modified sql/handler.h (+1/-0) sql/sql_planner.cc (+4/-2) sql/sql_select.cc (+32/-6)
To merge this branch:	bzr merge lp:~laurynas-biveinis/percona-server/tokudb-clustering-query-opt
Related bugs:	Link a bug report
Related blueprints:	Query optimizer support for secondary clustering keys (Medium)

Reviewer	Review Type	Date Requested	Status
Alexey Kopytov (community)		2014-03-19	Approve on 2014-03-28
Review via email: mp+211810@code.launchpad.net

This proposal supersedes a proposal from 2014-01-23.

Description of the change

2nd MP:

- removed MY_TEST
- added missing tab->do_loosescan guard
- removed the extra arg to find_shortest_key. Its call to find the shortest clustering key was replaced by passing the clustering tab->keys subset instead.
- adjusted find_shortest_key not to prefer PK over secondary key over all table columns if the secondary key is clustering.

http://jenkins.percona.com/job/percona-server-5.6-param/554/

1st MP:

Implement (probably partial) query optimizer support for secondary
clustering keys,
https://blueprints.launchpad.net/percona-server/+spec/multiple-clustering-keys-query-opt.

- Declare new index flag HA_CLUSTERED_INDEX, which a storage engine is
supposed to return in index_flags() for any secondary clustered index.

- Extend find_shortest_key() to consider any clustering keys as
  suitable for full table scans too. Moreover, add new arg bool
  sec_clustering_only, which, if true, causes find_shortest_key() to
  consider secondary clustering keys only. Pass false from all
  existing callers to keep the current behavior.

- Modify Optimize_table_order::best_access_path() to treat secondary
clustering keys as covering keys (no row read required) for access
time estimates.

- In make_join_readinfo(), consider secondary clustering keys for the
join if no suitable covering keys and before resorting to table
scan.

- In test_if_skip_sort_order(), skip sorting for secondary clustered
keys too.

- In test_if_cheaper_ordering(), treat secondary clustered keys as
covering keys.

http://jenkins.percona.com/job/percona-server-5.6-param/490/

Revision history for this message

Alexey Kopytov (akopytov) wrote on 2014-01-27: Posted in a previous version of this proposal

#

- all MY_TEST() occurrences in the patch are unnecessary (“if (a ? 1 :
0)” is equivalent to “if (a)”)

  - HA_CLUSTERED_INDEX can only be set for secondary indexes. Making
    index_flags() return HA_CLUSTERED_INDEX for PK in InnoDB tables
    would simplify a couple of cumbersome checks like “if (key == PK
    && primary_key_is_clustered(key) ||
    index_flags(key) & HA_CLUSTERED_INDEX)

- in the following code the check for best_clust_is_pk is redundant:

if (sec_clustering_only)
return best_clust_is_pk ? MAX_KEY : best_clustered;

- no check for tab->do_loosescan() around the find_shortest_key(...,
true) call?

  - the additional argument to find_shortest_key() and the logic around
    it look fishy. The intention was that if we are going to do a full
    table scan, but have no covering keys on the table (if we had, that
    would be converted to an index scan by existing logic), look for the
    shortest TokuDB CLUSTERING key in tab->keys and convert the table
    scan to an index scan on the shortest one, if any. But I see nothing
    that would guarantee TokuDB CLUSTERING keys to be present in
    tab->keys? Moreover, tab->keys is a list of possible keys that could
    be used to optimize an index scan with conditions on that
    index. Which is excluded by requiring either tab->select or
    tab->select->quick to be NULL.

    On top of that, the second argument to find_shortest_key() is
    already supposed to indicate which keys we should be looking at. So
    introducing another argument with the sane semantics looks
    redundant. Can we use a pre-created bitmap of CLUSTERING keys
    instead?

  - these changes are obviously asking for regression tests. Combined
    with my previous suggestion on disabling CLUSTERING keys for
    non-TokuDB tables, this means TokuDB SE should be a prerequisite for
    this MP. TokuDB probably already has this covered in its own test
    suite? Even if it doesn’t, basing this MP on a tree with TokuDB
    included would allow to create a proper test case.

- all MY_TEST() occurrences in the patch are unnecessary (“if (a ? 1 :
    0)” is equivalent to “if (a)”)

- HA_CLUSTERED_INDEX can only be set for secondary indexes. Making
    index_flags() return HA_CLUSTERED_INDEX for PK in InnoDB tables
    would simplify a couple of cumbersome checks like “if (key == PK
    && primary_key_is_clustered(key) ||
    index_flags(key) & HA_CLUSTERED_INDEX)

- in the following code the check for best_clust_is_pk is redundant:

if (sec_clustering_only)
    return best_clust_is_pk ? MAX_KEY : best_clustered;

- no check for tab->do_loosescan() around the find_shortest_key(...,
    true) call?
 
  - the additional argument to find_shortest_key() and the logic around
    it look fishy. The intention was that if we are going to do a full
    table scan, but have no covering keys on the table (if we had, that
    would be converted to an index scan by existing logic), look for the
    shortest TokuDB CLUSTERING key in tab->keys and convert the table
    scan to an index scan on the shortest one, if any. But I see nothing
    that would guarantee TokuDB CLUSTERING keys to be present in
    tab->keys? Moreover, tab->keys is a list of possible keys that could
    be used to optimize an index scan with conditions on that
    index. Which is excluded by requiring either tab->select or
    tab->select->quick to be NULL.

On top of that, the second argument to find_shortest_key() is
    already supposed to indicate which keys we should be looking at. So
    introducing another argument with the sane semantics looks
    redundant. Can we use a pre-created bitmap of CLUSTERING keys
    instead?
     
  - these changes are obviously asking for regression tests. Combined
    with my previous suggestion on disabling CLUSTERING keys for
    non-TokuDB tables, this means TokuDB SE should be a prerequisite for
    this MP. TokuDB probably already has this covered in its own test
    suite? Even if it doesn’t, basing this MP on a tree with TokuDB
    included would allow to create a proper test case.

review: Needs Fixing

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2014-03-19:

#

Download full text (3.8 KiB)

> - all MY_TEST() occurrences in the patch are unnecessary (“if (a ? 1 :
> 0)” is equivalent to “if (a)”)

Heh, fixed.

> - HA_CLUSTERED_INDEX can only be set for secondary indexes. Making
> index_flags() return HA_CLUSTERED_INDEX for PK in InnoDB tables
> would simplify a couple of cumbersome checks like “if (key == PK
> && primary_key_is_clustered(key) ||
> index_flags(key) & HA_CLUSTERED_INDEX)

That's possible, I tested it, but honestly I don't like it much. There
is only a couple of those cumbersome checks where it helps, they diff
easily against upstream, it makes "no new code runs if no index has
HA_CLUSTERED_INDEX in flags" reasoning less trivial, and I guess
XtraDB downstream will like not introducing new index flag better.

> - in the following code the check for best_clust_is_pk is redundant:
>
> if (sec_clustering_only)
> return best_clust_is_pk ? MAX_KEY : best_clustered;

Fixed.

> - no check for tab->do_loosescan() around the find_shortest_key(...,
> true) call?

Fixed.

> - the additional argument to find_shortest_key() and the logic around
> it look fishy. The intention was that if we are going to do a full
> table scan, but have no covering keys on the table (if we had, that
> would be converted to an index scan by existing logic), look for the
> shortest TokuDB CLUSTERING key in tab->keys and convert the table
> scan to an index scan on the shortest one, if any. But I see nothing
> that would guarantee TokuDB CLUSTERING keys to be present in
> tab->keys?

Query optimizer "index" join type has a case where a (non-clustering)
key is read to read data rows in that key order. The existing logic
thus adds such keys, and if they happen to be clustering, we'll avoid
the data row read. It might miss some of the keys, but IMHO safe to
leave for future work by Tokutek.

> Moreover, tab->keys is a list of possible keys that could
> be used to optimize an index scan with conditions on that
> index. Which is excluded by requiring either tab->select or
> tab->select->quick to be NULL.

Yes. But we indeed have the case of tab->select == NULL,
tab->select->quick == NULL, and non-empty tab->keys here, probably
due to the above. I guess tab->keys isn't pruned if quick select
setup fails.

> On top of that, the second argument to find_shortest_key() is
> already supposed to indicate which keys we should be looking at. So
> introducing another argument with the sane semantics looks
> redundant. Can we use a pre-created bitmap of CLUSTERING keys
> instead?

Yes, it's a good idea. The question is how to create such bitmap. For
now I create it locally using tab->keys. This might miss some of the
keys, but it's local, good enough for Tokutek, and easy to extend in
the future as needed. An alternative would be add new key bitmap to
the table struct, and add index hint processing.

With this bitmap, the only change in the find_shortest_key is not to
assume that primary key is cheaper than a secondary key on all
columns, if the latter is clustered.

> - these changes are obviously asking for regression tests. Combined
> with my previous su...

>   - all MY_TEST() occurrences in the patch are unnecessary (“if (a ? 1 :
>     0)” is equivalent to “if (a)”)

Heh, fixed.

>   - HA_CLUSTERED_INDEX can only be set for secondary indexes. Making
>     index_flags() return HA_CLUSTERED_INDEX for PK in InnoDB tables
>     would simplify a couple of cumbersome checks like “if (key == PK
>     && primary_key_is_clustered(key) ||
>     index_flags(key) & HA_CLUSTERED_INDEX)

That's possible, I tested it, but honestly I don't like it much. There
is only a couple of those cumbersome checks where it helps, they diff
easily against upstream, it makes "no new code runs if no index has
HA_CLUSTERED_INDEX in flags" reasoning less trivial, and I guess
XtraDB downstream will like not introducing new index flag better.

>   - in the following code the check for best_clust_is_pk is redundant:
> 
>   if (sec_clustering_only)
>     return best_clust_is_pk ? MAX_KEY : best_clustered;

Fixed.

>   - no check for tab->do_loosescan() around the find_shortest_key(...,
>     true) call?

Fixed.

>   - the additional argument to find_shortest_key() and the logic around
>     it look fishy. The intention was that if we are going to do a full
>     table scan, but have no covering keys on the table (if we had, that
>     would be converted to an index scan by existing logic), look for the
>     shortest TokuDB CLUSTERING key in tab->keys and convert the table
>     scan to an index scan on the shortest one, if any. But I see nothing
>     that would guarantee TokuDB CLUSTERING keys to be present in
>     tab->keys?

Query optimizer "index" join type has a case where a (non-clustering)
key is read to read data rows in that key order. The existing logic
thus adds such keys, and if they happen to be clustering, we'll avoid
the data row read. It might miss some of the keys, but IMHO safe to
leave for future work by Tokutek.

>     Moreover, tab->keys is a list of possible keys that could
>     be used to optimize an index scan with conditions on that
>     index. Which is excluded by requiring either tab->select or
>     tab->select->quick to be NULL.

Yes. But we indeed have the case of tab->select == NULL,
tab->select->quick == NULL, and non-empty tab->keys here, probably
due to the above. I guess tab->keys isn't pruned if quick select
setup fails.

>     On top of that, the second argument to find_shortest_key() is
>     already supposed to indicate which keys we should be looking at. So
>     introducing another argument with the sane semantics looks
>     redundant. Can we use a pre-created bitmap of CLUSTERING keys
>     instead?

Yes, it's a good idea. The question is how to create such bitmap. For
now I create it locally using tab->keys. This might miss some of the
keys, but it's local, good enough for Tokutek, and easy to extend in
the future as needed. An alternative would be add new key bitmap to
the table struct, and add index hint processing.

With this bitmap, the only change in the find_shortest_key is not to
assume that primary key is cheaper than a secondary key on all
columns, if the latter is clustered.

>   - these changes are obviously asking for regression tests. Combined
>     with my previous suggestion on disabling CLUSTERING keys for
>     non-TokuDB tables, this means TokuDB SE should be a prerequisite for
>     this MP. TokuDB probably already has this covered in its own test
>     suite? Even if it doesn’t, basing this MP on a tree with TokuDB
>     included would allow to create a proper test case.

Replied offline previously. I have tried to arrange the patch so that
no new code runs if no HA_CLUSTERED_INDEX is returned in
index_flags. The only bit that does run is the following in
make_join_readinfo.

+          else if (!(tab->select && tab->select->quick))
+          {
+            DBUG_ASSERT(table->covering_keys.is_clear_all());
...
+          }

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2014-03-26:

#

Repushed branch because the prerequisite branch changed, at the same time rebased on the current trunk. No changes in the patch.

http://jenkins.percona.com/job/percona-server-5.6-param/564/

Revision history for this message

Alexey Kopytov (akopytov) on 2014-03-28:

#

review: Approve

 === modified file 'sql/handler.h'
 --- sql/handler.h	2014-03-26 10:32:59 +0000
 +++ sql/handler.h	2014-03-26 10:32:59 +0000
@@ -261,6 +261,7 @@
  */
  #define HA_KEY_SCAN_NOT_ROR     128
  #define HA_DO_INDEX_COND_PUSHDOWN  256 /* Supports Index Condition Pushdown */
++#define HA_CLUSTERED_INDEX      512     /* Data is clustered on this key */
 === modified file 'sql/sql_planner.cc'
 --- sql/sql_planner.cc	2013-12-05 17:23:10 +0000
 +++ sql/sql_planner.cc	2014-03-26 10:32:59 +0000
@@ -648,7 +648,8 @@
              /* Limit the number of matched rows */
              tmp= records;
              set_if_smaller(tmp, (double) thd->variables.max_seeks_for_key);
--            if (table->covering_keys.is_set(key))
++            if (table->covering_keys.is_set(key)
++                || (table->file->index_flags(key, 0, 0) & HA_CLUSTERED_INDEX))
+             {
                /* we can use only index tree */
                tmp= record_count * table->file->index_only_read_time(key, tmp);
@@ -823,7 +824,8 @@
              /* Limit the number of matched rows */
              set_if_smaller(tmp, (double) thd->variables.max_seeks_for_key);
--            if (table->covering_keys.is_set(key))
++            if (table->covering_keys.is_set(key)
++                || (table->file->index_flags(key, 0, 0) & HA_CLUSTERED_INDEX))
+             {
                /* we can use only index tree */
                tmp= record_count * table->file->index_only_read_time(key, tmp);
 === modified file 'sql/sql_select.cc'
 --- sql/sql_select.cc	2014-02-17 11:12:40 +0000
 +++ sql/sql_select.cc	2014-03-26 10:32:59 +0000
@@ -2910,7 +2910,28 @@
  	    tab->read_first_record= join_read_first;
              tab->type=JT_INDEX_SCAN;      // Read with index_first / index_next
+ 	  }
--	}
++          else if (!(tab->select && tab->select->quick))
++          {
++            DBUG_ASSERT(table->covering_keys.is_clear_all());
++            if (!tab->do_loosescan())
++            {
++              key_map clustering_keys;
++              for (uint i= 0; i < table->s->keys; i++)
++              {
++                if (tab->keys.is_set(i)
++                    && table->file->index_flags(i, 0, 0) & HA_CLUSTERED_INDEX)
++                  clustering_keys.set_bit(i);
++              }
++              uint index= find_shortest_key(table, &clustering_keys);
++              if (index != MAX_KEY)
++              {
++                tab->index= index;
++                tab->read_first_record= join_read_first;
++                tab->type= JT_INDEX_SCAN;
++              }
++            }
++          }
++        }
          if (tab->select && tab->select->quick &&
              tab->select->quick->index != MAX_KEY && ! tab->table->key_read)
            push_index_cond(tab, tab->select->quick->index, icp_other_tables_ok,
@@ -3644,13 +3665,14 @@
+   {
      /*
       If the primary key is clustered and found shorter key covers all table
--     fields then primary key scan normally would be faster because amount of
--     data to scan is the same but PK is clustered.
++     fields and is not clustering then primary key scan normally would be
++     faster because amount of data to scan is the same but PK is clustered.
       It's safe to compare key parts with table fields since duplicate key
       parts aren't allowed.
       */
      if (best == MAX_KEY ||
--        table->key_info[best].user_defined_key_parts >= table->s->fields)
++        ((table->key_info[best].user_defined_key_parts >= table->s->fields)
++         && !(table->file->index_flags(best, 0, 0) & HA_CLUSTERED_INDEX)))
        best= usable_clustered_pk;
+   }
    return best;
@@ -4101,7 +4123,9 @@
          (tab->type == JT_ALL &&
           tab->join->primary_tables > tab->join->const_tables + 1) &&
           ((unsigned) best_key != table->s->primary_key ||
--          !table->file->primary_key_is_clustered()))
++          !table->file->primary_key_is_clustered()) &&
++        !(best_key >= 0
++          && (table->file->index_flags(best_key, 0, 0) & HA_CLUSTERED_INDEX)))
+     {
        can_skip_sorting= false;
        goto fix_ICP;
@@ -5529,7 +5553,9 @@
        bool is_covering= table->covering_keys.is_set(nr) ||
                          (nr == table->s->primary_key &&
--                        table->file->primary_key_is_clustered());
++                        table->file->primary_key_is_clustered()) ||
++                        (table->file->index_flags(nr, 0, 0)
++                         & HA_CLUSTERED_INDEX);
        /*
          Don't use an index scan with ORDER BY without limit.

Percona Server moved to https://jira.percona.com/projects/PS

Merge lp:~laurynas-biveinis/percona-server/tokudb-clustering-query-opt into lp:percona-server/5.6

Commit message

Description of the change

Preview Diff

Subscribers