Wrong result with multipart keys, in_to_exists=on, NOT IN in mwl#89

Bug #777597 reported by Philip Stoev
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MariaDB
Fix Released
High
Timour Katchaounov

Bug Description

Not repeatable in maria-5.3, maria-5.2. The following query

SELECT * FROM t1 WHERE ( 6 ) NOT IN ( SELECT t2.f3 FROM t2 JOIN t3 ON t3.f10 = t2.f10);;

returns rows when executed with in-to-exists even though the subquery returns (6),(6) which means that the NOT IN predicate should be FALSE, making the entire WHERE condition FALSE.

The following things seem to be required:
- multipart index
- 1 row in the other table in the subquery

Innodb is required for this particular test case, but the bug was just observed with MyISAM.

EXPLAIN:

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 ALL NULL NULL NULL NULL 4
2 DEPENDENT SUBQUERY t3 system NULL NULL NULL NULL 1
2 DEPENDENT SUBQUERY t2 ref_or_null f10 f10 10 const,const 2 Using where; Using index

test case:

--source include/have_innodb.inc

CREATE TABLE t1 ( f4 int);
INSERT IGNORE INTO t1 VALUES (2),(2),(2),(2);

CREATE TABLE t2 ( f3 int, f10 int, KEY (f10,f3) ) ENGINE=InnoDB;
INSERT IGNORE INTO t2 VALUES (6, 1), (6, 1);

CREATE TABLE t3 ( f10 int );
INSERT IGNORE INTO t3 VALUES (1);

SET SESSION optimizer_switch='in_to_exists=on,materialization=off';
SELECT * FROM t1 WHERE ( 6 ) NOT IN ( SELECT t2.f3 FROM t2 JOIN t3 ON t3.f10 = t2.f10);

bzr version-info:

revision-id: <email address hidden>
date: 2011-05-05 01:35:03 +0300
build-date: 2011-05-05 08:59:05 +0300
revno: 2981
branch-nick: maria-5.3-mwl89

Related branches

Changed in maria:
milestone: none → 5.3
assignee: nobody → Timour Katchaounov (timour)
Revision history for this message
Philip Stoev (pstoev-askmonty) wrote : Re: Wrong result with multipart keys, in_to_exists=on, NOT IN in maria-5.3-mwl89

Bug was just observed with MyISAM. I have corrected the title and the description.

summary: - Wrong result with innodb, multipart keys, in_to_exists=on, NOT IN in
+ Wrong result with multipart keys, in_to_exists=on, NOT IN in
maria-5.3-mwl89
description: updated
Changed in maria:
status: New → Confirmed
importance: Undecided → High
Changed in maria:
status: Confirmed → In Progress
Revision history for this message
Timour Katchaounov (timour) wrote :

Analysis:
--------------------------------------------

In the call chain during optimization of the subquery:

#0 Item_cache::is_null (this=0x7f060c0b8ca8) at item.h:3519
#1 0x00000000005c296f in Item_direct_ref::is_null (this=0x7f060c0b8dd0) at item.cc:6715
#2 0x0000000000730f9d in merge_key_fields (start=0x7f060c0c9468, new_fields=0x7f060c0c9498, end=0x7f060c0c94c8, and_level=3) at sql_select.cc:3490
#3 0x0000000000731dc4 in add_key_fields (join=0x7f060c0c2140, key_fields=0x7fff115817a0, and_level=0x7fff115817c0, cond=0x7f060c0c9370, usable_tables=18446744073709551615, sargables=0x7fff11581830) at sql_select.cc:3880
#4 0x00000000007334bf in update_ref_and_keys (thd=0x7f060c015368, keyuse=0x7fff11581810, join_tab=0x7f060c0b9c58, tables=2, cond=0x7f060c0c9370, normal_tables=18446744073709551615, select_lex=0x7f060c0a24a8, sargables=0x7fff11581830) at sql_select.cc:4382

The call to is_null() returns TRUE, which is wrong. This results in
Item_null replacing the field 'f3' in the KEY_FIELD, then this Item_null
is used for index access, producing a wrong result.

The inspecion of the Item_direct_ref->Item_cache_str object shows that:
  value_cached = false,
  null_value = true,
  value = NULL,
while the cache "source" Item_cache_str::example contains the string 'bbb'.
Therefore somehow the Item_cache_str is not updated from its value.

Let's investigate where/how Item_cache_str is updated.

* In 5.3 the relevant calls happen in the following order:

outer_join->optimize()
make_join_select()
  if (const_cond && !const_cond->val_int())
  Item_func_not::val_int
............
    Item_in_optimizer::val_int
      cache->store(args[0]);
      cache->cache_value(); <== This is where the cache is updated
      ......
      Item_in_subselect::exec
      JOIN::optimize
      make_join_statistics
      update_ref_and_keys
      ....
      add_key_fields
      merge_key_fields <== At this point the cache is available

* In 5.3-MWL#89
- first the optimizer optimizes all subqueries
  - expensive constant conditions are not evaluated during optimization,
    so we don't call Item_in_optimizer::val_int()
  - call make_join_statistics -> add_key_fields -> merge_key_fields,
    at this point the cache is not updated
- only after that execution will call Item_in_optimizer::val_int
  which will evaluate the cache too late

summary: - Wrong result with multipart keys, in_to_exists=on, NOT IN in
- maria-5.3-mwl89
+ Wrong result with multipart keys, in_to_exists=on, NOT IN in mwl#89
Changed in maria:
status: In Progress → Fix Committed
Changed in maria:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.