drizzle date and time functions corrupt blob data in pbxt engine

Bug #562349 reported by Vladimir Kolesnikov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Drizzle
Fix Released
High
David Shrewsbury
7.0
Fix Released
High
David Shrewsbury
Cherry
Won't Fix
High
Unassigned
Dexter
Won't Fix
High
Unassigned

Bug Description

The log is a bit long, so I will describe with sql and comments...

CREATE TABLE t1 (
  int_fld INT NOT NULL
, date_fld DATE NOT NULL
, datetime_fld DATETIME NOT NULL
, timestamp_fld TIMESTAMP NOT NULL
, char_fld CHAR(22) NOT NULL
, varchar_fld VARCHAR(22) NOT NULL
, text_fld TEXT NOT NULL
, blob_fld BLOB NOT NULL
) engine=pbxt;

INSERT INTO t1 VALUES (
20071130
, "2007-11-30"
, "2007-11-30 16:30:19"
, "2007-11-30T16:30:19"
, "2007-11-30 16:30:19"
, "2007-11-30 16:30:19"
, "2007-11-30T16:30:19"
, "2007-11-30T16:30:19"
);

-- the "pure" select queries below work fine and return correct values

SELECT int_fld FROM t1;
SELECT date_fld FROM t1;
SELECT datetime_fld FROM t1;
SELECT timestamp_fld FROM t1;
SELECT char_fld FROM t1;
SELECT varchar_fld FROM t1;
SELECT text_fld FROM t1;
SELECT blob_fld FROM t1;

-- the last SELECT HOUR(blob_fld) FROM t1 fails

SELECT HOUR(int_fld) FROM t1;
SELECT HOUR(date_fld) FROM t1;
SELECT HOUR(datetime_fld) FROM t1;
SELECT HOUR(timestamp_fld) FROM t1;
SELECT HOUR(char_fld) FROM t1;
SELECT HOUR(varchar_fld) FROM t1;
SELECT HOUR(text_fld) FROM t1;
SELECT HOUR(blob_fld) FROM t1;

...
drizzle> SELECT HOUR(blob_fld) FROM t1;
ERROR 1686 (HY000): Received an invalid datetime value ''.

details:

fields text_fld and blob_fld have the following layout inside the engine: <text_fld-length><text_fld-data><blob_fld-length><blob_fld-data>

after the query SELECT HOUR(text_fld) FROM t1; the <blob_fld-length> value gets zeroed (by the code outside of the engine) so the next query SELECT HOUR(blob_fld) FROM t1; sees a zero-length field... I guess the problem might be that drizzle's function assumes that the buffer to which the pointer is returned from teh engine to drizzle is own drizzle's buffer and appends zero-byte to the string before parsing it...

Related branches

Revision history for this message
Jay Pipes (jaypipes) wrote :

So, I'm looking into this, but what I'm wondering is if the bug is in the kernel code, why it doesn't show up with the InnoDB plugin? Any insights?

Revision history for this message
Vladimir Kolesnikov (vkolesnikov) wrote :

Jay,

I didnt look into drizzle code. All I can tell is that when we return non-corrupted buffer which can be tested by running the first set of SELECTs multiple times - it will always return the correct value, but once you call a date-time function on the previous field (text_fld) the buffer of blob_fld gets corrupted and the returned value will always be "" (unless it's refreshed in pbxt's internal cache or server is restarted).

Revision history for this message
Jay Pipes (jaypipes) wrote :

OK, thx Vlad! I'll look further into it today (conference taking lots of my time... :) )

-jay

Changed in drizzle:
status: New → Confirmed
Revision history for this message
Paul McCullagh (paul-mccullagh) wrote :

I looked into this before the conference as well. As far as I can tell the problem is that somewhere in Drizzle a zero terminator is being set in order to do the conversion of the 'text_fld' field to a HOUR value.

So the zero terminator overwrites one byte in the buffer. In the case of PBXT this buffer is pointing directly into the PBXT sequential scan buffer cache.

The byte happens to be the size of the second BLOB (blob_fld) which follow directly after text_fld. So the this overwrite set the size of the value in the blob_fld field to zero.

And that is why the error occurs.

I presume InnoDB returns a pointer to a copy of the data, and therefore this overwrite does not affect it.

Revision history for this message
Monty Taylor (mordred) wrote : Re: [Bug 562349] Re: drizzle date and time functions corrupt blob data in pbxt engine

On 4/20/2010 3:56 PM, Paul McCullagh wrote:
> I looked into this before the conference as well. As far as I can tell
> the problem is that somewhere in Drizzle a zero terminator is being set
> in order to do the conversion of the 'text_fld' field to a HOUR value.
>
> So the zero terminator overwrites one byte in the buffer. In the case of
> PBXT this buffer is pointing directly into the PBXT sequential scan
> buffer cache.
>
> The byte happens to be the size of the second BLOB (blob_fld) which
> follow directly after text_fld. So the this overwrite set the size of
> the value in the blob_fld field to zero.
>
> And that is why the error occurs.
>
> I presume InnoDB returns a pointer to a copy of the data, and therefore
> this overwrite does not affect it.
>

Sigh.

(setting random \0 bytes in the middle of a buffer)--

Jay Pipes (jaypipes)
Changed in drizzle:
milestone: 2010-05-10 → 2010-05-24
Changed in drizzle:
milestone: 2010-05-24 → none
Changed in drizzle:
assignee: Jay Pipes (jaypipes) → nobody
Revision history for this message
David Shrewsbury (dshrews) wrote :

Blocked by bug 618758.

Revision history for this message
Paul McCullagh (paul-mccullagh) wrote :

A bug fix for this problem has been pushed to lp:~paul-mccullagh/drizzle/bug-fix-618758.

Revision history for this message
David Shrewsbury (dshrews) wrote :
Download full text (4.1 KiB)

Adding a comment to remember where I'm at in debugging:

So initially I thought it was conversion to a string that may be overwriting the record buffer, but it looks like the problem may be within table.cc:

(gdb) c
Continuing.

Breakpoint 3, drizzled::Field_blob::val_str (this=0x2f1cfe8, val_ptr=0x2e4d668)
    at drizzled/field/blob.cc:325
325 memcpy(&blob,ptr+packlength,sizeof(char*));
(gdb) p this->ptr
$13 = (unsigned char *) 0x2e4fcde "\023"
(gdb) x/40xb 0x2e4fcde
0x2e4fcde: 0x13 0x00 0x00 0x00 0x56 0x00 0x13 0x03
0x2e4fce6: 0x00 0x00 0x00 0x00 0x13 0x00 0x00 0x00 <-- 0x13 is the length of blob field
0x2e4fcee: 0x6a 0x00 0x13 0x03 0x00 0x00 0x00 0x00
0x2e4fcf6: 0x00 0x00 0xff 0x00 0x00 0x00 0x00 0x00
0x2e4fcfe: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
(gdb) watch *((int*)0x2e4fcea)
Hardware watchpoint 4: *((int*)0x2e4fcea)
(gdb) c
Continuing.

Breakpoint 2, drizzled::Item_func_hour::val_int (this=0x2e4d748)
    at drizzled/function/time/hour.cc:86
86 if (! temporal_datetime.from_string(res->c_ptr(), res->length()))
(gdb) c
Continuing.
Hardware watchpoint 4: *((int*)0x2e4fcea)

Old value = 19
New value = 0
memcpy () at ../sysdeps/x86_64/memcpy.S:120
120 ../sysdeps/x86_64/memcpy.S: No such file or directory.
 in ../sysdeps/x86_64/memcpy.S
(gdb) list
115 in ../sysdeps/x86_64/memcpy.S
(gdb) bt
#0 memcpy () at ../sysdeps/x86_64/memcpy.S:120
#1 0x000000000081c89f in drizzled::Table::restoreRecordAsDefault (
    this=0x2f1ac00) at drizzled/table.cc:1864
#2 0x000000000081c8c0 in drizzled::Table::emptyRecord (this=0x2f1ac00)
    at drizzled/table.cc:1873
#3 0x0000000000773fe6 in drizzled::ReadRecord::init_read_record (
    this=0x2e2ba98, session_arg=0x2e4c1d0, table_arg=0x2f1ac00,
    select_arg=0x2e2bcb0, use_record_cache=1, print_error_arg=true)
    at drizzled/records.cc:89
#4 0x00000000007c5c83 in drizzled::join_init_read_record (tab=0x2e2ba10)
    at drizzled/sql_select.cc:3967
#5 0x00000000007c4ac0 in drizzled::sub_select (join=0x2e4da30,
    join_tab=0x2e2ba10, end_of_records=false) at drizzled/sql_select.cc:3563
#6 0x00000000007c46bf in drizzled::do_select (join=0x2e4da30,
    fields=0x2e4cf40, table=0x0) at drizzled/sql_select.cc:3333
#7 0x00000000006f2d0f in drizzled::Join::exec (this=0x2e4da30)
    at drizzled/join.cc:1695
#8 0x00000000007bd91b in drizzled::mysql_select (session=0x2e4c1d0,
    rref_pointer_array=0x2e4d000, tables=0x2e4d860, wild_num=0, fields=...,
    conds=0x0, og_num=0, order=0x0, group=0x0, having=0x0,
    select_options=2147500032, result=0x2e4da10, unit=0x2e4cc48,
    select_lex=0x2e4ce48) at drizzled/sql_select.cc:427
#9 0x00000000007bd1f1 in drizzled::handle_select (session=0x2e4c1d0,
    lex=0x2e4cc28, result=0x2e4da10, setup_tables_done_option=0)
    at drizzled/sql_select.cc:146
#10 0x00000000007b98d8 in drizzled::execute_sqlcom_select (session=0x2e4c1d0,
    all_tables=0x2e4d860) at drizzled/sql_parse.cc:544
#11 0x0000000000812e7f in drizzled::statement::Select::execute (this=0x2e50080)
    at drizzled/statement/select.cc:32
#12 0x00000000007b9463 in mysql_execute_command (session=0x2e4c1d0)
    at drizzled/sql_parse.cc:479
#13 0x00000000007ba1b2 in drizzled::mysql_parse (sess...

Read more...

Revision history for this message
David Shrewsbury (dshrews) wrote :

Forget that last comment... supposed to happen.

Revision history for this message
Andrew Hutchings (linuxjedi) wrote :

Fix for hour.cc:

=== modified file 'drizzled/function/time/hour.cc'
--- drizzled/function/time/hour.cc 2010-02-04 08:14:46 +0000
+++ drizzled/function/time/hour.cc 2010-09-09 17:38:41 +0000
@@ -57,7 +57,13 @@
   char time_buff[DRIZZLE_MAX_LENGTH_DATETIME_AS_STRING];
   String tmp_time(time_buff,sizeof(time_buff), &my_charset_utf8_bin);
   String *time_res= args[0]->val_str(&tmp_time);
- if (! temporal_time.from_string(time_res->c_ptr(), time_res->length()))
+ if (time_res && time_res != &tmp_time)
+ {
+ tmp_time.copy(*time_res);
+ }
+
+
+ if (! temporal_time.from_string(tmp_time.c_ptr(), tmp_time.length()))
   {
     /*
      * OK, we failed to match the first argument as a string
@@ -83,7 +89,12 @@
           char buff[DRIZZLE_MAX_LENGTH_DATETIME_AS_STRING];
           String tmp(buff,sizeof(buff), &my_charset_utf8_bin);
           String *res= args[0]->val_str(&tmp);
- if (! temporal_datetime.from_string(res->c_ptr(), res->length()))
+ if (res && res != &tmp)
+ {
+ tmp.copy(*res);
+ }
+
+ if (! temporal_datetime.from_string(tmp.c_ptr(), tmp.length()))
           {
             /*
             * Could not interpret the function argument as a temporal value,
@@ -111,7 +122,12 @@

           res= args[0]->val_str(&tmp);

- my_error(ER_INVALID_DATETIME_VALUE, MYF(0), res->c_ptr());
+ if (res && res != &tmp)
+ {
+ tmp.copy(*res);
+ }
+
+ my_error(ER_INVALID_DATETIME_VALUE, MYF(0), tmp.c_ptr());
           return 0;
         }
     }

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.