Server crashes in add_identifier on concurrent ALTER TABLE and SHOW ENGINE INNODB STATUS

Bug #1017192 reported by Elena Stepanova
44
This bug affects 6 people
Affects Status Importance Assigned to Milestone
MySQL Server
Unknown
Unknown
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
Critical
Sergei Glushchenko
5.5
Fix Released
Critical
Sergei Glushchenko

Bug Description

Percona XtraDB 1.1.8-27.0 started

#3 <signal handler called>
#4 0x000000000066373a in add_identifier (thd=0x2854b30, to_p=0x7f6f26df5c6a "\337&o\177", end_p=0x7f6f26df5d40 "#sql-4d62_2#P#p0", name=0x7f6f26df5d4e "p0", name_len=4294967293) at sql/sql_table.cc:104
#5 0x000000000066433d in explain_filename (thd=0x2854b30, from=0x7f6f26df5d40 "#sql-4d62_2#P#p0", to=0x7f6f26df5c40 "`#sql-4d62_2#P#p0` /* Temporary Partition \337&o\177", to_length=256, explain_mode=EXPLAIN_PARTITIONS_AS_COMMENT) at sql/sql_table.cc:356
#6 0x00000000008c9ac0 in innobase_convert_identifier (buf=0x7f6f26df5ef7 "`t1` /* P", buflen=569, id=0x7f6f1c002b45 "#sql-4d62_2#P#p0", idlen=16, thd=0x2854b30, file_id=1) at storage/innobase/handler/ha_innodb.cc:2243
#7 0x00000000008c9df4 in innobase_convert_name (buf=0x7f6f26df5ef0 "`test`.`t1` /* P", buflen=576, id=0x7f6f1c002b40 "test/#sql-4d62_2#P#p0", idlen=21, thd=0x2854b30, table_id=1) at storage/innobase/handler/ha_innodb.cc:2329
#8 0x000000000095aeba in ut_print_namel (f=0x26d0260, trx=0x28f6798, table_id=1, name=0x7f6f1c002b40 "test/#sql-4d62_2#P#p0", namelen=21) at storage/innobase/ut/ut0ut.c:544
#9 0x000000000095ae30 in ut_print_name (f=0x26d0260, trx=0x28f6798, table_id=1, name=0x7f6f1c002b40 "test/#sql-4d62_2#P#p0") at storage/innobase/ut/ut0ut.c:520
#10 0x0000000000a1ecc8 in lock_table_print (file=0x26d0260, lock=0x28ee1c8) at storage/innobase/lock/lock0lock.c:4319
#11 0x0000000000a1faad in lock_print_info_all_transactions (file=0x26d0260) at storage/innobase/lock/lock0lock.c:4680
#12 0x00000000009233c5 in srv_printf_innodb_monitor (file=0x26d0260, nowait=0, trx_start=0x7f6f26df68d0, trx_end=0x7f6f26df68d8) at storage/innobase/srv/srv0srv.c:2243
#13 0x00000000008d7c4f in innodb_show_status (hton=0x2288840, thd=0x28f9830, stat_print=0x756c8f <stat_print(THD*, char const*, uint, char const*, uint, char const*, uint)>) at storage/innobase/handler/ha_innodb.cc:10347
#14 0x00000000008d86e9 in innobase_show_status (hton=0x2288840, thd=0x28f9830, stat_print=0x756c8f <stat_print(THD*, char const*, uint, char const*, uint, char const*, uint)>, stat_type=HA_ENGINE_STATUS) at storage/innobase/handler/ha_innodb.cc:10587
#15 0x0000000000756fde in ha_show_status (thd=0x28f9830, db_type=0x2288840, stat=HA_ENGINE_STATUS) at sql/handler.cc:5055
#16 0x00000000005e6873 in mysql_execute_command (thd=0x28f9830) at sql/sql_parse.cc:2479
#17 0x00000000005ef600 in mysql_parse (thd=0x28f9830, rawbuf=0x2904990 "SHOW ENGINE INNODB STATUS", length=25, parser_state=0x7f6f26df7ff0) at sql/sql_parse.cc:5809
#18 0x00000000005e34f0 in dispatch_command (command=COM_QUERY, thd=0x28f9830, packet=0x28fc961 "", packet_length=25) at sql/sql_parse.cc:1060
#19 0x00000000005e27d1 in do_command (thd=0x28f9830) at sql/sql_parse.cc:788
#20 0x00000000006c9024 in do_handle_one_connection (thd_arg=0x28f9830) at sql/sql_connect.cc:1441
#21 0x00000000006c8a0a in handle_one_connection (arg=0x28f9830) at sql/sql_connect.cc:1348
#22 0x0000000000aa5fa7 in pfs_spawn_thread (arg=0x2857e80) at storage/perfschema/pfs.cc:1015
#23 0x00007f6f2eea0efc in start_thread (arg=0x7f6f26df9700) at pthread_create.c:304
#24 0x00007f6f2e21359d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112

bzr version-info
revision-id: <email address hidden>
date: 2012-06-21 14:36:22 +0200
build-date: 2012-06-24 20:18:16 +0400
revno: 270

# MTR test case:

--source include/have_partition.inc
--source include/have_innodb.inc

CREATE TABLE IF NOT EXISTS t1 (`a` INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1),(2),(3),(4);

--connect (con1,localhost,root,,)
--let $run = 100

while ($run)
{
        --connection default
        --send ALTER TABLE t1 PARTITION BY HASH (`a`) PARTITIONS 4
        --connection con1
        --let $show = 300
        while ($show)
        {
                SHOW ENGINE INNODB STATUS;
                --dec $show
        }
        --connection default
        --reap
        --dec $run
}

DROP TABLE t1;

# End of MTR test case

# If it does not work, try the RQG grammar
# cat 1.yy

query_init:
        CREATE TABLE IF NOT EXISTS t1 (`a` INT) ENGINE=InnoDB
        ; INSERT INTO t1 VALUES (1),(2),(3),(4);

thread1:
        SHOW ENGINE INNODB STATUS;

query:
        ALTER TABLE t1 PARTITION BY HASH (`a`) PARTITIONS 4 ;

# End of RQG grammar
# run as

perl ./runall.pl \
--threads=2 \
--queries=100M \
--duration=300 \
--skip-gendata \
--grammar=1.yy \
--basedir=<your basedir>

Tags: upstream

Related branches

Stewart Smith (stewart)
Changed in percona-server:
importance: Undecided → High
Revision history for this message
Roel Van de Paar (roel11) wrote :

See bug#1036506 which is a possible duplicate

Revision history for this message
Roel Van de Paar (roel11) wrote :

See bug 1036506 which is a possible duplicate

Stewart Smith (stewart)
Changed in percona-server:
status: New → In Progress
importance: High → Critical
assignee: nobody → Stewart Smith (stewart)
milestone: none → 5.5.27-28.1
Revision history for this message
nathan hruby (nhruby-2) wrote :

I don't get this on 5.5.24, but those packages seem to have evaporated form the Percona RPM repos. Can they be put back at least until the problem can be fixed?

Revision history for this message
nathan hruby (nhruby-2) wrote :

Hi,

So in 5.5.27 we just experienced a deadlock with ALTER TABLE on regular, non-partitioned tables. This caused SHOW INNODB STATUS and SHOW GLOBAL STATUS as well as any access to the database that contained the ALTER'ed table to simple freeze forever. Would this bug be a cause of that, or shoudl I submit a new bug?

We do not experience this on 5.5.24 using identical hardware and similar OS builds.

-n

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Nathan -

Please have a look at bug 1040735, if it matches your symptoms.

Revision history for this message
yinfeng (yinfeng-zwx) wrote :

I can't reproduce this bug on Percona Server 5.5.18 we are using now .

Can anyone confirm if this bug affects 5.5.18?

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

According to https://mariadb.atlassian.net/browse/MDEV-364, it is not reproducible on 5.5.24. Presumably 5.5.18 is also fine.

no longer affects: maria
Revision history for this message
Roel Van de Paar (roel11) wrote :

Bug #1036506 still crashes. Un-duplicated that one.

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

I tried both mtr test case and RQG grammar on upstream 5.5.28 many times and I am not able to repeat the crash there (even though source code in sql/sql_table.cc looks the same as original, non-patched one. Going to try on other Linux version and hardware later.

Revision history for this message
Valerii Kravchuk (valerii-kravchuk) wrote :

Older Percona server versions (like 5.5.27) are easy to crash this way with RQG:

...
121207 20:58:46 [Note] /home/openxs/dbs/p5.5/bin/mysqld: ready for connections.
Version: '5.5.27-log' socket: '/home/openxs/dbs/p5.5/mysql-test/var/tmp/master.sock' port: 19300 Source distribution
17:59:23 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Please help us make Percona Server better by reporting any
bugs at http://bugs.percona.com/

key_buffer_size=1048576
read_buffer_size=131072
max_used_connections=3
max_threads=151
thread_count=3
connection_count=3
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 60853 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x33af570
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fdf781a3e58 thread_stack 0x40000
/home/openxs/dbs/p5.5/bin/mysqld(my_print_stacktrace+0x2e)[0x7643be]
/home/openxs/dbs/p5.5/bin/mysqld(handle_fatal_signal+0x41a)[0x6592fa]
/lib64/libpthread.so.0(+0xf500)[0x7fdf7aba2500]
/home/openxs/dbs/p5.5/bin/mysqld[0x5bfc33]
/home/openxs/dbs/p5.5/bin/mysqld(_Z16explain_filenameP3THDPKcPcj26enum_explain_filename_mode+0x363)[0x5c0143]
/home/openxs/dbs/p5.5/bin/mysqld[0x7ab507]
/home/openxs/dbs/p5.5/bin/mysqld[0x7abca9]
/home/openxs/dbs/p5.5/bin/mysqld[0x806461]
/home/openxs/dbs/p5.5/bin/mysqld[0x88102e]
/home/openxs/dbs/p5.5/bin/mysqld[0x881c8b]
/home/openxs/dbs/p5.5/bin/mysqld[0x7e401e]
/home/openxs/dbs/p5.5/bin/mysqld[0x7b2b65]
/home/openxs/dbs/p5.5/bin/mysqld(_Z14ha_show_statusP3THDP10handlerton12ha_stat_type+0x333)[0x65f823]
/home/openxs/dbs/p5.5/bin/mysqld(_Z21mysql_execute_commandP3THD+0x22f7)[0x56e207]
/home/openxs/dbs/p5.5/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x33b)[0x5705ab]
/home/openxs/dbs/p5.5/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1419)[0x572199]
/home/openxs/dbs/p5.5/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x1a6)[0x601fa6]
/home/openxs/dbs/p5.5/bin/mysqld(handle_one_connection+0x4a)[0x60200a]
/lib64/libpthread.so.0(+0x7851)[0x7fdf7ab9a851]
/lib64/libc.so.6(clone+0x6d)[0x7fdf79b0511d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fdf540048e0): SHOW ENGINE INNODB STATUS
Connection ID (thread ID): 5
...

Oracle's 5.6.8-rc passed without crashes in the same environment.

Revision history for this message
Sergei Glushchenko (sergei.glushchenko) wrote :

Both bugs (#1036506, #1017192) are abount crashing explain_filename when input filename looks like "#sql-1979_11#P#p0". This is introduced in upstream revision 0.16394.68. Percona Server 5.1 is not affected. MariaDB released fix for it in revision 3413.1.78

Revision history for this message
Alexey Kopytov (akopytov) wrote :

Then we should have an upstream bug report. So we need to verify it against upstream 5.5. Currently we only have a report that it's not reproducible with upstream 5.6.8-rc.

Revision history for this message
Roel Van de Paar (roel11) wrote :

Ref note #2 on bug 1036506 re: testing against upstream for that bug.

Revision history for this message
Sergei Glushchenko (sergei.glushchenko) wrote :
Download full text (28.9 KiB)

I'm able to reproduce on mysql 5.5 trunk by just injecting call

 ut_print_name(stderr, trx, TRUE, table->name);

at the start of row_create_table_for_mysql.

[sergei@centos6 mysql-test]$ cat ../../mysql-test/t/crash.test
--source include/have_partition.inc
--source include/have_innodb.inc

CREATE TABLE IF NOT EXISTS t1 (`a` INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1),(2),(3),(4);

--connect (con1,localhost,root,,)

ALTER TABLE t1 PARTITION BY HASH (`a`) PARTITIONS 4;

DROP TABLE t1;

[sergei@centos6 mysql-test]$ ./mtr crash.test
Logging: /home/sergei/percona/upstream/55-et-crash/mysql-test/mysql-test-run.pl crash.test
MySQL Version 5.5.28
Checking supported features...
 - skipping ndbcluster
 - skipping SSL, mysqld not compiled with SSL
Collecting tests...
vardir: /home/sergei/percona/upstream/55-et-crash/build/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/home/sergei/percona/upstream/55-et-crash/build/mysql-test/var'...
Installing system database...
Using server port 51161

==============================================================================

TEST RESULT TIME (ms) or COMMENT
--------------------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
CREATE TABLE IF NOT EXISTS t1 (`a` INT) ENGINE=InnoDB;
INSERT INTO t1 VALUES (1),(2),(3),(4);
main.crash [ fail ]
        Test ended at 2012-12-21 23:57:44

CURRENT_TEST: main.crash
mysqltest: At line 9: query 'ALTER TABLE t1 PARTITION BY HASH (`a`) PARTITIONS 4' failed: 2013: Lost connection to MySQL server during query

Server [mysqld.1 - pid: 61076, winpid: 61076, exit: 256] failed during test run
Server log from this test:
----------SERVER LOG START-----------
121221 19:57:37 InnoDB: The InnoDB memory heap is disabled
121221 19:57:37 InnoDB: Mutexes and rw_locks use GCC atomic builtins
121221 19:57:37 InnoDB: Compressed tables use zlib 1.2.3
121221 19:57:37 InnoDB: Using Linux native AIO
121221 19:57:37 InnoDB: Initializing buffer pool, size = 8.0M
121221 19:57:37 InnoDB: Completed initialization of buffer pool
121221 19:57:37 InnoDB: highest supported file format is Barracuda.
121221 19:57:37 InnoDB: Waiting for the background threads to start
121221 19:57:38 InnoDB: 1.1.8 started; log sequence number 1595675
121221 19:57:38 [Warning] /home/sergei/percona/upstream/55-et-crash/build/sql/mysqld: unknown variable 'loose-debug-sync-timeout=300'
121221 19:57:38 [Note] Server hostname (bind-address): '0.0.0.0'; port: 13000
121221 19:57:38 [Note] - '0.0.0.0' resolves to '0.0.0.0';
121221 19:57:38 [Note] Server socket created on IP: '0.0.0.0'.
121221 19:57:38 [Note] Event Scheduler: Loaded 0 events
121221 19:57:38 [Note] /home/sergei/percona/upstream/55-et-crash/build/sql/mysqld: ready for connections.
Version: '5.5.28-log' socket: '/home/sergei/percona/upstream/55-et-crash/build/mysql-test/var/tmp/mysqld.1.sock' port: 13000 Source distribution
`test`.`t1`16:57:39 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the librari...

Revision history for this message
Alexey Kopytov (akopytov) wrote :

Sergei, can you file an upstream bug then, and link it from this one?

Revision history for this message
Roel Van de Paar (roel11) wrote :

Bug status for added clarity:

Upstream bug is not repeatable (see comment '[27 Dec 2012 23:20] Sveta Smirnova' @ upstream bug), but remains verified as logic is valid.

So, ftm, this bug is repeatable in PS only.

For this, a [ported] PS fix @ https://code.launchpad.net/~sergei.glushchenko/percona-server/ps55-bug1017192

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Since the upstream bug is closed, let's track the upstream fix here (revnos for Oracle 5.6 tree):

    ------------------------------------------------------------
    revno: 5010.1.6
    committer: Tor Didriksen <email address hidden>
    branch nick: 5.6-review
    timestamp: Thu 2013-04-11 13:53:51 +0200
    message:
      Bug#16051728 SERVER CRASHES IN ADD_IDENTIFIER ON CONCURRENT ALTER TABLE

      Post push fix: do not include mytap header,
      the declarations there may break other code.

      Also: CHECK_CXX_COMPILER_FLAG rather than CHECK_C_COMPILER_FLAG for -Wno-null-dereference
------------------------------------------------------------
revno: 4936
committer: Tor Didriksen <email address hidden>
branch nick: 5.6
timestamp: Thu 2013-03-21 18:26:13 +0100
message:
  Bug#16051728 SERVER CRASHES IN ADD_IDENTIFIER ON CONCURRENT ALTER TABLE

  Post push fix: the unit test broke merge_large_tests-t
  since it redefined some global settings.

  Also: no underscores in test names or test case names.
    ------------------------------------------------------------
    revno: 2875.437.26
    committer: Annamalai Gurusami <email address hidden>
    branch nick: mysql-5.5
    timestamp: Thu 2013-03-21 11:40:43 +0530
    message:
      Bug #16051728 SERVER CRASHES IN ADD_IDENTIFIER ON CONCURRENT ALTER TABLE AND
      SHOW ENGINE INNOD

      Problem:

      The purpose of explain_filename() is to provide useful additional
      information regarding the partitions given the filename. This function
      was returning an error when it was not able to parse the given filename.
      For example, within InnoDB, temporary files are created with #sql-
      prefix. But this function was not able to parse it correctly.

      Solution:

      It is not an error, if explain_filename() could not parse the given
      filename. If there is no partition information to explain, then silently
      return from the function.

      rb#1940 approved by mattiasj

tags: added: upstream
Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-344

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.