Bug #1154558 “InnoDB: Assertion failure in thread N in file dict...” : Bugs : Percona Server moved to https://jira.percona.com/projects/PS

Revision history for this message

renton (renton) wrote on 2013-03-13:

#1

crash_log.txt Edit (264.6 KiB, text/plain)

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2013-03-13:

#2

This is suspicious in your log:

---TRANSACTION 108A468, ACTIVE 120653 sec
mysql tables in use 7, locked 0
MySQL thread id 189440, OS thread handle 0x795f0f75d700, query id 78664538 81.177.174.10 gb_tmt Sending data
SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=25) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=24) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=23) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=22) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=21) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_va
Trx read view will not see trx with id >= 108A469, sees < 108A469

Is it normal to have transactions active for so long time? had you got similar ones in 5.1.x?

Assertion failure happened in the dict_index_remove_from_cache() function that has this 600 seconds limit hardcoded:

        for (;;) {
                ulint ref_count = btr_search_info_get_ref_count(info, index->id);
                if (ref_count == 0) {
                        break;
                }

                /* Sleep for 10ms before trying again. */
                os_thread_sleep(10000);
                ++retries;

                if (retries % 500 == 0) {
                        /* No luck after 5 seconds of wait. */
                        fprintf(stderr, "InnoDB: Error: Waited for"
                                        " %lu secs for hash index"
                                        " ref_count (%lu) to drop"
                                        " to 0.\n"
                                        "index: \"%s\""
                                        " table: \"%s\"\n",
                                        retries/100,
                                        ref_count,
                                        index->name,
                                        table->name);
                }

                /* To avoid a hang here we commit suicide if the
                ref_count doesn't drop to zero in 600 seconds. */
                if (retries >= 60000) {
                        ut_error;
                }
        }

Not sure what to do with this, if only make adaptive indexing smaller or switch it off entirely.

This is suspicious in your log:

---TRANSACTION 108A468, ACTIVE 120653 sec
mysql tables in use 7, locked 0
MySQL thread id 189440, OS thread handle 0x795f0f75d700, query id 78664538 81.177.174.10 gb_tmt Sending data
SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=25) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=24) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=23) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=22) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_value LIKE "%" AND attr_id=21) AND product_id IN (SELECT product_id from products_attr_products WHERE (attr_va
Trx read view will not see trx with id >= 108A469, sees < 108A469

Is it normal to have transactions active for so long time? had you got similar ones in 5.1.x?

Assertion failure happened in the dict_index_remove_from_cache() function that has this 600 seconds limit hardcoded:

for (;;) {
                ulint ref_count = btr_search_info_get_ref_count(info, index->id);
                if (ref_count == 0) {
                        break;
                }

/* Sleep for 10ms before trying again. */
                os_thread_sleep(10000);
                ++retries;

if (retries % 500 == 0) {
                        /* No luck after 5 seconds of wait. */
                        fprintf(stderr, "InnoDB: Error: Waited for"
                                        " %lu secs for hash index"
                                        " ref_count (%lu) to drop"
                                        " to 0.\n"
                                        "index: \"%s\""
                                        " table: \"%s\"\n",
                                        retries/100,
                                        ref_count,
                                        index->name,
                                        table->name);
                }

/* To avoid a hang here we commit suicide if the
                ref_count doesn't drop to zero in 600 seconds. */
                if (retries >= 60000) {
                        ut_error;
                }
        }

Not sure what to do with this, if only make adaptive indexing smaller or switch it off entirely.

Revision history for this message

Valerii Kravchuk (valerii-kravchuk) wrote on 2013-03-13:

#3

Please, send also the output of:

show global variables like 'innodb%';

also. I wonder if innodb_lazy_drop_table can be involved somehow.

Changed in percona-server:
assignee:	nobody → Valerii Kravchuk (valerii-kravchuk)
status:	New → Incomplete

Revision history for this message

renton (renton) wrote on 2013-03-13:

#4

Examining bug tracker today I've found similar traces and switched off innodb_lazy_drop_table everywhere. Well I'll wait to see if it crashes again or not.

Revision history for this message

Raghavendra D Prabhu (raghavendra-prabhu) wrote on 2013-03-14:

#5

@renton,

Thanks for the bug report and the inference. Let us know if it crashes with lazy drop table disable or not. A similar crash has been reported here - https://bugs.launchpad.net/percona-server/+bug/1128848

From the crash it looks like you were dropping gb_x_zernplus.c8qey_extensions ?

Revision history for this message

renton (renton) wrote on 2013-03-15:

#6

It didn't hang for about two days with the switched off lazy drop table. Nevertheless I think it passed too little time to make any conclusions that switching off this option has really helped. With lazy_drop on it could either work without crashing for about three days or hang three times a day.
I took a closer look at log spool queue of tasks executing and it really had a command for database dropping before failure.

> 2013-03-11 23:47:14 zernplus drop mySQL gb_x_zernplus (server 77)

130311 23:25:34 [ERROR] /opt/mysql+/5.5.29-rel30.0-451/bin/mysqld: Table './gb_gazrs/cms_online' is marked as crashed and last (automatic?) repair failed
InnoDB: Error: Waited for 5 secs for hash index ref_count (3) to drop to 0.
index: "PRIMARY" table: "gb_x_zernplus/c8qey_extensions"

So I keep on waiting.
Thank you for your answers.

Raghavendra D Prabhu (raghavendra-prabhu) on 2013-03-18

Changed in percona-server:
status:	Incomplete → New

Revision history for this message

renton (renton) wrote on 2013-03-31:

#7

Mysql didn't crash any more with the switched off lazy_drop_table.

Revision history for this message

Laurynas Biveinis (laurynas-biveinis) wrote on 2013-04-03:

#8

The symptoms and the workaround are identical to that of bug 1086227. Thus closing this one as its duplicate, please follow bug 1086227 for further tracking of this bug.

Percona Server moved to https://jira.percona.com/projects/PS

InnoDB: Assertion failure in thread N in file dict0dict.c line 1900

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches