innodb_adaptive_hash_index_partitions may cause hangup

Bug #791030 reported by Yasufumi Kinoshita
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Percona Server moved to https://jira.percona.com/projects/PS
Fix Released
High
Unassigned
5.5
Fix Released
High
Unassigned

Bug Description

innodb_adaptive_hash_index_partitions > 1 may cause hangup.
Sometimes I met the hangup at high-spec bencnhmarks.

The part of innodb_adaptive_hash_index_partitions.patch

+ } else if (btr_search_index_num > 1) {
+ rw_lock_t* btr_search_latch;
+
+ /* FIXME: This may be optimistic implementation still. */
+ btr_search_latch = (rw_lock_t*)(block->btr_search_latch);
+ if (UNIV_LIKELY(!btr_search_latch)) {
+ if (block->is_hashed) {
+ goto retry;
+ }
+ return;
+ }
+ rw_lock_s_lock(btr_search_latch);

is insufficient a little still.

Related branches

Changed in percona-server:
assignee: nobody → Yasufumi Kinoshita (yasufumi-kinoshita)
status: New → Confirmed
importance: Undecided → Medium
Stewart Smith (stewart)
Changed in percona-server:
importance: Medium → High
status: Confirmed → Triaged
Revision history for this message
Yasufumi Kinoshita (yasufumi-kinoshita) wrote :

Next problematic place is:

--------------------------------------------------
                        hash index semaphore! */

 #ifndef UNIV_SEARCH_DEBUG
- if (!trx->has_search_latch) {
- rw_lock_s_lock(&btr_search_latch);
- trx->has_search_latch = TRUE;
+ if (!(trx->has_search_latch
+ & ((ulint)1 << (index->id % btr_search_index_num)))) {
+ rw_lock_s_lock(btr_search_get_latch(index->id));
+ trx->has_search_latch |=
+ (ulint)1 << (index->id % btr_search_index_num);
                        }
 #endif
                        switch (row_sel_try_search_shortcut_for_mysql(
--------------------------------------------------

We should decide explicit latch order between btr_search_latch_part[n], and rewrite patch to this function.
It is more possibly to cause hangup than the reported before.

Revision history for this message
Yasufumi Kinoshita (yasufumi-kinoshita) wrote :

I found true reason of the hangup. I will fix in today.

It seems to be needed to release s-latch of all hash indexes if someone other wait one of it.

The current implementation (If someone wait the s-latch "which the trx hold", release it only) might cause wait chain of the btr_search_latch_part[n]. only "release all" also seems not enough.

Changed in percona-server:
status: Triaged → Fix Committed
Stewart Smith (stewart)
Changed in percona-server:
milestone: none → 5.5.13-20.4
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

This issue is not fixed yet. Setting multiple AHI partitions (mostly with multiple buffer pools) triggers a lock deadlock.

--Thread 139558755972864 has waited at btr0sea.c line 1197 for 930.00 seconds the semaphore:
X-lock (wait_ex) on RW-latch at 0x9390f558 'btr_search_latch_part[i]'
a writer (thread id 139558755972864) has reserved it in mode wait exclusive
number of readers 1, waiters flag 1, lock_word: ffffffffffffffff
Last time read locked in file btr0sea.c line 1099
Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos6-64/target/BUILD/Percona-Server-5.5.27-rel28.1/Percona-Server-5.5.27-rel28.1/storage/innobase/btr/btr0sea.c line 669
InnoDB: Warning: a long semaphore wait:

As you can see, 139558755972864 is waiting on itself.

The call chain is as follows: (will provide full trace later)

    pthread_cond_wait
    os_cond_wait
    os_event_wait_low
    sync_array_wait_event
    rw_lock_x_lock_wait
    rw_lock_x_lock_low
    rw_lock_x_lock_func
    pfs_rw_lock_x_lock_func
    btr_search_drop_page_hash_index
    buf_LRU_free_block
    buf_LRU_free_from_common_LRU_list
    buf_LRU_search_and_free_block
    buf_LRU_get_free_block
    buf_block_alloc
    btr_search_check_free_space_in_heap
    btr_search_info_update_slow
    btr_search_info_update
    btr_cur_search_to_nth_level
    btr_pcur_open_with_no_init_func
    row_sel_try_search_shortcut_for_mysql
    row_search_for_mysql
    ha_innobase::index_read
    join_read_key
    sub_select
    evaluate_join_record
    sub_select
    evaluate_join_record
    sub_select
    evaluate_join_record
    sub_select
    evaluate_join_record
    sub_select
    evaluate_join_record
    sub_select
    evaluate_join_record
    sub_select
    do_select
    JOIN::exec
    mysql_select
    handle_select
    execute_sqlcom_select
    mysql_execute_command
    mysql_parse
    dispatch_command
    do_handle_one_connection
    handle_one_connection
    start_thread
    clone

The suspect in question is btr_search_drop_page_hash_index, since it deals with multiple partitions differently, and the FIXME mentioned in the description is still present in the code.

=================================
 if (btr_search_index_num > 1) {
  rw_lock_t* btr_search_latch;

  /* FIXME: This may be optimistic implementation still. */
  btr_search_latch = (rw_lock_t*)(block->btr_search_latch);
  if (UNIV_LIKELY(!btr_search_latch)) {
   if (block->index) {
    goto retry;
   }
   return;
  }
  ......
  ..
=================================================

It has also been reported in lp:331659

tags: added: i26423
Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

While checking an AHI-related issue, I noticed that
mem_heap_create_block uses buf_block_alloc which round robins
through available buffer pools. So, in the presence of multiple
partitions and multiple buffer pools, can it be possible that
the locking meant for one pool is used for another, resulting in
deadlock?

Revision history for this message
vineet khanna (khannavin) wrote :

Hi,

I am having

mysql> show variables like 'innodb_adaptive_hash_index_partitions';
+---------------------------------------+-------+
| Variable_name | Value |
+---------------------------------------+-------+
| innodb_adaptive_hash_index_partitions | 1 |
+---------------------------------------+-------+

Still i am getting SEMAPHORES issue:

----------
SEMAPHORES
----------
OS WAIT ARRAY INFO: reservation count 4641988, signal count 327450531
--Thread 1219737920 has waited at row0sel.c line 3698 for 0.0000 seconds the semaphore:
S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
number of readers 0, waiters flag 0, lock_word: 100000
Last time read locked in file btr0sea.c line 918
Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona-Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669
--Thread 1218406720 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
number of readers 0, waiters flag 0, lock_word: 100000
Last time read locked in file btr0sea.c line 918
Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona-Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669
--Thread 1219205440 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
number of readers 0, waiters flag 0, lock_word: 100000
Last time read locked in file btr0sea.c line 918

Please update if its a same issue.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote : Re: [Launchpad] [Bug 791030] Re: innodb_adaptive_hash_index_partitions may cause hangup

* On Tue, Dec 11, 2012 at 07:18:42AM -0000, vineet khanna <email address hidden> wrote:
>Hi,
>
>I am having
>
>mysql> show variables like 'innodb_adaptive_hash_index_partitions';
>+---------------------------------------+-------+
>| Variable_name | Value |
>+---------------------------------------+-------+
>| innodb_adaptive_hash_index_partitions | 1 |
>+---------------------------------------+-------+
>
>Still i am getting SEMAPHORES issue:
>
>----------
>SEMAPHORES
>----------
>OS WAIT ARRAY INFO: reservation count 4641988, signal count 327450531
>--Thread 1219737920 has waited at row0sel.c line 3698 for 0.0000 seconds the semaphore:
>S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
>number of readers 0, waiters flag 0, lock_word: 100000
>Last time read locked in file btr0sea.c line 918
>Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona-Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669
>--Thread 1218406720 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
>S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
>number of readers 0, waiters flag 0, lock_word: 100000
>Last time read locked in file btr0sea.c line 918
>Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona-Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669
>--Thread 1219205440 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
>S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
>number of readers 0, waiters flag 0, lock_word: 100000
>Last time read locked in file btr0sea.c line 918
>
>
>Please update if its a same issue.
>
It is not, in your case, setting
innodb_adaptive_hash_index_partitions to non-default value (say
8) should ameliorate your issue.

The lp issue is about circular lock deadlock like for
139558755972864 in comment#3.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

>I am having
>
>mysql> show variables like 'innodb_adaptive_hash_index_partitions';
>+---------------------------------------+-------+
>| Variable_name | Value |
>+---------------------------------------+-------+
>| innodb_adaptive_hash_index_partitions | 1 |
>+---------------------------------------+-------+
>
>Still i am getting SEMAPHORES issue:
>
>----------
>SEMAPHORES
>----------
>OS WAIT ARRAY INFO: reservation count 4641988, signal count 327450531
>--Thread 1219737920 has waited at row0sel.c line 3698 for 0.0000 seconds the semaphore:
>S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
>number of readers 0, waiters flag 0, lock_word: 100000
>Last time read locked in file btr0sea.c line 918
>Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona-Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669
>--Thread 1218406720 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
>S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
>number of readers 0, waiters flag 0, lock_word: 100000
>Last time read locked in file btr0sea.c line 918
>Last time write locked in file /home/jenkins/workspace/percona-server-5.5-rpms/label_exp/centos5-64/target/BUILD/Percona-Server-5.5.22-rel25.2/Percona-Server-5.5.22-rel25.2/storage/innobase/btr/btr0sea.c line 669
>--Thread 1219205440 has waited at btr0sea.c line 1508 for 0.0000 seconds the semaphore:
>S-lock on RW-latch at 0x29ea4b18 'btr_search_latch_part[i]'
>number of readers 0, waiters flag 0, lock_word: 100000
>Last time read locked in file btr0sea.c line 918
>
>
>Please update if its a same issue.
>
It is not, in your case, setting
innodb_adaptive_hash_index_partitions to non-default value (say
8) should ameliorate your issue.

The lp issue is about circular lock deadlock like for
139558755972864 in comment#3.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

To add to previous comment, the main issue is mostly noticeable when there are multiple buffer pools and multiple AHI partitions, so setting AHI partitions to like 8 should be fine.

Revision history for this message
vineet khanna (khannavin) wrote :

Hi Raghavendra,

As per your comment:

"innodb_adaptive_hash_index_partitions to non-default value (say 8) should ameliorate your issue."

This change will put me in that Bug 791030 or will resolve my problem.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

It should resolve your problem. As mentioned in #8, it mostly affects in presence of multiple buffer pools and multiple AHI partitions.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

I am still able to reproduce it (but not crash it - since it requires a much larger buffer pool for this) with config in http://sprunge.us/aESA and backtrace (though not just in time) in http://sprunge.us/INjC

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Following sysbench was used in #11:

sysbench --test=./oltp.lua --db-driver=mysql --mysql-engine-trx=yes --mysql-table-engine=innodb --mysql-user=root --mysql-password=test --oltp-table-size=30000 --num-threads=32 --max-requests=1000000 --oltp-tables-count=18 run

Revision history for this message
Alexey Kopytov (akopytov) wrote :

@Raghavendra,

Can you report it as another bug?

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Sure, will do.

Revision history for this message
Raghavendra D Prabhu (raghavendra-prabhu) wrote :

Reported as lp:1100760

tags: added: xtradb
Revision history for this message
Alexey Kopytov (akopytov) wrote :

Closing this, as the linked branch has been merged, and remaining issues reported elsewhere.

tags: added: ahi-partitions
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

It is not clear why the committed fix for this bug gathers a bit mask all of X waiters.

The original 5.5 code is

 if (UNIV_UNLIKELY(rw_lock_get_writer(&btr_search_latch) != RW_LOCK_NOT_LOCKED)
     && trx->has_search_latch) {

  /* There is an x-latch request on the adaptive hash index:
  release the s-latch to reduce starvation and wait for
  BTR_SEA_TIMEOUT rounds before trying to keep it again over
  calls from MySQL */

rw_lock_s_unlock(&btr_search_latch);
trx->has_search_latch = FALSE;

The current XtraDB 5.5 code is

 should_release = 0;
 for (i = 0; i < btr_search_index_num; i++) {
  /* we should check all latches (fix Bug#791030) */
  if (UNIV_UNLIKELY(rw_lock_get_writer(btr_search_latch_part[i])
      != RW_LOCK_NOT_LOCKED)) {
   should_release |= ((ulint)1 << i);
  }
 }

 if (UNIV_UNLIKELY(should_release)) {

  for (i = 0; i < btr_search_index_num; i++) {
   /* we should release all s-latches (fix Bug#791030) */
   if (trx->has_search_latch & ((ulint)1 << i)) {
    rw_lock_s_unlock(btr_search_latch_part[i]);
    trx->has_search_latch &= (~((ulint)1 << i));
   }
  }

Thus, it checks all the latches for the X waiters and collects a bitmask of them. That bitmask is only used if any bit is set, and if yes, the S latches are released for which the current transaction thinks it has the latch for. The individual bit info in the should_release is not used at all. The should_release and trx->has_search_latch bitmasks are not identical.

It can be rewritten as

 should_release = false;
 for (i = 0; i < btr_search_index_num; i++) {
  /* we should check all latches (fix Bug#791030) */
  if (UNIV_UNLIKELY(rw_lock_get_writer(btr_search_latch_part[i])
      != RW_LOCK_NOT_LOCKED)) {
   should_release = true;
   break;
  }
 }

        if (UNIV_UNLIKELY(should_release) {
                trx_search_latch_release_if_reserved();

Revision history for this message
Alexey Kopytov (akopytov) wrote :

And even that is suboptimal. The proposed code just interrupts the loop early as soon as we found a latch with X waiters. 2 problems:

1. the current thread will release all latches, even if there are only X waiters on latches it doesn't own.
2. the current thread will release all latches, though it may be sufficient to release only some.

I will comment separately on the 5.6 AHI port.

Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

The questionable code has been removed with the fix for bug 1218347.

Revision history for this message
Shahriyar Rzayev (rzayev-sehriyar) wrote :

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PS-480

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.