Merge lp:~vlad-lesin/percona-server/5.6-logical-readahead into lp:percona-server/5.6

Proposed by Vlad Lesin
Status: Rejected
Rejected by: Laurynas Biveinis
Proposed branch: lp:~vlad-lesin/percona-server/5.6-logical-readahead
Merge into: lp:percona-server/5.6
Diff against target: 2278 lines (+1520/-71)
40 files modified
client/client_priv.h (+4/-1)
client/mysqldump.c (+38/-0)
mysql-test/include/have_native_aio.inc (+6/-0)
mysql-test/suite/innodb/r/innodb_logical_read_ahead.result (+47/-0)
mysql-test/suite/innodb/r/innodb_logical_read_ahead_correctness.result (+88/-0)
mysql-test/suite/innodb/r/innodb_merge_read.result (+30/-0)
mysql-test/suite/innodb/t/innodb_logical_read_ahead-master.opt (+2/-0)
mysql-test/suite/innodb/t/innodb_logical_read_ahead.test (+55/-0)
mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness-master.opt (+2/-0)
mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness.test (+107/-0)
mysql-test/suite/innodb/t/innodb_merge_read-master.opt (+1/-0)
mysql-test/suite/innodb/t/innodb_merge_read.test (+42/-0)
mysql-test/suite/sys_vars/r/innodb_lra_n_node_recs_before_sleep_basic.result (+28/-0)
mysql-test/suite/sys_vars/r/innodb_lra_size_basic.result (+28/-0)
mysql-test/suite/sys_vars/r/innodb_lra_sleep_basic.result (+30/-0)
mysql-test/suite/sys_vars/r/innodb_lra_test_basic.result (+8/-0)
mysql-test/suite/sys_vars/t/innodb_lra_n_node_recs_before_sleep_basic.test (+14/-0)
mysql-test/suite/sys_vars/t/innodb_lra_size_basic-master.opt (+1/-0)
mysql-test/suite/sys_vars/t/innodb_lra_size_basic.test (+14/-0)
mysql-test/suite/sys_vars/t/innodb_lra_sleep_basic.test (+14/-0)
mysql-test/suite/sys_vars/t/innodb_lra_test_basic-master.opt (+1/-0)
mysql-test/suite/sys_vars/t/innodb_lra_test_basic.test (+8/-0)
storage/innobase/btr/btr0cur.cc (+1/-0)
storage/innobase/btr/btr0pcur.cc (+26/-18)
storage/innobase/buf/buf0rea.cc (+33/-10)
storage/innobase/fil/fil0fil.cc (+8/-3)
storage/innobase/handler/ha_innodb.cc (+61/-0)
storage/innobase/include/btr0pcur.h (+3/-2)
storage/innobase/include/btr0pcur.ic (+51/-17)
storage/innobase/include/buf0rea.h (+37/-0)
storage/innobase/include/fil0fil.h (+5/-2)
storage/innobase/include/os0file.h (+24/-6)
storage/innobase/include/os0file.ic (+6/-2)
storage/innobase/include/srv0srv.h (+42/-0)
storage/innobase/include/trx0trx.h (+96/-1)
storage/innobase/os/os0file.cc (+99/-9)
storage/innobase/row/row0purge.cc (+9/-0)
storage/innobase/row/row0sel.cc (+319/-0)
storage/innobase/srv/srv0srv.cc (+9/-0)
storage/innobase/trx/trx0trx.cc (+123/-0)
To merge this branch: bzr merge lp:~vlad-lesin/percona-server/5.6-logical-readahead
Reviewer Review Type Date Requested Status
Laurynas Biveinis (community) Needs Resubmitting
Review via email: mp+216857@code.launchpad.net

Description of the change

Porting logical readahead feature from Facebook branch. The original patches are here:

https://github.com/facebook/mysql-5.6/commit/f9d1a5332eb2c82c028638d3b93b5a3592a69ffa
https://github.com/facebook/mysql-5.6/commit/f8e361952612d00979f7cf744f487e48b15cb5a6
https://github.com/facebook/mysql-5.6/commit/f69a4ea522bce24e4cdcc7696d5fad29587cf87a

The main difference is multiple io's commit is enabled only for logical read-ahead in this branch in comparison with the original implementation where it is enabled by default for all operations. See explanation in commit comments.

Jenkins testing:
http://jenkins.percona.com/view/PS 5.6/job/percona-server-5.6-param/589

To post a comment you must log in.
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

This is not a full review, but addressing this bit can happen in parallel with the rest of the review: the MP needs a blueprint, or even several blueprints, corresponding to each commit (the commits IMHO are well-split). The blueprints must be self-contained.

review: Needs Information
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Review of commit 576. I think it will be easier if it will have its own MP (probably the other two commits too).

    Code

    - s/ibool/bool/g (in all three commits if applies)
    - I'd add a defensive asserts at os_aio_free that for all arrays,
      count[x] == 0. This would catch any unsubmitted buffered read
      request, and any request buffered on the non-read array.
    - s/ut_malloc+memset(0)/calloc in os_aio_array_create
    - Why does buf_read_recv_pages call
      os_aio_linux_dispatch_read_array_submit? It does not appear to
      submit any buffered requests.
    - fil_extend_space_to_desired_size calling os_aio with
      should_buffer == TRUE is a (benign) typo?
    - The abstraction level for buffered request submitting seems to
      be off. I'd rename os_aio_linux_dispatch_read_array_submit to
      os_aio_submit_buffered_requests and push #ifdef LINUX_NATIVE_AIO
      down to it.
    - buf_read_page_low header comment @return tag: edit to "1 if
      read request is issued or buffered" to clarify that the function
      returns the same for both buffered and immediatelly issued read
      requests.
    - s/read/ready in the buf_read_page_low should_buffer
      comment. (https://github.com/facebook/mysql-5.6/commit/b1b4c7977136d57170f8bf500aaedba740b1c333)
    - Make sure the patch does not break the build with performance
      schema configured out
    - os_aio_linux_dispatch_read_array_submit would need a /*===*/
      comment, os_aio_func should_buffer arg declaration is misaligned
      </pedantic>

    Testcase

    - --disable_warnings/DROP TABLE IF EXISTS/--enable_warnings idiom
      is obsolete and should be removed. (in all three commits if
      applies)
    - innodb_merge_read.test needs --source
      include/have_innodb_16k.inc, as the number of readahead requests
      is likely to differ for other page sizes.
    - innodb_merge_read-master.opt is redundant, as
      --innodb-use-native-aio=1 is the default. The source
      include/have_native_aio.inc check in the testcase itself is
      enough.
    - newline at the end of have_native_aio.inc
    - I'd extend the innodb_merge_read testcase to check that linear
      read ahead read buffering works for compressed tablespaces too
      (there is code if (zip_size) then read(... should_buffer) else
      read(... should_buffer)). That would cause move of the testcase
      to the innodb_zip suite as well.
    - (wishlist) Consider submitting
      https://github.com/webscalesql/webscalesql-5.6/commit/32e49b4d66eaa392d9f06198596db4b16e8b8d04
      for Percona Server so that we exercise AIO with MTR --mem.

review: Needs Fixing
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

Work on this MP must continue on github.

review: Needs Resubmitting

Unmerged revisions

578. By Vlad Lesin

Add mysqldump support for logical read ahead

Summary:
Adds options to mysqldump:
 --lra-size=X
 --lra-sleep=X
 --lra-n-node-recs-before-sleep=X

These just inject SET statements to set these session variables.

The original implementation is here:
https://github.com/facebook/mysql-5.6/commit/f69a4ea522bce24e4cdcc7696d5fad29587cf87a

577. By Vlad Lesin

When the session variable innodb_lra_size is set to N, we issue async
read requests for the next M logical pages where the total size of the M
pages on disk is N megabytes. The max allowed value of innodb_lra_size
is is 16384 which corresponds to prefetching 16GB of data. We may choose
to use smaller values in production.

The original implementation can be found here:
https://github.com/facebook/mysql-5.6/commit/f8e361952612d00979f7cf744f487e48b15cb5a6

This implementation does not contain code for flashcahe.

576. By Vlad Lesin

Merge aio page read requests

Summary:
Tries to submit multiple aio page read requests together to improve read
performance.

The original code and description can be found here:
https://github.com/facebook/mysql-5.6/commit/f9d1a5332eb2c82c028638d3b93b5a3592a69ffa

The difference between this and the original implementation is that fil_io()
macros invokes _fil_io() function with enabled io's buffering by default in
the original implementation, it can cause the errors connected with waiting
io finishing just after fil_io() invocation.

For example log_archive_do() waits io's finishing on log_sys->archive_lock
mutex, but the mutex is not being unlocked as io's were buffered and
uncommited and io_handler_thread() does not process io's completion in
fil_aio_wait(). Potentially there can be the same errors so io's buffering
is disabled by default and will be enabled only for logical readahead code.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'client/client_priv.h'
--- client/client_priv.h 2014-02-25 17:05:01 +0000
+++ client/client_priv.h 2014-04-23 10:58:56 +0000
@@ -106,7 +106,10 @@
106 OPT_INNODB_OPTIMIZE_KEYS,106 OPT_INNODB_OPTIMIZE_KEYS,
107 OPT_REWRITE_DB,107 OPT_REWRITE_DB,
108 OPT_LOCK_FOR_BACKUP,108 OPT_LOCK_FOR_BACKUP,
109 OPT_MAX_CLIENT_OPTION109 OPT_MAX_CLIENT_OPTION,
110 OPT_LRA_SIZE,
111 OPT_LRA_SLEEP,
112 OPT_LRA_N_NODE_RECS_BEFORE_SLEEP
110};113};
111114
112/**115/**
113116
=== modified file 'client/mysqldump.c'
--- client/mysqldump.c 2014-03-03 17:51:33 +0000
+++ client/mysqldump.c 2014-04-23 10:58:56 +0000
@@ -133,6 +133,9 @@
133/* Server supports character_set_results session variable? */133/* Server supports character_set_results session variable? */
134static my_bool server_supports_switching_charsets= TRUE;134static my_bool server_supports_switching_charsets= TRUE;
135static ulong opt_compatible_mode= 0;135static ulong opt_compatible_mode= 0;
136static ulong opt_lra_size = 0;
137static ulong opt_lra_sleep = 0;
138static ulong opt_lra_n_node_recs_before_sleep = 0;
136#define MYSQL_OPT_MASTER_DATA_EFFECTIVE_SQL 1139#define MYSQL_OPT_MASTER_DATA_EFFECTIVE_SQL 1
137#define MYSQL_OPT_MASTER_DATA_COMMENTED_SQL 2140#define MYSQL_OPT_MASTER_DATA_COMMENTED_SQL 2
138#define MYSQL_OPT_SLAVE_DATA_EFFECTIVE_SQL 1141#define MYSQL_OPT_SLAVE_DATA_EFFECTIVE_SQL 1
@@ -567,6 +570,18 @@
567 "Default authentication client-side plugin to use.",570 "Default authentication client-side plugin to use.",
568 &opt_default_auth, &opt_default_auth, 0,571 &opt_default_auth, &opt_default_auth, 0,
569 GET_STR, REQUIRED_ARG, 0, 0, 0, 0, 0, 0},572 GET_STR, REQUIRED_ARG, 0, 0, 0, 0, 0, 0},
573 {"lra_size", OPT_LRA_SIZE,
574 "Set innodb_lra_size for the session of this dump.",
575 &opt_lra_size, &opt_lra_size, 0,
576 GET_ULONG, REQUIRED_ARG, 0, 0, 16384, 0, 0, 0},
577 {"lra_sleep", OPT_LRA_SLEEP,
578 "Set innodb_lra_sleep for the session of this dump.",
579 &opt_lra_sleep, &opt_lra_sleep, 0,
580 GET_ULONG, REQUIRED_ARG, 0, 0, 1000, 0, 0, 0},
581 {"lra_n_node_recs_before_sleep", OPT_LRA_N_NODE_RECS_BEFORE_SLEEP,
582 "Set innodb_lra_n_node_recs_before_sleep for the session of this dump.",
583 &opt_lra_n_node_recs_before_sleep, &opt_lra_n_node_recs_before_sleep, 0,
584 GET_ULONG, REQUIRED_ARG, 1024, 128, ULONG_MAX, 0, 0, 0},
570 {0, 0, 0, 0, 0, 0, GET_NO_ARG, NO_ARG, 0, 0, 0, 0, 0, 0}585 {0, 0, 0, 0, 0, 0, GET_NO_ARG, NO_ARG, 0, 0, 0, 0, 0, 0}
571};586};
572587
@@ -1611,6 +1626,29 @@
1611 if (mysql_query_with_error_report(mysql, 0, buff))1626 if (mysql_query_with_error_report(mysql, 0, buff))
1612 DBUG_RETURN(1);1627 DBUG_RETURN(1);
1613 }1628 }
1629
1630 if (opt_lra_size)
1631 {
1632 my_snprintf(buff, sizeof(buff), "SET innodb_lra_size=%lu", opt_lra_size);
1633 if (mysql_query_with_error_report(mysql, 0, buff))
1634 DBUG_RETURN(1);
1635 if (opt_lra_sleep)
1636 {
1637 my_snprintf(buff, sizeof(buff), "SET innodb_lra_sleep=%lu",
1638 opt_lra_sleep);
1639 if (mysql_query_with_error_report(mysql, 0, buff))
1640 DBUG_RETURN(1);
1641 }
1642 if (opt_lra_n_node_recs_before_sleep)
1643 {
1644 my_snprintf(buff, sizeof(buff),
1645 "SET innodb_lra_n_node_recs_before_sleep=%lu",
1646 opt_lra_n_node_recs_before_sleep);
1647 if (mysql_query_with_error_report(mysql, 0, buff))
1648 DBUG_RETURN(1);
1649 }
1650 }
1651
1614 DBUG_RETURN(0);1652 DBUG_RETURN(0);
1615} /* connect_to_db */1653} /* connect_to_db */
16161654
16171655
=== added file 'mysql-test/include/have_native_aio.inc'
--- mysql-test/include/have_native_aio.inc 1970-01-01 00:00:00 +0000
+++ mysql-test/include/have_native_aio.inc 2014-04-23 10:58:56 +0000
@@ -0,0 +1,6 @@
1--disable_query_log
2if (`select @@global.innodb_use_native_aio != 1`)
3{
4 --skip native AIO is not in use
5}
6--enable_query_log
0\ No newline at end of file7\ No newline at end of file
18
=== added file 'mysql-test/suite/innodb/r/innodb_logical_read_ahead.result'
--- mysql-test/suite/innodb/r/innodb_logical_read_ahead.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/r/innodb_logical_read_ahead.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,47 @@
1DROP TABLE if exists t1;
2CREATE TABLE t1 (a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
3INSERT INTO t1 VALUES (0, REPEAT('a',256));
4INSERT INTO t1 SELECT 0, b FROM t1;
5INSERT INTO t1 SELECT 0, b FROM t1;
6INSERT INTO t1 SELECT 0, b FROM t1;
7INSERT INTO t1 SELECT 0, b FROM t1;
8INSERT INTO t1 SELECT 0, b FROM t1;
9INSERT INTO t1 SELECT 0, b FROM t1;
10INSERT INTO t1 SELECT 0, b FROM t1;
11INSERT INTO t1 SELECT 0, b FROM t1;
12INSERT INTO t1 SELECT 0, b FROM t1;
13INSERT INTO t1 SELECT 0, b FROM t1;
14INSERT INTO t1 SELECT 0, b FROM t1;
15INSERT INTO t1 SELECT 0, b FROM t1;
16INSERT INTO t1 SELECT 0, b FROM t1;
17INSERT INTO t1 SELECT 0, b FROM t1;
18INSERT INTO t1 SELECT 0, b FROM t1;
19INSERT INTO t1 SELECT 0, b FROM t1;
20show global status like "innodb_buffered_aio_submitted";
21Variable_name Value
22Innodb_buffered_aio_submitted 0
23show global status like "innodb_logical_read_ahead_misses";
24Variable_name Value
25Innodb_logical_read_ahead_misses 0
26show global status like "innodb_logical_read_ahead_prefetched";
27Variable_name Value
28Innodb_logical_read_ahead_prefetched 0
29show global status like "innodb_logical_read_ahead_in_buf_pool";
30Variable_name Value
31Innodb_logical_read_ahead_in_buf_pool 0
32SET SESSION innodb_lra_size=1024;
33SET SESSION innodb_lra_n_node_recs_before_sleep=128;
34SET SESSION innodb_lra_sleep=100;
35checksum table t1;
36Table Checksum
37test.t1 2920207201
38show global status like "innodb_logical_read_ahead_misses";
39Variable_name Value
40Innodb_logical_read_ahead_misses 0
41select variable_value > 1000 from information_schema.global_status where variable_name="innodb_logical_read_ahead_prefetched";
42variable_value > 1000
431
44select variable_value < 100 from information_schema.global_status where variable_name="innodb_logical_read_ahead_in_buf_pool";
45variable_value < 100
461
47DROP TABLE t1;
048
=== added file 'mysql-test/suite/innodb/r/innodb_logical_read_ahead_correctness.result'
--- mysql-test/suite/innodb/r/innodb_logical_read_ahead_correctness.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/r/innodb_logical_read_ahead_correctness.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,88 @@
1DROP TABLE IF EXISTS t1_small;
2DROP TABLE IF EXISTS t1;
3DROP TABLE IF EXISTS t1_lra;
4DROP TABLE IF EXISTS t2_small;
5DROP TABLE IF EXISTS t3_small;
6CREATE TABLE t1_small(a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
7SET SESSION innodb_lra_size=1;
8SELECT * FROM t1_small;
9a b
10SET SESSION innodb_lra_size=0;
11INSERT INTO t1_small(b) VALUES(REPEAT('a',256));
12SET SESSION innodb_lra_size=1;
13SELECT a, LENGTH(b) FROM t1_small;
14a LENGTH(b)
151 256
16SET SESSION innodb_lra_size=0;
17DROP TABLE t1_small;
18CREATE TABLE `t2_small` (
19`id1` bigint(20) unsigned NOT NULL DEFAULT '0',
20`time` bigint(20) unsigned NOT NULL DEFAULT '0',
21`id2` bigint(20) unsigned NOT NULL DEFAULT '0',
22`id2_type` int(10) unsigned DEFAULT NULL,
23`data` text,
24`status` tinyint(3) unsigned DEFAULT NULL,
25PRIMARY KEY (`id1`,`time`,`id2`)
26) ENGINE=InnoDB DEFAULT CHARSET=latin1;
27SET SESSION innodb_lra_size=1;
28SELECT * FROM t2_small;
29id1 time id2 id2_type data status
30DROP TABLE t2_small;
31CREATE TABLE `t3_small` (
32`id` bigint(20) NOT NULL,
33`a` text,
34`b` text,
35`c` text,
36`d` text,
37`e` text,
38`f` text,
39`g` text,
40PRIMARY KEY (`id`)
41) ENGINE=InnoDB DEFAULT CHARSET=latin1;
42SET SESSION innodb_lra_size=1;
43SELECT * FROM t3_small;
44id a b c d e f g
45DROP TABLE t3_small;
46CREATE TABLE t1(a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
47CREATE TABLE t1_lra(a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
48INSERT INTO t1 VALUES (0, REPEAT('a',256));
49INSERT INTO t1(b) SELECT b FROM t1;
50INSERT INTO t1(b) SELECT b FROM t1;
51INSERT INTO t1(b) SELECT b FROM t1;
52INSERT INTO t1(b) SELECT b FROM t1;
53INSERT INTO t1(b) SELECT b FROM t1;
54INSERT INTO t1(b) SELECT b FROM t1;
55INSERT INTO t1(b) SELECT b FROM t1;
56INSERT INTO t1(b) SELECT b FROM t1;
57INSERT INTO t1(b) SELECT b FROM t1;
58INSERT INTO t1(b) SELECT b FROM t1;
59INSERT INTO t1(b) SELECT b FROM t1;
60INSERT INTO t1(b) SELECT b FROM t1;
61INSERT INTO t1(b) SELECT b FROM t1;
62INSERT INTO t1(b) SELECT b FROM t1;
63INSERT INTO t1_lra SELECT * FROM t1;
64CHECKSUM TABLE t1;
65Table Checksum
66test.t1 2793042655
67SET SESSION innodb_lra_size=1;
68SET SESSION innodb_lra_n_node_recs_before_sleep=128;
69SET SESSION innodb_lra_sleep=100;
70CHECKSUM TABLE t1_lra;
71Table Checksum
72test.t1_lra 2793042655
73DELETE FROM t1 WHERE a >= 5480 AND a < 5520;
74DELETE FROM t1 WHERE a >= 5520 AND a < 5550;
75CHECKSUM TABLE t1;
76Table Checksum
77test.t1 1005864202
78SET GLOBAL innodb_lra_test=1;
79DELETE FROM t1_lra WHERE a >= 5480 AND a < 5520;
80DELETE FROM t1_lra WHERE a >= 5520 AND a < 5550;
81SET SESSION innodb_lra_size=1;
82SET SESSION innodb_lra_n_node_recs_before_sleep=128;
83SET SESSION innodb_lra_sleep=100;
84CHECKSUM TABLE t1_lra;
85Table Checksum
86test.t1_lra 1005864202
87DROP TABLE t1;
88DROP TABLE t1_lra;
089
=== added file 'mysql-test/suite/innodb/r/innodb_merge_read.result'
--- mysql-test/suite/innodb/r/innodb_merge_read.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/r/innodb_merge_read.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,30 @@
1DROP TABLE if exists t1;
2CREATE TABLE t1 (a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
3INSERT INTO t1 VALUES (0, REPEAT('a',256));
4INSERT INTO t1 SELECT 0, b FROM t1;
5INSERT INTO t1 SELECT 0, b FROM t1;
6INSERT INTO t1 SELECT 0, b FROM t1;
7INSERT INTO t1 SELECT 0, b FROM t1;
8INSERT INTO t1 SELECT 0, b FROM t1;
9INSERT INTO t1 SELECT 0, b FROM t1;
10INSERT INTO t1 SELECT 0, b FROM t1;
11INSERT INTO t1 SELECT 0, b FROM t1;
12INSERT INTO t1 SELECT 0, b FROM t1;
13INSERT INTO t1 SELECT 0, b FROM t1;
14INSERT INTO t1 SELECT 0, b FROM t1;
15INSERT INTO t1 SELECT 0, b FROM t1;
16INSERT INTO t1 SELECT 0, b FROM t1;
17INSERT INTO t1 SELECT 0, b FROM t1;
18INSERT INTO t1 SELECT 0, b FROM t1;
19INSERT INTO t1 SELECT 0, b FROM t1;
20show global status like "innodb_buffered_aio_submitted";
21Variable_name Value
22Innodb_buffered_aio_submitted 0
23select * from t1;
24select count(*) from t1;
25count(*)
2665536
27show global status like "innodb_buffered_aio_submitted";
28Variable_name Value
29Innodb_buffered_aio_submitted 2397
30DROP TABLE t1;
031
=== added file 'mysql-test/suite/innodb/t/innodb_logical_read_ahead-master.opt'
--- mysql-test/suite/innodb/t/innodb_logical_read_ahead-master.opt 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_logical_read_ahead-master.opt 2014-04-23 10:58:56 +0000
@@ -0,0 +1,2 @@
1--innodb_use_native_aio=1
2--force-restart
03
=== added file 'mysql-test/suite/innodb/t/innodb_logical_read_ahead.test'
--- mysql-test/suite/innodb/t/innodb_logical_read_ahead.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_logical_read_ahead.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,55 @@
1--source include/have_innodb.inc
2--source include/have_native_aio.inc
3
4--disable_warnings
5DROP TABLE if exists t1;
6--enable_warnings
7
8# Create table.
9CREATE TABLE t1 (a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
10
11# Populate table.
12INSERT INTO t1 VALUES (0, REPEAT('a',256));
13INSERT INTO t1 SELECT 0, b FROM t1;
14INSERT INTO t1 SELECT 0, b FROM t1;
15INSERT INTO t1 SELECT 0, b FROM t1;
16INSERT INTO t1 SELECT 0, b FROM t1;
17INSERT INTO t1 SELECT 0, b FROM t1;
18INSERT INTO t1 SELECT 0, b FROM t1;
19INSERT INTO t1 SELECT 0, b FROM t1;
20INSERT INTO t1 SELECT 0, b FROM t1;
21INSERT INTO t1 SELECT 0, b FROM t1;
22INSERT INTO t1 SELECT 0, b FROM t1;
23INSERT INTO t1 SELECT 0, b FROM t1;
24INSERT INTO t1 SELECT 0, b FROM t1;
25INSERT INTO t1 SELECT 0, b FROM t1;
26INSERT INTO t1 SELECT 0, b FROM t1;
27INSERT INTO t1 SELECT 0, b FROM t1;
28INSERT INTO t1 SELECT 0, b FROM t1;
29
30--source include/restart_mysqld.inc
31
32show global status like "innodb_buffered_aio_submitted";
33show global status like "innodb_logical_read_ahead_misses";
34show global status like "innodb_logical_read_ahead_prefetched";
35show global status like "innodb_logical_read_ahead_in_buf_pool";
36
37# set the logical read ahead large enough to prefetch
38# the entire table.
39SET SESSION innodb_lra_size=1024;
40SET SESSION innodb_lra_n_node_recs_before_sleep=128;
41SET SESSION innodb_lra_sleep=100;
42checksum table t1;
43
44# there should be no misses, all pages must have been
45# prefetched by the logical read ahead.
46show global status like "innodb_logical_read_ahead_misses";
47# the total number of pages prefetched must be close to the number
48# of leaf pages of the table.
49select variable_value > 1000 from information_schema.global_status where variable_name="innodb_logical_read_ahead_prefetched";
50# innodb_logical_read_ahead_in_buf_pool is the number of pages
51# of the table that were already in the buffer pool while doing the scan.
52# This should be small.
53select variable_value < 100 from information_schema.global_status where variable_name="innodb_logical_read_ahead_in_buf_pool";
54
55DROP TABLE t1;
056
=== added file 'mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness-master.opt'
--- mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness-master.opt 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness-master.opt 2014-04-23 10:58:56 +0000
@@ -0,0 +1,2 @@
1--innodb_use_native_aio=1
2--force-restart
03
=== added file 'mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness.test'
--- mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_logical_read_ahead_correctness.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,107 @@
1--source include/have_debug.inc
2--source include/have_innodb.inc
3--source include/have_native_aio.inc
4
5--disable_warnings
6DROP TABLE IF EXISTS t1_small;
7DROP TABLE IF EXISTS t1;
8DROP TABLE IF EXISTS t1_lra;
9DROP TABLE IF EXISTS t2_small;
10DROP TABLE IF EXISTS t3_small;
11--enable_warnings
12
13# The small table is for checking against a bug where the table's only page is the
14# root page. In such a case the function called for getting the parent page caused
15# the server to crash.
16CREATE TABLE t1_small(a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
17
18SET SESSION innodb_lra_size=1;
19SELECT * FROM t1_small;
20
21SET SESSION innodb_lra_size=0;
22INSERT INTO t1_small(b) VALUES(REPEAT('a',256));
23SET SESSION innodb_lra_size=1;
24SELECT a, LENGTH(b) FROM t1_small;
25SET SESSION innodb_lra_size=0;
26
27DROP TABLE t1_small;
28
29CREATE TABLE `t2_small` (
30 `id1` bigint(20) unsigned NOT NULL DEFAULT '0',
31 `time` bigint(20) unsigned NOT NULL DEFAULT '0',
32 `id2` bigint(20) unsigned NOT NULL DEFAULT '0',
33 `id2_type` int(10) unsigned DEFAULT NULL,
34 `data` text,
35 `status` tinyint(3) unsigned DEFAULT NULL,
36 PRIMARY KEY (`id1`,`time`,`id2`)
37) ENGINE=InnoDB DEFAULT CHARSET=latin1;
38
39SET SESSION innodb_lra_size=1;
40SELECT * FROM t2_small;
41DROP TABLE t2_small;
42
43CREATE TABLE `t3_small` (
44 `id` bigint(20) NOT NULL,
45 `a` text,
46 `b` text,
47 `c` text,
48 `d` text,
49 `e` text,
50 `f` text,
51 `g` text,
52 PRIMARY KEY (`id`)
53) ENGINE=InnoDB DEFAULT CHARSET=latin1;
54
55SET SESSION innodb_lra_size=1;
56SELECT * FROM t3_small;
57DROP TABLE t3_small;
58
59CREATE TABLE t1(a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
60CREATE TABLE t1_lra(a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
61
62# Populate tables.
63INSERT INTO t1 VALUES (0, REPEAT('a',256));
64INSERT INTO t1(b) SELECT b FROM t1;
65INSERT INTO t1(b) SELECT b FROM t1;
66INSERT INTO t1(b) SELECT b FROM t1;
67INSERT INTO t1(b) SELECT b FROM t1;
68INSERT INTO t1(b) SELECT b FROM t1;
69INSERT INTO t1(b) SELECT b FROM t1;
70INSERT INTO t1(b) SELECT b FROM t1;
71INSERT INTO t1(b) SELECT b FROM t1;
72INSERT INTO t1(b) SELECT b FROM t1;
73INSERT INTO t1(b) SELECT b FROM t1;
74INSERT INTO t1(b) SELECT b FROM t1;
75INSERT INTO t1(b) SELECT b FROM t1;
76INSERT INTO t1(b) SELECT b FROM t1;
77INSERT INTO t1(b) SELECT b FROM t1;
78
79INSERT INTO t1_lra SELECT * FROM t1;
80
81--source include/restart_mysqld.inc
82
83CHECKSUM TABLE t1;
84
85SET SESSION innodb_lra_size=1;
86SET SESSION innodb_lra_n_node_recs_before_sleep=128;
87SET SESSION innodb_lra_sleep=100;
88CHECKSUM TABLE t1_lra;
89
90--source include/restart_mysqld.inc
91
92DELETE FROM t1 WHERE a >= 5480 AND a < 5520;
93DELETE FROM t1 WHERE a >= 5520 AND a < 5550;
94
95CHECKSUM TABLE t1;
96
97SET GLOBAL innodb_lra_test=1;
98DELETE FROM t1_lra WHERE a >= 5480 AND a < 5520;
99DELETE FROM t1_lra WHERE a >= 5520 AND a < 5550;
100
101SET SESSION innodb_lra_size=1;
102SET SESSION innodb_lra_n_node_recs_before_sleep=128;
103SET SESSION innodb_lra_sleep=100;
104CHECKSUM TABLE t1_lra;
105
106DROP TABLE t1;
107DROP TABLE t1_lra;
0108
=== added file 'mysql-test/suite/innodb/t/innodb_merge_read-master.opt'
--- mysql-test/suite/innodb/t/innodb_merge_read-master.opt 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_merge_read-master.opt 2014-04-23 10:58:56 +0000
@@ -0,0 +1,1 @@
1--innodb-use-native-aio=1
02
=== added file 'mysql-test/suite/innodb/t/innodb_merge_read.test'
--- mysql-test/suite/innodb/t/innodb_merge_read.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/innodb/t/innodb_merge_read.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,42 @@
1--source include/have_innodb.inc
2--source include/have_native_aio.inc
3
4--disable_warnings
5DROP TABLE if exists t1;
6--enable_warnings
7
8# Create table.
9CREATE TABLE t1 (a INT NOT NULL PRIMARY KEY AUTO_INCREMENT, b VARCHAR(256)) ENGINE=INNODB;
10
11# Populate table.
12INSERT INTO t1 VALUES (0, REPEAT('a',256));
13INSERT INTO t1 SELECT 0, b FROM t1;
14INSERT INTO t1 SELECT 0, b FROM t1;
15INSERT INTO t1 SELECT 0, b FROM t1;
16INSERT INTO t1 SELECT 0, b FROM t1;
17INSERT INTO t1 SELECT 0, b FROM t1;
18INSERT INTO t1 SELECT 0, b FROM t1;
19INSERT INTO t1 SELECT 0, b FROM t1;
20INSERT INTO t1 SELECT 0, b FROM t1;
21INSERT INTO t1 SELECT 0, b FROM t1;
22INSERT INTO t1 SELECT 0, b FROM t1;
23INSERT INTO t1 SELECT 0, b FROM t1;
24INSERT INTO t1 SELECT 0, b FROM t1;
25INSERT INTO t1 SELECT 0, b FROM t1;
26INSERT INTO t1 SELECT 0, b FROM t1;
27INSERT INTO t1 SELECT 0, b FROM t1;
28INSERT INTO t1 SELECT 0, b FROM t1;
29
30--source include/restart_mysqld.inc
31
32show global status like "innodb_buffered_aio_submitted";
33
34--disable_result_log
35select * from t1;
36--enable_result_log
37
38select count(*) from t1;
39
40show global status like "innodb_buffered_aio_submitted";
41
42DROP TABLE t1;
043
=== added file 'mysql-test/suite/sys_vars/r/innodb_lra_n_node_recs_before_sleep_basic.result'
--- mysql-test/suite/sys_vars/r/innodb_lra_n_node_recs_before_sleep_basic.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/r/innodb_lra_n_node_recs_before_sleep_basic.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,28 @@
1SET GLOBAL innodb_lra_n_node_recs_before_sleep = 128;
2SELECT @@GLOBAL.innodb_lra_n_node_recs_before_sleep;
3@@GLOBAL.innodb_lra_n_node_recs_before_sleep
4128
5SET SESSION innodb_lra_n_node_recs_before_sleep=1000000;
6SELECT @@SESSION.innodb_lra_n_node_recs_before_sleep;
7@@SESSION.innodb_lra_n_node_recs_before_sleep
81000000
9SET SESSION innodb_lra_n_node_recs_before_sleep=0;
10Warnings:
11Warning 1292 Truncated incorrect innodb_lra_n_node_recs_before_sl value: '0'
12SELECT @@SESSION.innodb_lra_n_node_recs_before_sleep;
13@@SESSION.innodb_lra_n_node_recs_before_sleep
14128
15SET SESSION innodb_lra_n_node_recs_before_sleep=16384;
16SELECT @@SESSION.innodb_lra_n_node_recs_before_sleep;
17@@SESSION.innodb_lra_n_node_recs_before_sleep
1816384
19SET GLOBAL innodb_lra_n_node_recs_before_sleep=-1;
20Warnings:
21Warning 1292 Truncated incorrect innodb_lra_n_node_recs_before_sl value: '-1'
22SELECT @@GLOBAL.innodb_lra_n_node_recs_before_sleep;
23@@GLOBAL.innodb_lra_n_node_recs_before_sleep
24128
25SET GLOBAL innodb_lra_n_node_recs_before_sleep = default;
26SELECT @@GLOBAL.innodb_lra_n_node_recs_before_sleep;
27@@GLOBAL.innodb_lra_n_node_recs_before_sleep
281024
029
=== added file 'mysql-test/suite/sys_vars/r/innodb_lra_size_basic.result'
--- mysql-test/suite/sys_vars/r/innodb_lra_size_basic.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/r/innodb_lra_size_basic.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,28 @@
1SET GLOBAL innodb_lra_size = 128;
2SELECT @@GLOBAL.innodb_lra_size;
3@@GLOBAL.innodb_lra_size
4128
5SET SESSION innodb_lra_size=1000000;
6Warnings:
7Warning 1292 Truncated incorrect innodb_lra_size value: '1000000'
8SELECT @@SESSION.innodb_lra_size;
9@@SESSION.innodb_lra_size
1016384
11SET SESSION innodb_lra_size=0;
12SELECT @@SESSION.innodb_lra_size;
13@@SESSION.innodb_lra_size
140
15SET SESSION innodb_lra_size=16384;
16SELECT @@SESSION.innodb_lra_size;
17@@SESSION.innodb_lra_size
1816384
19SET GLOBAL innodb_lra_size=-1;
20Warnings:
21Warning 1292 Truncated incorrect innodb_lra_size value: '-1'
22SELECT @@GLOBAL.innodb_lra_size;
23@@GLOBAL.innodb_lra_size
240
25SET GLOBAL innodb_lra_size = default;
26SELECT @@GLOBAL.innodb_lra_size;
27@@GLOBAL.innodb_lra_size
280
029
=== added file 'mysql-test/suite/sys_vars/r/innodb_lra_sleep_basic.result'
--- mysql-test/suite/sys_vars/r/innodb_lra_sleep_basic.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/r/innodb_lra_sleep_basic.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,30 @@
1SET GLOBAL innodb_lra_sleep = 128;
2SELECT @@GLOBAL.innodb_lra_sleep;
3@@GLOBAL.innodb_lra_sleep
4128
5SET SESSION innodb_lra_sleep=1000000;
6Warnings:
7Warning 1292 Truncated incorrect innodb_lra_sleep value: '1000000'
8SELECT @@SESSION.innodb_lra_sleep;
9@@SESSION.innodb_lra_sleep
101000
11SET SESSION innodb_lra_sleep=0;
12SELECT @@SESSION.innodb_lra_sleep;
13@@SESSION.innodb_lra_sleep
140
15SET SESSION innodb_lra_sleep=16384;
16Warnings:
17Warning 1292 Truncated incorrect innodb_lra_sleep value: '16384'
18SELECT @@SESSION.innodb_lra_sleep;
19@@SESSION.innodb_lra_sleep
201000
21SET GLOBAL innodb_lra_sleep=-1;
22Warnings:
23Warning 1292 Truncated incorrect innodb_lra_sleep value: '-1'
24SELECT @@GLOBAL.innodb_lra_sleep;
25@@GLOBAL.innodb_lra_sleep
260
27SET GLOBAL innodb_lra_sleep = default;
28SELECT @@GLOBAL.innodb_lra_sleep;
29@@GLOBAL.innodb_lra_sleep
3050
031
=== added file 'mysql-test/suite/sys_vars/r/innodb_lra_test_basic.result'
--- mysql-test/suite/sys_vars/r/innodb_lra_test_basic.result 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/r/innodb_lra_test_basic.result 2014-04-23 10:58:56 +0000
@@ -0,0 +1,8 @@
1set global innodb_lra_test=1;
2select @@global.innodb_lra_test;
3@@global.innodb_lra_test
41
5set global innodb_lra_test=default;
6select @@global.innodb_lra_test;
7@@global.innodb_lra_test
80
09
=== added file 'mysql-test/suite/sys_vars/t/innodb_lra_n_node_recs_before_sleep_basic.test'
--- mysql-test/suite/sys_vars/t/innodb_lra_n_node_recs_before_sleep_basic.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/t/innodb_lra_n_node_recs_before_sleep_basic.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,14 @@
1--source include/have_innodb.inc
2
3SET GLOBAL innodb_lra_n_node_recs_before_sleep = 128;
4SELECT @@GLOBAL.innodb_lra_n_node_recs_before_sleep;
5SET SESSION innodb_lra_n_node_recs_before_sleep=1000000;
6SELECT @@SESSION.innodb_lra_n_node_recs_before_sleep;
7SET SESSION innodb_lra_n_node_recs_before_sleep=0;
8SELECT @@SESSION.innodb_lra_n_node_recs_before_sleep;
9SET SESSION innodb_lra_n_node_recs_before_sleep=16384;
10SELECT @@SESSION.innodb_lra_n_node_recs_before_sleep;
11SET GLOBAL innodb_lra_n_node_recs_before_sleep=-1;
12SELECT @@GLOBAL.innodb_lra_n_node_recs_before_sleep;
13SET GLOBAL innodb_lra_n_node_recs_before_sleep = default;
14SELECT @@GLOBAL.innodb_lra_n_node_recs_before_sleep;
015
=== added file 'mysql-test/suite/sys_vars/t/innodb_lra_size_basic-master.opt'
--- mysql-test/suite/sys_vars/t/innodb_lra_size_basic-master.opt 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/t/innodb_lra_size_basic-master.opt 2014-04-23 10:58:56 +0000
@@ -0,0 +1,1 @@
1--innodb-use-native-aio=1
02
=== added file 'mysql-test/suite/sys_vars/t/innodb_lra_size_basic.test'
--- mysql-test/suite/sys_vars/t/innodb_lra_size_basic.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/t/innodb_lra_size_basic.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,14 @@
1--source include/have_innodb.inc
2
3SET GLOBAL innodb_lra_size = 128;
4SELECT @@GLOBAL.innodb_lra_size;
5SET SESSION innodb_lra_size=1000000;
6SELECT @@SESSION.innodb_lra_size;
7SET SESSION innodb_lra_size=0;
8SELECT @@SESSION.innodb_lra_size;
9SET SESSION innodb_lra_size=16384;
10SELECT @@SESSION.innodb_lra_size;
11SET GLOBAL innodb_lra_size=-1;
12SELECT @@GLOBAL.innodb_lra_size;
13SET GLOBAL innodb_lra_size = default;
14SELECT @@GLOBAL.innodb_lra_size;
015
=== added file 'mysql-test/suite/sys_vars/t/innodb_lra_sleep_basic.test'
--- mysql-test/suite/sys_vars/t/innodb_lra_sleep_basic.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/t/innodb_lra_sleep_basic.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,14 @@
1--source include/have_innodb.inc
2
3SET GLOBAL innodb_lra_sleep = 128;
4SELECT @@GLOBAL.innodb_lra_sleep;
5SET SESSION innodb_lra_sleep=1000000;
6SELECT @@SESSION.innodb_lra_sleep;
7SET SESSION innodb_lra_sleep=0;
8SELECT @@SESSION.innodb_lra_sleep;
9SET SESSION innodb_lra_sleep=16384;
10SELECT @@SESSION.innodb_lra_sleep;
11SET GLOBAL innodb_lra_sleep=-1;
12SELECT @@GLOBAL.innodb_lra_sleep;
13SET GLOBAL innodb_lra_sleep = default;
14SELECT @@GLOBAL.innodb_lra_sleep;
015
=== added file 'mysql-test/suite/sys_vars/t/innodb_lra_test_basic-master.opt'
--- mysql-test/suite/sys_vars/t/innodb_lra_test_basic-master.opt 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/t/innodb_lra_test_basic-master.opt 2014-04-23 10:58:56 +0000
@@ -0,0 +1,1 @@
1--innodb-use-native-aio=1
02
=== added file 'mysql-test/suite/sys_vars/t/innodb_lra_test_basic.test'
--- mysql-test/suite/sys_vars/t/innodb_lra_test_basic.test 1970-01-01 00:00:00 +0000
+++ mysql-test/suite/sys_vars/t/innodb_lra_test_basic.test 2014-04-23 10:58:56 +0000
@@ -0,0 +1,8 @@
1--source include/have_debug.inc
2--source include/have_innodb.inc
3--source include/have_native_aio.inc
4
5set global innodb_lra_test=1;
6select @@global.innodb_lra_test;
7set global innodb_lra_test=default;
8select @@global.innodb_lra_test;
0\ No newline at end of file9\ No newline at end of file
110
=== modified file 'storage/innobase/btr/btr0cur.cc'
--- storage/innobase/btr/btr0cur.cc 2014-03-03 17:51:33 +0000
+++ storage/innobase/btr/btr0cur.cc 2014-04-23 10:58:56 +0000
@@ -548,6 +548,7 @@
548 btr_search_enabled below, and btr_search_guess_on_hash()548 btr_search_enabled below, and btr_search_guess_on_hash()
549 will have to check it again. */549 will have to check it again. */
550 && UNIV_LIKELY(btr_search_enabled)550 && UNIV_LIKELY(btr_search_enabled)
551 && !level
551 && btr_search_guess_on_hash(index, info, tuple, mode,552 && btr_search_guess_on_hash(index, info, tuple, mode,
552 latch_mode, cursor,553 latch_mode, cursor,
553 has_search_latch, mtr)) {554 has_search_latch, mtr)) {
554555
=== modified file 'storage/innobase/btr/btr0pcur.cc'
--- storage/innobase/btr/btr0pcur.cc 2014-03-03 17:51:33 +0000
+++ storage/innobase/btr/btr0pcur.cc 2014-04-23 10:58:56 +0000
@@ -227,6 +227,7 @@
227/*===========================*/227/*===========================*/
228 ulint latch_mode, /*!< in: BTR_SEARCH_LEAF, ... */228 ulint latch_mode, /*!< in: BTR_SEARCH_LEAF, ... */
229 btr_pcur_t* cursor, /*!< in: detached persistent cursor */229 btr_pcur_t* cursor, /*!< in: detached persistent cursor */
230 ulint level,
230 const char* file, /*!< in: file name */231 const char* file, /*!< in: file name */
231 ulint line, /*!< in: line where called */232 ulint line, /*!< in: line where called */
232 mtr_t* mtr) /*!< in: mtr */233 mtr_t* mtr) /*!< in: mtr */
@@ -255,7 +256,7 @@
255 btr_cur_open_at_index_side(256 btr_cur_open_at_index_side(
256 cursor->rel_pos == BTR_PCUR_BEFORE_FIRST_IN_TREE,257 cursor->rel_pos == BTR_PCUR_BEFORE_FIRST_IN_TREE,
257 index, latch_mode,258 index, latch_mode,
258 btr_pcur_get_btr_cur(cursor), 0, mtr);259 btr_pcur_get_btr_cur(cursor), level, mtr);
259260
260 cursor->latch_mode = latch_mode;261 cursor->latch_mode = latch_mode;
261 cursor->pos_state = BTR_PCUR_IS_POSITIONED;262 cursor->pos_state = BTR_PCUR_IS_POSITIONED;
@@ -267,8 +268,12 @@
267 ut_a(cursor->old_rec);268 ut_a(cursor->old_rec);
268 ut_a(cursor->old_n_fields);269 ut_a(cursor->old_n_fields);
269270
270 if (UNIV_LIKELY(latch_mode == BTR_SEARCH_LEAF)271 if (true
271 || UNIV_LIKELY(latch_mode == BTR_MODIFY_LEAF)) {272#ifdef UNIV_DEBUG
273 && !level
274#endif
275 && (UNIV_LIKELY(latch_mode == BTR_SEARCH_LEAF)
276 || UNIV_LIKELY(latch_mode == BTR_MODIFY_LEAF))) {
272 /* Try optimistic restoration. */277 /* Try optimistic restoration. */
273278
274 if (buf_page_optimistic_get(latch_mode,279 if (buf_page_optimistic_get(latch_mode,
@@ -325,24 +330,27 @@
325330
326 /* Save the old search mode of the cursor */331 /* Save the old search mode of the cursor */
327 old_mode = cursor->search_mode;332 old_mode = cursor->search_mode;
328333 if (level > 0) {
329 switch (cursor->rel_pos) {
330 case BTR_PCUR_ON:
331 mode = PAGE_CUR_LE;334 mode = PAGE_CUR_LE;
332 break;335 } else {
333 case BTR_PCUR_AFTER:336 switch (cursor->rel_pos) {
334 mode = PAGE_CUR_G;337 case BTR_PCUR_ON:
335 break;338 mode = PAGE_CUR_LE;
336 case BTR_PCUR_BEFORE:339 break;
337 mode = PAGE_CUR_L;340 case BTR_PCUR_AFTER:
338 break;341 mode = PAGE_CUR_G;
339 default:342 break;
340 ut_error;343 case BTR_PCUR_BEFORE:
341 mode = 0;344 mode = PAGE_CUR_L;
345 break;
346 default:
347 ut_error;
348 mode = 0;
349 }
342 }350 }
343351
344 btr_pcur_open_with_no_init_func(index, tuple, mode, latch_mode,352 btr_pcur_open_with_no_init_func_low(index, tuple, mode, latch_mode,
345 cursor, 0, file, line, mtr);353 cursor, level, 0, file, line, mtr);
346354
347 /* Restore the old search mode */355 /* Restore the old search mode */
348 cursor->search_mode = old_mode;356 cursor->search_mode = old_mode;
349357
=== modified file 'storage/innobase/buf/buf0rea.cc'
--- storage/innobase/buf/buf0rea.cc 2013-10-23 08:48:28 +0000
+++ storage/innobase/buf/buf0rea.cc 2014-04-23 10:58:56 +0000
@@ -123,7 +123,12 @@
123 use to stop dangling page reads from a tablespace123 use to stop dangling page reads from a tablespace
124 which we have DISCARDed + IMPORTed back */124 which we have DISCARDed + IMPORTed back */
125 ulint offset, /*!< in: page number */125 ulint offset, /*!< in: page number */
126 trx_t* trx)126 trx_t* trx, /*!< in: transaction object */
127 ibool should_buffer) /*!< in: whether to buffer an aio request.
128 AIO read ahead uses this. If you plan to
129 use this parameter, make sure you remember
130 to call os_aio_linux_dispatch_read_array_submit
131 when you are read to commit all your requests.*/
127{132{
128 buf_page_t* bpage;133 buf_page_t* bpage;
129 ulint wake_later;134 ulint wake_later;
@@ -229,14 +234,16 @@
229 *err = _fil_io(OS_FILE_READ | wake_later234 *err = _fil_io(OS_FILE_READ | wake_later
230 | ignore_nonexistent_pages,235 | ignore_nonexistent_pages,
231 sync, space, zip_size, offset, 0, zip_size,236 sync, space, zip_size, offset, 0, zip_size,
232 bpage->zip.data, bpage, trx);237 bpage->zip.data, bpage, trx,
238 should_buffer);
233 } else {239 } else {
234 ut_a(buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE);240 ut_a(buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE);
235241
236 *err = _fil_io(OS_FILE_READ | wake_later242 *err = _fil_io(OS_FILE_READ | wake_later
237 | ignore_nonexistent_pages,243 | ignore_nonexistent_pages,
238 sync, space, 0, offset, 0, UNIV_PAGE_SIZE,244 sync, space, 0, offset, 0, UNIV_PAGE_SIZE,
239 ((buf_block_t*) bpage)->frame, bpage, trx);245 ((buf_block_t*) bpage)->frame, bpage, trx,
246 should_buffer);
240 }247 }
241248
242 if (sync) {249 if (sync) {
@@ -395,7 +402,7 @@
395 &err, false,402 &err, false,
396 ibuf_mode | OS_AIO_SIMULATED_WAKE_LATER,403 ibuf_mode | OS_AIO_SIMULATED_WAKE_LATER,
397 space, zip_size, FALSE,404 space, zip_size, FALSE,
398 tablespace_version, i, trx);405 tablespace_version, i, trx, FALSE);
399 if (err == DB_TABLESPACE_DELETED) {406 if (err == DB_TABLESPACE_DELETED) {
400 ut_print_timestamp(stderr);407 ut_print_timestamp(stderr);
401 fprintf(stderr,408 fprintf(stderr,
@@ -459,7 +466,7 @@
459466
460 count = buf_read_page_low(&err, true, BUF_READ_ANY_PAGE, space,467 count = buf_read_page_low(&err, true, BUF_READ_ANY_PAGE, space,
461 zip_size, FALSE,468 zip_size, FALSE,
462 tablespace_version, offset, trx);469 tablespace_version, offset, trx, FALSE);
463 srv_stats.buf_pool_reads.add(count);470 srv_stats.buf_pool_reads.add(count);
464 if (err == DB_TABLESPACE_DELETED) {471 if (err == DB_TABLESPACE_DELETED) {
465 ut_print_timestamp(stderr);472 ut_print_timestamp(stderr);
@@ -507,7 +514,7 @@
507 | OS_AIO_SIMULATED_WAKE_LATER514 | OS_AIO_SIMULATED_WAKE_LATER
508 | BUF_READ_IGNORE_NONEXISTENT_PAGES,515 | BUF_READ_IGNORE_NONEXISTENT_PAGES,
509 space, zip_size, FALSE,516 space, zip_size, FALSE,
510 tablespace_version, offset, NULL);517 tablespace_version, offset, NULL, FALSE);
511 srv_stats.buf_pool_reads.add(count);518 srv_stats.buf_pool_reads.add(count);
512519
513 /* We do not increment number of I/O operations used for LRU policy520 /* We do not increment number of I/O operations used for LRU policy
@@ -584,6 +591,12 @@
584 return(0);591 return(0);
585 }592 }
586593
594 /* linear read ahead is disabled if user requested logical read ahead.
595 */
596 if (trx && trx->lra_size) {
597 return(0);
598 }
599
587 low = (offset / buf_read_ahead_linear_area)600 low = (offset / buf_read_ahead_linear_area)
588 * buf_read_ahead_linear_area;601 * buf_read_ahead_linear_area;
589 high = (offset / buf_read_ahead_linear_area + 1)602 high = (offset / buf_read_ahead_linear_area + 1)
@@ -773,7 +786,8 @@
773 count += buf_read_page_low(786 count += buf_read_page_low(
774 &err, false,787 &err, false,
775 ibuf_mode,788 ibuf_mode,
776 space, zip_size, FALSE, tablespace_version, i, trx);789 space, zip_size, FALSE, tablespace_version, i, trx,
790 TRUE);
777 if (err == DB_TABLESPACE_DELETED) {791 if (err == DB_TABLESPACE_DELETED) {
778 ut_print_timestamp(stderr);792 ut_print_timestamp(stderr);
779 fprintf(stderr,793 fprintf(stderr,
@@ -786,6 +800,10 @@
786 }800 }
787 }801 }
788 }802 }
803#if defined(LINUX_NATIVE_AIO)
804 /* Tell aio to submit all buffered requests. */
805 ut_a(os_aio_linux_dispatch_read_array_submit());
806#endif
789807
790 /* In simulated aio we wake the aio handler threads only after808 /* In simulated aio we wake the aio handler threads only after
791 queuing all aio requests, in native aio the following call does809 queuing all aio requests, in native aio the following call does
@@ -863,7 +881,7 @@
863 buf_read_page_low(&err, sync && (i + 1 == n_stored),881 buf_read_page_low(&err, sync && (i + 1 == n_stored),
864 BUF_READ_ANY_PAGE, space_ids[i],882 BUF_READ_ANY_PAGE, space_ids[i],
865 zip_size, TRUE, space_versions[i],883 zip_size, TRUE, space_versions[i],
866 page_nos[i], NULL);884 page_nos[i], NULL, FALSE);
867885
868 if (UNIV_UNLIKELY(err == DB_TABLESPACE_DELETED)) {886 if (UNIV_UNLIKELY(err == DB_TABLESPACE_DELETED)) {
869tablespace_deleted:887tablespace_deleted:
@@ -1003,15 +1021,20 @@
1003 if ((i + 1 == n_stored) && sync) {1021 if ((i + 1 == n_stored) && sync) {
1004 buf_read_page_low(&err, true, BUF_READ_ANY_PAGE, space,1022 buf_read_page_low(&err, true, BUF_READ_ANY_PAGE, space,
1005 zip_size, TRUE, tablespace_version,1023 zip_size, TRUE, tablespace_version,
1006 page_nos[i], NULL);1024 page_nos[i], NULL, FALSE);
1007 } else {1025 } else {
1008 buf_read_page_low(&err, false, BUF_READ_ANY_PAGE1026 buf_read_page_low(&err, false, BUF_READ_ANY_PAGE
1009 | OS_AIO_SIMULATED_WAKE_LATER,1027 | OS_AIO_SIMULATED_WAKE_LATER,
1010 space, zip_size, TRUE,1028 space, zip_size, TRUE,
1011 tablespace_version, page_nos[i], NULL);1029 tablespace_version, page_nos[i], NULL,
1030 FALSE);
1012 }1031 }
1013 }1032 }
10141033
1034#ifdef LINUX_NATIVE_AIO
1035 ut_a(os_aio_linux_dispatch_read_array_submit());
1036#endif
1037
1015 os_aio_simulated_wake_handler_threads();1038 os_aio_simulated_wake_handler_threads();
10161039
1017#ifdef UNIV_DEBUG1040#ifdef UNIV_DEBUG
10181041
=== modified file 'storage/innobase/fil/fil0fil.cc'
--- storage/innobase/fil/fil0fil.cc 2014-03-05 11:54:14 +0000
+++ storage/innobase/fil/fil0fil.cc 2014-04-23 10:58:56 +0000
@@ -5168,7 +5168,7 @@
5168 success = os_aio(OS_FILE_WRITE, OS_AIO_SYNC,5168 success = os_aio(OS_FILE_WRITE, OS_AIO_SYNC,
5169 node->name, node->handle, buf,5169 node->name, node->handle, buf,
5170 offset, page_size * n_pages,5170 offset, page_size * n_pages,
5171 NULL, NULL, space_id, NULL);5171 NULL, NULL, space_id, NULL, TRUE);
5172#endif /* UNIV_HOTBACKUP */5172#endif /* UNIV_HOTBACKUP */
5173 if (success) {5173 if (success) {
5174 os_has_said_disk_full = FALSE;5174 os_has_said_disk_full = FALSE;
@@ -5545,7 +5545,12 @@
5545 appropriately aligned */5545 appropriately aligned */
5546 void* message, /*!< in: message for aio handler if non-sync5546 void* message, /*!< in: message for aio handler if non-sync
5547 aio used, else ignored */5547 aio used, else ignored */
5548 trx_t* trx)5548 trx_t* trx,
5549 ibool should_buffer) /*!< in: whether to buffer an aio request.
5550 AIO read ahead uses this. If you plan to
5551 use this parameter, make sure you remember
5552 to call os_aio_linux_dispatch_read_array_submit
5553 when you are read to commit all your requests.*/
5549{5554{
5550 ulint mode;5555 ulint mode;
5551 fil_space_t* space;5556 fil_space_t* space;
@@ -5762,7 +5767,7 @@
57625767
5763 /* Queue the aio request */5768 /* Queue the aio request */
5764 ret = os_aio(type, mode | wake_later, node->name, node->handle, buf,5769 ret = os_aio(type, mode | wake_later, node->name, node->handle, buf,
5765 offset, len, node, message, space_id, trx);5770 offset, len, node, message, space_id, trx, should_buffer);
57665771
5767#else5772#else
5768 /* In ibbackup do normal i/o, not aio */5773 /* In ibbackup do normal i/o, not aio */
57695774
=== modified file 'storage/innobase/handler/ha_innodb.cc'
--- storage/innobase/handler/ha_innodb.cc 2014-03-03 17:51:33 +0000
+++ storage/innobase/handler/ha_innodb.cc 2014-04-23 10:58:56 +0000
@@ -106,6 +106,11 @@
106#include "i_s.h"106#include "i_s.h"
107#include "xtradb_i_s.h"107#include "xtradb_i_s.h"
108108
109#ifdef TARGET_OS_LINUX
110#include <sys/syscall.h>
111#include <sys/ioctl.h>
112#endif /* TARGET_OS_LINUX */
113
109# ifndef MYSQL_PLUGIN_IMPORT114# ifndef MYSQL_PLUGIN_IMPORT
110# define MYSQL_PLUGIN_IMPORT /* nothing */115# define MYSQL_PLUGIN_IMPORT /* nothing */
111# endif /* MYSQL_PLUGIN_IMPORT */116# endif /* MYSQL_PLUGIN_IMPORT */
@@ -634,6 +639,30 @@
634 "Timeout in seconds an InnoDB transaction may wait for a lock before being rolled back. Values above 100000000 disable the timeout.",639 "Timeout in seconds an InnoDB transaction may wait for a lock before being rolled back. Values above 100000000 disable the timeout.",
635 NULL, NULL, 50, 1, 1024 * 1024 * 1024, 0);640 NULL, NULL, 50, 1, 1024 * 1024 * 1024, 0);
636641
642static MYSQL_THDVAR_ULONG(lra_size, PLUGIN_VAR_OPCMDARG,
643 "The size (in MBs) of the total size of the pages that innodb will prefetch "
644 "while scanning a table during this session. This is meant to be used only "
645 "for table scans. The upper limit of this variable is 16384 which "
646 "corresponds to prefetching 16GB of data. When set to max, this algorithm "
647 "may use 100M memory.", NULL, NULL, 0, 0, 16384, 0);
648
649static MYSQL_THDVAR_ULONG(lra_n_node_recs_before_sleep, PLUGIN_VAR_OPCMDARG,
650 "innodb_lra_n_node_recs_before_sleep is the number of node pointer records "
651 "traversed while holding the index lock before releasing the index lock "
652 "and sleeping for a short period of time so that the other threads get a "
653 "chance to x-latch the index lock. innodb_lra_sleep is the sleep time in "
654 "milliseconds.",
655 NULL, NULL, 1024, 128, ULINT_MAX, 0);
656
657static MYSQL_THDVAR_ULONG(lra_sleep, PLUGIN_VAR_OPCMDARG,
658 "innodb_lra_n_node_recs_before_sleep is the number of node pointer records "
659 "traversed while holding the index lock before releasing the index lock "
660 "and sleeping for a short period of time so that the other threads get a "
661 "chance to x-latch the index lock. innodb_lra_sleep is the sleep time in "
662 "milliseconds.",
663 NULL, NULL, 50, 0, 1000, 0);
664
665
637static MYSQL_THDVAR_STR(ft_user_stopword_table,666static MYSQL_THDVAR_STR(ft_user_stopword_table,
638 PLUGIN_VAR_OPCMDARG|PLUGIN_VAR_MEMALLOC,667 PLUGIN_VAR_OPCMDARG|PLUGIN_VAR_MEMALLOC,
639 "User supplied stopword table name, effective in the session level.",668 "User supplied stopword table name, effective in the session level.",
@@ -851,6 +880,14 @@
851 (char*) &export_vars.innodb_x_lock_spin_rounds, SHOW_LONGLONG},880 (char*) &export_vars.innodb_x_lock_spin_rounds, SHOW_LONGLONG},
852 {"x_lock_spin_waits",881 {"x_lock_spin_waits",
853 (char*) &export_vars.innodb_x_lock_spin_waits, SHOW_LONGLONG},882 (char*) &export_vars.innodb_x_lock_spin_waits, SHOW_LONGLONG},
883 {"buffered_aio_submitted",
884 (char*) &export_vars.innodb_buffered_aio_submitted, SHOW_LONG},
885 {"logical_read_ahead_misses",
886 (char*) &export_vars.innodb_logical_read_ahead_misses, SHOW_LONG},
887 {"logical_read_ahead_prefetched",
888 (char*) &export_vars.innodb_logical_read_ahead_prefetched, SHOW_LONG},
889 {"logical_read_ahead_in_buf_pool",
890 (char*) &export_vars.innodb_logical_read_ahead_in_buf_pool, SHOW_LONG},
854 {NullS, NullS, SHOW_LONG}891 {NullS, NullS, SHOW_LONG}
855};892};
856893
@@ -2294,6 +2331,10 @@
2294 thd, OPTION_RELAXED_UNIQUE_CHECKS);2331 thd, OPTION_RELAXED_UNIQUE_CHECKS);
22952332
2296 trx->fake_changes = THDVAR(thd, fake_changes);2333 trx->fake_changes = THDVAR(thd, fake_changes);
2334 trx_lra_reset(trx,
2335 THDVAR(thd, lra_size),
2336 THDVAR(thd, lra_n_node_recs_before_sleep),
2337 THDVAR(thd, lra_sleep));
22972338
2298#ifdef EXTENDED_SLOWLOG2339#ifdef EXTENDED_SLOWLOG
2299 if (thd_log_slow_verbosity(thd) & (1ULL << SLOG_V_INNODB)) {2340 if (thd_log_slow_verbosity(thd) & (1ULL << SLOG_V_INNODB)) {
@@ -2326,6 +2367,10 @@
2326 trx = trx_allocate_for_mysql();2367 trx = trx_allocate_for_mysql();
23272368
2328 trx->mysql_thd = thd;2369 trx->mysql_thd = thd;
2370 trx_lra_reset(trx,
2371 THDVAR(thd, lra_size),
2372 THDVAR(thd, lra_n_node_recs_before_sleep),
2373 THDVAR(thd, lra_sleep));
23292374
2330 innobase_trx_init(thd, trx);2375 innobase_trx_init(thd, trx);
23312376
@@ -3860,6 +3905,7 @@
3860/*================*/3905/*================*/
3861 trx_t* trx) /*!< in: transaction handle */3906 trx_t* trx) /*!< in: transaction handle */
3862{3907{
3908 trx_lra_reset(trx, 0, 0, 0);
3863 if (trx_is_started(trx)) {3909 if (trx_is_started(trx)) {
38643910
3865 trx_commit_for_mysql(trx);3911 trx_commit_for_mysql(trx);
@@ -17573,6 +17619,17 @@
17573 "It is to create artificially the situation the purge view have been updated "17619 "It is to create artificially the situation the purge view have been updated "
17574 "but the each purges were not done yet.",17620 "but the each purges were not done yet.",
17575 NULL, NULL, FALSE);17621 NULL, NULL, FALSE);
17622
17623#ifdef UNIV_DEBUG
17624extern my_bool row_lra_test;
17625#endif
17626
17627static MYSQL_SYSVAR_BOOL(lra_test, row_lra_test,
17628 PLUGIN_VAR_NOCMDARG,
17629 "When set to true, the purge thread stops until the logical read ahead "
17630 "sets this variable to TRUE. Used for testing edge cases regarding the "
17631 "purge thread and logical read ahead.",
17632 NULL, NULL, FALSE);
17576#endif /* UNIV_DEBUG */17633#endif /* UNIV_DEBUG */
1757717634
17578const char *corrupt_table_action_names[]=17635const char *corrupt_table_action_names[]=
@@ -17789,10 +17846,14 @@
17789 MYSQL_SYSVAR(trx_rseg_n_slots_debug),17846 MYSQL_SYSVAR(trx_rseg_n_slots_debug),
17790 MYSQL_SYSVAR(limit_optimistic_insert_debug),17847 MYSQL_SYSVAR(limit_optimistic_insert_debug),
17791 MYSQL_SYSVAR(trx_purge_view_update_only_debug),17848 MYSQL_SYSVAR(trx_purge_view_update_only_debug),
17849 MYSQL_SYSVAR(lra_test),
17792#endif /* UNIV_DEBUG */17850#endif /* UNIV_DEBUG */
17793 MYSQL_SYSVAR(corrupt_table_action),17851 MYSQL_SYSVAR(corrupt_table_action),
17794 MYSQL_SYSVAR(fake_changes),17852 MYSQL_SYSVAR(fake_changes),
17795 MYSQL_SYSVAR(locking_fake_changes),17853 MYSQL_SYSVAR(locking_fake_changes),
17854 MYSQL_SYSVAR(lra_size),
17855 MYSQL_SYSVAR(lra_n_node_recs_before_sleep),
17856 MYSQL_SYSVAR(lra_sleep),
17796 NULL17857 NULL
17797};17858};
1779817859
1779917860
=== modified file 'storage/innobase/include/btr0pcur.h'
--- storage/innobase/include/btr0pcur.h 2014-02-17 11:12:40 +0000
+++ storage/innobase/include/btr0pcur.h 2014-04-23 10:58:56 +0000
@@ -262,11 +262,12 @@
262/*===========================*/262/*===========================*/
263 ulint latch_mode, /*!< in: BTR_SEARCH_LEAF, ... */263 ulint latch_mode, /*!< in: BTR_SEARCH_LEAF, ... */
264 btr_pcur_t* cursor, /*!< in: detached persistent cursor */264 btr_pcur_t* cursor, /*!< in: detached persistent cursor */
265 ulint level,
265 const char* file, /*!< in: file name */266 const char* file, /*!< in: file name */
266 ulint line, /*!< in: line where called */267 ulint line, /*!< in: line where called */
267 mtr_t* mtr); /*!< in: mtr */268 mtr_t* mtr); /*!< in: mtr */
268#define btr_pcur_restore_position(l,cur,mtr) \269#define btr_pcur_restore_position(l, cur, mtr) \
269 btr_pcur_restore_position_func(l,cur,__FILE__,__LINE__,mtr)270 btr_pcur_restore_position_func(l, cur, 0, __FILE__, __LINE__, mtr)
270/*********************************************************//**271/*********************************************************//**
271Gets the rel_pos field for a cursor whose position has been stored.272Gets the rel_pos field for a cursor whose position has been stored.
272@return BTR_PCUR_ON, ... */273@return BTR_PCUR_ON, ... */
273274
=== modified file 'storage/innobase/include/btr0pcur.ic'
--- storage/innobase/include/btr0pcur.ic 2014-02-17 11:12:40 +0000
+++ storage/innobase/include/btr0pcur.ic 2014-04-23 10:58:56 +0000
@@ -448,6 +448,54 @@
448cursor. */448cursor. */
449UNIV_INLINE449UNIV_INLINE
450void450void
451btr_pcur_open_with_no_init_func_low(
452/*============================*/
453 dict_index_t* index, /*!< in: index */
454 const dtuple_t* tuple, /*!< in: tuple on which search done */
455 ulint mode, /*!< in: PAGE_CUR_L, ...;
456 NOTE that if the search is made using a unique
457 prefix of a record, mode should be
458 PAGE_CUR_LE, not PAGE_CUR_GE, as the latter
459 may end up on the previous page of the
460 record! */
461 ulint latch_mode,/*!< in: BTR_SEARCH_LEAF, ...;
462 NOTE that if has_search_latch != 0 then
463 we maybe do not acquire a latch on the cursor
464 page, but assume that the caller uses his
465 btr search latch to protect the record! */
466 btr_pcur_t* cursor, /*!< in: memory buffer for persistent cursor */
467 ulint level,
468 ulint has_search_latch,/*!< in: latch mode the caller
469 currently has on btr_search_latch:
470 RW_S_LATCH, or 0 */
471 const char* file, /*!< in: file name */
472 ulint line, /*!< in: line where called */
473 mtr_t* mtr) /*!< in: mtr */
474{
475 btr_cur_t* btr_cursor;
476
477 cursor->latch_mode = latch_mode;
478 cursor->search_mode = mode;
479
480 /* Search with the tree cursor */
481
482 btr_cursor = btr_pcur_get_btr_cur(cursor);
483
484 btr_cur_search_to_nth_level(index, level, tuple, mode, latch_mode,
485 btr_cursor, has_search_latch,
486 file, line, mtr);
487 cursor->pos_state = BTR_PCUR_IS_POSITIONED;
488
489 cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
490
491 cursor->trx_if_known = NULL;
492}
493
494/**************************************************************//**
495Opens an persistent cursor to an index tree without initializing the
496cursor. */
497UNIV_INLINE
498void
451btr_pcur_open_with_no_init_func(499btr_pcur_open_with_no_init_func(
452/*============================*/500/*============================*/
453 dict_index_t* index, /*!< in: index */501 dict_index_t* index, /*!< in: index */
@@ -471,23 +519,9 @@
471 ulint line, /*!< in: line where called */519 ulint line, /*!< in: line where called */
472 mtr_t* mtr) /*!< in: mtr */520 mtr_t* mtr) /*!< in: mtr */
473{521{
474 btr_cur_t* btr_cursor;522 return btr_pcur_open_with_no_init_func_low(
475523 index, tuple, mode, latch_mode, cursor,
476 cursor->latch_mode = latch_mode;524 0, has_search_latch, file, line, mtr);
477 cursor->search_mode = mode;
478
479 /* Search with the tree cursor */
480
481 btr_cursor = btr_pcur_get_btr_cur(cursor);
482
483 btr_cur_search_to_nth_level(index, 0, tuple, mode, latch_mode,
484 btr_cursor, has_search_latch,
485 file, line, mtr);
486 cursor->pos_state = BTR_PCUR_IS_POSITIONED;
487
488 cursor->old_stored = BTR_PCUR_OLD_NOT_STORED;
489
490 cursor->trx_if_known = NULL;
491}525}
492526
493/*****************************************************************//**527/*****************************************************************//**
494528
=== modified file 'storage/innobase/include/buf0rea.h'
--- storage/innobase/include/buf0rea.h 2013-10-23 08:48:28 +0000
+++ storage/innobase/include/buf0rea.h 2014-04-23 10:58:56 +0000
@@ -30,6 +30,43 @@
30#include "buf0types.h"30#include "buf0types.h"
3131
32/********************************************************************//**32/********************************************************************//**
33Low-level function which reads a page asynchronously from a file to the
34buffer buf_pool if it is not already there, in which case does nothing.
35Sets the io_fix flag and sets an exclusive lock on the buffer frame. The
36flag is cleared and the x-lock released by an i/o-handler thread.
37@return 1 if a read request was queued, 0 if the page already resided
38in buf_pool, or if the page is in the doublewrite buffer blocks in
39which case it is never read into the pool, or if the tablespace does
40not exist or is being dropped
41@return 1 if read request is issued. 0 if it is not */
42UNIV_INTERN
43ulint
44buf_read_page_low(
45/*==============*/
46 dberr_t* err, /*!< out: DB_SUCCESS or DB_TABLESPACE_DELETED
47 if we are trying to read from a non-existent
48 tablespace, or a tablespace which is just now being
49 dropped */
50 bool sync, /*!< in: TRUE if synchronous aio is desired */
51 ulint mode, /*!< in: BUF_READ_IBUF_PAGES_ONLY, ...,
52 ORed to OS_AIO_SIMULATED_WAKE_LATER (see below
53 at read-ahead functions) */
54 ulint space, /*!< in: space id */
55 ulint zip_size,/*!< in: compressed page size, or 0 */
56 ibool unzip, /*!< in: TRUE=request uncompressed page */
57 ib_int64_t tablespace_version, /*!< in: if the space memory object has
58 this timestamp different from what we are giving here,
59 treat the tablespace as dropped; this is a timestamp
60 we use to stop dangling page reads from a tablespace
61 which we have DISCARDed + IMPORTed back */
62 ulint offset, /*!< in: page number */
63 trx_t* trx, /*!< in: transaction object */
64 ibool should_buffer); /*!< in: whether to buffer an aio request.
65 AIO read ahead uses this. If you plan to
66 use this parameter, make sure you remember
67 to call os_aio_linux_dispatch_read_array_submit
68 when you are read to commit all your requests.*/
69/********************************************************************//**
33High-level function which reads a page asynchronously from a file to the70High-level function which reads a page asynchronously from a file to the
34buffer buf_pool if it is not already there. Sets the io_fix flag and sets71buffer buf_pool if it is not already there. Sets the io_fix flag and sets
35an exclusive lock on the buffer frame. The flag is cleared and the x-lock72an exclusive lock on the buffer frame. The flag is cleared and the x-lock
3673
=== modified file 'storage/innobase/include/fil0fil.h'
--- storage/innobase/include/fil0fil.h 2014-02-17 11:12:40 +0000
+++ storage/innobase/include/fil0fil.h 2014-04-23 10:58:56 +0000
@@ -724,7 +724,7 @@
724@return DB_SUCCESS, or DB_TABLESPACE_DELETED if we are trying to do724@return DB_SUCCESS, or DB_TABLESPACE_DELETED if we are trying to do
725i/o on a tablespace which does not exist */725i/o on a tablespace which does not exist */
726#define fil_io(type, sync, space_id, zip_size, block_offset, byte_offset, len, buf, message) \726#define fil_io(type, sync, space_id, zip_size, block_offset, byte_offset, len, buf, message) \
727 _fil_io(type, sync, space_id, zip_size, block_offset, byte_offset, len, buf, message, NULL)727 _fil_io(type, sync, space_id, zip_size, block_offset, byte_offset, len, buf, message, NULL, FALSE)
728728
729UNIV_INTERN729UNIV_INTERN
730dberr_t730dberr_t
@@ -755,7 +755,10 @@
755 appropriately aligned */755 appropriately aligned */
756 void* message, /*!< in: message for aio handler if non-sync756 void* message, /*!< in: message for aio handler if non-sync
757 aio used, else ignored */757 aio used, else ignored */
758 trx_t* trx)758 trx_t* trx,
759 ibool should_buffer /*!< in: whether to buffer an aio request.
760 Only used by aio read ahead*/
761)
759 __attribute__((nonnull(8)));762 __attribute__((nonnull(8)));
760/**********************************************************************//**763/**********************************************************************//**
761Waits for an aio operation to complete. This function is used to write the764Waits for an aio operation to complete. This function is used to write the
762765
=== modified file 'storage/innobase/include/os0file.h'
--- storage/innobase/include/os0file.h 2014-02-17 11:12:40 +0000
+++ storage/innobase/include/os0file.h 2014-04-23 10:58:56 +0000
@@ -321,10 +321,11 @@
321 pfs_os_file_close_func(file, __FILE__, __LINE__)321 pfs_os_file_close_func(file, __FILE__, __LINE__)
322322
323# define os_aio(type, mode, name, file, buf, offset, \323# define os_aio(type, mode, name, file, buf, offset, \
324 n, message1, message2, space_id, trx) \324 n, message1, message2, space_id, trx, \
325 should_buffer) \
325 pfs_os_aio_func(type, mode, name, file, buf, offset, \326 pfs_os_aio_func(type, mode, name, file, buf, offset, \
326 n, message1, message2, space_id, trx, \327 n, message1, message2, space_id, trx, \
327 __FILE__, __LINE__)328 __FILE__, __LINE__, should_buffer)
328329
329# define os_file_read(file, buf, offset, n) \330# define os_file_read(file, buf, offset, n) \
330 pfs_os_file_read_func(file, buf, offset, n, NULL, \331 pfs_os_file_read_func(file, buf, offset, n, NULL, \
@@ -371,9 +372,9 @@
371# define os_file_close(file) os_file_close_func(file)372# define os_file_close(file) os_file_close_func(file)
372373
373# define os_aio(type, mode, name, file, buf, offset, n, message1, \374# define os_aio(type, mode, name, file, buf, offset, n, message1, \
374 message2, space_id, trx) \375 message2, space_id, trx, should_buffer) \
375 os_aio_func(type, mode, name, file, buf, offset, n, \376 os_aio_func(type, mode, name, file, buf, offset, n, \
376 message1, message2, space_id, trx)377 message1, message2, space_id, trx, should_buffer)
377378
378# define os_file_read(file, buf, offset, n) \379# define os_file_read(file, buf, offset, n) \
379 os_file_read_func(file, buf, offset, n, NULL)380 os_file_read_func(file, buf, offset, n, NULL)
@@ -777,7 +778,13 @@
777 ulint space_id,778 ulint space_id,
778 trx_t* trx,779 trx_t* trx,
779 const char* src_file,/*!< in: file name where func invoked */780 const char* src_file,/*!< in: file name where func invoked */
780 ulint src_line);/*!< in: line where the func invoked */781 ulint src_line,/*!< in: line where the func invoked */
782 ibool should_buffer);
783 /*!< in: Whether to buffer an aio request.
784 AIO read ahead uses this. If you plan to
785 use this parameter, make sure you remember
786 to call os_aio_linux_dispatch_read_array_submit
787 when you are read to commit all your requests.*/
781/*******************************************************************//**788/*******************************************************************//**
782NOTE! Please use the corresponding macro os_file_write(), not directly789NOTE! Please use the corresponding macro os_file_write(), not directly
783this function!790this function!
@@ -1148,7 +1155,12 @@
1148 aio operation); ignored if mode is1155 aio operation); ignored if mode is
1149 OS_AIO_SYNC */1156 OS_AIO_SYNC */
1150 ulint space_id,1157 ulint space_id,
1151 trx_t* trx);1158 trx_t* trx,
1159 ibool should_buffer); /*!< in: Whether to buffer an aio request.
1160 AIO read ahead uses this. If you plan to
1161 use this parameter, make sure you remember
1162 to call os_aio_linux_dispatch_read_array_submit
1163 when you are read to commit all your requests.*/
1152/************************************************************************//**1164/************************************************************************//**
1153Wakes up all async i/o threads so that they know to exit themselves in1165Wakes up all async i/o threads so that they know to exit themselves in
1154shutdown. */1166shutdown. */
@@ -1315,6 +1327,12 @@
1315 restart the operation. */1327 restart the operation. */
1316 ulint* type, /*!< out: OS_FILE_WRITE or ..._READ */1328 ulint* type, /*!< out: OS_FILE_WRITE or ..._READ */
1317 ulint* space_id);1329 ulint* space_id);
1330/*******************************************************************//**
1331Submit buffered AIO requests on the given segment to the kernel.
1332@return TRUE on success. */
1333UNIV_INTERN
1334ibool
1335os_aio_linux_dispatch_read_array_submit();
1318#endif /* LINUX_NATIVE_AIO */1336#endif /* LINUX_NATIVE_AIO */
13191337
1320#ifndef UNIV_NONINL1338#ifndef UNIV_NONINL
13211339
=== modified file 'storage/innobase/include/os0file.ic'
--- storage/innobase/include/os0file.ic 2013-10-23 08:48:28 +0000
+++ storage/innobase/include/os0file.ic 2014-04-23 10:58:56 +0000
@@ -213,7 +213,10 @@
213 ulint space_id,213 ulint space_id,
214 trx_t* trx,214 trx_t* trx,
215 const char* src_file,/*!< in: file name where func invoked */215 const char* src_file,/*!< in: file name where func invoked */
216 ulint src_line)/*!< in: line where the func invoked */216 ulint src_line,/*!< in: line where the func invoked */
217 ibool should_buffer)
218 /*!< in: whether to buffer an aio request.
219 Only used by aio read ahead*/
217{220{
218 ibool result;221 ibool result;
219 struct PSI_file_locker* locker = NULL;222 struct PSI_file_locker* locker = NULL;
@@ -227,7 +230,8 @@
227 src_file, src_line);230 src_file, src_line);
228231
229 result = os_aio_func(type, mode, name, file, buf, offset,232 result = os_aio_func(type, mode, name, file, buf, offset,
230 n, message1, message2, space_id, trx);233 n, message1, message2, space_id, trx,
234 should_buffer);
231235
232 register_pfs_file_io_end(locker, n);236 register_pfs_file_io_end(locker, n);
233237
234238
=== modified file 'storage/innobase/include/srv0srv.h'
--- storage/innobase/include/srv0srv.h 2014-02-17 11:12:40 +0000
+++ storage/innobase/include/srv0srv.h 2014-04-23 10:58:56 +0000
@@ -129,6 +129,23 @@
129 ulint_ctr_1_t lock_deadlock_count;129 ulint_ctr_1_t lock_deadlock_count;
130130
131 ulint_ctr_1_t n_lock_max_wait_time;131 ulint_ctr_1_t n_lock_max_wait_time;
132
133 /** Number of buffered aio requests submitted */
134 ulint_ctr_64_t n_aio_submitted;
135
136 /** total number of pages that logical-read-ahead missed while doing
137 a table scan. The number is the total for all transactions that used a
138 non-zero innodb_lra_size. */
139 ulint_ctr_64_t n_logical_read_ahead_misses;
140 /** total number of pages that logical-read-ahead prefetched. The
141 number is the total for all transactions that used a non-zero
142 innodb_lra_size. */
143 ulint_ctr_64_t n_logical_read_ahead_prefetched;
144 /** total number of pages that logical-read-ahead did not need to
145 prefetch because these pages were already in the buffer pool. The
146 number is the total for all transactions that used a non-zero
147 innodb_lra_size. */
148 ulint_ctr_64_t n_logical_read_ahead_in_buf_pool;
132};149};
133150
134extern const char* srv_main_thread_op_info;151extern const char* srv_main_thread_op_info;
@@ -1060,6 +1077,31 @@
1060 ulint innodb_purge_view_trx_id_age; /*!< rw_max_trx_id1077 ulint innodb_purge_view_trx_id_age; /*!< rw_max_trx_id
1061 - purged view's min trx_id */1078 - purged view's min trx_id */
1062#endif /* UNIV_DEBUG */1079#endif /* UNIV_DEBUG */
1080 ulint innodb_buffered_aio_submitted;
1081 ulint innodb_logical_read_ahead_misses; /*!< total number of pages that
1082 logical-read-ahead missed
1083 during a table scan.
1084 The number is the total for all
1085 the transactions that used a
1086 non-zero
1087 innodb_lra_size.
1088 */
1089 ulint innodb_logical_read_ahead_prefetched; /*!< total number of pages
1090 that logical-read-ahead
1091 prefetched. The number is the
1092 total for all the transactions
1093 that used a non-zero
1094 innodb_lra_size.
1095 */
1096 ulint innodb_logical_read_ahead_in_buf_pool; /*!< total number of pages
1097 that logical-read-ahead did not
1098 need to prefetch because these
1099 pages were already in the
1100 buffer pool. The number is the
1101 total for all transactions that
1102 used a non-zero
1103 innodb_lra_size.
1104 */
1063};1105};
10641106
1065/** Thread slot in the thread table. */1107/** Thread slot in the thread table. */
10661108
=== modified file 'storage/innobase/include/trx0trx.h'
--- storage/innobase/include/trx0trx.h 2014-02-17 11:12:40 +0000
+++ storage/innobase/include/trx0trx.h 2014-04-23 10:58:56 +0000
@@ -39,6 +39,13 @@
39#include "trx0xa.h"39#include "trx0xa.h"
40#include "ut0vec.h"40#include "ut0vec.h"
41#include "fts0fts.h"41#include "fts0fts.h"
42#include "btr0types.h"
43
44#ifdef TARGET_OS_LINUX
45#include <sys/syscall.h>
46#include <sys/ioctl.h>
47#endif /* TARGET_OS_LINUX */
48
4249
43/** Dummy session used currently in MySQL interface */50/** Dummy session used currently in MySQL interface */
44extern sess_t* trx_dummy_sess;51extern sess_t* trx_dummy_sess;
@@ -135,7 +142,27 @@
135#define trx_start_if_not_started_xa(t) \142#define trx_start_if_not_started_xa(t) \
136 trx_start_if_not_started_xa_low((t))143 trx_start_if_not_started_xa_low((t))
137#endif /* UNIV_DEBUG */144#endif /* UNIV_DEBUG */
138145/*************************************************************//**
146Creates or frees data structures related to logical-read-ahead.
147based on the value of lra_size. */
148UNIV_INTERN
149void
150trx_lra_reset(
151 trx_t* trx, /*!< in: transaction */
152 ulint lra_size, /*!< in: lra_size in MB.
153 If 0, the fields that are releated
154 to logical-read-ahead will be free'd
155 if they were initialized. */
156 ulint lra_n_node_recs_before_sleep,
157 /*!< in: lra_n_node_recs_before_sleep
158 is the number of node pointer records
159 traversed while holding the index lock
160 before releasing the index lock and
161 sleeping for a short period of time so
162 that the other threads get a chance to
163 x-latch the index lock. */
164 ulint lra_sleep); /* lra_sleep is the sleep time in
165 milliseconds. */
139/*************************************************************//**166/*************************************************************//**
140Starts the transaction if it is not yet started. */167Starts the transaction if it is not yet started. */
141UNIV_INTERN168UNIV_INTERN
@@ -650,6 +677,15 @@
650677
651#define TRX_MAGIC_N 91118598678#define TRX_MAGIC_N 91118598
652679
680/*******************************************************************//**
681Helper data structure to store page numbers in an internally-linked hash
682table. */
683typedef struct page_no_holder_struct page_no_holder_t;
684struct page_no_holder_struct {
685 ulint page_no;
686 page_no_holder_t* hash;
687};
688
653/** The transaction handle689/** The transaction handle
654690
655Normally, there is a 1:1 relationship between a transaction handle691Normally, there is a 1:1 relationship between a transaction handle
@@ -804,6 +840,65 @@
804 150 bytes in the undo log size as then840 150 bytes in the undo log size as then
805 we skip XA steps */841 we skip XA steps */
806 ulint fake_changes;842 ulint fake_changes;
843 ulint lra_size; /* Total size (in MBs) of the
844 pages that will be prefetched by
845 logical read ahead. */
846 ulint lra_n_pages; /* Number of pages that lra prefetches
847 every time. This is computed using
848 lra_size and the currently scanned
849 table's block size */
850 ulint lra_space_id; /* The last space id that the scanning
851 transaction accessed. If the scanning
852 trx accesses multiple tables, we need
853 to reset the data structures that lra
854 uses. */
855 ulint lra_page_no; /* The last page that was visited
856 by the trx. Used by the
857 logical-read-ahead algorithm to
858 determine if a new prefetch should be
859 performed. */
860 hash_table_t* lra_ht1;
861 hash_table_t* lra_ht2; /* Hash tables store the leaf page
862 numbers for the already prefetched
863 pages. Each hash table will typically
864 have lra_n_pages pages and when the
865 scanning trx visits all lra_n_pages
866 pages in one of them, we will empty
867 that one and prefetch another batch of
868 lra_n_pages pages. */
869 hash_table_t* lra_ht; /* lra_ht points to lra_ht1 and lra_ht2
870 alternatingly. */
871 ulint lra_n_pages_since;/* number of leaf pages visited since
872 the last prefetch operation. We require
873 that no prefetch be done until the
874 scanning trx scans lra_n_pages pages.
875 */
876 ulint* lra_sort_arr; /* Array used for sorting the page
877 numbers before issuing the read
878 requests */
879 page_no_holder_t* lra_arr1; /* Pre-allocated array of
880 page_no_holder objects which are used
881 by the logical-read-ahead algorithm for
882 lra_ht1. */
883 page_no_holder_t* lra_arr2; /* Pre-allocated array of
884 page_no_holder objects which are used
885 by the logical-read-ahead algorithm for
886 lra_ht2. */
887 btr_pcur_t* lra_cur; /* The persistent cursor that points
888 to the first node pointer record for
889 which the associated leaf page is not
890 prefetched by LRA. */
891 ulint lra_n_node_recs_before_sleep;
892 /* lra_n_node_recs_before_sleep
893 is the number of node pointer records
894 traversed while holding the index lock
895 before releasing the index lock and
896 sleeping for a short period of time so
897 that the other threads get a chance to
898 x-latch the index lock. */
899 ulint lra_sleep; /* lra_sleep is the sleep time in
900 milliseconds. */
901 ulint lra_tree_height;
807 ulint flush_log_later;/* In 2PC, we hold the902 ulint flush_log_later;/* In 2PC, we hold the
808 prepare_commit mutex across903 prepare_commit mutex across
809 both phases. In that case, we904 both phases. In that case, we
810905
=== modified file 'storage/innobase/os/os0file.cc'
--- storage/innobase/os/os0file.cc 2014-03-03 17:51:33 +0000
+++ storage/innobase/os/os0file.cc 2014-04-23 10:58:56 +0000
@@ -245,6 +245,16 @@
245 There is one such event for each245 There is one such event for each
246 possible pending IO. The size of the246 possible pending IO. The size of the
247 array is equal to n_slots. */247 array is equal to n_slots. */
248 struct iocb** pending;
249 /* Array to buffer the not-submitted aio
250 requests. The array length is n_slots.
251 It is divided into n_segments segments.
252 pending requests on each segment are buffered
253 separately.*/
254 ulint* count;
255 /* Array of length n_segments. Each element
256 counts the number of not-submitted aio request
257 on that segment.*/
248#endif /* LINUX_NATIV_AIO */258#endif /* LINUX_NATIV_AIO */
249};259};
250260
@@ -3926,6 +3936,13 @@
3926 memset(io_event, 0x0, sizeof(*io_event) * n);3936 memset(io_event, 0x0, sizeof(*io_event) * n);
3927 array->aio_events = io_event;3937 array->aio_events = io_event;
39283938
3939 array->pending = static_cast<struct iocb**>(
3940 ut_malloc(n * sizeof(struct iocb*)));
3941 memset(array->pending, 0x0, sizeof(struct iocb*) * n);
3942 array->count = static_cast<ulint*>(
3943 ut_malloc(n_segments * sizeof(ulint)));
3944 memset(array->count, 0x0, sizeof(ulint) * n_segments);
3945
3929skip_native_aio:3946skip_native_aio:
3930#endif /* LINUX_NATIVE_AIO */3947#endif /* LINUX_NATIVE_AIO */
3931 for (ulint i = 0; i < n; i++) {3948 for (ulint i = 0; i < n; i++) {
@@ -3982,6 +3999,8 @@
3982 if (srv_use_native_aio) {3999 if (srv_use_native_aio) {
3983 ut_free(array->aio_events);4000 ut_free(array->aio_events);
3984 ut_free(array->aio_ctx);4001 ut_free(array->aio_ctx);
4002 ut_free(array->pending);
4003 ut_free(array->count);
3985 }4004 }
3986#endif /* LINUX_NATIVE_AIO */4005#endif /* LINUX_NATIVE_AIO */
39874006
@@ -4605,6 +4624,49 @@
46054624
4606#if defined(LINUX_NATIVE_AIO)4625#if defined(LINUX_NATIVE_AIO)
4607/*******************************************************************//**4626/*******************************************************************//**
4627Submit buffered AIO requests on the given segment to the kernel.
4628@return TRUE on success. */
4629UNIV_INTERN
4630ibool
4631os_aio_linux_dispatch_read_array_submit()
4632{
4633 os_aio_array_t* array = os_aio_read_array;
4634 ulint total_submitted = 0;
4635 ulint total_count = 0;
4636 if (!srv_use_native_aio) {
4637 return TRUE;
4638 }
4639 os_mutex_enter(array->mutex);
4640 /* Submit aio requests buffered on all segments. */
4641 for (ulint i = 0; i < array->n_segments; i++) {
4642 ulint count = array->count[i];
4643 if (count > 0) {
4644 ulint iocb_index = i * array->n_slots
4645 / array->n_segments;
4646 total_count += count;
4647 total_submitted += io_submit(array->aio_ctx[i], count,
4648 &(array->pending[iocb_index]));
4649 }
4650 }
4651 /* Reset the aio request buffer. */
4652 memset(array->pending, 0x0,
4653 sizeof(struct iocb*) * array->n_slots);
4654 memset(array->count, 0x0, sizeof(ulint) * array->n_segments);
4655 os_mutex_exit(array->mutex);
4656
4657 srv_stats.n_aio_submitted.add(total_count);
4658
4659 /* io_submit returns number of successfully
4660 queued requests or -errno. */
4661 if (UNIV_UNLIKELY(total_count != total_submitted)) {
4662 errno = -total_submitted;
4663 return(FALSE);
4664 }
4665
4666 return(TRUE);
4667}
4668
4669/*******************************************************************//**
4608Dispatch an AIO request to the kernel.4670Dispatch an AIO request to the kernel.
4609@return TRUE on success. */4671@return TRUE on success. */
4610static4672static
@@ -4612,24 +4674,46 @@
4612os_aio_linux_dispatch(4674os_aio_linux_dispatch(
4613/*==================*/4675/*==================*/
4614 os_aio_array_t* array, /*!< in: io request array. */4676 os_aio_array_t* array, /*!< in: io request array. */
4615 os_aio_slot_t* slot) /*!< in: an already reserved slot. */4677 os_aio_slot_t* slot, /*!< in: an already reserved slot. */
4678 ibool should_buffer) /*!< in: should buffer the request
4679 rather than submit. */
4616{4680{
4617 int ret;4681 int ret;
4618 ulint io_ctx_index;4682 ulint io_ctx_index = 0;
4619 struct iocb* iocb;4683 struct iocb* iocb;
4684 ulint slots_per_segment;
46204685
4621 ut_ad(slot != NULL);4686 ut_ad(slot);
4622 ut_ad(array);4687 ut_ad(array);
4623
4624 ut_a(slot->reserved);4688 ut_a(slot->reserved);
46254689
4626 /* Find out what we are going to work with.4690 /* Find out what we are going to work with.
4627 The iocb struct is directly in the slot.4691 The iocb struct is directly in the slot.
4628 The io_context is one per segment. */4692 The io_context is one per segment. */
46294693
4694 slots_per_segment = array->n_slots / array->n_segments;
4630 iocb = &slot->control;4695 iocb = &slot->control;
4631 io_ctx_index = (slot->pos * array->n_segments) / array->n_slots;4696 io_ctx_index = slot->pos / slots_per_segment;
46324697 if (should_buffer) {
4698 ulint n;
4699 os_mutex_enter(array->mutex);
4700 /* There are array->n_slots elements in array->pending,
4701 which is divided into array->n_segments area of equal size.
4702 The iocb of each segment are buffered in its corresponding area
4703 in the pending array consecutively as they come.
4704 array->count[i] records the number of buffered aio requests
4705 in the ith segment.*/
4706 n = io_ctx_index * slots_per_segment
4707 + array->count[io_ctx_index];
4708 array->pending[n] = iocb;
4709 array->count[io_ctx_index] ++;
4710 os_mutex_exit(array->mutex);
4711 if (array->count[io_ctx_index] == slots_per_segment) {
4712 return os_aio_linux_dispatch_read_array_submit();
4713 }
4714 return(TRUE);
4715 }
4716 /* Submit the given request. */
4633 ret = io_submit(array->aio_ctx[io_ctx_index], 1, &iocb);4717 ret = io_submit(array->aio_ctx[io_ctx_index], 1, &iocb);
46344718
4635#if defined(UNIV_AIO_DEBUG)4719#if defined(UNIV_AIO_DEBUG)
@@ -4689,7 +4773,12 @@
4689 aio operation); ignored if mode is4773 aio operation); ignored if mode is
4690 OS_AIO_SYNC */4774 OS_AIO_SYNC */
4691 ulint space_id,4775 ulint space_id,
4692 trx_t* trx)4776 trx_t* trx,
4777 ibool should_buffer) /*!< in: Whether to buffer an aio request.
4778 AIO read ahead uses this. If you plan to
4779 use this parameter, make sure you remember
4780 to call os_aio_linux_dispatch_read_array_submit
4781 when you are read to commit all your requests.*/
4693{4782{
4694 os_aio_array_t* array;4783 os_aio_array_t* array;
4695 os_aio_slot_t* slot;4784 os_aio_slot_t* slot;
@@ -4802,7 +4891,8 @@
4802 &(slot->control));4891 &(slot->control));
48034892
4804#elif defined(LINUX_NATIVE_AIO)4893#elif defined(LINUX_NATIVE_AIO)
4805 if (!os_aio_linux_dispatch(array, slot)) {4894 if (!os_aio_linux_dispatch(array, slot,
4895 should_buffer)) {
4806 goto err_exit;4896 goto err_exit;
4807 }4897 }
4808#endif /* WIN_ASYNC_IO */4898#endif /* WIN_ASYNC_IO */
@@ -4822,7 +4912,7 @@
4822 &(slot->control));4912 &(slot->control));
48234913
4824#elif defined(LINUX_NATIVE_AIO)4914#elif defined(LINUX_NATIVE_AIO)
4825 if (!os_aio_linux_dispatch(array, slot)) {4915 if (!os_aio_linux_dispatch(array, slot, FALSE)) {
4826 goto err_exit;4916 goto err_exit;
4827 }4917 }
4828#endif /* WIN_ASYNC_IO */4918#endif /* WIN_ASYNC_IO */
48294919
=== modified file 'storage/innobase/row/row0purge.cc'
--- storage/innobase/row/row0purge.cc 2013-06-20 15:16:00 +0000
+++ storage/innobase/row/row0purge.cc 2014-04-23 10:58:56 +0000
@@ -187,6 +187,10 @@
187 return(success);187 return(success);
188}188}
189189
190#ifdef UNIV_DEBUG
191extern my_bool row_lra_test;
192#endif
193
190/***********************************************************//**194/***********************************************************//**
191Removes a clustered index record if it has not been modified after the delete195Removes a clustered index record if it has not been modified after the delete
192marking.196marking.
@@ -203,6 +207,11 @@
203 return(true);207 return(true);
204 }208 }
205209
210#ifdef UNIV_DEBUG
211 while (row_lra_test) {
212 os_thread_sleep(300000);
213 }
214#endif
206 for (ulint n_tries = 0;215 for (ulint n_tries = 0;
207 n_tries < BTR_CUR_RETRY_DELETE_N_TIMES;216 n_tries < BTR_CUR_RETRY_DELETE_N_TIMES;
208 n_tries++) {217 n_tries++) {
209218
=== modified file 'storage/innobase/row/row0sel.cc'
--- storage/innobase/row/row0sel.cc 2014-03-03 17:51:33 +0000
+++ storage/innobase/row/row0sel.cc 2014-04-23 10:58:56 +0000
@@ -60,6 +60,8 @@
60#include "srv0start.h"60#include "srv0start.h"
61#include "m_string.h" /* for my_sys.h */61#include "m_string.h" /* for my_sys.h */
62#include "my_sys.h" /* DEBUG_SYNC_C */62#include "my_sys.h" /* DEBUG_SYNC_C */
63#include "ut0sort.h"
64#include <algorithm>
6365
64#include "my_compare.h" /* enum icp_result */66#include "my_compare.h" /* enum icp_result */
6567
@@ -3632,6 +3634,318 @@
3632 return(result);3634 return(result);
3633}3635}
36343636
3637/**********************************************************************//**
3638Determines the page numbers for the next batch of pages that will be
3639prefetched for logical read ahead and stores them in the hash_table and
3640page_no_array. Does not issue read requests. */
3641static
3642void
3643row_read_ahead_logical_low(
3644 hash_table_t* hash_table, /* in/out: This hash table is emptied and
3645 then filled with the next batch of page
3646 numbers that should be prefetched. */
3647 ulint* n_prefetched_ptr, /* in/out: the number that's pointed by
3648 this pointer is incremented by the number
3649 of pages that are added to the hash_table */
3650 ulint* page_no_array, /* out: the page numbers that will be
3651 prefetched are stored in this array */
3652 dict_index_t* index, /* in: index object for the table */
3653 mtr_t* mtr, /* in: mini transaction object used for
3654 acquiring and releasing the necessary locks */
3655 ulint* offsets, /* in/out: temporary storage for offsets */
3656 mem_heap_t* heap, /* in: temporary memory heap */
3657 trx_t *trx)
3658{
3659 page_no_holder_t* page_no_holder;
3660 page_no_holder_t* lra_arr;
3661 ulint page_no;
3662 ulint n_prefetched = 0;
3663 rec_t* rec;
3664 /* empty the hash table because we don't want it to grow to hold
3665 all leaf page numbers of the table. The concern is not memory, but
3666 the lookup time. */
3667 hash_table_clear(hash_table);
3668 if (hash_table == trx->lra_ht1) {
3669 lra_arr = trx->lra_arr1;
3670 } else {
3671 lra_arr = trx->lra_arr2;
3672 }
3673 while (!btr_pcur_is_after_last_in_tree(trx->lra_cur, mtr)
3674 && n_prefetched < trx->lra_n_pages) {
3675 if (UNIV_UNLIKELY(trx_is_interrupted(trx))) {
3676 return;
3677 }
3678 rec = btr_pcur_get_rec(trx->lra_cur);
3679 if (page_rec_is_supremum(rec) || page_rec_is_infimum(rec)) {
3680 btr_pcur_move_to_next(trx->lra_cur, mtr);
3681 continue;
3682 }
3683 offsets = rec_get_offsets(rec, index, offsets,
3684 ULINT_UNDEFINED, &heap);
3685 page_no = btr_node_ptr_get_child_page_no(rec, offsets);
3686 page_no_holder = &lra_arr[n_prefetched];
3687 page_no_holder->page_no = page_no;
3688 page_no_holder->hash = NULL;
3689 HASH_INSERT(page_no_holder_t, hash, hash_table,
3690 page_no, page_no_holder);
3691 btr_pcur_move_to_next(trx->lra_cur, mtr);
3692 page_no_array[n_prefetched] = page_no;
3693 ++n_prefetched;
3694 if (trx->lra_n_node_recs_before_sleep
3695 && trx->lra_sleep
3696 &&
3697 ((n_prefetched % trx->lra_n_node_recs_before_sleep) == 0))
3698 {
3699 btr_pcur_store_position(trx->lra_cur, mtr);
3700 mtr_commit(mtr);
3701 os_thread_sleep(trx->lra_sleep * 1000);
3702 mtr_start(mtr);
3703 btr_pcur_restore_position_func(
3704 BTR_SEARCH_LEAF, trx->lra_cur, 1,
3705 __FILE__, __LINE__, mtr);
3706 }
3707 }
3708 *n_prefetched_ptr += n_prefetched;
3709}
3710
3711/*********************************************************************//**
3712Returns TRUE if the page specified by page_no was prefetched.
3713@return: TRUE if the page was prefetched before. */
3714UNIV_INLINE
3715ibool
3716row_lra_is_prefetched(
3717 const trx_t* trx, /* in: trx->lra_ht1 and trx->lra_ht2 are
3718 probed to see if page page_no was prefetched */
3719 ulint page_no) /* in: page no for the page that is being checked */
3720{
3721 page_no_holder_t* page_no_holder = NULL;
3722 hash_table_t* other_table;
3723 HASH_SEARCH(hash, trx->lra_ht, page_no, page_no_holder_t*,
3724 page_no_holder, ut_a(1),
3725 page_no_holder->page_no == page_no);
3726 if (page_no_holder) {
3727 return TRUE;
3728 }
3729 other_table = trx->lra_ht == trx->lra_ht1 ? trx->lra_ht2
3730 : trx->lra_ht1;
3731 HASH_SEARCH(hash, other_table, page_no, page_no_holder_t*,
3732 page_no_holder, ut_a(1),
3733 page_no_holder->page_no == page_no);
3734 if (page_no_holder) {
3735 return TRUE;
3736 }
3737 return FALSE;
3738}
3739
3740#ifdef UNIV_DEBUG
3741my_bool row_lra_test = FALSE;
3742#endif
3743
3744/*********************************************************************//**
3745This function submits io requests for pages that are logical successors to
3746the page that is pointed by pcur. It is meant to be called during sequential
3747scan to prefetch pages and speed-up the scan. The number of pages that are
3748prefetched is determined by the session variable innodb_logical_readahead_size
3749divided by the block size of the table. It is ok to call this function
3750successively even if pcur did not move to the next page because this function
3751keeps track of the page numbers it prefetched and won't duplicate io requests.
3752This function may temporarily release the block latches held by pcur and
3753re-acquire them.
3754@return: TRUE if the function released the block latch and re-acquired it and
3755now the cursor pcur points to a new record that must be processed by the
3756caller. */
3757static
3758ibool
3759row_read_ahead_logical(
3760 btr_pcur_t* pcur, /* in/out: Cursor from which the current page
3761 number is obtained. Cursor's position may change
3762 and this is indicated in the return value. */
3763 dict_index_t* index, /* in: index object for the table */
3764 mtr_t* mtr, /* in: mini-transaction. May be committed and
3765 restarted */
3766 ulint* offsets, /* in: temporary storage for offsets */
3767 mem_heap_t** heap_ptr, /* in/out: If *heap_ptr is not NULL then this
3768 heap is used for memory allocations, otherwise
3769 a new heap is created and stored in *heap_ptr.
3770 The caller is responsible for freeing the heap */
3771 trx_t *trx)
3772{
3773 buf_block_t* block = btr_cur_get_block(&pcur->btr_cur);
3774 ibool same_user_rec;
3775 mem_heap_t* heap;
3776 ib_int64_t tablespace_version;
3777 ulint page_no = buf_block_get_page_no(block);
3778 ulint space = buf_block_get_space(block);
3779 ulint zip_size = buf_block_get_zip_size(block);
3780 rec_t* rec;
3781 dtuple_t* tuple;
3782 dberr_t err;
3783 ulint num_prefetched = 0;
3784 ulint num_read_requests = 0;
3785 ulint i;
3786 ulint root_page_no;
3787 buf_block_t* root_block;
3788
3789 if (!trx->lra_size) {
3790 return FALSE;
3791 }
3792 if (trx->lra_space_id == space && trx->lra_tree_height <= 1) {
3793 return FALSE;
3794 }
3795 if (trx->lra_space_id == space && trx->lra_page_no == page_no) {
3796 /* the cursor is on the same page as the last time this
3797 function was called. */
3798 return FALSE;
3799 }
3800
3801 /* Set the last page number to page_no only if we are scanning the
3802 same table. */
3803 if (trx->lra_space_id == space) {
3804 trx->lra_page_no = page_no;
3805 }
3806
3807 /* In order not to prefetch extraneously, we do not issue prefetches
3808 until the scan processes lra_n_pages pages. This may cause misses if
3809 there are too many splits/merges going on but in such a case it
3810 would be hard to guess which pages to prefetch anyway
3811 because it is equivalent to guessing which pages would split
3812 (or merge). */
3813 if (trx->lra_space_id == space
3814 && ++trx->lra_n_pages_since <= trx->lra_n_pages) {
3815 if (!row_lra_is_prefetched(trx, page_no)) {
3816 srv_stats.n_logical_read_ahead_misses.add(1);
3817 }
3818 return FALSE;
3819 }
3820 rec = page_rec_get_next(page_get_infimum_rec(
3821 buf_block_get_frame(block)));
3822 if (!rec || page_rec_is_supremum(rec)) {
3823 /* Do not start prefetching because we can not get the
3824 parent page of an empty page */
3825 trx->lra_page_no = 0;
3826 return FALSE;
3827 }
3828 tablespace_version = fil_space_get_version(space);
3829
3830 if (!*heap_ptr) {
3831 *heap_ptr = mem_heap_create(100);
3832 }
3833 heap = *heap_ptr;
3834 tuple = dict_index_build_node_ptr(index, rec, 0, heap, 1);
3835 trx->lra_n_pages_since = 0;
3836
3837 btr_pcur_store_position(pcur, mtr);
3838 mtr_commit(mtr);
3839#ifdef UNIV_DEBUG
3840 if (row_lra_test && trx->lra_space_id == space) {
3841 row_lra_test = FALSE;
3842 os_thread_sleep(1000000);
3843 }
3844#endif
3845 mtr_start(mtr);
3846
3847 if (UNIV_LIKELY(trx->lra_space_id == space)) {
3848#ifdef UNIV_DEBUG
3849 memset(trx->lra_sort_arr, 0,
3850 2 * trx->lra_n_pages * sizeof(ulint));
3851#endif
3852 btr_pcur_restore_position_func(
3853 BTR_SEARCH_LEAF, trx->lra_cur, 1,
3854 __FILE__, __LINE__, mtr);
3855 row_read_ahead_logical_low(
3856 trx->lra_ht, &num_prefetched,
3857 trx->lra_sort_arr, index, mtr,
3858 offsets, heap, trx);
3859 if (trx->lra_ht == trx->lra_ht1) {
3860 trx->lra_ht = trx->lra_ht2;
3861 } else {
3862 trx->lra_ht = trx->lra_ht1;
3863 }
3864 } else {
3865 /* The transaction started to scan a new table, set
3866 the values for lra_space_id and lra_n_pages based on the
3867 new table */
3868 trx_lra_reset(trx,
3869 trx->lra_size,
3870 trx->lra_n_node_recs_before_sleep,
3871 trx->lra_sleep);
3872 trx->lra_space_id = space;
3873 trx->lra_n_pages = (trx->lra_size << 20L)
3874 / (zip_size ? zip_size : UNIV_PAGE_SIZE);
3875 trx->lra_page_no = page_no;
3876 mtr_s_lock(dict_index_get_lock(index), mtr);
3877 /* Get root page to get the B-tree depth */
3878 root_page_no = dict_index_get_page(index);
3879 root_block = buf_page_get_gen(space, zip_size, root_page_no,
3880 RW_NO_LATCH, NULL, BUF_GET,
3881 __FILE__, __LINE__, mtr);
3882 trx->lra_tree_height = btr_page_get_level(
3883 buf_block_get_frame(root_block),
3884 mtr) + 1;
3885 if (trx->lra_tree_height > 1) {
3886#ifdef UNIV_DEBUG
3887 memset(trx->lra_sort_arr, 0,
3888 2 * trx->lra_n_pages * sizeof(ulint));
3889#endif
3890 mtr_commit(mtr);
3891 mtr_start(mtr);
3892 btr_pcur_open_low(index, 1, tuple, PAGE_CUR_LE,
3893 BTR_SEARCH_LEAF, trx->lra_cur,
3894 __FILE__, __LINE__, mtr);
3895 row_read_ahead_logical_low(
3896 trx->lra_ht1, &num_prefetched,
3897 trx->lra_sort_arr, index, mtr,
3898 offsets, heap, trx);
3899 row_read_ahead_logical_low(
3900 trx->lra_ht2, &num_prefetched,
3901 &trx->lra_sort_arr[num_prefetched], index, mtr,
3902 offsets, heap, trx);
3903 }
3904 trx->lra_ht = trx->lra_ht1;
3905 }
3906 if (trx->lra_tree_height > 1)
3907 btr_pcur_store_position(trx->lra_cur, mtr);
3908 mtr_commit(mtr);
3909 if (num_prefetched) {
3910 /* We sort the page numbers before issuing read requests for two
3911 reasons:
3912 1- The block layer in linux kernel currently sorts the read
3913 requests and merges them but there is a possibility that
3914 this algorithm does not detect the sequential read and
3915 coalesce the iops.
3916 2- Even if the block layer algorithm is perfect, the
3917 asynchrounous read array size may be small in which case we
3918 the read requests will have a lower chance of being
3919 coalesced by the block layer.
3920
3921 Sorting is cheap in comparison to the iops that are about to
3922 be done so we always sort. */
3923 std::sort(trx->lra_sort_arr, trx->lra_sort_arr + num_prefetched);
3924 /* TODO(nizamordulu): Here we call buf_read_page_low() which
3925 acquires the related buffer pool shard lock and checks if the
3926 page is in that shard for each page. This could be made more
3927 efficient if we checked for all pages at once or batched the
3928 check for multiple pages after acquiring the related latch. */
3929 for (i = 0; i < num_prefetched; ++i) {
3930 num_read_requests += buf_read_page_low(
3931 &err, FALSE,
3932 BUF_READ_ANY_PAGE | OS_AIO_SIMULATED_WAKE_LATER,
3933 space, zip_size, FALSE, tablespace_version,
3934 trx->lra_sort_arr[i], trx, TRUE);
3935 }
3936#ifdef LINUX_NATIVE_AIO
3937 os_aio_linux_dispatch_read_array_submit();
3938#endif
3939 srv_stats.n_logical_read_ahead_prefetched.add(
3940 num_read_requests);
3941 srv_stats.n_logical_read_ahead_in_buf_pool.add(
3942 num_prefetched - num_read_requests);
3943 }
3944 mtr_start(mtr);
3945 return sel_restore_position_for_mysql(&same_user_rec, BTR_SEARCH_LEAF,
3946 pcur, TRUE, mtr);
3947}
3948
3635/********************************************************************//**3949/********************************************************************//**
3636Searches for rows in the database. This is used in the interface to3950Searches for rows in the database. This is used in the interface to
3637MySQL. This function opens a cursor, and also implements fetch next3951MySQL. This function opens a cursor, and also implements fetch next
@@ -5014,6 +5328,11 @@
5014 }5328 }
50155329
5016 if (moves_up) {5330 if (moves_up) {
5331 if (trx
5332 && row_read_ahead_logical(
5333 pcur, index, &mtr, offsets, &heap, trx)) {
5334 goto rec_loop;
5335 }
5017 if (UNIV_UNLIKELY(!btr_pcur_move_to_next(pcur, &mtr))) {5336 if (UNIV_UNLIKELY(!btr_pcur_move_to_next(pcur, &mtr))) {
5018not_moved:5337not_moved:
5019 btr_pcur_store_position(pcur, &mtr);5338 btr_pcur_store_position(pcur, &mtr);
50205339
=== modified file 'storage/innobase/srv/srv0srv.cc'
--- storage/innobase/srv/srv0srv.cc 2014-03-03 17:51:33 +0000
+++ storage/innobase/srv/srv0srv.cc 2014-04-23 10:58:56 +0000
@@ -1863,6 +1863,15 @@
1863 }1863 }
1864#endif /* UNIV_DEBUG */1864#endif /* UNIV_DEBUG */
18651865
1866 export_vars.innodb_buffered_aio_submitted =
1867 srv_stats.n_aio_submitted;
1868 export_vars.innodb_logical_read_ahead_misses =
1869 srv_stats.n_logical_read_ahead_misses;
1870 export_vars.innodb_logical_read_ahead_prefetched =
1871 srv_stats.n_logical_read_ahead_prefetched;
1872 export_vars.innodb_logical_read_ahead_in_buf_pool =
1873 srv_stats.n_logical_read_ahead_in_buf_pool;
1874
1866 mutex_exit(&srv_innodb_monitor_mutex);1875 mutex_exit(&srv_innodb_monitor_mutex);
1867}1876}
18681877
18691878
=== modified file 'storage/innobase/trx/trx0trx.cc'
--- storage/innobase/trx/trx0trx.cc 2014-02-17 11:12:40 +0000
+++ storage/innobase/trx/trx0trx.cc 2014-04-23 10:58:56 +0000
@@ -48,6 +48,7 @@
48#include "ha_prototypes.h"48#include "ha_prototypes.h"
49#include "srv0mon.h"49#include "srv0mon.h"
50#include "ut0vec.h"50#include "ut0vec.h"
51#include "btr0pcur.h"
5152
52#include<set>53#include<set>
5354
@@ -212,6 +213,121 @@
212 trx_sys->descr_n_used--;213 trx_sys->descr_n_used--;
213}214}
214215
216/*************************************************************//**
217Creates or frees data structures related to logical-read-ahead.
218based on the value of lra_size. */
219UNIV_INTERN
220void
221trx_lra_reset(
222 trx_t* trx, /*!< in: transaction */
223 ulint lra_size, /*!< in: lra_size in MB.
224 If 0, the fields that are releated
225 to logical-read-ahead will be free'd
226 if they were initialized. */
227 ulint lra_n_node_recs_before_sleep,
228 /*!< in: lra_n_node_recs_before_sleep
229 is the number of node pointer records
230 traversed while holding the index lock
231 before releasing the index lock and
232 sleeping for a short period of time so
233 that the other threads get a chance to
234 x-latch the index lock. */
235 ulint lra_sleep) /* lra_sleep is the sleep time in
236 milliseconds. */
237{
238#ifndef TARGET_OS_LINUX
239 if (lra_size) {
240 ib_logf(IB_LOG_LEVEL_WARN,
241 "Logical read ahead is supported only on linux.");
242 lra_size = 0;
243 }
244#else /* TARGET_OS_LINUX */
245 if (!srv_use_native_aio && lra_size) {
246 ib_logf(IB_LOG_LEVEL_WARN,
247 "In order to use logical read ahead please enable "
248 "native aio by setting innodb_use_native_aio=1 in "
249 "my.cnf and restarting the server.");
250 lra_size = 0;
251 }
252#endif /* TARGET_OS_LINUX */
253 trx->lra_size = lra_size;
254 trx->lra_space_id = 0;
255 trx->lra_n_pages = 0;
256 trx->lra_n_pages_since = 0;
257 trx->lra_page_no = 0;
258 trx->lra_n_node_recs_before_sleep = lra_n_node_recs_before_sleep;
259 trx->lra_sleep = lra_sleep;
260 trx->lra_tree_height = 0;
261 if (lra_size) {
262 ulint n_pages_max =
263 (lra_size << 20L) / UNIV_ZIP_SIZE_MIN;
264 ulint mem = n_pages_max * (2 * sizeof(ulint)
265 + 2 * sizeof(page_no_holder_t))
266 + sizeof(btr_pcur_t);
267 if (trx->lra_ht) {
268 ut_a(trx->lra_ht1);
269 ut_a(trx->lra_ht2);
270 ut_a(trx->lra_sort_arr);
271 ut_a(trx->lra_cur);
272 hash_table_clear(trx->lra_ht1);
273 hash_table_clear(trx->lra_ht2);
274 trx->lra_ht = trx->lra_ht1;
275#ifdef UNIV_DEBUG
276 /* following resets lra_sort_arr,
277 * lra_arr1, lra_arr2, and lra_cursor.
278 */
279 memset(trx->lra_sort_arr, 0, mem);
280#endif
281 btr_pcur_init(trx->lra_cur);
282 } else {
283 byte* alloc;
284 ut_a(!trx->lra_ht1);
285 ut_a(!trx->lra_ht2);
286 ut_a(!trx->lra_sort_arr);
287 trx->lra_ht1 = hash_create(16384);
288 trx->lra_ht2 = hash_create(16384);
289 trx->lra_ht = trx->lra_ht1;
290 alloc = (byte*)ut_malloc(mem);
291#ifdef UNIV_DEBUG
292 memset(alloc, 0, mem);
293#endif
294 trx->lra_sort_arr = (ulint*)alloc;
295 alloc += 2 * sizeof(ulint) * n_pages_max;
296 trx->lra_arr1 = (page_no_holder_t*) alloc;
297 alloc += sizeof(page_no_holder_t) * n_pages_max;
298 trx->lra_arr2 = (page_no_holder_t*) alloc;
299 alloc += sizeof(page_no_holder_t) * n_pages_max;
300 trx->lra_cur = (btr_pcur_t*) alloc;
301 btr_pcur_init(trx->lra_cur);
302
303 }
304 } else {
305 if (trx->lra_ht) {
306 ut_a(trx->lra_ht1);
307 ut_a(trx->lra_ht2);
308 ut_a(trx->lra_sort_arr);
309 hash_table_free(trx->lra_ht1);
310 hash_table_free(trx->lra_ht2);
311 btr_pcur_close(trx->lra_cur);
312 ut_free(trx->lra_sort_arr);
313 trx->lra_sort_arr = NULL;
314 trx->lra_ht = NULL;
315 trx->lra_ht1 = NULL;
316 trx->lra_ht2 = NULL;
317 trx->lra_arr1 = NULL;
318 trx->lra_arr2 = NULL;
319 trx->lra_cur = NULL;
320 } else {
321 ut_a(!trx->lra_ht1);
322 ut_a(!trx->lra_ht2);
323 ut_a(!trx->lra_sort_arr);
324 ut_a(!trx->lra_cur);
325 ut_a(!trx->lra_arr1);
326 ut_a(!trx->lra_arr2);
327 }
328 }
329}
330
215/****************************************************************//**331/****************************************************************//**
216Creates and initializes a transaction object. It must be explicitly332Creates and initializes a transaction object. It must be explicitly
217started with trx_start_if_not_started() before using it. The default333started with trx_start_if_not_started() before using it. The default
@@ -294,6 +410,12 @@
294 trx->lock.table_locks = ib_vector_create(410 trx->lock.table_locks = ib_vector_create(
295 heap_alloc, sizeof(void**), 32);411 heap_alloc, sizeof(void**), 32);
296412
413 trx->lra_ht = NULL;
414 trx->lra_cur = NULL;
415 trx->lra_ht1 = NULL;
416 trx->lra_ht2 = NULL;
417 trx_lra_reset(trx, 0, 0, 0);
418
297 return(trx);419 return(trx);
298}420}
299421
@@ -388,6 +510,7 @@
388 }510 }
389511
390 mutex_free(&trx->mutex);512 mutex_free(&trx->mutex);
513 trx_lra_reset(trx, 0, 0, 0);
391514
392 read_view_free(trx->prebuilt_view);515 read_view_free(trx->prebuilt_view);
393516

Subscribers

People subscribed via source and target branches