Merge lp:~zephyrleaves/percona-server/row_cache into lp:percona-server/5.5

Proposed by Kevin.Huang on 2012-01-12
Status: Work in progress
Proposed branch: lp:~zephyrleaves/percona-server/row_cache
Merge into: lp:percona-server/5.5
Diff against target: 3321 lines (+3317/-0)
1 file modified
patches/innodb_row_cache.diff (+3317/-0)
To merge this branch: bzr merge lp:~zephyrleaves/percona-server/row_cache
Reviewer Review Type Date Requested Status
Oleg Tsarev 2012-03-02 Pending
Vadim Tkachenko 2012-01-12 Pending
Review via email: mp+88310@code.launchpad.net

Description of the change

Row Cache For Innodb is designed for increase memory utilization when query is Key-Value-like.

To solve the problem that it is mainly in popular data and popular data distribution is discrete cases Page in memory cache utilization rate is low.

Some detail can see : http://code.google.com/p/row-cache-for-innodb/

Benchmarks can use this sysbench which i modified for row cache scene.
http://code.google.com/p/row-cache-for-innodb/downloads/detail?name=sysbench-0.4.8.tar.gz&can=2&q=#makechanges

To post a comment you must log in.
Kevin.Huang (zephyrleaves) wrote :

fix bug for memory leak on shutdown
fix bug for crash on rollback

206. By on 2012-03-02

fix bug for memory leak on shutdown
fix bug for crash on rollback

207. By Kevin.Huang on 2012-03-22

fix bug for row cache is not evict when truncate

Vadim Tkachenko (vadim-tk) wrote :

Kevin,

can you pls provide merge proposal against current lp:percona-server ?
We do not have patches anymore, this is full bzr branch now.

Vadim Tkachenko (vadim-tk) wrote :

ok,
it was easier than I thought, I ported it there
lp:~vadim-tk/percona-server/5.5.22-rowcache

Vadim Tkachenko (vadim-tk) wrote :

I have a problem with compiling current patch
"(byte*) key_val_buff" is not defined.

Kevin.Huang (zephyrleaves) wrote :

> I have a problem with compiling current patch
> "(byte*) key_val_buff" is not defined.

I will fix it with 5.5.22.

Kevin.Huang (zephyrleaves) wrote :

> I have a problem with compiling current patch
> "(byte*) key_val_buff" is not defined.

I fixed the compatibility problems for percona 5.5.22.
I ported it here
lp:~zephyrleaves/percona-server/5.5.22-rowcache

Vadim Tkachenko (vadim-tk) wrote :

I can't get good performance numbers with row cache.

I am running sysbench oltp, ZipF distribution, 100GB database, 16 threads

mysqld --innodb-buffer-pool-size=10GB --innodb-log-file-size=4G --innodb_flush_log_at_trx_commit=1 --innodb_row_cache_mem_pool_size=10GB --innodb_row_cache_on=ON

with innodb_row_cache_mem_pool_size=10GB the results is actually worse then with innodb_row_cache_mem_pool_size=5GB

and in any case it is worse then with no row cache at all.

I suspect it maybe a some mutex contention issue.

Kevin.Huang (zephyrleaves) wrote :

> I can't get good performance numbers with row cache.
>
> I am running sysbench oltp, ZipF distribution, 100GB database, 16 threads
>
> mysqld --innodb-buffer-pool-size=10GB --innodb-log-file-size=4G
> --innodb_flush_log_at_trx_commit=1 --innodb_row_cache_mem_pool_size=10GB
> --innodb_row_cache_on=ON
>
> with innodb_row_cache_mem_pool_size=10GB the results is actually worse then
> with innodb_row_cache_mem_pool_size=5GB
>
> and in any case it is worse then with no row cache at all.
>
> I suspect it maybe a some mutex contention issue.

You need use this sysbench
http://code.google.com/p/row-cache-for-innodb/downloads/detail?name=sysbench-0.4.8.tar.gz
for test.
This sysbench which i modified for row cache's scene.
You can use command like this to run test:
sysbench --test=oltp --oltp-test-mode=simple --oltp-skip-trx=on --oltp-table-size=80000000 --oltp-range-size=1 --mysql-host=localhost --mysql-user=xx --mysql-password=xx --oltp-read-only=on --init-rng=on --num-threads=70 --oltp-dist-type=special --oltp-dist-pct=1 --oltp-dist-res=80 --max-requests=0 --max-time=1800 run

Because of the common sysbench's oltp-dist-type is not random enough.

Vadim Tkachenko (vadim-tk) wrote :

Kevin,

I am not sure what do you mean by "not random" enough.

I am using sysbench 0.5 with new zipf distribution.

In anycase I do not expect a performance degradation when I use bigger cache, but
this is what I see.

Kevin.Huang (zephyrleaves) wrote :

> > I can't get good performance numbers with row cache.
> >
> > I am running sysbench oltp, ZipF distribution, 100GB database, 16 threads
> >
> > mysqld --innodb-buffer-pool-size=10GB --innodb-log-file-size=4G
> > --innodb_flush_log_at_trx_commit=1 --innodb_row_cache_mem_pool_size=10GB
> > --innodb_row_cache_on=ON
> >
> > with innodb_row_cache_mem_pool_size=10GB the results is actually worse then
> > with innodb_row_cache_mem_pool_size=5GB
> >
> > and in any case it is worse then with no row cache at all.
> >
> > I suspect it maybe a some mutex contention issue.
>
> You need use this sysbench
> http://code.google.com/p/row-cache-for-
> innodb/downloads/detail?name=sysbench-0.4.8.tar.gz
> for test.
> This sysbench which i modified for row cache's scene.
> You can use command like this to run test:
> sysbench --test=oltp --oltp-test-mode=simple --oltp-skip-trx=on --oltp-table-
> size=80000000 --oltp-range-size=1 --mysql-host=localhost --mysql-user=xx
> --mysql-password=xx --oltp-read-only=on --init-rng=on --num-threads=70 --oltp-
> dist-type=special --oltp-dist-pct=1 --oltp-dist-res=80 --max-requests=0 --max-
> time=1800 run
>
> Because of the common sysbench's oltp-dist-type is not random enough.

RowCache is good at for hot data and the hot data is distributed in random ,not continuous.

Vadim Tkachenko (vadim-tk) wrote :

Kevin,

How does that explain that with
--innodb-buffer-pool-size=10GB --innodb_row_cache_mem_pool_size=10GB

I am getting worse throughput then with
--innodb-buffer-pool-size=5GB --innodb_row_cache_mem_pool_size=5GB

Again, whatever distribution is, I do not expect to see worse results with bigger cache.

Kevin.Huang (zephyrleaves) wrote :

> Kevin,
>
> I am not sure what do you mean by "not random" enough.
>
> I am using sysbench 0.5 with new zipf distribution.
>
> In anycase I do not expect a performance degradation when I use bigger cache,
> but
> this is what I see.

Vadim,

I see.
I will test in sysbench 0.5 with zipf.
And improve the row cache.

thanks.

Kevin.Huang (zephyrleaves) wrote :

> > Kevin,
> >
> > I am not sure what do you mean by "not random" enough.
> >
> > I am using sysbench 0.5 with new zipf distribution.
> >
> > In anycase I do not expect a performance degradation when I use bigger
> cache,
> > but
> > this is what I see.
>
> Vadim,
>
> I see.
> I will test in sysbench 0.5 with zipf.
> And improve the row cache.
>
> thanks.

Hi Vadim,

Can you provide the command which you use to test the row cache with sysbench?

tks.

Kevin.Huang (zephyrleaves) wrote :

> Kevin,
>
> How does that explain that with
> --innodb-buffer-pool-size=10GB --innodb_row_cache_mem_pool_size=10GB
>
> I am getting worse throughput then with
> --innodb-buffer-pool-size=5GB --innodb_row_cache_mem_pool_size=5GB
>
> Again, whatever distribution is, I do not expect to see worse results with
> bigger cache.

Hi Vadim,

I think i found the reason why bigger cache cause worse performance.
If you set bigger innodb-buffer-pool-size,accordingly you need set bigger innodb_row_cache_cell_num too.
I use hashtable to store the records , so bigger innodb-buffer-pool-size cause more records to store in the hashtable, and the hashtable is more deeper , it make worse performance.

So set the bigger innodb_row_cache_cell_num can make hashtable shallower, it can make better performance.

Vadim Tkachenko (vadim-tk) wrote :

Kevin,

command is
sysbench --test=tests/db/oltp.lua --oltp_tables_count=16 --oltp-table-size=$ROWS --rand-init=on --num-threads=$thread --oltp-read-only=off --report-interval=10 --rand-type=zipf --rand-zipf-t=0.9 --mysql-socket=/var/lib/mysql/mysql.sock --max-time=$RT --max-requests=0 --mysql-user=root --percentile=99 run

what value should I use for innodb_row_cache_cell_num ?

Kevin.Huang (zephyrleaves) wrote :

> Kevin,
>
> command is
> sysbench --test=tests/db/oltp.lua --oltp_tables_count=16 --oltp-table-
> size=$ROWS --rand-init=on --num-threads=$thread --oltp-read-only=off
> --report-interval=10 --rand-type=zipf --rand-zipf-t=0.9 --mysql-
> socket=/var/lib/mysql/mysql.sock --max-time=$RT --max-requests=0 --mysql-
> user=root --percentile=99 run
>
> what value should I use for innodb_row_cache_cell_num ?

Vadim,

The default value of innodb_row_cache_cell_num is 10000.
You can set --innodb-buffer-pool-size=5GB --innodb_row_cache_mem_pool_size=5GB --innodb_row_cache_cell_num=10000
and set --innodb-buffer-pool-size=10GB --innodb_row_cache_mem_pool_size=10GB --innodb_row_cache_cell_num=20000
for test.

Stewart Smith (stewart) wrote :

On Mon, 14 May 2012 01:24:17 -0000, "Kevin.Huang" <email address hidden> wrote:
> The default value of innodb_row_cache_cell_num is 10000.
> You can set --innodb-buffer-pool-size=5GB --innodb_row_cache_mem_pool_size=5GB --innodb_row_cache_cell_num=10000
> and set --innodb-buffer-pool-size=10GB --innodb_row_cache_mem_pool_size=10GB --innodb_row_cache_cell_num=20000
> for test.

I suggest perhaps having a sensible default calculation for it, maybe
1000/GB of row cache? does that seems sensible?

--
Stewart Smith

Kevin.Huang (zephyrleaves) wrote :

> On Mon, 14 May 2012 01:24:17 -0000, "Kevin.Huang" <email address hidden>
> wrote:
> > The default value of innodb_row_cache_cell_num is 10000.
> > You can set --innodb-buffer-pool-size=5GB
> --innodb_row_cache_mem_pool_size=5GB --innodb_row_cache_cell_num=10000
> > and set --innodb-buffer-pool-size=10GB --innodb_row_cache_mem_pool_size=10GB
> --innodb_row_cache_cell_num=20000
> > for test.
>
> I suggest perhaps having a sensible default calculation for it, maybe
> 1000/GB of row cache? does that seems sensible?
>
> --
> Stewart Smith

Hi Stewart,

Innodb_row_cache_cell_num 's value depended on the size of the record which be cached by row cache.
The smaller record need more innodb_row_cache_cell_num for high performance, so i make it can be configured.

Stewart Smith (stewart) wrote :

On Wed, 16 May 2012 02:15:23 -0000, "Kevin.Huang" <email address hidden> wrote:
> Innodb_row_cache_cell_num 's value depended on the size of the record which be cached by row cache.
> The smaller record need more innodb_row_cache_cell_num for high performance, so i make it can be configured.

I completely understand, we do however see a lot of people who don't
know the intricacies of server configuration and will go with defaults
or something close to the default. Making it as easy for these people as
possible is a good feature, even if it may not be optimal for everyone
(people can change the variable by hand of course).

--
Stewart Smith

Kevin.Huang (zephyrleaves) wrote :

> On Wed, 16 May 2012 02:15:23 -0000, "Kevin.Huang" <email address hidden>
> wrote:
> > Innodb_row_cache_cell_num 's value depended on the size of the record which
> be cached by row cache.
> > The smaller record need more innodb_row_cache_cell_num for high performance,
> so i make it can be configured.
>
> I completely understand, we do however see a lot of people who don't
> know the intricacies of server configuration and will go with defaults
> or something close to the default. Making it as easy for these people as
> possible is a good feature, even if it may not be optimal for everyone
> (people can change the variable by hand of course).
>
> --
> Stewart Smith

Yeah, I will improve it ,the default value will be self-adaption ,and can be configured by people .

Unmerged revisions

207. By Kevin.Huang on 2012-03-22

fix bug for row cache is not evict when truncate

206. By on 2012-03-02

fix bug for memory leak on shutdown
fix bug for crash on rollback

205. By on 2012-03-02

fix bug for memory leak on shutdown
fix bug for crash on rollback

204. By Kevin.Huang on 2012-01-12

add row cache for innodb

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== added file 'patches/innodb_row_cache.diff'
--- patches/innodb_row_cache.diff 1970-01-01 00:00:00 +0000
+++ patches/innodb_row_cache.diff 2012-03-22 02:13:18 +0000
@@ -0,0 +1,3317 @@
1Index: storage/innobase/mtr/mtr0mtr.c
2===================================================================
3--- storage/innobase/mtr/mtr0mtr.c (revision 498)
4+++ storage/innobase/mtr/mtr0mtr.c (revision 724)
5@@ -279,6 +279,11 @@
6 }
7 #endif /* UNIV_DEBUG_VALGRIND */
8 ut_d(mtr->state = MTR_COMMITTED);
9+
10+ if (mtr->row_cache_value_queue_base.count > 0) {
11+ //release row cache value
12+ release_row_cache_value_in_mtr(mtr);
13+ }
14 }
15
16 #ifndef UNIV_HOTBACKUP
17Index: storage/innobase/cache/row0cache0lru.c
18===================================================================
19--- storage/innobase/cache/row0cache0lru.c (revision 0)
20+++ storage/innobase/cache/row0cache0lru.c (revision 724)
21@@ -0,0 +1,155 @@
22+/********************************************************************
23+created: 2011/03/23
24+created: 23:3:2011 15:15
25+file base: row0cache0lru
26+file ext: c
27+author: wentong@taobao.com
28+
29+purpose:
30+*********************************************************************/
31+#include "row0cache0lru.h"
32+#include "row0cache0hash.h"
33+#include "row0cache0mempool.h"
34+#include "ut0rbt.h"
35+#include "ut0lst.h"
36+
37+#define ROW_CACHE_FREE_DISANCE 100
38+
39+static ROW_CACHE_VALUE_LIST_BASE *innodb_row_cache_lru;
40+
41+static row_cache_lru_stat_t _row_cache_lru_stat;
42+
43+UNIV_INTERN row_cache_lru_stat_t* row_cache_lru_stat = &_row_cache_lru_stat;
44+
45+UNIV_INTERN my_bool innodb_row_cache_clean_cache = FALSE;
46+
47+void init_innodb_row_cache_lru(){
48+ if (innodb_row_cache_mutex_num > 0){
49+ ulint i;
50+ innodb_row_cache_lru = (ROW_CACHE_VALUE_LIST_BASE*) ut_malloc(innodb_row_cache_mutex_num * sizeof(ROW_CACHE_VALUE_LIST_BASE));
51+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
52+ {
53+ UT_LIST_INIT(*(innodb_row_cache_lru+i));
54+ }
55+ }
56+ memset(row_cache_lru_stat , 0 , sizeof(row_cache_lru_stat_t));
57+}
58+
59+void deinit_innodb_row_cache_lru(){
60+ if (innodb_row_cache_mutex_num > 0){
61+ ulint i;
62+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
63+ {
64+ row_cache_value_t* value;
65+ ROW_CACHE_VALUE_LIST_BASE* free_lru = innodb_row_cache_lru + i;
66+ row_cache_enter_mutex_by_no(i);
67+ for (value = UT_LIST_GET_LAST(*free_lru);
68+ value!=NULL;){
69+ row_cache_value_t* free_value;
70+ ulint fold;
71+ free_value = value;
72+ fold = free_value->fold;
73+ value=UT_LIST_GET_PREV(list,value);
74+ UT_LIST_REMOVE(list,*free_lru,free_value);
75+ delete_row_cache_value(free_value);
76+ ca_free(free_value->buf ,free_value->buf_size ,fold);
77+ ca_free_for_value(free_value , fold);
78+ }
79+ row_cache_exit_mutex_by_no(i);
80+ }
81+ ut_free(innodb_row_cache_lru);
82+ }
83+}
84+
85+void clean_row_cache(){
86+ if (innodb_row_cache_mutex_num > 0){
87+ ulint i;
88+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
89+ {
90+ row_cache_value_t* value;
91+ ROW_CACHE_VALUE_LIST_BASE* free_lru = innodb_row_cache_lru + i;
92+ row_cache_enter_mutex_by_no(i);
93+ for (value = UT_LIST_GET_LAST(*free_lru);
94+ value!=NULL;){
95+ row_cache_value_t* free_value;
96+ ulint fold;
97+ free_value = value;
98+ fold = free_value->fold;
99+ value=UT_LIST_GET_PREV(list,value);
100+ if(free_value->ref_num==0){
101+ UT_LIST_REMOVE(list,*free_lru,free_value);
102+ delete_row_cache_value(free_value);
103+ ca_free(free_value->buf ,free_value->buf_size, fold);
104+ ca_free_for_value(free_value , fold);
105+ }
106+ }
107+ row_cache_exit_mutex_by_no(i);
108+ }
109+ }
110+}
111+
112+static ROW_CACHE_VALUE_LIST_BASE* get_current_lru_base(const ulint fold){
113+ return innodb_row_cache_lru + row_cache_get_mutex_no(fold);
114+}
115+
116+void add_row_cache_value_to_lru( row_cache_value_t* value ) {
117+ UT_LIST_ADD_FIRST(list,*get_current_lru_base(value->fold),value);
118+ row_cache_lru_stat->n_add++;
119+}
120+
121+void make_row_cache_value_first_from_lru( row_cache_value_t* value )
122+{
123+ ROW_CACHE_VALUE_LIST_BASE* lru_base = get_current_lru_base(value->fold);
124+ UT_LIST_REMOVE(list,*lru_base,value);
125+ UT_LIST_ADD_FIRST(list,*lru_base,value);
126+ row_cache_lru_stat->n_make_first++;
127+}
128+
129+ulint free_from_lru(const ulint size , const ulint used_fold)
130+{
131+ ulint iteration_size = 0;
132+ int free_distance = 0;
133+ ulint free_size = 0;
134+ row_cache_value_t* value;
135+ ROW_CACHE_VALUE_LIST_BASE* lru_base = get_current_lru_base(used_fold);
136+ for (value = UT_LIST_GET_LAST(*lru_base);
137+ free_distance < ROW_CACHE_FREE_DISANCE && value!=NULL;)
138+ {
139+ row_cache_value_t* free_value;
140+ free_value = value;
141+ value=UT_LIST_GET_PREV(list,value);
142+ iteration_size += free_value->buf_size;
143+ if(free_value->ref_num==0 && free_value->fold!=used_fold){
144+ //the value can't be used or can't be locked by the upper function
145+ UT_LIST_REMOVE(list,*lru_base,free_value);
146+ row_cache_lru_stat->n_evict++;
147+ free_size+=free_value->buf_size;
148+ delete_row_cache_value(free_value);
149+ ca_free(free_value->buf ,free_value->buf_size , used_fold);
150+ ca_free_for_value(free_value , used_fold);
151+ }
152+ if(iteration_size > size){
153+ //when may free enough mem and calc free distance
154+ free_distance++;
155+ }
156+ }
157+ return free_size;
158+}
159+
160+void remove_row_cache_value_from_lru( row_cache_value_t* value )
161+{
162+ UT_LIST_REMOVE(list,*get_current_lru_base(value->fold),value);
163+ row_cache_lru_stat->n_evict++;
164+}
165+
166+ulint get_row_cache_lru_count()
167+{
168+ ulint i;
169+ ulint ret = 0;
170+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
171+ {
172+ ret += UT_LIST_GET_LEN(*(innodb_row_cache_lru+i));
173+ }
174+ return ret;
175+}
176+
177
178Property changes on: storage/innobase/cache/row0cache0lru.c
179___________________________________________________________________
180Added: svn:mime-type
181 + text/plain
182
183Index: storage/innobase/cache/row0cache0hash.c
184===================================================================
185--- storage/innobase/cache/row0cache0hash.c (revision 0)
186+++ storage/innobase/cache/row0cache0hash.c (revision 724)
187@@ -0,0 +1,196 @@
188+/********************************************************************
189+created: 2011/03/24
190+created: 24:3:2011 8:49
191+file base: row0cache0hash
192+file ext: c
193+author: wentong@taobao.com
194+
195+purpose:
196+*********************************************************************/
197+
198+#include "row0cache0hash.h"
199+#include "ha0ha.h"
200+#include "hash0hash.h"
201+#include "rem0rec.h"
202+#include "rem0cmp.h"
203+#include "ut0byte.h"
204+
205+
206+
207+UNIV_INTERN unsigned long innodb_row_cache_cell_num = 10000L;
208+
209+UNIV_INTERN unsigned int innodb_row_cache_mutex_num_shift = 6;
210+
211+UNIV_INTERN ulint innodb_row_cache_mutex_num = 0;
212+
213+static row_cache_t _innodb_row_cache;
214+
215+UNIV_INTERN row_cache_t* innodb_row_cache = &_innodb_row_cache;
216+
217+int init_row_cache_hash(my_bool innodb_row_cache_on)
218+{
219+ memset(innodb_row_cache, 0 , sizeof(row_cache_t));
220+ if (innodb_row_cache_on)
221+ {
222+ innodb_row_cache_mutex_num = (1<<innodb_row_cache_mutex_num_shift);
223+ innodb_row_cache->row_cache = ha_create(innodb_row_cache_cell_num,innodb_row_cache_mutex_num,0);
224+
225+ }else{
226+ innodb_row_cache->row_cache = NULL;
227+ }
228+ return 0;
229+}
230+
231+static void free_hash_table_mutex(hash_table_t* table){
232+ //mutex was freed by sync_close();
233+ ulint i;
234+ for (i = 0; i < table->n_mutexes; i++) {
235+ mem_heap_free(table->heaps[i]);
236+ }
237+ mem_free(table->mutexes);
238+ table->mutexes = NULL;
239+ table->n_mutexes = 0;
240+ mem_free(table->heaps);
241+ table->heaps = NULL;
242+}
243+
244+void deinit_row_cache_hash(my_bool innodb_row_cache_on){
245+ if (innodb_row_cache_on)
246+ {
247+// ha_clear(innodb_row_cache->row_cache);
248+ free_hash_table_mutex(innodb_row_cache->row_cache);
249+ hash_table_free(innodb_row_cache->row_cache);
250+ }
251+}
252+
253+row_cache_value_t* search_row_cache_value(const dtuple_t* tuple, const dict_index_t* index, const ulint fold)
254+{
255+ ulint offsets_[REC_OFFS_NORMAL_SIZE];
256+ ulint* offsets = offsets_;
257+ mem_heap_t* heap = NULL;
258+ row_cache_chain_t* chain = NULL;
259+ rec_offs_init(offsets_);
260+
261+ HASH_SEARCH(
262+ /* hash_chain->"next" */
263+ next,
264+ /* the hash table */
265+ innodb_row_cache->row_cache,
266+ /* fold */
267+ fold,
268+ /* the type of the next variable */
269+ row_cache_chain_t*,
270+ /* auxiliary variable */
271+ chain,
272+ /* assertion on every traversed item */
273+ ,
274+ /* this determines if we have found the lock */
275+ (chain->value!=NULL && chain->value->tree_id == index->id && chain->value->table_id == index->table->id &&
276+ cmp_dtuple_rec(tuple,chain->value->rec,rec_get_offsets(chain->value->rec, index, offsets, ULINT_UNDEFINED, &heap)) == 0));
277+ if (UNIV_LIKELY_NULL(heap)) {
278+ mem_heap_free(heap);
279+ }
280+ if (chain == NULL) {
281+ return(NULL);
282+ }
283+ /* else */
284+ return(chain->value);
285+}
286+
287+
288+row_cache_value_t* search_row_cache_value_with_rec( const rec_t* rec, const ulint* rec_offsets, dict_index_t* index, const ulint fold )
289+{
290+ ulint offsets_[REC_OFFS_NORMAL_SIZE];
291+ ulint* offsets = offsets_;
292+ mem_heap_t* heap = NULL;
293+ row_cache_chain_t* chain = NULL;
294+ rec_offs_init(offsets_);
295+
296+ HASH_SEARCH(
297+ /* hash_chain->"next" */
298+ next,
299+ /* the hash table */
300+ innodb_row_cache->row_cache,
301+ /* fold */
302+ fold,
303+ /* the type of the next variable */
304+ row_cache_chain_t*,
305+ /* auxiliary variable */
306+ chain,
307+ /* assertion on every traversed item */
308+ ,
309+ /* this determines if we have found the lock */
310+ (chain->value!=NULL && chain->value->tree_id == index->id && chain->value->table_id == index->table->id &&
311+ cmp_rec_rec(rec,chain->value->rec,rec_offsets,rec_get_offsets(chain->value->rec, index, offsets, ULINT_UNDEFINED, &heap),index)==0));
312+
313+ if (UNIV_LIKELY_NULL(heap)) {
314+ mem_heap_free(heap);
315+ }
316+ if (chain == NULL) {
317+ return(NULL);
318+ }
319+ /* else */
320+ return(chain->value);
321+
322+}
323+
324+
325+
326+
327+row_cache_value_t* insert_row_cache_value(const ulint fold , row_cache_value_t* value){
328+ HASH_INSERT(
329+ /* the type used in the hash chain */
330+ row_cache_chain_t,
331+ /* hash_chain->"next" */
332+ next,
333+ /* the hash table */
334+ innodb_row_cache->row_cache,
335+ /* fold */
336+ fold,
337+ /* add this data to the hash */
338+ &value->chain);
339+ return value;
340+}
341+
342+
343+void delete_row_cache_value(row_cache_value_t* value){
344+ HASH_DELETE(
345+ row_cache_chain_t,
346+ next,
347+ innodb_row_cache->row_cache,
348+ value->fold,
349+ (&value->chain));
350+}
351+
352+void row_cache_enter_mutex_by_no(const ulint no){
353+ mutex_enter(hash_get_nth_mutex(innodb_row_cache->row_cache, no));
354+}
355+
356+void row_cache_exit_mutex_by_no(const ulint no){
357+ mutex_exit(hash_get_nth_mutex(innodb_row_cache->row_cache, no));
358+}
359+
360+void row_cache_enter_mutex( const ulint fold )
361+{
362+ hash_mutex_enter(innodb_row_cache->row_cache , fold);
363+}
364+
365+ulint row_cache_enter_mutex_nowait( const ulint fold )
366+{
367+ //hash_mutex_enter(innodb_row_cache->row_cache , fold);
368+ return mutex_enter_nowait(hash_get_mutex(innodb_row_cache->row_cache, fold));
369+}
370+
371+void row_cache_exit_mutex( const ulint fold )
372+{
373+ hash_mutex_exit(innodb_row_cache->row_cache , fold);
374+}
375+
376+int row_cache_own_mutex(const ulint fold1 , const ulint fold2){
377+ return hash_get_mutex_no(innodb_row_cache->row_cache , fold1) == hash_get_mutex_no(innodb_row_cache->row_cache , fold2);
378+}
379+
380+ulint row_cache_get_mutex_no( const ulint fold )
381+{
382+ return hash_get_mutex_no(innodb_row_cache->row_cache , fold);
383+}
384
385Property changes on: storage/innobase/cache/row0cache0hash.c
386___________________________________________________________________
387Added: svn:mime-type
388 + text/plain
389
390Index: storage/innobase/cache/row0cache0filter.c
391===================================================================
392--- storage/innobase/cache/row0cache0filter.c (revision 0)
393+++ storage/innobase/cache/row0cache0filter.c (revision 724)
394@@ -0,0 +1,295 @@
395+/********************************************************************
396+ created: 2011/05/31
397+ created: 31:5:2011 11:40
398+ file base: row0cache0filter
399+ file ext: c
400+ author: wentong@taobao.com
401+
402+ purpose:
403+ *********************************************************************/
404+
405+#include "row0cache0filter.h"
406+#include "ut0rbt.h"
407+#include "sync0rw.h"
408+#include "ut0lst.h"
409+#include "hash0hash.h"
410+#include "ut0rnd.h"
411+#include "dict0mem.h"
412+
413+#define FILTER_HASH_CELL_NUM 1000
414+
415+typedef struct row_cache_filter_index_chain row_cache_filter_index_chain_t;
416+typedef struct row_cache_filter_index_value row_cache_filter_index_value_t;
417+
418+struct row_cache_filter_index_chain {
419+ row_cache_filter_index_value_t* value;
420+ row_cache_filter_index_chain_t* next;
421+};
422+
423+struct row_cache_filter_index_value {
424+ index_id_t tree_id;
425+ row_cache_filter_index_chain_t chain;
426+};
427+
428+UNIV_INTERN char* innodb_row_cache_index = NULL;
429+
430+static char innodb_row_cache_index_r[INDEX_CONFIG_LEN + 1];
431+
432+static ibool has_filter = FALSE;
433+
434+static hash_table_t* filtered_in_index;
435+
436+static hash_table_t* filtered_out_index;
437+
438+#ifdef UNIV_PFS_RWLOCK
439+/* Key to register btr_search_sys with performance schema */
440+
441+UNIV_INTERN mysql_pfs_key_t row_cache_filter_lock_key;
442+#endif /* UNIV_PFS_RWLOCK */
443+
444+static rw_lock_t filter_lock;
445+
446+void reset_filter();
447+static int wild_case_compare_wl(const char *str, const char *wildstr,
448+ const size_t length);
449+static void free_hash_table_elem(hash_table_t* table);
450+
451+void init_row_cache_filter(my_bool innodb_row_cache_on) {
452+ if (innodb_row_cache_on) {
453+ rw_lock_create(row_cache_filter_lock_key, &filter_lock, SYNC_MEM_HASH);
454+ filtered_in_index = hash_create(FILTER_HASH_CELL_NUM);
455+ filtered_out_index = hash_create(FILTER_HASH_CELL_NUM);
456+ reset_filter();
457+ }
458+}
459+
460+void deinit_row_cache_filter(my_bool innodb_row_cache_on) {
461+ if (innodb_row_cache_on) {
462+ reset_filter();
463+ hash_table_free(filtered_in_index);
464+ hash_table_free(filtered_out_index);
465+ rw_lock_free(&filter_lock);
466+ }
467+}
468+
469+void reset_filter() {
470+ size_t len = 0;
471+ if (innodb_row_cache_index) {
472+ len = strlen(innodb_row_cache_index);
473+ }
474+ len = len > INDEX_CONFIG_LEN ? INDEX_CONFIG_LEN : len;
475+ rw_lock_x_lock(&filter_lock);
476+ memset(innodb_row_cache_index_r, 0, INDEX_CONFIG_LEN + 1);
477+ if (len) {
478+ ut_memcpy(innodb_row_cache_index_r, innodb_row_cache_index, len);
479+ }
480+ innodb_row_cache_index_r[len] = '\0';
481+ innodb_row_cache_index = innodb_row_cache_index_r;
482+
483+ free_hash_table_elem(filtered_in_index);
484+ free_hash_table_elem(filtered_out_index);
485+ if (!len) {
486+ has_filter = FALSE;
487+ } else {
488+ has_filter = TRUE;
489+ }
490+ rw_lock_x_unlock(&filter_lock);
491+}
492+
493+static void add_result_to_index_hash(ulint fold, index_id_t tree_id,
494+ my_bool is_in_filter) {
495+ hash_table_t* table;
496+ row_cache_filter_index_value_t* value;
497+ rw_lock_x_lock(&filter_lock);
498+ if (is_in_filter) {
499+ table = filtered_in_index;
500+ } else {
501+ table = filtered_out_index;
502+ }
503+ value =
504+ (row_cache_filter_index_value_t*) mem_alloc(sizeof(row_cache_filter_index_value_t));
505+ if (value) {
506+ value->tree_id = tree_id;
507+ value->chain.value = value;
508+ HASH_INSERT(
509+ /* the type used in the hash chain */
510+ row_cache_filter_index_chain_t,
511+ /* hash_chain->"next" */
512+ next,
513+ /* the hash table */
514+ table,
515+ /* fold */
516+ fold,
517+ /* add this data to the hash */
518+ &value->chain);
519+ }
520+ rw_lock_x_unlock(&filter_lock);
521+}
522+
523+static my_bool is_in_filter_index_hash(ulint fold, index_id_t tree_id,
524+ my_bool is_in_filter) {
525+ hash_table_t* table;
526+ row_cache_filter_index_chain_t* chain = NULL;
527+ if (is_in_filter) {
528+ table = filtered_in_index;
529+ } else {
530+ table = filtered_out_index;
531+ }HASH_SEARCH(
532+ /* hash_chain->"next" */
533+ next,
534+ /* the hash table */
535+ table,
536+ /* fold */
537+ fold,
538+ /* the type of the next variable */
539+ row_cache_filter_index_chain_t*,
540+ /* auxiliary variable */
541+ chain,
542+ /* assertion on every traversed item */
543+ ,
544+ /* this determines if we have found the lock */
545+ (chain->value!=NULL && chain->value->tree_id == tree_id));
546+ if (chain == NULL) {
547+ return (FALSE);
548+ }
549+ return (TRUE);
550+}
551+
552+static int wild_case_compare_wl(const char *str, const char *wildstr,
553+ const size_t length) {
554+ const char* start = wildstr;
555+ char wild_many = '*';
556+ char wild_one = '?';
557+ char wild_prefix = 0; /* QQ this can potentially cause a SIGSEGV */
558+ int flag;
559+ while ((size_t) (wildstr - start) < length) {
560+ while ((size_t) (wildstr - start) < length && *wildstr != wild_many
561+ && *wildstr != wild_one) {
562+ if (*wildstr == wild_prefix
563+ && (size_t) (wildstr + 1 - start) < length)
564+ wildstr++;
565+ if (toupper(*wildstr++) != toupper(*str++))
566+ return (1);
567+ }
568+ if ((size_t) (wildstr - start) >= length)
569+ return (*str != 0);
570+ if (*wildstr++ == wild_one) {
571+ if (!*str++)
572+ return (1); /* One char; skip */
573+ } else { /* Found '*' */
574+ if ((size_t) (wildstr - start) >= length)
575+ return (0); /* '*' as last char: OK */
576+ flag = (*wildstr != wild_many && *wildstr != wild_one);
577+ do {
578+ if (flag) {
579+ char cmp;
580+ if ((cmp = *wildstr) == wild_prefix
581+ && (size_t) (wildstr + 1 - start) < length)
582+ cmp = wildstr[1];
583+ cmp = toupper(cmp);
584+ while (*str && toupper(*str) != cmp)
585+ str++;
586+ if (!*str)
587+ return (1);
588+ }
589+ if (wild_case_compare_wl(str, wildstr,
590+ length - (wildstr - start)) == 0)
591+ return (0);
592+ } while (*str++);
593+ return (1);
594+ }
595+ }
596+ return (*str != '\0');
597+}
598+
599+static my_bool compare_index_config(const char* table_name,
600+ const char* index_name, const char* config) {
601+ char index_sp = ';';
602+ char table_sp = ':';
603+ my_bool is_index_now = FALSE;
604+ my_bool is_table_matching = FALSE;
605+ my_bool is_index_matching = FALSE;
606+ while (*config) {
607+ const char* o_config = config;
608+ size_t length = 0;
609+ while (*config && *config != index_sp && *config != table_sp) {
610+ config++;
611+ length++;
612+ }
613+ if (*config == table_sp) {
614+ if (!is_index_now) {
615+ is_table_matching = !wild_case_compare_wl(table_name, o_config,
616+ length);
617+ is_index_now = TRUE;
618+ } else if (is_table_matching) {
619+ is_index_matching = !wild_case_compare_wl(index_name, o_config,
620+ length);
621+ if (is_index_matching) {
622+ return TRUE;
623+ }
624+ }
625+ } else if (!*config || *config == index_sp) {
626+ if (is_index_now && is_table_matching) {
627+ is_index_matching = !wild_case_compare_wl(index_name, o_config,
628+ length);
629+ if (is_index_matching) {
630+ return TRUE;
631+ }
632+ }
633+ is_index_now = FALSE;
634+ is_table_matching = FALSE;
635+ is_index_matching = FALSE;
636+ }
637+ if (*config) {
638+ config++;
639+ }
640+ }
641+ return FALSE;
642+}
643+
644+static void free_hash_table_elem(hash_table_t* table) {
645+ ulint i;
646+ for (i = 0; i < table->n_cells; i++) {
647+ row_cache_filter_index_chain_t* chain =
648+ (row_cache_filter_index_chain_t*) HASH_GET_FIRST(table,i);
649+ while (chain != NULL) {
650+ row_cache_filter_index_value_t* value = chain->value;
651+ chain = chain->next;
652+ mem_free(value);
653+ }
654+ }
655+ hash_table_clear(table);
656+}
657+
658+#define index_id_get_high(id) ((ulint)(((id)>>32)&0xFFFFFFFF))
659+
660+#define index_id_get_low(id) ((ulint)((id)&0xFFFFFFFF))
661+
662+
663+my_bool is_index_need_cache(const dict_index_t* index) {
664+ const char* table_name = index->table_name;
665+ index_id_t index_id = index->id;
666+ const char* index_name = index->name;
667+ ulint fold = ut_fold_ulint_pair(index_id_get_high(index_id), index_id_get_low(index_id));
668+ my_bool is_in_hash = FALSE;
669+ my_bool ret;
670+ rw_lock_s_lock(&filter_lock);
671+ if (!has_filter) {
672+ is_in_hash = TRUE; //didn't need to be cache the result
673+ ret = TRUE;
674+ } else if (is_in_filter_index_hash(fold, index_id, TRUE)) {
675+ is_in_hash = TRUE;
676+ ret = TRUE;
677+ } else if (is_in_filter_index_hash(fold, index_id, FALSE)) {
678+ is_in_hash = TRUE;
679+ ret = FALSE;
680+ } else {
681+ ret = compare_index_config(table_name, index_name,
682+ innodb_row_cache_index_r);
683+ }rw_lock_s_unlock(&filter_lock);
684+ if (!is_in_hash) {
685+ add_result_to_index_hash(fold, index_id, ret);
686+ }
687+ return ret;
688+}
689+
690
691Property changes on: storage/innobase/cache/row0cache0filter.c
692___________________________________________________________________
693Added: svn:mime-type
694 + text/plain
695
696Index: storage/innobase/cache/row0cache0mempool.c
697===================================================================
698--- storage/innobase/cache/row0cache0mempool.c (revision 0)
699+++ storage/innobase/cache/row0cache0mempool.c (revision 724)
700@@ -0,0 +1,800 @@
701+/********************************************************************
702+ created: 2011/03/23
703+ created: 23:3:2011 15:15
704+ file base: row0cache0mempool
705+ file ext: c
706+ author: wentong@taobao.com
707+
708+ purpose:
709+*********************************************************************/
710+#include "row0cache0mempool.h"
711+#include "mem0mem.h"
712+#include "sync0sync.h"
713+#include "row0cache0lru.h"
714+#include "mem0pool.h"
715+#include "row0cache.h"
716+#include "ut0lst.h"
717+#include "os0proc.h"
718+#include "srv0srv.h"
719+
720+//copy from mem0pool.c
721+
722+/** The smallest memory area total size */
723+#define MEM_AREA_MIN_SIZE (2 * MEM_AREA_EXTRA_SIZE)
724+
725+/** Mask used to extract the free bit from area->size */
726+#define MEM_AREA_FREE 1
727+
728+/** Data structure for a memory pool. The space is allocated using the buddy
729+algorithm, where free list i contains areas of size 2 to power i. */
730+struct mem_pool_struct{
731+ byte* buf; /*!< memory pool */
732+ ulint size; /*!< memory common pool size */
733+ ulint reserved; /*!< amount of currently allocated
734+ memory */
735+ mutex_t mutex; /*!< mutex protecting this struct */
736+ UT_LIST_BASE_NODE_T(mem_area_t)
737+ free_list[64]; /*!< lists of free memory areas: an
738+ area is put to the list whose number
739+ is the 2-logarithm of the area size */
740+};
741+
742+/********************************************************************//**
743+Sets memory area size. */
744+static
745+void
746+mem_area_set_size_out(
747+/*==============*/
748+ mem_area_t* area, /*!< in: area */
749+ ulint size) /*!< in: size */
750+{
751+ area->size_and_free = (area->size_and_free & MEM_AREA_FREE)
752+ | size;
753+}
754+
755+/********************************************************************//**
756+Sets memory area free bit. */
757+static
758+void
759+mem_area_set_free_out(
760+/*==============*/
761+mem_area_t* area, /*!< in: area */
762+ibool free) /*!< in: free bit value */
763+{
764+#if TRUE != MEM_AREA_FREE
765+# error "TRUE != MEM_AREA_FREE"
766+#endif
767+ area->size_and_free = (area->size_and_free & ~MEM_AREA_FREE)
768+ | free;
769+}
770+
771+/********************************************************************//**
772+Returns memory area free bit.
773+ @return TRUE if free */
774+static
775+ibool
776+mem_area_get_free_out(
777+/*==============*/
778+mem_area_t* area) /*!< in: area */
779+{
780+#if TRUE != MEM_AREA_FREE
781+# error "TRUE != MEM_AREA_FREE"
782+#endif
783+ return(area->size_and_free & MEM_AREA_FREE);
784+}
785+
786+/********************************************************************//**
787+Returns memory area size.
788+@return size */
789+static
790+ulint
791+mem_area_get_size_out(
792+/*==============*/
793+mem_area_t* area) /*!< in: area */
794+{
795+ return(area->size_and_free & ~MEM_AREA_FREE);
796+}
797+
798+
799+/********************************************************************//**
800+Gets the buddy of an area, if it exists in pool.
801+@return the buddy, NULL if no buddy in pool */
802+static
803+mem_area_t*
804+mem_area_get_buddy_out(
805+/*===============*/
806+ mem_area_t* area, /*!< in: memory area */
807+ ulint size, /*!< in: memory area size */
808+ mem_pool_t* pool) /*!< in: memory pool */
809+{
810+ mem_area_t* buddy;
811+
812+ ut_ad(size != 0);
813+
814+ if (((((byte*)area) - pool->buf) % (2 * size)) == 0) {
815+
816+ /* The buddy is in a higher address */
817+
818+ buddy = (mem_area_t*)(((byte*)area) + size);
819+
820+ if ((((byte*)buddy) - pool->buf) + size > pool->size) {
821+
822+ /* The buddy is not wholly contained in the pool:
823+ there is no buddy */
824+
825+ buddy = NULL;
826+ }
827+ } else {
828+ /* The buddy is in a lower address; NOTE that area cannot
829+ be at the pool lower end, because then we would end up to
830+ the upper branch in this if-clause: the remainder would be
831+ 0 */
832+
833+ buddy = (mem_area_t*)(((byte*)area) - size);
834+ }
835+
836+ return(buddy);
837+}
838+/********************************************************************//**
839+Fills the specified free list.
840+@return TRUE if we were able to insert a block to the free list */
841+static
842+ibool
843+mem_pool_fill_free_list(
844+/*====================*/
845+ ulint i, /*!< in: free list index */
846+ mem_pool_t* pool) /*!< in: memory pool */
847+{
848+ mem_area_t* area;
849+ mem_area_t* area2;
850+ ibool ret;
851+
852+ ut_ad(mutex_own(&(pool->mutex)));
853+
854+ if (UNIV_UNLIKELY(i >= 63)) {
855+ /* We come here when we have run out of space in the
856+ memory pool: */
857+
858+ return(FALSE);
859+ }
860+
861+ area = UT_LIST_GET_FIRST(pool->free_list[i + 1]);
862+
863+ if (area == NULL) {
864+ if (UT_LIST_GET_LEN(pool->free_list[i + 1]) > 0) {
865+ ut_print_timestamp(stderr);
866+
867+ fprintf(stderr,
868+ " InnoDB: Error: mem pool free list %lu"
869+ " length is %lu\n"
870+ "InnoDB: though the list is empty!\n",
871+ (ulong) i + 1,
872+ (ulong)
873+ UT_LIST_GET_LEN(pool->free_list[i + 1]));
874+ }
875+
876+ ret = mem_pool_fill_free_list(i + 1, pool);
877+
878+ if (ret == FALSE) {
879+
880+ return(FALSE);
881+ }
882+
883+ area = UT_LIST_GET_FIRST(pool->free_list[i + 1]);
884+ }
885+
886+ if (UNIV_UNLIKELY(UT_LIST_GET_LEN(pool->free_list[i + 1]) == 0)) {
887+ mem_analyze_corruption(area);
888+
889+ ut_error;
890+ }
891+
892+ UT_LIST_REMOVE(free_list, pool->free_list[i + 1], area);
893+
894+ area2 = (mem_area_t*)(((byte*)area) + ut_2_exp(i));
895+ UNIV_MEM_ALLOC(area2, MEM_AREA_EXTRA_SIZE);
896+
897+ mem_area_set_size_out(area2, ut_2_exp(i));
898+ mem_area_set_free_out(area2, TRUE);
899+
900+ UT_LIST_ADD_FIRST(free_list, pool->free_list[i], area2);
901+
902+ mem_area_set_size_out(area, ut_2_exp(i));
903+
904+ UT_LIST_ADD_FIRST(free_list, pool->free_list[i], area);
905+
906+ return(TRUE);
907+}
908+
909+/********************************************************************//**
910+Creates a memory pool.
911+@return memory pool */
912+static
913+mem_pool_t*
914+mem_pool_create_out(
915+/*============*/
916+ ulint size) /*!< in: pool size in bytes */
917+{
918+ mem_pool_t* pool;
919+ mem_area_t* area;
920+ ulint i;
921+ ulint used;
922+
923+ pool = ut_malloc(sizeof(mem_pool_t));
924+
925+ /* We do not set the memory to zero (FALSE) in the pool,
926+ but only when allocated at a higher level in mem0mem.c.
927+ This is to avoid masking useful Purify warnings. */
928+
929+ pool->buf = ut_malloc_low(size, FALSE, TRUE);
930+ /* pool->buf = os_mem_alloc_large(&size);
931+ if (pool->buf == NULL) {
932+ ut_print_timestamp(stderr);
933+
934+ fprintf(stderr,
935+ " InnoDB: We now intentionally"
936+ " generate a seg fault so that\n"
937+ "InnoDB: on Linux we get a stack trace.\n");
938+
939+ if (*ut_mem_null_ptr) ut_mem_null_ptr = 0;
940+ } */
941+ pool->size = size;
942+
943+ mutex_create(PFS_NOT_INSTRUMENTED, &pool->mutex, SYNC_MEM_POOL);
944+
945+ /* Initialize the free lists */
946+
947+ for (i = 0; i < 64; i++) {
948+
949+ UT_LIST_INIT(pool->free_list[i]);
950+ }
951+
952+ used = 0;
953+
954+ while (size - used >= MEM_AREA_MIN_SIZE) {
955+
956+ i = ut_2_log(size - used);
957+
958+ if (ut_2_exp(i) > size - used) {
959+
960+ /* ut_2_log rounds upward */
961+
962+ i--;
963+ }
964+
965+ area = (mem_area_t*)(pool->buf + used);
966+
967+ mem_area_set_size_out(area, ut_2_exp(i));
968+ mem_area_set_free_out(area, TRUE);
969+ UNIV_MEM_FREE(MEM_AREA_EXTRA_SIZE + (byte*) area,
970+ ut_2_exp(i) - MEM_AREA_EXTRA_SIZE);
971+
972+ UT_LIST_ADD_FIRST(free_list, pool->free_list[i], area);
973+
974+ used = used + ut_2_exp(i);
975+ }
976+
977+ ut_ad(size >= used);
978+
979+ pool->reserved = 0;
980+
981+ return(pool);
982+}
983+
984+/********************************************************************//**
985+Frees a memory pool. */
986+static
987+void
988+mem_pool_free_out(
989+/*==========*/
990+ mem_pool_t* pool) /*!< in, own: memory pool */
991+{
992+ //os_mem_free_large(pool->buf , pool->size);
993+ ut_free(pool->buf);
994+ ut_free(pool);
995+}
996+
997+/********************************************************************//**
998+Allocates memory from a pool. NOTE: This low-level function should only be
999+used in mem0mem.*!
1000+@return own: allocated memory buffer */
1001+static
1002+void*
1003+mem_area_alloc_out(
1004+/*===========*/
1005+ ulint* psize, /*!< in: requested size in bytes; for optimum
1006+ space usage, the size should be a power of 2
1007+ minus MEM_AREA_EXTRA_SIZE;
1008+ out: allocated size in bytes (greater than
1009+ or equal to the requested size) */
1010+ mem_pool_t* pool) /*!< in: memory pool */
1011+{
1012+ mem_area_t* area;
1013+ ulint size;
1014+ ulint n;
1015+ ibool ret;
1016+
1017+
1018+ size = *psize;
1019+ n = ut_2_log(ut_max(size + MEM_AREA_EXTRA_SIZE, MEM_AREA_MIN_SIZE));
1020+
1021+ mutex_enter(&(pool->mutex));
1022+
1023+
1024+ area = UT_LIST_GET_FIRST(pool->free_list[n]);
1025+
1026+ if (area == NULL) {
1027+ ret = mem_pool_fill_free_list(n, pool);
1028+
1029+ if (ret == FALSE) {
1030+ /* Out of memory in memory pool: we try to allocate
1031+ from the operating system with the regular malloc: */
1032+
1033+ mutex_exit(&(pool->mutex));
1034+
1035+ return(NULL);
1036+ }
1037+
1038+ area = UT_LIST_GET_FIRST(pool->free_list[n]);
1039+ }
1040+
1041+ if (!mem_area_get_free_out(area)) {
1042+ fprintf(stderr,
1043+ "InnoDB: Error: Removing element from mem pool"
1044+ " free list %lu though the\n"
1045+ "InnoDB: element is not marked free!\n",
1046+ (ulong) n);
1047+
1048+ mem_analyze_corruption(area);
1049+
1050+ /* Try to analyze a strange assertion failure reported at
1051+ mysql@lists.mysql.com where the free bit IS 1 in the
1052+ hex dump above */
1053+
1054+ if (mem_area_get_free_out(area)) {
1055+ fprintf(stderr,
1056+ "InnoDB: Probably a race condition"
1057+ " because now the area is marked free!\n");
1058+ }
1059+
1060+ ut_error;
1061+ }
1062+
1063+ if (UT_LIST_GET_LEN(pool->free_list[n]) == 0) {
1064+ fprintf(stderr,
1065+ "InnoDB: Error: Removing element from mem pool"
1066+ " free list %lu\n"
1067+ "InnoDB: though the list length is 0!\n",
1068+ (ulong) n);
1069+ mem_analyze_corruption(area);
1070+
1071+ ut_error;
1072+ }
1073+
1074+ ut_ad(mem_area_get_size_out(area) == ut_2_exp(n));
1075+
1076+ mem_area_set_free_out(area, FALSE);
1077+
1078+ UT_LIST_REMOVE(free_list, pool->free_list[n], area);
1079+
1080+ pool->reserved += mem_area_get_size_out(area);
1081+
1082+ mutex_exit(&(pool->mutex));
1083+
1084+ ut_ad(mem_pool_validate(pool));
1085+
1086+ *psize = ut_2_exp(n) - MEM_AREA_EXTRA_SIZE;
1087+ UNIV_MEM_ALLOC(MEM_AREA_EXTRA_SIZE + (byte*)area, *psize);
1088+
1089+ return((void*)(MEM_AREA_EXTRA_SIZE + ((byte*)area)));
1090+}
1091+
1092+/********************************************************************//**
1093+Frees memory to a pool. */
1094+static
1095+void
1096+mem_area_free_out(
1097+/*==========*/
1098+ void* ptr, /*!< in, own: pointer to allocated memory
1099+ buffer */
1100+ mem_pool_t* pool) /*!< in: memory pool */
1101+{
1102+ mem_area_t* area;
1103+ mem_area_t* buddy;
1104+ void* new_ptr;
1105+ ulint size;
1106+ ulint n;
1107+
1108+
1109+
1110+ /* It may be that the area was really allocated from the OS with
1111+ regular malloc: check if ptr points within our memory pool */
1112+
1113+ if ((byte*)ptr < pool->buf || (byte*)ptr >= pool->buf + pool->size) {
1114+ ut_free(ptr);
1115+
1116+ return;
1117+ }
1118+
1119+ area = (mem_area_t*) (((byte*)ptr) - MEM_AREA_EXTRA_SIZE);
1120+
1121+ if (mem_area_get_free_out(area)) {
1122+ fprintf(stderr,
1123+ "InnoDB: Error: Freeing element to mem pool"
1124+ " free list though the\n"
1125+ "InnoDB: element is marked free!\n");
1126+
1127+ mem_analyze_corruption(area);
1128+ ut_error;
1129+ }
1130+
1131+ size = mem_area_get_size_out(area);
1132+ UNIV_MEM_FREE(ptr, size - MEM_AREA_EXTRA_SIZE);
1133+
1134+ if (size == 0) {
1135+ fprintf(stderr,
1136+ "InnoDB: Error: Mem area size is 0. Possibly a"
1137+ " memory overrun of the\n"
1138+ "InnoDB: previous allocated area!\n");
1139+
1140+ mem_analyze_corruption(area);
1141+ ut_error;
1142+ }
1143+
1144+#ifdef UNIV_LIGHT_MEM_DEBUG
1145+ if (((byte*)area) + size < pool->buf + pool->size) {
1146+
1147+ ulint next_size;
1148+
1149+ next_size = mem_area_get_size_out(
1150+ (mem_area_t*)(((byte*)area) + size));
1151+ if (UNIV_UNLIKELY(!next_size || !ut_is_2pow(next_size))) {
1152+ fprintf(stderr,
1153+ "InnoDB: Error: Memory area size %lu,"
1154+ " next area size %lu not a power of 2!\n"
1155+ "InnoDB: Possibly a memory overrun of"
1156+ " the buffer being freed here.\n",
1157+ (ulong) size, (ulong) next_size);
1158+ mem_analyze_corruption(area);
1159+
1160+ ut_error;
1161+ }
1162+ }
1163+#endif
1164+ buddy = mem_area_get_buddy_out(area, size, pool);
1165+
1166+ n = ut_2_log(size);
1167+
1168+ mutex_enter(&(pool->mutex));
1169+
1170+
1171+ if (buddy && mem_area_get_free_out(buddy)
1172+ && (size == mem_area_get_size_out(buddy))) {
1173+
1174+ /* The buddy is in a free list */
1175+
1176+ if ((byte*)buddy < (byte*)area) {
1177+ new_ptr = ((byte*)buddy) + MEM_AREA_EXTRA_SIZE;
1178+
1179+ mem_area_set_size_out(buddy, 2 * size);
1180+ mem_area_set_free_out(buddy, FALSE);
1181+ } else {
1182+ new_ptr = ptr;
1183+
1184+ mem_area_set_size_out(area, 2 * size);
1185+ }
1186+
1187+ /* Remove the buddy from its free list and merge it to area */
1188+
1189+ UT_LIST_REMOVE(free_list, pool->free_list[n], buddy);
1190+
1191+ pool->reserved += ut_2_exp(n);
1192+
1193+ mutex_exit(&(pool->mutex));
1194+
1195+ mem_area_free_out(new_ptr, pool);
1196+
1197+ return;
1198+ } else {
1199+ UT_LIST_ADD_FIRST(free_list, pool->free_list[n], area);
1200+
1201+ mem_area_set_free_out(area, TRUE);
1202+
1203+ ut_ad(pool->reserved >= size);
1204+
1205+ pool->reserved -= size;
1206+ }
1207+
1208+ mutex_exit(&(pool->mutex));
1209+
1210+ ut_ad(mem_pool_validate(pool));
1211+}
1212+
1213+//end copy
1214+
1215+UNIV_INTERN my_bool innodb_row_cache_use_sys_malloc = FALSE;
1216+
1217+UNIV_INTERN llong innodb_row_cache_mem_pool_size = 1024 * 1024L; //default is 1M
1218+
1219+UNIV_INTERN llong innodb_row_cache_additional_mem_pool_size = 1024 * 1024L;//default is 1M
1220+
1221+//mem pool
1222+static mem_pool_t** row_cache_mem_pool = NULL;
1223+
1224+//system malloc stat
1225+static llong* sys_malloc_mem_size = NULL;
1226+
1227+static llong* sys_malloc_mem_used = NULL;
1228+
1229+static ROW_CACHE_VALUE_LIST_BASE *innodb_row_cache_value_mem_list = NULL;
1230+
1231+static row_cache_value_t *innodb_row_cache_value_mem_pool = NULL;
1232+
1233+static ROW_CACHE_VALUE_QUEUE_LIST_BASE *innodb_row_cache_queue_mem_list = NULL;
1234+
1235+static row_cache_value_queue_t *innodb_row_cache_queue_mem_pool = NULL;
1236+
1237+void init_row_cache_mem_pool(my_bool innodb_row_cache_on)
1238+{
1239+ if(!innodb_row_cache_on){
1240+ innodb_row_cache_mem_pool_size = 1;
1241+ }
1242+ if (sizeof(ulint) == 4) {
1243+ if (innodb_row_cache_mem_pool_size > UINT_MAX32) {
1244+ ut_print_timestamp(stderr);
1245+ fprintf(stderr,
1246+ "[Error]innodb_row_cache_mem_pool_size can't be over 4GB"
1247+ " on 32-bit systems\n");
1248+ }
1249+ }
1250+ if (innodb_row_cache_mutex_num > 0){
1251+ long long mem_pool_size = 0;
1252+ ulint i;
1253+ //init row cache value mem pool
1254+ ulint row_cache_value_mem_list_size = 0;
1255+ ulint row_cache_mem_pool_value_num = innodb_row_cache_additional_mem_pool_size / sizeof(row_cache_value_t);
1256+ ulint row_cache_max_queue_num;
1257+ if (row_cache_mem_pool_value_num>0){
1258+ row_cache_value_mem_list_size = innodb_row_cache_mutex_num * sizeof(ROW_CACHE_VALUE_LIST_BASE);
1259+ //create row cache mem list
1260+ innodb_row_cache_value_mem_list = (ROW_CACHE_VALUE_LIST_BASE*) ut_malloc(row_cache_value_mem_list_size);
1261+ innodb_row_cache_value_mem_pool = (row_cache_value_t*) ut_malloc(row_cache_mem_pool_value_num * sizeof(row_cache_value_t));
1262+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1263+ {
1264+ UT_LIST_INIT(*(innodb_row_cache_value_mem_list+i));
1265+ }
1266+ for (i = 0 ; i < row_cache_mem_pool_value_num ; i++)
1267+ {
1268+ ROW_CACHE_VALUE_LIST_BASE* value_list = innodb_row_cache_value_mem_list + (i % innodb_row_cache_mutex_num);
1269+ row_cache_value_t* value = innodb_row_cache_value_mem_pool + i;
1270+ memset(value , 0 , sizeof(row_cache_value_t));
1271+ UT_LIST_ADD_FIRST(list,*value_list,value);
1272+ }
1273+ }
1274+ //create row cache queue mem list
1275+ innodb_row_cache_queue_mem_list = (ROW_CACHE_VALUE_QUEUE_LIST_BASE*) ut_malloc(innodb_row_cache_mutex_num * sizeof(ROW_CACHE_VALUE_QUEUE_LIST_BASE));
1276+ row_cache_max_queue_num = srv_thread_concurrency * 2 * innodb_row_cache_mutex_num;
1277+ innodb_row_cache_queue_mem_pool = (row_cache_value_queue_t*) ut_malloc(row_cache_max_queue_num * sizeof(row_cache_value_queue_t));
1278+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1279+ {
1280+ UT_LIST_INIT(*(innodb_row_cache_queue_mem_list+i));
1281+ }
1282+ for (i = 0 ; i < row_cache_max_queue_num ; i++)
1283+ {
1284+ ROW_CACHE_VALUE_QUEUE_LIST_BASE* list_base = innodb_row_cache_queue_mem_list + (i % innodb_row_cache_mutex_num);
1285+ row_cache_value_queue_t* value = innodb_row_cache_queue_mem_pool + i;
1286+ memset(value , 0 , sizeof(row_cache_value_queue_t));
1287+ UT_LIST_ADD_FIRST(list,*list_base,value);
1288+ }
1289+
1290+ if(!innodb_row_cache_use_sys_malloc){
1291+ //init general mem pool
1292+ row_cache_mem_pool = (mem_pool_t**) ut_malloc(innodb_row_cache_mutex_num * sizeof(mem_pool_t*));
1293+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1294+ {
1295+ row_cache_mem_pool[i] = mem_pool_create_out(innodb_row_cache_mem_pool_size / innodb_row_cache_mutex_num);
1296+ mem_pool_size += row_cache_mem_pool[i]->size;
1297+ }
1298+ //set to real alloc size!
1299+ innodb_row_cache_mem_pool_size = mem_pool_size;
1300+ }else{
1301+ //init system malloc mem stat
1302+ sys_malloc_mem_size = (llong*) ut_malloc(innodb_row_cache_mutex_num * sizeof(llong));
1303+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1304+ {
1305+ sys_malloc_mem_size[i] = innodb_row_cache_mem_pool_size / innodb_row_cache_mutex_num;
1306+ }
1307+ sys_malloc_mem_used = (llong*) ut_malloc(innodb_row_cache_mutex_num * sizeof(llong));
1308+ memset(sys_malloc_mem_used , 0 , innodb_row_cache_mutex_num * sizeof(llong));
1309+ }
1310+ }
1311+}
1312+
1313+void deinit_row_cache_mem_pool(){
1314+ if (innodb_row_cache_mutex_num > 0){
1315+ if(!innodb_row_cache_use_sys_malloc){
1316+ ulint i;
1317+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++){
1318+ //mutex_free(&row_cache_mem_pool[i]->mutex); //free in sync_close()
1319+ mem_pool_free_out(row_cache_mem_pool[i]);
1320+ }
1321+ ut_free(row_cache_mem_pool);
1322+ row_cache_mem_pool=NULL;
1323+ }else{
1324+ ut_free(sys_malloc_mem_size);
1325+ ut_free(sys_malloc_mem_used);
1326+ }
1327+ }
1328+ if(innodb_row_cache_value_mem_pool){
1329+ ut_free(innodb_row_cache_value_mem_pool);
1330+ innodb_row_cache_value_mem_pool=NULL;
1331+ }
1332+ if(innodb_row_cache_value_mem_list){
1333+ ut_free(innodb_row_cache_value_mem_list);
1334+ innodb_row_cache_value_mem_list=NULL;
1335+ }
1336+ if(innodb_row_cache_queue_mem_pool){
1337+ ut_free(innodb_row_cache_queue_mem_pool);
1338+ innodb_row_cache_queue_mem_pool=NULL;
1339+ }
1340+ if(innodb_row_cache_queue_mem_list){
1341+ ut_free(innodb_row_cache_queue_mem_list);
1342+ innodb_row_cache_queue_mem_list=NULL;
1343+ }
1344+}
1345+
1346+static mem_pool_t* get_current_mem_pool(const ulint fold){
1347+ return row_cache_mem_pool[row_cache_get_mutex_no(fold)];
1348+}
1349+
1350+void* ca_malloc_low(ulint n , const ulint used_fold)
1351+{
1352+ void* ret = NULL;
1353+ if(!innodb_row_cache_use_sys_malloc){
1354+ mem_pool_t* mem_pool = get_current_mem_pool(used_fold);
1355+ ret=mem_area_alloc_out(&n,mem_pool);
1356+ if(ret==NULL){
1357+ free_from_lru(n,used_fold);
1358+ ret=mem_area_alloc_out(&n,mem_pool);
1359+ }
1360+ }else{
1361+ ulint no = row_cache_get_mutex_no(used_fold);
1362+ if(sys_malloc_mem_size[no] - sys_malloc_mem_used[no] > n){
1363+ ret = malloc(n);
1364+ }
1365+ if(ret==NULL){
1366+ free_from_lru(n,used_fold);
1367+ ret=malloc(n);
1368+ }
1369+ if(ret){
1370+ sys_malloc_mem_used[no] += n;
1371+ }
1372+ }
1373+ return ret;
1374+}
1375+
1376+void ca_free_low( void* ptr , const ulint size ,const ulint used_fold)
1377+{
1378+ if(!innodb_row_cache_use_sys_malloc){
1379+ mem_area_free_out(ptr,get_current_mem_pool(used_fold));
1380+ }else{
1381+ ulint no = row_cache_get_mutex_no(used_fold);
1382+ free(ptr);
1383+ sys_malloc_mem_used[no] -= size;
1384+ }
1385+}
1386+
1387+static ROW_CACHE_VALUE_LIST_BASE* get_current_value_list(const ulint fold){
1388+ if(innodb_row_cache_value_mem_list){
1389+ return innodb_row_cache_value_mem_list + row_cache_get_mutex_no(fold);
1390+ }
1391+ return NULL;
1392+}
1393+
1394+row_cache_value_t* ca_malloc_for_value( const ulint used_fold )
1395+{
1396+ row_cache_value_t* value = NULL;
1397+ ROW_CACHE_VALUE_LIST_BASE* value_list = get_current_value_list(used_fold);
1398+ if(value_list && UT_LIST_GET_LEN(*value_list) > 0){
1399+ value = UT_LIST_GET_FIRST(*value_list);
1400+ UT_LIST_REMOVE(list,*value_list,value);
1401+ memset(value , 0 , sizeof(row_cache_value_t));
1402+ //set value mean it come from value's mem pool
1403+ onBit(value->flag,FLAG_VALUE_IS_FROM_VALUE_POOL);
1404+ }else{
1405+ value = (row_cache_value_t*) ca_malloc(sizeof(row_cache_value_t) , used_fold);
1406+ if(value){
1407+ memset(value , 0 , sizeof(row_cache_value_t));
1408+ }
1409+ }
1410+ return value;
1411+}
1412+
1413+void ca_free_for_value( row_cache_value_t* value, const ulint used_fold )
1414+{
1415+ if(isValueFromValuePool(value->flag)){
1416+ ROW_CACHE_VALUE_LIST_BASE* value_list = get_current_value_list(used_fold);
1417+ UT_LIST_ADD_FIRST(list,*value_list,value);
1418+ }else{
1419+ ca_free(value,sizeof(row_cache_value_t),used_fold);
1420+ }
1421+}
1422+
1423+static ROW_CACHE_VALUE_QUEUE_LIST_BASE* get_current_value_queue_list(const ulint fold){
1424+ if(innodb_row_cache_queue_mem_list){
1425+ return innodb_row_cache_queue_mem_list + row_cache_get_mutex_no(fold);
1426+ }
1427+ return NULL;
1428+}
1429+
1430+
1431+row_cache_value_queue_t* ca_malloc_for_queue( const ulint used_fold )
1432+{
1433+ row_cache_value_queue_t* value = NULL;
1434+ ROW_CACHE_VALUE_QUEUE_LIST_BASE* list_base = get_current_value_queue_list(used_fold);
1435+ if(list_base && UT_LIST_GET_LEN(*list_base) > 0){
1436+ value = UT_LIST_GET_FIRST(*list_base);
1437+ UT_LIST_REMOVE(list,*list_base,value);
1438+ memset(value , 0 , sizeof(row_cache_value_queue_t));
1439+ }else{
1440+ value = (row_cache_value_queue_t*) ca_malloc(sizeof(row_cache_value_queue_t) , used_fold);
1441+ if(value){
1442+ memset(value , 0 , sizeof(row_cache_value_queue_t));
1443+ }
1444+ }
1445+ return value;
1446+
1447+}
1448+
1449+void ca_free_for_queue( row_cache_value_queue_t* value, const ulint used_fold )
1450+{
1451+ ROW_CACHE_VALUE_QUEUE_LIST_BASE* list_base = get_current_value_queue_list(used_fold);
1452+ UT_LIST_ADD_FIRST(list,*list_base,value);
1453+}
1454+
1455+
1456+ulint row_cache_mem_pool_used()
1457+{
1458+ ulint ret = 0;
1459+ ulint i;
1460+ if(!innodb_row_cache_use_sys_malloc){
1461+ if(row_cache_mem_pool){
1462+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1463+ {
1464+ ret += mem_pool_get_reserved(row_cache_mem_pool[i]);
1465+ }
1466+ }
1467+ }else{
1468+ if(sys_malloc_mem_used){
1469+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1470+ {
1471+ ret += sys_malloc_mem_used[i];
1472+ }
1473+ }
1474+ }
1475+ return ret;
1476+}
1477+
1478+ulint row_cache_get_value_free_count(){
1479+ ulint ret = 0;
1480+ ulint i;
1481+ if(innodb_row_cache_value_mem_list){
1482+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1483+ {
1484+ ret += UT_LIST_GET_LEN(innodb_row_cache_value_mem_list[i]);
1485+ }
1486+ }
1487+ return ret;
1488+}
1489+
1490+ulint row_cache_get_queue_free_count(){
1491+ ulint ret = 0;
1492+ ulint i;
1493+ if(innodb_row_cache_queue_mem_list){
1494+ for (i = 0 ; i < innodb_row_cache_mutex_num ; i++)
1495+ {
1496+ ret += UT_LIST_GET_LEN(innodb_row_cache_queue_mem_list[i]);
1497+ }
1498+ }
1499+ return ret;
1500+}
1501
1502Property changes on: storage/innobase/cache/row0cache0mempool.c
1503___________________________________________________________________
1504Added: svn:mime-type
1505 + text/plain
1506
1507Index: storage/innobase/cache/row0cache.c
1508===================================================================
1509--- storage/innobase/cache/row0cache.c (revision 0)
1510+++ storage/innobase/cache/row0cache.c (revision 724)
1511@@ -0,0 +1,385 @@
1512+/********************************************************************
1513+ created: 2011/03/17
1514+ created: 17:3:2011 16:51
1515+ file base: row0cache
1516+ file ext: c
1517+ author: wentong@taobao.com
1518+
1519+ purpose: for row cache
1520+ *********************************************************************/
1521+#include "row0cache.h"
1522+#include "row0cache0mempool.h"
1523+#include "row0cache0hash.h"
1524+#include "mtr0mtr.h"
1525+#include "rem0rec.h"
1526+#include "row0cache0lru.h"
1527+#include "trx0types.h"
1528+#include "page0page.h"
1529+#include "log0recv.h"
1530+#include "read0read.h"
1531+#include "rem0cmp.h"
1532+#include "row0cache0filter.h"
1533+UNIV_INTERN my_bool innodb_row_cache_on = FALSE;
1534+
1535+static row_cache_stat_t _row_cache_stat;
1536+UNIV_INTERN row_cache_stat_t* row_cache_stat = &_row_cache_stat;
1537+
1538+void init_row_cache() {
1539+ DBUG_ENTER("init_row_cache");
1540+ memset(row_cache_stat, 0, sizeof(row_cache_stat_t));
1541+ init_row_cache_hash(innodb_row_cache_on);
1542+ init_innodb_row_cache_lru();
1543+ init_row_cache_mem_pool(innodb_row_cache_on);
1544+ init_row_cache_filter(innodb_row_cache_on);
1545+ //init_row_cache_lock_pool(1000);//TODO
1546+ DBUG_VOID_RETURN;
1547+}
1548+
1549+void deinit_row_cache() {
1550+ DBUG_ENTER("deinit_row_cache");
1551+ deinit_innodb_row_cache_lru();
1552+ deinit_row_cache_hash(innodb_row_cache_on);
1553+ deinit_row_cache_mem_pool();
1554+ /*deinit_row_cache_filter(innodb_row_cache_on);*/ // deinitde in innobase_shutdown_for_mysql*/
1555+ DBUG_VOID_RETURN;
1556+}
1557+
1558+static row_cache_value_t* create_row_cache_value(const rec_t* rec,
1559+ const ulint* offsets, ulint fold, const dict_index_t* index,
1560+ ibool is_sec_index) {
1561+ ulint buf_size;
1562+ row_cache_value_t* value = ca_malloc_for_value(fold);
1563+ if (value == NULL) {
1564+ //not enough mem
1565+ ut_print_timestamp(stderr);
1566+ fprintf(
1567+ stderr,
1568+ "[Warnning] malloc row_cache_value_t failded in create_row_cache_value !\n");
1569+ return NULL;
1570+ }
1571+ buf_size = rec_offs_extra_size(offsets) + rec_offs_data_size(offsets);
1572+ if (is_sec_index) {
1573+ buf_size += sizeof(trx_id_t);
1574+ }
1575+ value->buf = (rec_t*) ca_malloc(buf_size , fold);
1576+ if (value->buf == NULL) {
1577+ //not enough mem
1578+ ut_print_timestamp(stderr);
1579+ fprintf(
1580+ stderr,
1581+ "[Warnning] malloc rec copy buf failded in create_row_cache_value !\n");
1582+ ca_free_for_value(value, fold);
1583+ return NULL;
1584+ }
1585+ memset(value->buf, 0, buf_size);
1586+ value->fold = fold;
1587+ value->tree_id = index->id;
1588+ value->table_id = index->table->id;
1589+ value->rec = rec_copy(value->buf, rec, offsets);
1590+ if (is_sec_index) {
1591+ trx_id_t* trx_id_in_rec = (trx_id_t*) (value->buf + buf_size
1592+ - sizeof(trx_id_t));
1593+ *trx_id_in_rec = page_get_max_trx_id(page_align(rec));
1594+ }
1595+ value->buf_size = buf_size;
1596+ value->ref_num = 0;
1597+ value->chain.value = value;
1598+ return value;
1599+}
1600+
1601+static int update_row_cache_value(row_cache_value_t* value, const rec_t* rec,
1602+ const ulint* offsets, ulint fold, const dict_index_t* index,
1603+ ibool is_sec_index) {
1604+ ulint buf_size;
1605+ ca_free(value->buf, value->buf_size, fold);
1606+ buf_size = rec_offs_extra_size(offsets) + rec_offs_data_size(offsets);
1607+ if (is_sec_index) {
1608+ buf_size += sizeof(trx_id_t);
1609+ }
1610+ value->buf = (rec_t*) ca_malloc(buf_size , fold);
1611+ if (value->buf == NULL) {
1612+ //not enough mem
1613+ ut_print_timestamp(stderr);
1614+ fprintf(
1615+ stderr,
1616+ "[Warnning] malloc rec copy buf failded in create_row_cache_value !\n");
1617+ return NOT_ENOUGH_MEM;
1618+ }
1619+ memset(value->buf, 0, buf_size);
1620+ value->fold = fold;
1621+ value->tree_id = index->id;
1622+ value->table_id = index->table->id;
1623+ value->rec = rec_copy(value->buf, rec, offsets);
1624+ if (is_sec_index) {
1625+ trx_id_t* trx_id_in_rec = (trx_id_t*) (value->buf + buf_size
1626+ - sizeof(trx_id_t));
1627+ *trx_id_in_rec = page_get_max_trx_id(page_align(rec));
1628+ }
1629+ value->buf_size = buf_size;
1630+ //trun off is_removed
1631+ offBit(value->flag, FLAG_VALUE_IS_REMOVED);
1632+ value->ref_num = 0;
1633+ return 0;
1634+}
1635+
1636+void put_rec_to_row_cache(const dtuple_t* tuple, const rec_t* rec, /*!< in: physical record */
1637+const ulint* offsets, /*!< in: array returned by rec_get_offsets() */
1638+const dict_index_t* index, ibool is_sec_index) {
1639+ row_cache_value_t* value;
1640+ ulint fold;
1641+ if (!innodb_row_cache_on) {
1642+ return;
1643+ }
1644+ if (!is_index_need_cache(index)) {
1645+ return;
1646+ }
1647+ //if rec is deleted ,it can,t be put into row cache
1648+ if (rec_get_deleted_flag(rec, dict_table_is_comp(index->table))) {
1649+ return;
1650+ }
1651+ fold = calc_fold_by_rec(rec, offsets, dict_index_get_n_unique(index),
1652+ index->id);
1653+ row_cache_enter_mutex(fold);
1654+ value = search_row_cache_value(tuple, index, fold);
1655+ if (value == NULL) {
1656+ value = create_row_cache_value(rec, offsets, fold, index, is_sec_index);
1657+ if (value != NULL) {
1658+ insert_row_cache_value(fold, value);
1659+ add_row_cache_value_to_lru(value);
1660+ }
1661+ } else {
1662+ if (isValueRemoved(value->flag) && value->ref_num == 0) {
1663+ //can be overwrite
1664+ if (update_row_cache_value(value, rec, offsets, fold, index,
1665+ is_sec_index) == NOT_ENOUGH_MEM) {
1666+ delete_row_cache_value(value);
1667+ remove_row_cache_value_from_lru(value);
1668+ ca_free_for_value(value, fold); //value->buf already free in update_row_cache_value()
1669+ } else {
1670+ make_row_cache_value_first_from_lru(value);
1671+ }
1672+ }
1673+ }
1674+ row_cache_exit_mutex(fold);
1675+}
1676+
1677+static int add_row_cache_value_to_mtr(mtr_t* mtr, row_cache_value_t* value) {
1678+ row_cache_value_queue_t* row_cache_value_queue;
1679+ row_cache_value_queue = (row_cache_value_queue_t*) ca_malloc_for_queue(
1680+ value->fold);
1681+ if (NULL == row_cache_value_queue) {
1682+ ut_print_timestamp(stderr);
1683+ fprintf(
1684+ stderr,
1685+ "[Warnning] malloc row_cache_value_queue_t failded in add_row_cache_value_to_mtr!\n");
1686+ return NOT_ENOUGH_MEM;
1687+ }
1688+ row_cache_value_queue->value = value;
1689+ UT_LIST_ADD_LAST(list, mtr->row_cache_value_queue_base,
1690+ row_cache_value_queue);
1691+ return 0;
1692+}
1693+
1694+void release_row_cache_value_in_mtr(mtr_t* mtr) {
1695+ row_cache_value_queue_t* row_cache_value_queue;
1696+ if (!innodb_row_cache_on) {
1697+ return;
1698+ }
1699+ while (mtr->row_cache_value_queue_base.count != 0) {
1700+ ulint fold;
1701+ row_cache_value_queue = mtr->row_cache_value_queue_base.end;
1702+ //release Reference
1703+ fold = row_cache_value_queue->value->fold;
1704+ row_cache_enter_mutex(fold);
1705+ row_cache_value_queue->value->ref_num--;
1706+ UT_LIST_REMOVE(list, mtr->row_cache_value_queue_base,
1707+ row_cache_value_queue);
1708+ ca_free_for_queue(row_cache_value_queue, fold);
1709+ row_cache_exit_mutex(fold);
1710+ }
1711+}
1712+
1713+static void free_row_cache_value(row_cache_value_t* value) {
1714+ ulint fold = value->fold;
1715+ ca_free(value->buf, value->buf_size, fold);
1716+ ca_free_for_value(value, fold);
1717+}
1718+
1719+int contain_row_cache_low(const ulint fold, const dtuple_t* tuple,
1720+ const dict_index_t* index) {
1721+ row_cache_value_t* value;
1722+ int ret = 0;
1723+ if (!innodb_row_cache_on) {
1724+ return ret;
1725+ }
1726+ if (!is_index_need_cache(index)) {
1727+ return ret;
1728+ }
1729+ row_cache_enter_mutex(fold);
1730+ value = search_row_cache_value(tuple, index, fold);
1731+ if (value != NULL && !isValueRemoved(value->flag)) {
1732+ ret = 1;
1733+ }
1734+ row_cache_exit_mutex(fold);
1735+ return ret;
1736+}
1737+
1738+rec_t* get_from_row_cache_low(const ulint fold, const dtuple_t* tuple,
1739+ const dict_index_t* index, mtr_t* mtr) {
1740+ row_cache_value_t* value;
1741+ if (!innodb_row_cache_on) {
1742+ return NULL;
1743+ }
1744+ if (!is_index_need_cache(index)) {
1745+ return NULL;
1746+ }
1747+ row_cache_stat->n_get++;
1748+ row_cache_enter_mutex(fold);
1749+ value = search_row_cache_value(tuple, index, fold);
1750+ if (value != NULL) {
1751+ if (isValueRemoved(value->flag) && value->ref_num == 0) {
1752+ //can be free
1753+ delete_row_cache_value(value);
1754+ remove_row_cache_value_from_lru(value);
1755+ free_row_cache_value(value);
1756+ value = NULL;
1757+ } else if (!isValueRemoved(value->flag)) {
1758+ //can be read
1759+ if (add_row_cache_value_to_mtr(mtr, value) == 0) {
1760+ make_row_cache_value_first_from_lru(value);
1761+ value->ref_num++;
1762+ } else {
1763+ //not have enough mem
1764+ value = NULL;
1765+ }
1766+
1767+ } else {
1768+ // is removed can't be read
1769+ value = NULL;
1770+ }
1771+ }
1772+ row_cache_exit_mutex(fold);
1773+ if (value != NULL) {
1774+ row_cache_stat->geted++;
1775+ return value->rec;
1776+ }
1777+ return NULL;
1778+}
1779+
1780+void remove_from_row_cache_low(const ulint fold, const rec_t* rec,
1781+ const ulint* offsets, const dtuple_t* tuple, dict_index_t* index) {
1782+ row_cache_value_t* value = NULL;
1783+ if (!innodb_row_cache_on) {
1784+ return;
1785+ }
1786+ row_cache_enter_mutex(fold);
1787+ if (rec && offsets) {
1788+ value = search_row_cache_value_with_rec(rec, offsets, index, fold);
1789+ } else if (tuple) {
1790+ value = search_row_cache_value(tuple, index, fold);
1791+ }
1792+ if (value != NULL) {
1793+ if (value->ref_num == 0) {
1794+ //can be free
1795+ delete_row_cache_value(value);
1796+ remove_row_cache_value_from_lru(value);
1797+ free_row_cache_value(value);
1798+ value = NULL;
1799+ } else {
1800+ //using just set is_removed = 1
1801+ onBit(value->flag, FLAG_VALUE_IS_REMOVED);
1802+ }
1803+ }
1804+ row_cache_exit_mutex(fold);
1805+}
1806+
1807+ulint calc_fold_by_rec(const rec_t* rec, /*!< in: the physical record */
1808+const ulint* offsets, /*!< in: array returned by rec_get_offsets() */
1809+ulint n_fields, /*!< in: number of complete fields to fold */
1810+index_id_t tree_id) {
1811+ return rec_fold(rec, offsets, n_fields, 0, tree_id);
1812+}
1813+
1814+ulint lock_sec_rec_in_row_cache_cons_read_sees(const rec_t* rec,
1815+ const ulint* offsets, const read_view_t* view) {
1816+ trx_id_t* max_trx_id;
1817+ if (recv_recovery_is_on()) {
1818+
1819+ return (FALSE);
1820+ }
1821+
1822+ max_trx_id = (trx_id_t*) (rec + rec_offs_data_size(offsets));
1823+ ut_ad((*max_trx_id)!=0);
1824+
1825+ return (*max_trx_id) < (view->up_limit_id);
1826+// return(ut_dulint_cmp( *max_trx_id, view->up_limit_id) < 0);
1827+}
1828+
1829+void row_cache_refresh_stats() {
1830+ row_cache_stat->last_printout_time = time(NULL);
1831+ row_cache_stat->old_geted = row_cache_stat->geted;
1832+ row_cache_stat->old_n_get = row_cache_stat->n_get;
1833+ row_cache_lru_stat->old_n_add = row_cache_lru_stat->n_add;
1834+ row_cache_lru_stat->old_n_evict = row_cache_lru_stat->n_evict;
1835+ row_cache_lru_stat->old_n_make_first = row_cache_lru_stat->n_make_first;
1836+}
1837+
1838+void print_row_cache_stats(FILE* file) {
1839+ time_t current_time;
1840+ double time_elapsed;
1841+ ulint n_gets_diff = 0;
1842+ unsigned long long n_geted_diff = 0;
1843+ unsigned long long mem_pool_used = row_cache_mem_pool_used();
1844+
1845+ fputs("----------------------\n"
1846+ "ROW CACHE\n"
1847+ "----------------------\n", file);
1848+ fprintf(file, "Total memory allocated " ULINTPF
1849+ "; used " ULINTPF " (" ULINTPF " / 1000)"
1850+ "; additional pool allocated " ULINTPF
1851+ "; Total LRU count " ULINTPF"\n", (ulong) innodb_row_cache_mem_pool_size,
1852+ (ulong) mem_pool_used,
1853+ (ulong) (1000 * mem_pool_used / innodb_row_cache_mem_pool_size),
1854+ (ulong) innodb_row_cache_additional_mem_pool_size,
1855+ (ulong) get_row_cache_lru_count());
1856+
1857+ fprintf(file, "Free Value Count " ULINTPF
1858+ "; Free Queue Count " ULINTPF "\n",
1859+ (ulong) row_cache_get_value_free_count(),
1860+ (ulong) row_cache_get_queue_free_count());
1861+
1862+ current_time = time(NULL);
1863+ time_elapsed = 0.001
1864+ + difftime(current_time, row_cache_stat->last_printout_time);
1865+
1866+ n_geted_diff = row_cache_stat->geted - row_cache_stat->old_geted;
1867+ n_gets_diff = row_cache_stat->n_get - row_cache_stat->old_n_get;
1868+
1869+ fprintf(
1870+ file,
1871+ "Row total add " ULINTPF " , %.2f add/s \n"
1872+ "Row total make first " ULINTPF " , %.2f mf/s \n"
1873+ "Row total evict " ULINTPF " , %.2f evict/s \n"
1874+ "Row read from cache " ULINTPF ", %.2f read/s \n"
1875+ "Row get from cache " ULINTPF ", %.2f get/s \n",
1876+ (ulong) row_cache_lru_stat->n_add,
1877+ (row_cache_lru_stat->n_add - row_cache_lru_stat->old_n_add)
1878+ / time_elapsed,
1879+ (ulong) row_cache_lru_stat->n_make_first,
1880+ (row_cache_lru_stat->n_make_first
1881+ - row_cache_lru_stat->old_n_make_first) / time_elapsed,
1882+ (ulong) row_cache_lru_stat->n_evict,
1883+ (row_cache_lru_stat->n_evict - row_cache_lru_stat->old_n_evict)
1884+ / time_elapsed, (ulong) row_cache_stat->n_get,
1885+ n_gets_diff / time_elapsed, (ulong) row_cache_stat->geted,
1886+ n_geted_diff / time_elapsed);
1887+
1888+ if (n_gets_diff) {
1889+ fprintf(file, "Row cache hit rate %lu / 1000 \n",
1890+ (ulong) (1000 * n_geted_diff) / n_gets_diff);
1891+ } else {
1892+ fputs("No row cache gets since the last printout\n", file);
1893+ }
1894+
1895+ row_cache_refresh_stats();
1896+}
1897
1898Property changes on: storage/innobase/cache/row0cache.c
1899___________________________________________________________________
1900Added: svn:mime-type
1901 + text/plain
1902
1903Index: storage/innobase/srv/srv0srv.c
1904===================================================================
1905--- storage/innobase/srv/srv0srv.c (revision 498)
1906+++ storage/innobase/srv/srv0srv.c (revision 724)
1907@@ -86,6 +86,9 @@
1908 #include "read0read.h"
1909 #include "mysql/plugin.h"
1910 #include "mysql/service_thd_wait.h"
1911+#include "row0cache.h" /* for row cache*/
1912+#include "row0cache0lru.h"
1913+#include "row0cache0mempool.h"
1914
1915 /* prototypes of new functions added to ha_innodb.cc for kill_idle_transaction */
1916 ibool innobase_thd_is_idle(const void* thd);
1917@@ -1139,6 +1142,8 @@
1918
1919 /* Initialize some INFORMATION SCHEMA internal structures */
1920 trx_i_s_cache_init(trx_i_s_cache);
1921+
1922+ init_row_cache();
1923 }
1924
1925 /*********************************************************************//**
1926@@ -1162,6 +1167,8 @@
1927 srv_mysql_table = NULL;
1928
1929 trx_i_s_cache_free(trx_i_s_cache);
1930+
1931+ deinit_row_cache();
1932 }
1933
1934 /*********************************************************************//**
1935@@ -1967,6 +1974,8 @@
1936
1937 buf_refresh_io_stats_all();
1938
1939+ row_cache_refresh_stats();
1940+
1941 srv_n_rows_inserted_old = srv_n_rows_inserted;
1942 srv_n_rows_updated_old = srv_n_rows_updated;
1943 srv_n_rows_deleted_old = srv_n_rows_deleted;
1944@@ -2160,6 +2169,8 @@
1945
1946 buf_print_io(file);
1947
1948+ print_row_cache_stats(file);
1949+
1950 fputs("--------------\n"
1951 "ROW OPERATIONS\n"
1952 "--------------\n", file);
1953@@ -2471,6 +2482,19 @@
1954 export_vars.innodb_rows_deleted = srv_n_rows_deleted;
1955 export_vars.innodb_truncated_status_writes = srv_truncated_status_writes;
1956
1957+
1958+ export_vars.innodb_row_cache_n_get = row_cache_stat->n_get;
1959+ export_vars.innodb_row_cache_geted = row_cache_stat->geted;
1960+
1961+ export_vars.innodb_row_cache_lru_n_add = row_cache_lru_stat->n_add;
1962+ export_vars.innodb_row_cache_lru_n_make_first = row_cache_lru_stat->n_make_first;
1963+ export_vars.innodb_row_cache_lru_n_evict = row_cache_lru_stat->n_evict;
1964+ export_vars.innodb_row_cache_lru_count = get_row_cache_lru_count();
1965+
1966+ export_vars.innodb_row_cache_mem_pool_size = innodb_row_cache_mem_pool_size;
1967+ export_vars.innodb_row_cache_mem_pool_used = row_cache_mem_pool_used();
1968+
1969+
1970 mutex_exit(&srv_innodb_monitor_mutex);
1971 }
1972
1973Index: storage/innobase/srv/srv0start.c
1974===================================================================
1975--- storage/innobase/srv/srv0start.c (revision 498)
1976+++ storage/innobase/srv/srv0start.c (revision 724)
1977@@ -62,6 +62,8 @@
1978 #include "ibuf0ibuf.h"
1979 #include "srv0start.h"
1980 #include "srv0srv.h"
1981+#include "row0cache.h"
1982+#include "row0cache0filter.h"
1983 #ifndef UNIV_HOTBACKUP
1984 # include "os0proc.h"
1985 # include "sync0sync.h"
1986@@ -2390,6 +2392,7 @@
1987 /* 3. Free all InnoDB's own mutexes and the os_fast_mutexes inside
1988 them */
1989 os_aio_free();
1990+ deinit_row_cache_filter(innodb_row_cache_on);
1991 sync_close();
1992 srv_free();
1993 fil_close();
1994Index: storage/innobase/CMakeLists.txt
1995===================================================================
1996--- storage/innobase/CMakeLists.txt (revision 498)
1997+++ storage/innobase/CMakeLists.txt (revision 724)
1998@@ -245,7 +245,9 @@
1999 trx/trx0sys.c trx/trx0trx.c trx/trx0undo.c
2000 usr/usr0sess.c
2001 ut/ut0byte.c ut/ut0dbg.c ut/ut0list.c ut/ut0mem.c ut/ut0rbt.c ut/ut0rnd.c
2002- ut/ut0ut.c ut/ut0vec.c ut/ut0wqueue.c ut/ut0bh.c)
2003+ ut/ut0ut.c ut/ut0vec.c ut/ut0wqueue.c ut/ut0bh.c
2004+ cache/row0cache.c cache/row0cache0hash.c cache/row0cache0lru.c cache/row0cache0mempool.c
2005+ cache/row0cache0filter.c)
2006
2007 IF(WITH_INNODB)
2008 # Legacy option
2009Index: storage/innobase/handler/ha_innodb.cc
2010===================================================================
2011--- storage/innobase/handler/ha_innodb.cc (revision 498)
2012+++ storage/innobase/handler/ha_innodb.cc (revision 724)
2013@@ -97,6 +97,10 @@
2014 #include "ha_prototypes.h"
2015 #include "ut0mem.h"
2016 #include "ibuf0ibuf.h"
2017+#include "row0cache0mempool.h"
2018+#include "row0cache0hash.h"
2019+#include "row0cache0filter.h"
2020+#include "row0cache0lru.h"
2021 }
2022
2023 #include "ha_innodb.h"
2024@@ -342,7 +346,8 @@
2025 {&trx_i_s_cache_lock_key, "trx_i_s_cache_lock", 0},
2026 {&trx_purge_latch_key, "trx_purge_latch", 0},
2027 {&index_tree_rw_lock_key, "index_tree_rw_lock", 0},
2028- {&dict_table_stats_latch_key, "dict_table_stats", 0}
2029+ {&dict_table_stats_latch_key, "dict_table_stats", 0},
2030+ {&row_cache_filter_lock_key, "row_cache_filter_lock", 0}
2031 };
2032 # endif /* UNIV_PFS_RWLOCK */
2033
2034@@ -844,6 +849,22 @@
2035 (char*) &export_vars.innodb_x_lock_spin_rounds, SHOW_LONGLONG},
2036 {"x_lock_spin_waits",
2037 (char*) &export_vars.innodb_x_lock_spin_waits, SHOW_LONGLONG},
2038+ {"row_cache_n_get",
2039+ (char*) &export_vars.innodb_row_cache_n_get, SHOW_LONG},
2040+ {"row_cache_n_geted",
2041+ (char*) &export_vars.innodb_row_cache_geted, SHOW_LONG},
2042+ {"row_cache_lru_count",
2043+ (char*) &export_vars.innodb_row_cache_lru_count, SHOW_LONG},
2044+ {"row_cache_lru_n_add",
2045+ (char*) &export_vars.innodb_row_cache_lru_n_add, SHOW_LONG},
2046+ {"row_cache_lru_n_evict",
2047+ (char*) &export_vars.innodb_row_cache_lru_n_evict, SHOW_LONG},
2048+ {"row_cache_lru_n_make_first",
2049+ (char*) &export_vars.innodb_row_cache_lru_n_make_first, SHOW_LONG},
2050+ {"row_cache_mem_pool_size",
2051+ (char*) &export_vars.innodb_row_cache_mem_pool_size, SHOW_LONGLONG},
2052+ {"row_cache_mem_pool_used",
2053+ (char*) &export_vars.innodb_row_cache_mem_pool_used, SHOW_LONG},
2054 {NullS, NullS, SHOW_LONG}
2055 };
2056
2057@@ -6330,6 +6351,37 @@
2058 start of a new SQL statement. */
2059
2060
2061+UNIV_INTERN
2062+bool ha_innobase::is_in_cache(const uchar * key_ptr, uint key_len) {
2063+ dict_index_t* index;
2064+ bool ret = false;
2065+
2066+ DBUG_ENTER("index_read");
2067+ ut_a(prebuilt->trx == thd_to_trx(user_thd));
2068+ index = prebuilt->index;
2069+
2070+ if (UNIV_UNLIKELY(index == NULL) || dict_index_is_corrupted(index)) {
2071+ prebuilt->index_usable = FALSE;
2072+ DBUG_RETURN(ret);
2073+ }
2074+ if (UNIV_UNLIKELY(!prebuilt->index_usable)) {
2075+ DBUG_RETURN(ret);
2076+ }
2077+
2078+ if (key_ptr) {
2079+ /* Convert the search key value to InnoDB format into
2080+ prebuilt->search_tuple */
2081+
2082+ row_sel_convert_mysql_key_to_innobase(prebuilt->search_tuple,
2083+ (byte*) key_val_buff, (ulint) upd_and_key_val_buff_len, index,
2084+ (byte*) key_ptr, (ulint) key_len, prebuilt->trx);
2085+
2086+ ret = (contain_row_cache(prebuilt->search_tuple,index) == 1);
2087+ }
2088+
2089+ DBUG_RETURN(ret);
2090+}
2091+
2092 /**********************************************************************//**
2093 Positions an index cursor to the index specified in the handle. Fetches the
2094 row if any.
2095@@ -12556,6 +12608,89 @@
2096 "e.g. for http://bugs.mysql.com/51325",
2097 NULL, NULL, 0, 0, 1, 0);
2098
2099+
2100+static
2101+void
2102+innodb_row_cache_index_update(
2103+/*===========================*/
2104+THD* thd, /*!< in: thread handle */
2105+struct st_mysql_sys_var* var, /*!< in: pointer to
2106+ system variable */
2107+void* var_ptr, /*!< out: where the
2108+ formal string goes */
2109+const void* save) /*!< in: immediate result
2110+ from check function */
2111+{
2112+ ut_a(var_ptr != NULL);
2113+ ut_a(save != NULL);
2114+
2115+ *static_cast<const char**>(var_ptr) = *static_cast<const char*const*>(save);
2116+ reset_filter();
2117+}
2118+
2119+static
2120+void
2121+innodb_row_cache_clean_cache_update(
2122+/*===========================*/
2123+THD* thd, /*!< in: thread handle */
2124+struct st_mysql_sys_var* var, /*!< in: pointer to
2125+ system variable */
2126+void* var_ptr, /*!< out: where the
2127+ formal string goes */
2128+const void* save) /*!< in: immediate result
2129+ from check function */
2130+{
2131+ ut_a(var_ptr != NULL);
2132+ ut_a(save != NULL);
2133+
2134+ if(*(my_bool*) save){
2135+ clean_row_cache();
2136+ }
2137+}
2138+
2139+static MYSQL_SYSVAR_LONGLONG(row_cache_mem_pool_size, innodb_row_cache_mem_pool_size,
2140+ PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
2141+ "The size of the memory buffer InnoDB uses to cache row.",
2142+ NULL, NULL, 1024*1024L, 1024*1024L, LONGLONG_MAX, 0);
2143+
2144+static MYSQL_SYSVAR_LONGLONG(row_cache_additional_mem_pool_size, innodb_row_cache_additional_mem_pool_size,
2145+ PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
2146+ "The size of the memory buffer InnoDB uses create buffer for cache row 's struct.",
2147+ NULL, NULL, 1024*1024L, 1024*1024L, LONGLONG_MAX, 0);
2148+
2149+static MYSQL_SYSVAR_BOOL(row_cache_on, innodb_row_cache_on,
2150+ PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
2151+ "Enable row cache",
2152+ NULL, NULL, FALSE);
2153+
2154+static MYSQL_SYSVAR_ULONG(row_cache_cell_num, innodb_row_cache_cell_num,
2155+ PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
2156+ "Number of row cache 's hash table cell.",
2157+ NULL, NULL, 10000L, 100L, ~0L, 0);
2158+
2159+static MYSQL_SYSVAR_UINT(row_cache_mutex_num_shift, innodb_row_cache_mutex_num_shift,
2160+ PLUGIN_VAR_RQCMDARG | PLUGIN_VAR_READONLY,
2161+ "Number of row cache 's hash table mutex number's shift.",
2162+ NULL, NULL, 6, 1, 14, 0);
2163+
2164+static MYSQL_SYSVAR_STR(row_cache_index, innodb_row_cache_index,
2165+ PLUGIN_VAR_OPCMDARG,
2166+ "The config of index need to cache.",
2167+ NULL,
2168+ innodb_row_cache_index_update, NULL);
2169+
2170+static MYSQL_SYSVAR_BOOL(row_cache_clean_cache, innodb_row_cache_clean_cache,
2171+ PLUGIN_VAR_NOCMDARG,
2172+ "Set ON to Clean cache For DEBUG!",
2173+ NULL, innodb_row_cache_clean_cache_update, FALSE);
2174+
2175+static MYSQL_SYSVAR_BOOL(row_cache_use_sys_malloc, innodb_row_cache_use_sys_malloc,
2176+ PLUGIN_VAR_OPCMDARG | PLUGIN_VAR_READONLY,
2177+ "row cache use system malloc (disabled by default)",
2178+ NULL, NULL, FALSE);
2179+
2180+
2181+
2182 static struct st_mysql_sys_var* innobase_system_variables[]= {
2183 MYSQL_SYSVAR(page_size),
2184 MYSQL_SYSVAR(log_block_size),
2185@@ -12656,6 +12791,14 @@
2186 MYSQL_SYSVAR(corrupt_table_action),
2187 MYSQL_SYSVAR(lazy_drop_table),
2188 MYSQL_SYSVAR(fake_changes),
2189+ MYSQL_SYSVAR(row_cache_mem_pool_size),
2190+ MYSQL_SYSVAR(row_cache_additional_mem_pool_size),
2191+ MYSQL_SYSVAR(row_cache_on),
2192+ MYSQL_SYSVAR(row_cache_cell_num),
2193+ MYSQL_SYSVAR(row_cache_mutex_num_shift),
2194+ MYSQL_SYSVAR(row_cache_index),
2195+ MYSQL_SYSVAR(row_cache_clean_cache),
2196+ MYSQL_SYSVAR(row_cache_use_sys_malloc),
2197 NULL
2198 };
2199
2200Index: storage/innobase/handler/ha_innodb.h
2201===================================================================
2202--- storage/innobase/handler/ha_innodb.h (revision 498)
2203+++ storage/innobase/handler/ha_innodb.h (revision 724)
2204@@ -147,6 +147,8 @@
2205
2206 int index_init(uint index, bool sorted);
2207 int index_end();
2208+ bool is_in_cache(const uchar * key,
2209+ uint key_len);
2210 int index_read(uchar * buf, const uchar * key,
2211 uint key_len, enum ha_rkey_function find_flag);
2212 int index_read_idx(uchar * buf, uint index, const uchar * key,
2213Index: storage/innobase/include/row0cache0mempool.h
2214===================================================================
2215--- storage/innobase/include/row0cache0mempool.h (revision 0)
2216+++ storage/innobase/include/row0cache0mempool.h (revision 724)
2217@@ -0,0 +1,55 @@
2218+/********************************************************************
2219+ created: 2011/03/23
2220+ created: 23:3:2011 14:49
2221+ file base: row0cache0mempool
2222+ file ext: h
2223+ author: wentong@taobao.vom
2224+
2225+ purpose:
2226+*********************************************************************/
2227+
2228+#ifndef row0cache0mempool_h__
2229+#define row0cache0mempool_h__
2230+
2231+#include "ut0rbt.h"
2232+#include "row0cache0hash.h"
2233+#include "row0cache.h"
2234+
2235+typedef long long llong;
2236+
2237+extern my_bool innodb_row_cache_use_sys_malloc;
2238+
2239+extern llong innodb_row_cache_mem_pool_size;
2240+
2241+extern llong innodb_row_cache_additional_mem_pool_size;
2242+
2243+#define NOT_ENOUGH_MEM 1
2244+
2245+void init_row_cache_mem_pool(my_bool innodb_row_cache_on);
2246+
2247+void deinit_row_cache_mem_pool();
2248+
2249+void* ca_malloc_low(ulint n , const ulint used_fold);
2250+
2251+void ca_free_low(void* ptr, const ulint size ,const ulint used_fold);
2252+
2253+#define ca_malloc(S,F) ca_malloc_low(S,F)
2254+#define ca_free(P,S,F) ca_free_low(P,S,F)
2255+
2256+//#define ca_malloc_for_value(F) ca_malloc(sizeof(row_cache_value_t),F)
2257+//#define ca_free_for_value(S,F) ca_free(S,F)
2258+
2259+//#define ca_malloc_for_queue(F) ca_malloc(sizeof(row_cache_value_queue_t),F)
2260+//#define ca_free_for_queue(S,F) ca_free(S,F)
2261+
2262+row_cache_value_t* ca_malloc_for_value(const ulint used_fold);
2263+void ca_free_for_value(row_cache_value_t* value, const ulint used_fold);
2264+
2265+row_cache_value_queue_t* ca_malloc_for_queue(const ulint used_fold);
2266+void ca_free_for_queue(row_cache_value_queue_t* value, const ulint used_fold);
2267+
2268+ulint row_cache_mem_pool_used();
2269+ulint row_cache_get_value_free_count();
2270+ulint row_cache_get_queue_free_count();
2271+
2272+#endif // row0cache0mempool_h__
2273
2274Property changes on: storage/innobase/include/row0cache0mempool.h
2275___________________________________________________________________
2276Added: svn:mime-type
2277 + text/plain
2278
2279Index: storage/innobase/include/srv0srv.h
2280===================================================================
2281--- storage/innobase/include/srv0srv.h (revision 498)
2282+++ storage/innobase/include/srv0srv.h (revision 724)
2283@@ -842,6 +842,18 @@
2284 ib_int64_t innodb_x_lock_os_waits;
2285 ib_int64_t innodb_x_lock_spin_rounds;
2286 ib_int64_t innodb_x_lock_spin_waits;
2287+
2288+ ulint innodb_row_cache_n_get;
2289+ ulint innodb_row_cache_geted;
2290+
2291+ ulint innodb_row_cache_lru_n_add;
2292+ ulint innodb_row_cache_lru_n_make_first;
2293+ ulint innodb_row_cache_lru_n_evict;
2294+ ulint innodb_row_cache_lru_count;
2295+
2296+ ib_int64_t innodb_row_cache_mem_pool_size;
2297+ ib_int64_t innodb_row_cache_mem_pool_used;
2298+
2299 };
2300
2301 /** Thread slot in the thread table */
2302Index: storage/innobase/include/row0cache0lru.h
2303===================================================================
2304--- storage/innobase/include/row0cache0lru.h (revision 0)
2305+++ storage/innobase/include/row0cache0lru.h (revision 724)
2306@@ -0,0 +1,46 @@
2307+/********************************************************************
2308+ created: 2011/03/23
2309+ created: 23:3:2011 14:48
2310+ file base: row0cache0lru
2311+ file ext: h
2312+ author: wentong@taobao.com
2313+
2314+ purpose:
2315+*********************************************************************/
2316+
2317+#ifndef row0cache0lru_h__
2318+#define row0cache0lru_h__
2319+
2320+#include "row0cache0hash.h"
2321+#include "rem0types.h"
2322+
2323+typedef struct struct_row_cache_lru_stat{
2324+ ulint n_add;
2325+ ulint n_make_first;
2326+ ulint n_evict;
2327+ ulint old_n_add;
2328+ ulint old_n_make_first;
2329+ ulint old_n_evict;
2330+}row_cache_lru_stat_t;
2331+
2332+extern row_cache_lru_stat_t* row_cache_lru_stat;
2333+
2334+extern my_bool innodb_row_cache_clean_cache;
2335+
2336+void init_innodb_row_cache_lru();
2337+
2338+void deinit_innodb_row_cache_lru();
2339+
2340+void clean_row_cache();
2341+
2342+void add_row_cache_value_to_lru(row_cache_value_t* value);
2343+
2344+void make_row_cache_value_first_from_lru(row_cache_value_t* value);
2345+
2346+ulint free_from_lru(const ulint size , const ulint used_fold);
2347+
2348+void remove_row_cache_value_from_lru(row_cache_value_t* value);
2349+
2350+ulint get_row_cache_lru_count();
2351+
2352+#endif // row0cache0lru_h__
2353
2354Property changes on: storage/innobase/include/row0cache0lru.h
2355___________________________________________________________________
2356Added: svn:mime-type
2357 + text/plain
2358
2359Index: storage/innobase/include/mtr0mtr.h
2360===================================================================
2361--- storage/innobase/include/mtr0mtr.h (revision 498)
2362+++ storage/innobase/include/mtr0mtr.h (revision 724)
2363@@ -34,6 +34,7 @@
2364 #include "ut0byte.h"
2365 #include "mtr0types.h"
2366 #include "page0types.h"
2367+#include "row0cache.h"
2368
2369 /* Logging modes for a mini-transaction */
2370 #define MTR_LOG_ALL 21 /* default mode: log all operations
2371@@ -368,6 +369,7 @@
2372 #endif
2373 dyn_array_t memo; /*!< memo stack for locks etc. */
2374 dyn_array_t log; /*!< mini-transaction log */
2375+ UT_LIST_BASE_NODE_T(row_cache_value_queue_t) row_cache_value_queue_base; /*!< row cache lock queue base*/
2376 ibool inside_ibuf;
2377 /*!< TRUE if inside ibuf changes */
2378 ibool modifications;
2379Index: storage/innobase/include/mtr0mtr.ic
2380===================================================================
2381--- storage/innobase/include/mtr0mtr.ic (revision 498)
2382+++ storage/innobase/include/mtr0mtr.ic (revision 724)
2383@@ -41,6 +41,8 @@
2384
2385 dyn_array_create(&(mtr->memo));
2386 dyn_array_create(&(mtr->log));
2387+
2388+ UT_LIST_INIT(mtr->row_cache_value_queue_base);
2389
2390 mtr->log_mode = MTR_LOG_ALL;
2391 mtr->modifications = FALSE;
2392Index: storage/innobase/include/row0cache0hash.h
2393===================================================================
2394--- storage/innobase/include/row0cache0hash.h (revision 0)
2395+++ storage/innobase/include/row0cache0hash.h (revision 724)
2396@@ -0,0 +1,86 @@
2397+/********************************************************************
2398+ created: 2011/03/24
2399+ created: 24:3:2011 8:48
2400+ file base: row0cache0hash
2401+ file ext: h
2402+ author: wentong@taobao.com
2403+
2404+ purpose:
2405+*********************************************************************/
2406+#ifndef row0cache0hash_h__
2407+#define row0cache0hash_h__
2408+
2409+#include "hash0hash.h"
2410+#include "rem0types.h"
2411+#include "data0types.h"
2412+#include "ut0byte.h"
2413+#include "ut0rbt.h"
2414+#include "dict0types.h"
2415+
2416+//flag and bit handler
2417+#define onBit(flag,bit) ((flag) |= (bit))
2418+#define offBit(flag,bit) ((flag) &= ~(bit))
2419+#define testFlag(flag,bit) (((flag) & (bit)) == (bit))
2420+
2421+
2422+typedef struct row_cache_chain row_cache_chain_t;
2423+typedef struct row_cache_value row_cache_value_t;
2424+typedef UT_LIST_BASE_NODE_T(row_cache_value_t) ROW_CACHE_VALUE_LIST_BASE;
2425+
2426+struct row_cache_chain{
2427+ row_cache_value_t* value;
2428+ row_cache_chain_t* next;
2429+};
2430+
2431+struct row_cache_value{
2432+ ulint fold;
2433+ index_id_t tree_id;
2434+ table_id_t table_id;
2435+ rec_t* buf; /*!< the real mem*/
2436+ rec_t* rec; /*!< the physical record */
2437+ ulint buf_size;
2438+ UT_LIST_NODE_T(row_cache_value_t) list;
2439+ ulint ref_num; /*!< the Reference Number */
2440+ unsigned char flag;
2441+ row_cache_chain_t chain;
2442+};
2443+
2444+#define FLAG_VALUE_IS_FROM_VALUE_POOL 1
2445+#define FLAG_VALUE_IS_REMOVED (1<<1)
2446+
2447+#define isValueFromValuePool(flag) testFlag(flag,FLAG_VALUE_IS_FROM_VALUE_POOL)
2448+#define isValueRemoved(flag) testFlag(flag,FLAG_VALUE_IS_REMOVED)
2449+
2450+typedef struct ha_row_cache{
2451+ hash_table_t* row_cache;
2452+}row_cache_t;
2453+
2454+extern row_cache_t* innodb_row_cache;
2455+
2456+extern unsigned long innodb_row_cache_cell_num;
2457+
2458+extern unsigned int innodb_row_cache_mutex_num_shift;
2459+
2460+extern ulint innodb_row_cache_mutex_num;
2461+
2462+int init_row_cache_hash(my_bool innodb_row_cache_on);
2463+
2464+void deinit_row_cache_hash(my_bool innodb_row_cache_on);
2465+
2466+row_cache_value_t* search_row_cache_value(const dtuple_t* tuple ,const dict_index_t* index, const ulint fold);
2467+row_cache_value_t* search_row_cache_value_with_rec(const rec_t* rec, const ulint* rec_offsets, dict_index_t* index, const ulint fold);
2468+row_cache_value_t* insert_row_cache_value(const ulint fold , row_cache_value_t* value);
2469+void delete_row_cache_value(row_cache_value_t* value);
2470+
2471+ulint row_cache_enter_mutex_nowait(const ulint fold);
2472+void row_cache_enter_mutex(const ulint fold);
2473+void row_cache_exit_mutex(const ulint fold);
2474+
2475+int row_cache_own_mutex(const ulint fold1 , const ulint fold2);
2476+
2477+ulint row_cache_get_mutex_no(const ulint fold);
2478+
2479+void row_cache_enter_mutex_by_no(const ulint no);
2480+void row_cache_exit_mutex_by_no(const ulint no);
2481+
2482+#endif // row0cache0hash_h__
2483
2484Property changes on: storage/innobase/include/row0cache0hash.h
2485___________________________________________________________________
2486Added: svn:mime-type
2487 + text/plain
2488
2489Index: storage/innobase/include/row0cache.h
2490===================================================================
2491--- storage/innobase/include/row0cache.h (revision 0)
2492+++ storage/innobase/include/row0cache.h (revision 724)
2493@@ -0,0 +1,89 @@
2494+/********************************************************************
2495+created: 2011/03/08
2496+created: 8:3:2011 10:56
2497+file base: row0cache
2498+file ext: h
2499+author: wentong@taobao.com
2500+
2501+purpose: for row cache
2502+*********************************************************************/
2503+#ifndef row0cache_h_
2504+#define row0cache_h_
2505+
2506+#include "univ.i"
2507+#include "rem0types.h"
2508+#include "data0types.h"
2509+#include "data0data.h"
2510+#include "ut0byte.h"
2511+#include "mtr0types.h"
2512+#include "ut0lst.h"
2513+#include "row0cache0hash.h"
2514+#include "read0types.h"
2515+
2516+extern my_bool innodb_row_cache_on;
2517+
2518+typedef struct struct_row_cache_stat{
2519+ ulint n_get; /*the total get*/
2520+ ulint geted; /*the get from row cache*/
2521+ ulint old_n_get;
2522+ ulint old_geted;
2523+ time_t last_printout_time;
2524+}row_cache_stat_t;
2525+
2526+extern row_cache_stat_t* row_cache_stat;
2527+
2528+
2529+#define calc_fold_by_tuple(tuple , n_fields , tree_id) dtuple_fold((tuple) , (n_fields) , 0 , (tree_id))
2530+ulint calc_fold_by_rec(
2531+/*=====*/
2532+ const rec_t* rec, /*!< in: the physical record */
2533+ const ulint* offsets, /*!< in: array returned by
2534+ rec_get_offsets() */
2535+ ulint n_fields, /*!< in: number of complete
2536+ fields to fold */
2537+ index_id_t tree_id); /*!< in: index tree id */
2538+
2539+typedef struct row_cache_value_queue_struct row_cache_value_queue_t;
2540+
2541+struct row_cache_value_queue_struct{
2542+ row_cache_value_t* value;
2543+ /*!< linear list of dyn blocks: this node is
2544+ used only in the first block */
2545+ UT_LIST_NODE_T(row_cache_value_queue_t) list;
2546+ /*!< linear list node: used in all blocks */
2547+};
2548+
2549+typedef UT_LIST_BASE_NODE_T(row_cache_value_queue_t) ROW_CACHE_VALUE_QUEUE_LIST_BASE;
2550+
2551+void init_row_cache();
2552+
2553+void deinit_row_cache();
2554+
2555+void put_rec_to_row_cache(const dtuple_t* tuple,
2556+ const rec_t* rec, /*!< in: physical record */
2557+ const ulint* offsets, /*!< in: array returned by rec_get_offsets() */
2558+ const dict_index_t* index, /*!< in: index tree id */
2559+ ibool is_sec_index);
2560+
2561+int contain_row_cache_low(const ulint fold, const dtuple_t* tuple,
2562+ const dict_index_t* index);
2563+
2564+rec_t* get_from_row_cache_low(const ulint fold, const dtuple_t* tuple, const dict_index_t* index ,mtr_t* mtr); /*!<in: mtr*/
2565+
2566+void remove_from_row_cache_low(const ulint fold, const rec_t* rec, const ulint* offsets, const dtuple_t* tuple, dict_index_t* index);
2567+
2568+void release_row_cache_value_in_mtr(mtr_t* mtr);
2569+
2570+ulint lock_sec_rec_in_row_cache_cons_read_sees(const rec_t* rec, const ulint* offsets, const read_view_t* view);
2571+
2572+#define contain_row_cache(tuple , index) contain_row_cache_low(calc_fold_by_tuple((tuple) , (dict_index_get_n_unique(index)) , (index->id)), (tuple) ,(index))
2573+#define get_from_row_cache(tuple , index , mtr) get_from_row_cache_low(calc_fold_by_tuple((tuple) , (dict_index_get_n_unique(index)) , (index->id)), (tuple) ,(index) ,(mtr) )
2574+#define remove_from_row_cache(rec, offsets, index) remove_from_row_cache_low(calc_fold_by_rec((rec), (offsets), (dict_index_get_n_unique(index)), (index->id)),(rec),(offsets) ,NULL,index)
2575+
2576+#define remove_from_row_cache_for_tuple(tuple, index) remove_from_row_cache_low(calc_fold_by_tuple((tuple) , (dict_index_get_n_unique(index)) , (index->id)) , NULL,NULL,(tuple),(index))
2577+
2578+void row_cache_refresh_stats();
2579+
2580+void print_row_cache_stats(FILE* file);
2581+
2582+#endif
2583
2584Property changes on: storage/innobase/include/row0cache.h
2585___________________________________________________________________
2586Added: svn:mime-type
2587 + text/plain
2588
2589Index: storage/innobase/include/row0cache0filter.h
2590===================================================================
2591--- storage/innobase/include/row0cache0filter.h (revision 0)
2592+++ storage/innobase/include/row0cache0filter.h (revision 724)
2593@@ -0,0 +1,33 @@
2594+/********************************************************************
2595+ created: 2011/05/31
2596+ created: 31:5:2011 11:39
2597+ file base: row0cache0filter
2598+ file ext: h
2599+ author: wentong@taobao.com
2600+
2601+ purpose:
2602+*********************************************************************/
2603+
2604+#ifndef row0cache0filter_h__
2605+#define row0cache0filter_h__
2606+
2607+#include "hash0hash.h"
2608+#include "dict0types.h"
2609+
2610+#ifdef UNIV_PFS_RWLOCK
2611+/* Key to register btr_search_sys with performance schema */
2612+extern mysql_pfs_key_t row_cache_filter_lock_key;
2613+#endif /* UNIV_PFS_RWLOCK */
2614+
2615+#define INDEX_CONFIG_LEN 2048
2616+
2617+extern char* innodb_row_cache_index;
2618+
2619+void init_row_cache_filter(my_bool innodb_row_cache_on);
2620+void deinit_row_cache_filter(my_bool innodb_row_cache_on);
2621+
2622+void reset_filter();
2623+
2624+my_bool is_index_need_cache(const dict_index_t* index);
2625+
2626+#endif // row0cache0filter_h__
2627
2628Property changes on: storage/innobase/include/row0cache0filter.h
2629___________________________________________________________________
2630Added: svn:mime-type
2631 + text/plain
2632
2633Index: storage/innobase/row/row0upd.c
2634===================================================================
2635--- storage/innobase/row/row0upd.c (revision 498)
2636+++ storage/innobase/row/row0upd.c (revision 724)
2637@@ -32,6 +32,7 @@
2638 #include "dict0dict.h"
2639 #include "trx0undo.h"
2640 #include "rem0rec.h"
2641+#include "row0cache.h"
2642 #ifndef UNIV_HOTBACKUP
2643 #include "dict0boot.h"
2644 #include "dict0crea.h"
2645@@ -1665,6 +1666,12 @@
2646 }
2647 break;
2648 }
2649+ //TB_HOOK
2650+ if(dict_index_is_unique(index)){
2651+ ulint offsets_[REC_OFFS_NORMAL_SIZE];
2652+ ulint* offsets = rec_get_offsets( rec, index, offsets_, ULINT_UNDEFINED, &heap);
2653+ remove_from_row_cache(rec,offsets,index);
2654+ }
2655
2656 btr_pcur_close(&pcur);
2657 mtr_commit(&mtr);
2658@@ -2199,6 +2206,8 @@
2659 }
2660 }
2661
2662+ remove_from_row_cache(rec , offsets , index);
2663+
2664 /* NOTE: the following function calls will also commit mtr */
2665
2666 if (node->is_delete) {
2667Index: storage/innobase/row/row0uins.c
2668===================================================================
2669--- storage/innobase/row/row0uins.c (revision 498)
2670+++ storage/innobase/row/row0uins.c (revision 724)
2671@@ -45,6 +45,8 @@
2672 #include "que0que.h"
2673 #include "ibuf0ibuf.h"
2674 #include "log0log.h"
2675+#include "row0cache.h"
2676+#include "btr0sea.h"
2677
2678 /*************************************************************************
2679 IMPORTANT NOTE: Any operation that generates redo MUST check that there
2680@@ -66,11 +68,23 @@
2681 /*==========================*/
2682 undo_node_t* node) /*!< in: undo node */
2683 {
2684+ dict_index_t* index;
2685+ btr_pcur_t* pcur;
2686+
2687 btr_cur_t* btr_cur;
2688 ibool success;
2689 ulint err;
2690 ulint n_tries = 0;
2691 mtr_t mtr;
2692+ rec_t* rec;
2693+
2694+ mem_heap_t* heap = NULL;
2695+ ulint offsets_[REC_OFFS_NORMAL_SIZE];
2696+ ulint* offsets;
2697+ rec_offs_init(offsets_);
2698+
2699+ pcur = &(node->pcur);
2700+ index = dict_table_get_first_index(node->table);
2701
2702 mtr_start(&mtr);
2703
2704@@ -78,6 +92,11 @@
2705 &mtr);
2706 ut_a(success);
2707
2708+ //TB_HOOK
2709+ rec = btr_pcur_get_rec(pcur);
2710+ offsets = rec_get_offsets(rec, index, offsets_, ULINT_UNDEFINED, &heap);
2711+ remove_from_row_cache(rec , offsets,index);
2712+
2713 if (node->table->id == DICT_INDEXES_ID) {
2714 ut_ad(node->trx->dict_operation_lock_mode == RW_X_LATCH);
2715
2716@@ -103,7 +122,9 @@
2717
2718 if (success) {
2719 trx_undo_rec_release(node->trx, node->undo_no);
2720-
2721+ if (UNIV_LIKELY_NULL(heap)) {
2722+ mem_heap_free(heap);
2723+ }
2724 return(DB_SUCCESS);
2725 }
2726 retry:
2727@@ -139,6 +160,9 @@
2728
2729 trx_undo_rec_release(node->trx, node->undo_no);
2730
2731+ if (UNIV_LIKELY_NULL(heap)) {
2732+ mem_heap_free(heap);
2733+ }
2734 return(err);
2735 }
2736
2737@@ -348,6 +372,10 @@
2738 transactions. */
2739 ut_a(trx_is_recv(node->trx));
2740 } else {
2741+ //TB_HOOK
2742+ if(dict_index_is_unique(node->index)){
2743+ remove_from_row_cache_for_tuple(entry , node->index);
2744+ }
2745 log_free_check();
2746 err = row_undo_ins_remove_sec(node->index, entry);
2747
2748Index: storage/innobase/row/row0sel.c
2749===================================================================
2750--- storage/innobase/row/row0sel.c (revision 498)
2751+++ storage/innobase/row/row0sel.c (revision 724)
2752@@ -57,6 +57,7 @@
2753 #include "read0read.h"
2754 #include "buf0lru.h"
2755 #include "ha_prototypes.h"
2756+#include "row0cache.h"
2757
2758 /* Maximum number of rows to prefetch; MySQL interface has another parameter */
2759 #define SEL_MAX_N_PREFETCH 16
2760@@ -2904,6 +2905,7 @@
2761 rec_t* old_vers;
2762 enum db_err err;
2763 trx_t* trx;
2764+ ibool is_get_from_row_cache = FALSE;
2765
2766 *out_rec = NULL;
2767 trx = thr_get_trx(thr);
2768@@ -2913,20 +2915,31 @@
2769
2770 clust_index = dict_table_get_first_index(sec_index->table);
2771
2772- btr_pcur_open_with_no_init(clust_index, prebuilt->clust_ref,
2773- PAGE_CUR_LE, BTR_SEARCH_LEAF,
2774- prebuilt->clust_pcur, 0, mtr);
2775+ //TB_HOOK get_from_row_cache
2776+ if(prebuilt->select_lock_type == LOCK_NONE
2777+ && trx->mysql_n_tables_locked == 0){
2778+ clust_rec = get_from_row_cache(prebuilt->clust_ref,clust_index,mtr);
2779+ if (clust_rec!=NULL)
2780+ {
2781+ is_get_from_row_cache=TRUE;
2782+ }
2783+ }
2784+ if(is_get_from_row_cache == FALSE){
2785+ btr_pcur_open_with_no_init(clust_index, prebuilt->clust_ref,
2786+ PAGE_CUR_LE, BTR_SEARCH_LEAF,
2787+ prebuilt->clust_pcur, 0, mtr);
2788
2789- clust_rec = btr_pcur_get_rec(prebuilt->clust_pcur);
2790+ clust_rec = btr_pcur_get_rec(prebuilt->clust_pcur);
2791+ }
2792
2793 prebuilt->clust_pcur->trx_if_known = trx;
2794
2795 /* Note: only if the search ends up on a non-infimum record is the
2796 low_match value the real match to the search tuple */
2797
2798- if (!page_rec_is_user_rec(clust_rec)
2799+ if (is_get_from_row_cache == FALSE && (!page_rec_is_user_rec(clust_rec)
2800 || btr_pcur_get_low_match(prebuilt->clust_pcur)
2801- < dict_index_get_n_unique(clust_index)) {
2802+ < dict_index_get_n_unique(clust_index))) {
2803
2804 /* In a rare case it is possible that no clust rec is found
2805 for a delete-marked secondary index record: if in row0umod.c
2806@@ -3011,6 +3024,13 @@
2807 }
2808
2809 clust_rec = old_vers;
2810+ } else {
2811+ //TB_HOOK put_rec_to_row_cache
2812+ if (is_get_from_row_cache == FALSE && !prebuilt->templ_contains_blob
2813+ && !prebuilt->used_in_HANDLER) {
2814+ put_rec_to_row_cache(prebuilt->clust_ref, clust_rec, *offsets,
2815+ clust_index, FALSE);
2816+ }
2817 }
2818
2819 /* If we had to go to an earlier version of row or the
2820@@ -3356,6 +3376,18 @@
2821 return(SEL_FOUND);
2822 }
2823
2824+static void put_rec_to_row_cache_in_sel(row_prebuilt_t* prebuilt,
2825+ ulint direction, ibool unique_search, const rec_t* rec, ulint* offsets,
2826+ dict_index_t* index, ibool is_sec_index) {
2827+ if (prebuilt->select_lock_type == LOCK_NONE && direction == 0
2828+ && unique_search && !prebuilt->templ_contains_blob
2829+ && !prebuilt->used_in_HANDLER) {
2830+ put_rec_to_row_cache(prebuilt->search_tuple, rec, offsets, index,
2831+ is_sec_index);
2832+ }
2833+}
2834+
2835+
2836 /********************************************************************//**
2837 Searches for rows in the database. This is used in the interface to
2838 MySQL. This function opens a cursor, and also implements fetch next
2839@@ -3417,6 +3449,7 @@
2840 ibool same_user_rec;
2841 mtr_t mtr;
2842 mem_heap_t* heap = NULL;
2843+ ibool is_get_from_row_cache = FALSE;
2844 ulint offsets_[REC_OFFS_NORMAL_SIZE];
2845 ulint* offsets = offsets_;
2846 ibool table_lock_waited = FALSE;
2847@@ -3929,14 +3962,26 @@
2848
2849 } else if (dtuple_get_n_fields(search_tuple) > 0) {
2850
2851- btr_pcur_open_with_no_init(index, search_tuple, mode,
2852- BTR_SEARCH_LEAF,
2853- pcur, 0, &mtr);
2854-
2855+ //TB_HOOK get_from_row_cache
2856+ if(prebuilt->select_lock_type == LOCK_NONE
2857+ && trx->mysql_n_tables_locked == 0
2858+ && direction == 0
2859+ && unique_search
2860+ && !prebuilt->used_in_HANDLER){
2861+ rec = get_from_row_cache(search_tuple,index,&mtr);
2862+ if (rec != NULL)
2863+ {
2864+ is_get_from_row_cache = TRUE;
2865+ }
2866+ }
2867+ if(is_get_from_row_cache == FALSE){
2868+ btr_pcur_open_with_no_init(index, search_tuple, mode,
2869+ BTR_SEARCH_LEAF,
2870+ pcur, 0, &mtr);
2871+ rec = btr_pcur_get_rec(pcur);
2872+ }
2873 pcur->trx_if_known = trx;
2874
2875- rec = btr_pcur_get_rec(pcur);
2876-
2877 if (!moves_up
2878 && !page_rec_is_supremum(rec)
2879 && set_also_gap_locks
2880@@ -3979,9 +4024,10 @@
2881 rec_loop:
2882 /*-------------------------------------------------------------*/
2883 /* PHASE 4: Look for matching records in a loop */
2884-
2885- rec = btr_pcur_get_rec(pcur);
2886-
2887+ if (is_get_from_row_cache == FALSE)
2888+ {
2889+ rec = btr_pcur_get_rec(pcur);
2890+ }
2891 if (srv_pass_corrupt_table && !rec) {
2892 err = DB_CORRUPTION;
2893 goto lock_wait_or_error;
2894@@ -3998,122 +4044,123 @@
2895 rec_print(rec);
2896 */
2897 #endif /* UNIV_SEARCH_DEBUG */
2898-
2899- if (page_rec_is_infimum(rec)) {
2900+ if(is_get_from_row_cache==FALSE){
2901+ if (page_rec_is_infimum(rec)) {
2902
2903- /* The infimum record on a page cannot be in the result set,
2904- and neither can a record lock be placed on it: we skip such
2905- a record. */
2906+ /* The infimum record on a page cannot be in the result set,
2907+ and neither can a record lock be placed on it: we skip such
2908+ a record. */
2909
2910- goto next_rec;
2911- }
2912+ goto next_rec;
2913+ }
2914
2915- if (page_rec_is_supremum(rec)) {
2916+ if (page_rec_is_supremum(rec)) {
2917
2918- if (set_also_gap_locks
2919- && !(srv_locks_unsafe_for_binlog
2920- || trx->isolation_level <= TRX_ISO_READ_COMMITTED)
2921- && prebuilt->select_lock_type != LOCK_NONE) {
2922+ if (set_also_gap_locks
2923+ && !(srv_locks_unsafe_for_binlog
2924+ || trx->isolation_level <= TRX_ISO_READ_COMMITTED)
2925+ && prebuilt->select_lock_type != LOCK_NONE) {
2926
2927- /* Try to place a lock on the index record */
2928+ /* Try to place a lock on the index record */
2929
2930- /* If innodb_locks_unsafe_for_binlog option is used
2931- or this session is using a READ COMMITTED isolation
2932- level we do not lock gaps. Supremum record is really
2933- a gap and therefore we do not set locks there. */
2934+ /* If innodb_locks_unsafe_for_binlog option is used
2935+ or this session is using a READ COMMITTED isolation
2936+ level we do not lock gaps. Supremum record is really
2937+ a gap and therefore we do not set locks there. */
2938
2939- offsets = rec_get_offsets(rec, index, offsets,
2940- ULINT_UNDEFINED, &heap);
2941- err = sel_set_rec_lock(btr_pcur_get_block(pcur),
2942- rec, index, offsets,
2943- prebuilt->select_lock_type,
2944- LOCK_ORDINARY, thr);
2945+ offsets = rec_get_offsets(rec, index, offsets,
2946+ ULINT_UNDEFINED, &heap);
2947+ err = sel_set_rec_lock(btr_pcur_get_block(pcur),
2948+ rec, index, offsets,
2949+ prebuilt->select_lock_type,
2950+ LOCK_ORDINARY, thr);
2951
2952- switch (err) {
2953- case DB_SUCCESS_LOCKED_REC:
2954- err = DB_SUCCESS;
2955- case DB_SUCCESS:
2956- break;
2957- default:
2958- goto lock_wait_or_error;
2959+ switch (err) {
2960+ case DB_SUCCESS_LOCKED_REC:
2961+ err = DB_SUCCESS;
2962+ case DB_SUCCESS:
2963+ break;
2964+ default:
2965+ goto lock_wait_or_error;
2966+ }
2967 }
2968- }
2969- /* A page supremum record cannot be in the result set: skip
2970- it now that we have placed a possible lock on it */
2971+ /* A page supremum record cannot be in the result set: skip
2972+ it now that we have placed a possible lock on it */
2973
2974- goto next_rec;
2975- }
2976+ goto next_rec;
2977+ }
2978
2979- /*-------------------------------------------------------------*/
2980- /* Do sanity checks in case our cursor has bumped into page
2981- corruption */
2982+ /*-------------------------------------------------------------*/
2983+ /* Do sanity checks in case our cursor has bumped into page
2984+ corruption */
2985
2986- if (comp) {
2987- next_offs = rec_get_next_offs(rec, TRUE);
2988- if (UNIV_UNLIKELY(next_offs < PAGE_NEW_SUPREMUM)) {
2989+ if (comp) {
2990+ next_offs = rec_get_next_offs(rec, TRUE);
2991+ if (UNIV_UNLIKELY(next_offs < PAGE_NEW_SUPREMUM)) {
2992
2993- goto wrong_offs;
2994- }
2995- } else {
2996- next_offs = rec_get_next_offs(rec, FALSE);
2997- if (UNIV_UNLIKELY(next_offs < PAGE_OLD_SUPREMUM)) {
2998+ goto wrong_offs;
2999+ }
3000+ } else {
3001+ next_offs = rec_get_next_offs(rec, FALSE);
3002+ if (UNIV_UNLIKELY(next_offs < PAGE_OLD_SUPREMUM)) {
3003
3004- goto wrong_offs;
3005+ goto wrong_offs;
3006+ }
3007 }
3008- }
3009
3010- if (UNIV_UNLIKELY(next_offs >= UNIV_PAGE_SIZE - PAGE_DIR)) {
3011+ if (UNIV_UNLIKELY(next_offs >= UNIV_PAGE_SIZE - PAGE_DIR)) {
3012
3013 wrong_offs:
3014- if (srv_pass_corrupt_table && !trx_sys_sys_space(index->table->space)) {
3015- index->table->is_corrupt = TRUE;
3016- fil_space_set_corrupt(index->table->space);
3017- }
3018+ if (srv_pass_corrupt_table && !trx_sys_sys_space(index->table->space)) {
3019+ index->table->is_corrupt = TRUE;
3020+ fil_space_set_corrupt(index->table->space);
3021+ }
3022
3023- if ((srv_force_recovery == 0 || moves_up == FALSE)
3024- && srv_pass_corrupt_table <= 1) {
3025- ut_print_timestamp(stderr);
3026- buf_page_print(page_align(rec), 0);
3027- fprintf(stderr,
3028- "\nInnoDB: rec address %p,"
3029- " buf block fix count %lu\n",
3030- (void*) rec, (ulong)
3031- btr_cur_get_block(btr_pcur_get_btr_cur(pcur))
3032- ->page.buf_fix_count);
3033- fprintf(stderr,
3034- "InnoDB: Index corruption: rec offs %lu"
3035- " next offs %lu, page no %lu,\n"
3036- "InnoDB: ",
3037- (ulong) page_offset(rec),
3038- (ulong) next_offs,
3039- (ulong) page_get_page_no(page_align(rec)));
3040- dict_index_name_print(stderr, trx, index);
3041- fputs(". Run CHECK TABLE. You may need to\n"
3042- "InnoDB: restore from a backup, or"
3043- " dump + drop + reimport the table.\n",
3044- stderr);
3045+ if ((srv_force_recovery == 0 || moves_up == FALSE)
3046+ && srv_pass_corrupt_table <= 1) {
3047+ ut_print_timestamp(stderr);
3048+ buf_page_print(page_align(rec), 0);
3049+ fprintf(stderr,
3050+ "\nInnoDB: rec address %p,"
3051+ " buf block fix count %lu\n",
3052+ (void*) rec, (ulong)
3053+ btr_cur_get_block(btr_pcur_get_btr_cur(pcur))
3054+ ->page.buf_fix_count);
3055+ fprintf(stderr,
3056+ "InnoDB: Index corruption: rec offs %lu"
3057+ " next offs %lu, page no %lu,\n"
3058+ "InnoDB: ",
3059+ (ulong) page_offset(rec),
3060+ (ulong) next_offs,
3061+ (ulong) page_get_page_no(page_align(rec)));
3062+ dict_index_name_print(stderr, trx, index);
3063+ fputs(". Run CHECK TABLE. You may need to\n"
3064+ "InnoDB: restore from a backup, or"
3065+ " dump + drop + reimport the table.\n",
3066+ stderr);
3067
3068- err = DB_CORRUPTION;
3069+ err = DB_CORRUPTION;
3070
3071- goto lock_wait_or_error;
3072- } else {
3073- /* The user may be dumping a corrupt table. Jump
3074- over the corruption to recover as much as possible. */
3075+ goto lock_wait_or_error;
3076+ } else {
3077+ /* The user may be dumping a corrupt table. Jump
3078+ over the corruption to recover as much as possible. */
3079
3080- fprintf(stderr,
3081- "InnoDB: Index corruption: rec offs %lu"
3082- " next offs %lu, page no %lu,\n"
3083- "InnoDB: ",
3084- (ulong) page_offset(rec),
3085- (ulong) next_offs,
3086- (ulong) page_get_page_no(page_align(rec)));
3087- dict_index_name_print(stderr, trx, index);
3088- fputs(". We try to skip the rest of the page.\n",
3089- stderr);
3090+ fprintf(stderr,
3091+ "InnoDB: Index corruption: rec offs %lu"
3092+ " next offs %lu, page no %lu,\n"
3093+ "InnoDB: ",
3094+ (ulong) page_offset(rec),
3095+ (ulong) next_offs,
3096+ (ulong) page_get_page_no(page_align(rec)));
3097+ dict_index_name_print(stderr, trx, index);
3098+ fputs(". We try to skip the rest of the page.\n",
3099+ stderr);
3100
3101- btr_pcur_move_to_last_on_page(pcur, &mtr);
3102+ btr_pcur_move_to_last_on_page(pcur, &mtr);
3103
3104- goto next_rec;
3105+ goto next_rec;
3106+ }
3107 }
3108 }
3109 /*-------------------------------------------------------------*/
3110@@ -4364,7 +4411,11 @@
3111
3112 /* Do nothing: we let a non-locking SELECT read the
3113 latest version of the record */
3114-
3115+ //TB_HOOK put_rec_to_row_cache
3116+ if (is_get_from_row_cache == FALSE) {
3117+ put_rec_to_row_cache_in_sel(prebuilt, direction, unique_search,
3118+ rec, offsets, index, index != clust_index);
3119+ }
3120 } else if (index == clust_index) {
3121
3122 /* Fetch a previous version of the row if the current
3123@@ -4390,6 +4441,11 @@
3124 }
3125
3126 if (old_vers == NULL) {
3127+ if (is_get_from_row_cache == TRUE) {
3128+ //this should be Phantom read
3129+ err = DB_RECORD_NOT_FOUND;
3130+ goto normal_return;
3131+ }
3132 /* The row did not exist yet in
3133 the read view */
3134
3135@@ -4397,6 +4453,13 @@
3136 }
3137
3138 rec = old_vers;
3139+ } else {
3140+ //TB_HOOK put_rec_to_row_cache
3141+ if (is_get_from_row_cache == FALSE) {
3142+ put_rec_to_row_cache_in_sel(prebuilt, direction,
3143+ unique_search, rec, offsets, index,
3144+ index != clust_index);
3145+ }
3146 }
3147 } else {
3148 /* We are looking into a non-clustered index,
3149@@ -4407,9 +4470,19 @@
3150
3151 ut_ad(!dict_index_is_clust(index));
3152
3153- if (!lock_sec_rec_cons_read_sees(
3154- rec, trx->read_view)) {
3155+ if ((is_get_from_row_cache == FALSE
3156+ && !lock_sec_rec_cons_read_sees(rec, trx->read_view))
3157+ || (is_get_from_row_cache == TRUE
3158+ && !lock_sec_rec_in_row_cache_cons_read_sees(rec,
3159+ offsets, trx->read_view))) {
3160 goto requires_clust_rec;
3161+ } else {
3162+ //TB_HOOK put_rec_to_row_cache
3163+ if (is_get_from_row_cache == FALSE) {
3164+ put_rec_to_row_cache_in_sel(prebuilt, direction,
3165+ unique_search, rec, offsets, index,
3166+ index != clust_index);
3167+ }
3168 }
3169 }
3170 }
3171@@ -4483,7 +4556,13 @@
3172 if (clust_rec == NULL) {
3173 /* The record did not exist in the read view */
3174 ut_ad(prebuilt->select_lock_type == LOCK_NONE);
3175+ if(is_get_from_row_cache == TRUE){
3176+ // when is is_get_from_row_cache ,if clust_rec == NULL mean there is not record which can be
3177+ // read by this trx
3178+ err = DB_RECORD_NOT_FOUND;
3179
3180+ goto normal_return;
3181+ }
3182 goto next_rec;
3183 }
3184 break;
3185@@ -4573,6 +4652,10 @@
3186 goto got_row;
3187 }
3188
3189+ if(is_get_from_row_cache == TRUE){
3190+ goto got_row;
3191+ }
3192+
3193 goto next_rec;
3194 } else {
3195 if (UNIV_UNLIKELY
3196@@ -4634,10 +4717,11 @@
3197 return 'end of file'. Exceptions are locking reads and the MySQL
3198 HANDLER command where the user can move the cursor with PREV or NEXT
3199 even after a unique search. */
3200-
3201- if (!unique_search_from_clust_index
3202- || prebuilt->select_lock_type != LOCK_NONE
3203- || prebuilt->used_in_HANDLER) {
3204+ //if get from row_cache ,it is no next or prev
3205+ if (is_get_from_row_cache == FALSE
3206+ && (!unique_search_from_clust_index
3207+ || prebuilt->select_lock_type != LOCK_NONE
3208+ || prebuilt->used_in_HANDLER)) {
3209
3210 /* Inside an update always store the cursor position */
3211
3212@@ -4649,6 +4733,7 @@
3213 goto normal_return;
3214
3215 next_rec:
3216+ ut_ad(is_get_from_row_cache == FALSE);
3217 /* Reset the old and new "did semi-consistent read" flags. */
3218 if (UNIV_UNLIKELY(prebuilt->row_read_type
3219 == ROW_READ_DID_SEMI_CONSISTENT)) {
3220Index: storage/innobase/row/row0umod.c
3221===================================================================
3222--- storage/innobase/row/row0umod.c (revision 498)
3223+++ storage/innobase/row/row0umod.c (revision 724)
3224@@ -43,6 +43,8 @@
3225 #include "row0upd.h"
3226 #include "que0que.h"
3227 #include "log0log.h"
3228+#include "row0cache.h"
3229+#include "btr0sea.h"
3230
3231 /* Considerations on undoing a modify operation.
3232 (1) Undoing a delete marking: all index records should be found. Some of
3233@@ -128,6 +130,26 @@
3234
3235 ut_ad(success);
3236
3237+ //TB_HOOK
3238+ {
3239+ dict_index_t* index;
3240+ rec_t* rec;
3241+ mem_heap_t* heap = NULL;
3242+ ulint offsets_[REC_OFFS_NORMAL_SIZE];
3243+ ulint* offsets;
3244+ rec_offs_init(offsets_);
3245+
3246+ index = dict_table_get_first_index(node->table);
3247+ //TB_HOOK
3248+ rec = btr_pcur_get_rec(pcur);
3249+ offsets = rec_get_offsets(rec, index, offsets_, ULINT_UNDEFINED, &heap);
3250+ remove_from_row_cache(rec, offsets, index);
3251+
3252+ if (UNIV_LIKELY_NULL(heap)) {
3253+ mem_heap_free(heap);
3254+ }
3255+ }
3256+
3257 if (mode == BTR_MODIFY_LEAF) {
3258
3259 err = btr_cur_optimistic_update(BTR_NO_LOCKING_FLAG
3260@@ -597,6 +619,10 @@
3261 transactions. */
3262 ut_a(thr_is_recv(thr));
3263 } else {
3264+ //TB_HOOK
3265+ if(dict_index_is_unique(index)){
3266+ remove_from_row_cache_for_tuple(entry , index);
3267+ }
3268 err = row_undo_mod_del_mark_or_remove_sec(
3269 node, thr, index, entry);
3270
3271@@ -646,6 +672,10 @@
3272 entry = row_build_index_entry(node->row, node->ext,
3273 index, heap);
3274 ut_a(entry);
3275+ //TB_HOOK
3276+ if(dict_index_is_unique(index)){
3277+ remove_from_row_cache_for_tuple(entry , index);
3278+ }
3279 err = row_undo_mod_del_unmark_sec_and_undo_update(
3280 BTR_MODIFY_LEAF, thr, index, entry);
3281 if (err == DB_FAIL) {
3282@@ -746,7 +776,10 @@
3283 version of it, if the secondary index record
3284 through which we do the search is
3285 delete-marked. */
3286-
3287+ //TB_HOOK
3288+ if(dict_index_is_unique(index)){
3289+ remove_from_row_cache_for_tuple(entry , index);
3290+ }
3291 err = row_undo_mod_del_mark_or_remove_sec(
3292 node, thr, index, entry);
3293 if (err != DB_SUCCESS) {
3294Index: sql/handler.h
3295===================================================================
3296--- sql/handler.h (revision 498)
3297+++ sql/handler.h (revision 724)
3298@@ -1579,6 +1579,10 @@
3299 DBUG_ASSERT(FALSE);
3300 return HA_ERR_WRONG_COMMAND;
3301 }
3302+ virtual bool ha_is_in_cache(const uchar * key, key_part_map keypart_map) {
3303+ uint key_len = calculate_key_len(table, active_index, key, keypart_map);
3304+ return is_in_cache(key, key_len);
3305+ }
3306 /**
3307 @brief
3308 Positions an index cursor to the index specified in the handle. Fetches the
3309@@ -2069,6 +2073,8 @@
3310 virtual int open(const char *name, int mode, uint test_if_locked)=0;
3311 virtual int index_init(uint idx, bool sorted) { active_index= idx; return 0; }
3312 virtual int index_end() { active_index= MAX_KEY; return 0; }
3313+ virtual bool is_in_cache(const uchar * key,
3314+ uint key_len){return false;}
3315 /**
3316 rnd_init() can be called two times without rnd_end() in between
3317 (it only makes sense if scan=1).

Subscribers

People subscribed via source and target branches