Merge lp:~akopytov/percona-server/bug1131189 into lp:percona-server/5.5
- bug1131189
- Merge into 5.5
Status: | Merged |
---|---|
Approved by: | Laurynas Biveinis |
Approved revision: | no longer in the source branch. |
Merged at revision: | 483 |
Proposed branch: | lp:~akopytov/percona-server/bug1131189 |
Merge into: | lp:percona-server/5.5 |
Prerequisite: | lp:~akopytov/percona-server/bug1131187 |
Diff against target: |
1371 lines (+446/-230) 16 files modified
Percona-Server/storage/innobase/handler/ha_innodb.cc (+3/-3) Percona-Server/storage/innobase/include/read0read.h (+13/-8) Percona-Server/storage/innobase/include/read0read.ic (+18/-25) Percona-Server/storage/innobase/include/trx0sys.h (+38/-0) Percona-Server/storage/innobase/include/trx0sys.ic (+26/-20) Percona-Server/storage/innobase/include/trx0trx.h (+35/-4) Percona-Server/storage/innobase/include/trx0trx.ic (+4/-4) Percona-Server/storage/innobase/lock/lock0lock.c (+11/-11) Percona-Server/storage/innobase/read/read0read.c (+72/-101) Percona-Server/storage/innobase/row/row0sel.c (+7/-5) Percona-Server/storage/innobase/row/row0vers.c (+3/-3) Percona-Server/storage/innobase/srv/srv0srv.c (+1/-1) Percona-Server/storage/innobase/trx/trx0purge.c (+6/-1) Percona-Server/storage/innobase/trx/trx0roll.c (+4/-4) Percona-Server/storage/innobase/trx/trx0sys.c (+10/-1) Percona-Server/storage/innobase/trx/trx0trx.c (+195/-39) |
To merge this branch: | bzr merge lp:~akopytov/percona-server/bug1131189 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Laurynas Biveinis (community) | Approve | ||
Alexey Kopytov (community) | Needs Resubmitting | ||
Review via email: mp+150188@code.launchpad.net |
Commit message
Description of the change
Bug #1131189: Remove trx_list scan from read_view_
The patch introduces a concept if "trx descriptors" which is a global
ordered array containing IDs of transactions in either TRX_ACTIVE or
TRX_PREPARED state. It allows to replace the trx_list scan in
read_view_
binary search on the descriptors array and two memcpy()s.
Purposes of using the ID array of transactions is certain states:
- we remove dependencies on trx_struct size and cache locality of those
structures in memory
- per-node copying is replaced with memcpy()
- we don't have to check trx_struct fields for each trx_list node, such
as ID, state and no.
Since there is no transaction serialization numbers (i.e. trx->no) in the
descriptors array, this check was replaced by keeping a separate,
trx->no ordered list (trx_sys-
minimum for the current read view is then simply a matter of getting
trx->no of the first element from trx_serial_list.
Costs for maintaining the descriptors array:
- whenever a transaction is started, we need to insert its ID into the
descriptors array. This in most cases is very cheap, as transactions
are started with increasing IDs, so we can just add it as the last
element in the descriptors array. In the unlikely case when this
invariant does not hold (which is impossible with the current code),
we look for the right slot using a linear search. A binary search
would work better for cases when the right slot is far away from the
array end, but again, this is defensive against future code changes
that could possibly lead to breaking the ascending order in which new
IDs are added, and we hope that it will be "mostly ascending", so a
linear search should be faster.
- whenever a transaction is committed, we need to remove its ID from the
descriptors array. Which is performed by ut_memmove() on the array. We
could theoretically allow unused array slots but that would: 1)
increase the size of descriptors by the 'used/unused' flag; 2) make
using memcpy() in read_view_
have to filter unused slots and 3) make the operation of adding a new
descriptor scan the array for an unused slot
- when removing a transaction ID from the descriptors array, we first
have to find the corresponding slot with a binary search. We could
avoid this by maintaining the "trx -> descriptors slot" association,
but since the array may be resized or reorder on insertion, keeping
this association is not practically feasible
All of the above is performed under the kernel_mutex. But benchmarks
prove this overhead is negligible as compared to the list scan in
read_view_
The initial size of the descriptors array is 1000 slots (i.e. 8000
bytes). It is resized whenever we need more descriptors.
The patch also renames 'conc_state' to 'state' in trx_struct. The patch
would be much smaller without this change, but the purpose is to make
sure we notice any code changes around transaction states, as it is
critical for correct descriptors array maintenance. I tried implementing
a getter/setter pair of functions for trx state, it the patch got even
more messy, because trx state may be changed with kernel_mutex either
locked or unlocked, whereas we can only manipulate the descriptors array
with the mutex locked.
Alexey Kopytov (akopytov) wrote : | # |
Laurynas Biveinis (laurynas-biveinis) wrote : | # |
It's a good patch, just couple of relatively small questions and comments:
- The only thing I'd look into doing differently would be not
doing linear search at all in trx_reserve_
adding a debug assertion that the array is sorted after the
insert (which would be just the comparison with the n-1th
element). Would save a dozen code bytes in a presumably very hot
code path.
- I must be missing something here, but why there is padding
inserted between descr_n_max and descr_n_used in trx_sys? It
seems that these fields are accessed at close times.
- File read0read.h adds the include of trx0sys.h. But the diff of
read0read.h does not show anything new external used there.
Rather, it is read0read.ic which starts using
trx_
moved to read0read.ic.
- Spurious whitespace changes in diff lines 381, 382, 645,
1155-1157, 1220.
- Why does trx_reserve_
n_max) n_max *= 2 ? A second iteration of this loop can never
happen, it looks maybe a bit too defensive for me,
and this is next to a very hot code path, isn't it? Since there
is if (n_used > n_max) just above, one unconditional n_max *= 2;
should be enough.
- Why does trx_free_prepared() free its view now?
Alexey Kopytov (akopytov) wrote : | # |
Hi Laurynas,
Thanks for the review.
On Mon, 25 Mar 2013 08:25:24 -0000, Laurynas Biveinis wrote:
> Review: Needs Information
>
> It's a good patch, just couple of relatively small questions and comments:
>
> - The only thing I'd look into doing differently would be not
> doing linear search at all in trx_reserve_
> adding a debug assertion that the array is sorted after the
> insert (which would be just the comparison with the n-1th
> element). Would save a dozen code bytes in a presumably very hot
> code path.
>
I'd like to keep that code. Assuming that invariant will always hold
and enforcing it with a debug-only assertion looks a bit fragile to me.
> - I must be missing something here, but why there is padding
> inserted between descr_n_max and descr_n_used in trx_sys? It
> seems that these fields are accessed at close times.
>
They are accessed at close times, but not always modified at close
times. So by splitting them into separate cache lines we avoid false
sharing in some (actually most) cases.
> - File read0read.h adds the include of trx0sys.h. But the diff of
> read0read.h does not show anything new external used there.
> Rather, it is read0read.ic which starts using
> trx_find_
> moved to read0read.ic.
>
It was originally in read0read.ic, but that was breaking debug builds
due to some tricky inter-dependency between read0read.h, read0read.ic
and something else. Details elude me now, but my conclusion was that
adding that include to .h was the only clean way to resolve them.
> - Spurious whitespace changes in diff lines 381, 382, 645,
> 1155-1157, 1220.
>
Fixed those and a few other ones.
> - Why does trx_reserve_
> n_max) n_max *= 2 ? A second iteration of this loop can never
> happen, it looks maybe a bit too defensive for me,
> and this is next to a very hot code path, isn't it? Since there
> is if (n_used > n_max) just above, one unconditional n_max *= 2;
> should be enough.
>
Probably remnants from previous implementation iterations. Fixed.
> - Why does trx_free_prepared() free its view now?
>
Because we are going to free trx a few lines later. Technically it's a
follow-up to the fix for bug #1131187 to avoid a memory leak, but they
both were originally in the same tree (with multiple revisions per
each), and it was tricky to figure what belongs to what. There's
actually another change that logically would belong to that fix.
Alexey Kopytov (akopytov) wrote : | # |
Will rebase on GCA branches and resubmit along with a 5.6 MP, as porting this to 5.6 will be rather non-trivial.
Alexey Kopytov (akopytov) wrote : | # |
Actually, I will not, because this depends on the already merged fix for bug #1131187 which will be missing from the GCA 5.5 branch. Which means waiting for a proper 5.6 branch will be the only way to proceed with porting.
Laurynas Biveinis (laurynas-biveinis) : | # |
Preview Diff
1 | === modified file 'Percona-Server/storage/innobase/handler/ha_innodb.cc' |
2 | --- Percona-Server/storage/innobase/handler/ha_innodb.cc 2013-03-23 01:08:22 +0000 |
3 | +++ Percona-Server/storage/innobase/handler/ha_innodb.cc 2013-03-25 10:05:33 +0000 |
4 | @@ -1910,7 +1910,7 @@ |
5 | /*===========*/ |
6 | trx_t* trx) /* in: transaction */ |
7 | { |
8 | - return(trx->conc_state != TRX_NOT_STARTED); |
9 | + return(trx->state != TRX_NOT_STARTED); |
10 | } |
11 | |
12 | /*********************************************************************//** |
13 | @@ -9046,7 +9046,7 @@ |
14 | |
15 | prebuilt->trx->op_info = "confirming rows of SYS_STATS to store statistics"; |
16 | |
17 | - ut_a(prebuilt->trx->conc_state == TRX_NOT_STARTED); |
18 | + ut_a(!trx_is_started(prebuilt->trx)); |
19 | |
20 | for (index = dict_table_get_first_index(ib_table); |
21 | index != NULL; |
22 | @@ -9059,7 +9059,7 @@ |
23 | innobase_commit_low(prebuilt->trx); |
24 | } |
25 | |
26 | - ut_a(prebuilt->trx->conc_state == TRX_NOT_STARTED); |
27 | + ut_a(!trx_is_started(prebuilt->trx)); |
28 | } |
29 | |
30 | prebuilt->trx->op_info = "updating table statistics"; |
31 | |
32 | === modified file 'Percona-Server/storage/innobase/include/read0read.h' |
33 | --- Percona-Server/storage/innobase/include/read0read.h 2013-03-15 09:55:17 +0000 |
34 | +++ Percona-Server/storage/innobase/include/read0read.h 2013-03-25 10:05:33 +0000 |
35 | @@ -32,6 +32,7 @@ |
36 | #include "ut0byte.h" |
37 | #include "ut0lst.h" |
38 | #include "trx0trx.h" |
39 | +#include "trx0sys.h" |
40 | #include "read0types.h" |
41 | |
42 | /*********************************************************************//** |
43 | @@ -44,8 +45,11 @@ |
44 | /*===============*/ |
45 | trx_id_t cr_trx_id, /*!< in: trx_id of creating |
46 | transaction, or 0 used in purge */ |
47 | - read_view_t* view); /*!< in: pre-allocated view array or |
48 | - NULL if a new one needs to be created */ |
49 | + read_view_t* view, /*!< in: current read view or NULL if it |
50 | + doesn't exist yet */ |
51 | + ibool exclude_self); /*!< in: TRUE, if cr_trx_id should be |
52 | + excluded from the resulting view */ |
53 | + |
54 | /*********************************************************************//** |
55 | Makes a copy of the oldest existing read view, or opens a new. The view |
56 | must be closed with ..._close. |
57 | @@ -150,19 +154,20 @@ |
58 | are strictly smaller (<) than this value. |
59 | In other words, |
60 | this is the "low water mark". */ |
61 | - ulint n_trx_ids; |
62 | + ulint n_descr; |
63 | /*!< Number of cells in the trx_ids array */ |
64 | - ulint max_trx_ids; |
65 | + ulint max_descr; |
66 | /*!< Maximum number of cells in the trx_ids |
67 | array */ |
68 | - trx_id_t* trx_ids;/*!< Additional trx ids which the read should |
69 | - not see: typically, these are the active |
70 | + trx_id_t* descriptors; |
71 | + /*!< Array of trx descriptors which the read |
72 | + should not see: typically, these are the active |
73 | transactions at the time when the read is |
74 | serialized, except the reading transaction |
75 | itself; the trx ids in this array are in a |
76 | descending order. These trx_ids should be |
77 | - between the "low" and "high" water marks, |
78 | - that is, up_limit_id and low_limit_id. */ |
79 | + between the "low" and "high" water marks, that |
80 | + is, up_limit_id and low_limit_id. */ |
81 | trx_id_t creator_trx_id; |
82 | /*!< trx id of creating transaction, or |
83 | 0 used in purge */ |
84 | |
85 | === modified file 'Percona-Server/storage/innobase/include/read0read.ic' |
86 | --- Percona-Server/storage/innobase/include/read0read.ic 2010-07-21 14:22:29 +0000 |
87 | +++ Percona-Server/storage/innobase/include/read0read.ic 2013-03-25 10:05:33 +0000 |
88 | @@ -25,6 +25,11 @@ |
89 | |
90 | /*********************************************************************//** |
91 | Gets the nth trx id in a read view. |
92 | + |
93 | +Upstream code stores array of trx_ids in the descending order. Percona Server |
94 | +keeps it in the ascending order for performance reasons. Let us keep the |
95 | +semantics. |
96 | + |
97 | @return trx id */ |
98 | UNIV_INLINE |
99 | trx_id_t |
100 | @@ -33,13 +38,17 @@ |
101 | const read_view_t* view, /*!< in: read view */ |
102 | ulint n) /*!< in: position */ |
103 | { |
104 | - ut_ad(n < view->n_trx_ids); |
105 | + ut_ad(n < view->n_descr); |
106 | |
107 | - return(*(view->trx_ids + n)); |
108 | + return(view->descriptors[view->n_descr - 1 - n]); |
109 | } |
110 | |
111 | /*********************************************************************//** |
112 | -Sets the nth trx id in a read view. */ |
113 | +Sets the nth trx id in a read view. |
114 | + |
115 | +Upstream code stores array of trx_ids in the descending order. Percona Server |
116 | +keeps it in the ascending order for performance reasons. Let us keep the |
117 | +semantics. */ |
118 | UNIV_INLINE |
119 | void |
120 | read_view_set_nth_trx_id( |
121 | @@ -48,9 +57,9 @@ |
122 | ulint n, /*!< in: position */ |
123 | trx_id_t trx_id) /*!< in: trx id to set */ |
124 | { |
125 | - ut_ad(n < view->n_trx_ids); |
126 | + ut_ad(n < view->n_descr); |
127 | |
128 | - *(view->trx_ids + n) = trx_id; |
129 | + view->descriptors[view->n_descr - 1 - n] = trx_id; |
130 | } |
131 | |
132 | /*********************************************************************//** |
133 | @@ -63,9 +72,6 @@ |
134 | const read_view_t* view, /*!< in: read view */ |
135 | trx_id_t trx_id) /*!< in: trx id */ |
136 | { |
137 | - ulint n_ids; |
138 | - ulint i; |
139 | - |
140 | if (trx_id < view->up_limit_id) { |
141 | |
142 | return(TRUE); |
143 | @@ -76,21 +82,8 @@ |
144 | return(FALSE); |
145 | } |
146 | |
147 | - /* We go through the trx ids in the array smallest first: this order |
148 | - may save CPU time, because if there was a very long running |
149 | - transaction in the trx id array, its trx id is looked at first, and |
150 | - the first two comparisons may well decide the visibility of trx_id. */ |
151 | - |
152 | - n_ids = view->n_trx_ids; |
153 | - |
154 | - for (i = 0; i < n_ids; i++) { |
155 | - trx_id_t view_trx_id |
156 | - = read_view_get_nth_trx_id(view, n_ids - i - 1); |
157 | - |
158 | - if (trx_id <= view_trx_id) { |
159 | - return(trx_id != view_trx_id); |
160 | - } |
161 | - } |
162 | - |
163 | - return(TRUE); |
164 | + /* Do a binary search over this view's descriptors array */ |
165 | + |
166 | + return(trx_find_descriptor(view->descriptors, view->n_descr, |
167 | + trx_id) == NULL); |
168 | } |
169 | |
170 | === modified file 'Percona-Server/storage/innobase/include/trx0sys.h' |
171 | --- Percona-Server/storage/innobase/include/trx0sys.h 2012-10-22 00:07:03 +0000 |
172 | +++ Percona-Server/storage/innobase/include/trx0sys.h 2013-03-25 10:05:33 +0000 |
173 | @@ -249,6 +249,17 @@ |
174 | trx_sys_get_new_trx_id(void); |
175 | /*========================*/ |
176 | |
177 | +/*************************************************************//** |
178 | +Find a slot for a given trx ID in a descriptors array. |
179 | +@return: slot pointer */ |
180 | +UNIV_INLINE |
181 | +trx_id_t* |
182 | +trx_find_descriptor( |
183 | +/*================*/ |
184 | + const trx_id_t* descriptors, /*!< in: descriptors array */ |
185 | + ulint n_descr, /*!< in: array size */ |
186 | + trx_id_t trx_id); /*!< in: trx pointer */ |
187 | + |
188 | #ifdef UNIV_DEBUG |
189 | /* Flag to control TRX_RSEG_N_SLOTS behavior debugging. */ |
190 | extern uint trx_rseg_n_slots_debug; |
191 | @@ -633,6 +644,8 @@ |
192 | | TRX_SYS_FILE_FORMAT_TAG_MAGIC_N_LOW) |
193 | /* @} */ |
194 | |
195 | +#define TRX_DESCR_ARRAY_INITIAL_SIZE 1000 |
196 | + |
197 | #ifndef UNIV_HOTBACKUP |
198 | /** Doublewrite control struct */ |
199 | struct trx_doublewrite_struct{ |
200 | @@ -660,16 +673,41 @@ |
201 | trx_id_t max_trx_id; /*!< The smallest number not yet |
202 | assigned as a transaction id or |
203 | transaction number */ |
204 | + char pad1[64]; /*!< Ensure max_trx_id does not share |
205 | + cache line with other fields. */ |
206 | + trx_id_t* descriptors; /*!< Array of trx descriptors */ |
207 | + ulint descr_n_max; /*!< The current size of the descriptors |
208 | + array. */ |
209 | + char pad2[64]; /*!< Ensure static descriptor fields |
210 | + do not share cache lines with |
211 | + descr_n_used */ |
212 | + ulint descr_n_used; /*!< Number of used elements in the |
213 | + descriptors array. */ |
214 | + char pad3[64]; /*!< Ensure descriptors do not share |
215 | + cache line with other fields */ |
216 | UT_LIST_BASE_NODE_T(trx_t) trx_list; |
217 | /*!< List of active and committed in |
218 | memory transactions, sorted on trx id, |
219 | biggest first */ |
220 | + char pad4[64]; /*!< Ensure list base nodes do not |
221 | + share cache line with other fields */ |
222 | UT_LIST_BASE_NODE_T(trx_t) mysql_trx_list; |
223 | /*!< List of transactions created |
224 | for MySQL */ |
225 | + char pad5[64]; /*!< Ensure list base nodes do not |
226 | + share cache line with other fields */ |
227 | + UT_LIST_BASE_NODE_T(trx_t) trx_serial_list; |
228 | + /*!< trx->no ordered List of |
229 | + transactions in either TRX_PREPARED or |
230 | + TRX_ACTIVE which have already been |
231 | + assigned a serialization number */ |
232 | + char pad6[64]; /*!< Ensure trx_serial_list does not |
233 | + share cache line with other fields */ |
234 | UT_LIST_BASE_NODE_T(trx_rseg_t) rseg_list; |
235 | /*!< List of rollback segment |
236 | objects */ |
237 | + char pad7[64]; /*!< Ensure list base nodes do not |
238 | + share cache line with other fields */ |
239 | trx_rseg_t* latest_rseg; /*!< Latest rollback segment in the |
240 | round-robin assignment of rollback |
241 | segments to transactions */ |
242 | |
243 | === modified file 'Percona-Server/storage/innobase/include/trx0sys.ic' |
244 | --- Percona-Server/storage/innobase/include/trx0sys.ic 2012-05-10 07:49:14 +0000 |
245 | +++ Percona-Server/storage/innobase/include/trx0sys.ic 2013-03-25 10:05:33 +0000 |
246 | @@ -367,28 +367,11 @@ |
247 | /*==========*/ |
248 | trx_id_t trx_id) /*!< in: trx id of the transaction */ |
249 | { |
250 | - trx_t* trx; |
251 | - |
252 | ut_ad(mutex_own(&(kernel_mutex))); |
253 | |
254 | - if (trx_id < trx_list_get_min_trx_id()) { |
255 | - |
256 | - return(FALSE); |
257 | - } |
258 | - |
259 | - if (UNIV_UNLIKELY(trx_id >= trx_sys->max_trx_id)) { |
260 | - |
261 | - /* There must be corruption: we return TRUE because this |
262 | - function is only called by lock_clust_rec_some_has_impl() |
263 | - and row_vers_impl_x_locked_off_kernel() and they have |
264 | - diagnostic prints in this case */ |
265 | - |
266 | - return(TRUE); |
267 | - } |
268 | - |
269 | - trx = trx_get_on_id(trx_id); |
270 | - if (trx && (trx->conc_state == TRX_ACTIVE |
271 | - || trx->conc_state == TRX_PREPARED)) { |
272 | + if (trx_find_descriptor(trx_sys->descriptors, |
273 | + trx_sys->descr_n_used, |
274 | + trx_id)) { |
275 | |
276 | return(TRUE); |
277 | } |
278 | @@ -425,4 +408,27 @@ |
279 | return(id); |
280 | } |
281 | |
282 | + |
283 | +/*************************************************************//** |
284 | +Find a slot for a given trx ID in a descriptors array. |
285 | +@return: slot pointer */ |
286 | +UNIV_INLINE |
287 | +trx_id_t* |
288 | +trx_find_descriptor( |
289 | +/*================*/ |
290 | + const trx_id_t* descriptors, /*!< in: descriptors array */ |
291 | + ulint n_descr, /*!< in: array size */ |
292 | + trx_id_t trx_id) /*!< in: trx pointer */ |
293 | +{ |
294 | + ut_ad(descriptors != trx_sys->descriptors || |
295 | + mutex_own(&kernel_mutex)); |
296 | + |
297 | + if (UNIV_UNLIKELY(n_descr == 0)) { |
298 | + |
299 | + return(NULL); |
300 | + } |
301 | + |
302 | + return((trx_id_t *) bsearch(&trx_id, descriptors, n_descr, |
303 | + sizeof(trx_id_t), trx_descr_cmp)); |
304 | +} |
305 | #endif /* !UNIV_HOTBACKUP */ |
306 | |
307 | === modified file 'Percona-Server/storage/innobase/include/trx0trx.h' |
308 | --- Percona-Server/storage/innobase/include/trx0trx.h 2013-03-15 09:55:17 +0000 |
309 | +++ Percona-Server/storage/innobase/include/trx0trx.h 2013-03-25 10:05:33 +0000 |
310 | @@ -447,6 +447,23 @@ |
311 | /*==================*/ |
312 | const trx_t* trx); /*!< in: transaction */ |
313 | |
314 | +/*************************************************************//** |
315 | +Callback function for trx_find_descriptor() to compare trx IDs. */ |
316 | +UNIV_INTERN |
317 | +int |
318 | +trx_descr_cmp( |
319 | +/*==========*/ |
320 | + const void *a, /*!< in: pointer to first comparison argument */ |
321 | + const void *b); /*!< in: pointer to second comparison argument */ |
322 | + |
323 | +/*************************************************************//** |
324 | +Release a slot for a given trx in the global descriptors array. */ |
325 | +UNIV_INTERN |
326 | +void |
327 | +trx_release_descriptor( |
328 | +/*===================*/ |
329 | + trx_t* trx); /*!< in: trx pointer */ |
330 | + |
331 | /* Signal to a transaction */ |
332 | struct trx_sig_struct{ |
333 | unsigned type:3; /*!< signal type */ |
334 | @@ -477,10 +494,18 @@ |
335 | const char* op_info; /*!< English text describing the |
336 | current operation, or an empty |
337 | string */ |
338 | - ulint conc_state; /*!< state of the trx from the point |
339 | - of view of concurrency control: |
340 | - TRX_ACTIVE, TRX_COMMITTED_IN_MEMORY, |
341 | - ... */ |
342 | + ulint state; /*!< state of the trx from the point of |
343 | + view of concurrency control: TRX_ACTIVE, |
344 | + TRX_COMMITTED_IN_MEMORY, ... This was |
345 | + called 'conc_state' in the upstream and |
346 | + has been renamed in Percona Server, |
347 | + because changing it's value to/from |
348 | + either TRX_ACTIVE or TRX_PREPARED |
349 | + requires calling |
350 | + trx_reserve_descriptor() / |
351 | + trx_release_descriptor(). Different name |
352 | + ensures we notice any new code changing |
353 | + the state. */ |
354 | /*------------------------------*/ |
355 | /* MySQL has a transaction coordinator to coordinate two phase |
356 | commit between multiple storage engines and the binary log. When |
357 | @@ -495,6 +520,9 @@ |
358 | also be set to 1. This is used in the |
359 | XA code */ |
360 | unsigned called_commit_ordered:1;/* 1 if innobase_commit_ordered has run. */ |
361 | + unsigned is_in_trx_serial_list:1; |
362 | + /* Set when transaction is in the |
363 | + trx_serial_list */ |
364 | /*------------------------------*/ |
365 | ulint isolation_level;/* TRX_ISO_REPEATABLE_READ, ... */ |
366 | ulint check_foreigns; /* normally TRUE, but if the user |
367 | @@ -628,6 +656,9 @@ |
368 | UT_LIST_NODE_T(trx_t) |
369 | mysql_trx_list; /*!< list of transactions created for |
370 | MySQL */ |
371 | + UT_LIST_NODE_T(trx_t) |
372 | + trx_serial_list;/*!< list node for |
373 | + trx_sys->trx_serial_list */ |
374 | /*------------------------------*/ |
375 | ulint error_state; /*!< 0 if no error, otherwise error |
376 | number; NOTE That ONLY the thread |
377 | |
378 | === modified file 'Percona-Server/storage/innobase/include/trx0trx.ic' |
379 | --- Percona-Server/storage/innobase/include/trx0trx.ic 2010-07-21 14:22:29 +0000 |
380 | +++ Percona-Server/storage/innobase/include/trx0trx.ic 2013-03-25 10:05:33 +0000 |
381 | @@ -31,9 +31,9 @@ |
382 | /*=====================*/ |
383 | trx_t* trx) /*!< in: transaction */ |
384 | { |
385 | - ut_ad(trx->conc_state != TRX_COMMITTED_IN_MEMORY); |
386 | + ut_ad(trx->state != TRX_COMMITTED_IN_MEMORY); |
387 | |
388 | - if (trx->conc_state == TRX_NOT_STARTED) { |
389 | + if (trx->state == TRX_NOT_STARTED) { |
390 | |
391 | trx_start(trx, ULINT_UNDEFINED); |
392 | } |
393 | @@ -48,9 +48,9 @@ |
394 | /*=========================*/ |
395 | trx_t* trx) /*!< in: transaction */ |
396 | { |
397 | - ut_ad(trx->conc_state != TRX_COMMITTED_IN_MEMORY); |
398 | + ut_ad(trx->state != TRX_COMMITTED_IN_MEMORY); |
399 | |
400 | - if (trx->conc_state == TRX_NOT_STARTED) { |
401 | + if (trx->state == TRX_NOT_STARTED) { |
402 | |
403 | trx_start_low(trx, ULINT_UNDEFINED); |
404 | } |
405 | |
406 | === modified file 'Percona-Server/storage/innobase/lock/lock0lock.c' |
407 | --- Percona-Server/storage/innobase/lock/lock0lock.c 2013-03-05 12:16:18 +0000 |
408 | +++ Percona-Server/storage/innobase/lock/lock0lock.c 2013-03-25 10:05:33 +0000 |
409 | @@ -4652,7 +4652,7 @@ |
410 | trx = UT_LIST_GET_FIRST(trx_sys->mysql_trx_list); |
411 | |
412 | while (trx) { |
413 | - if (trx->conc_state == TRX_NOT_STARTED) { |
414 | + if (trx->state == TRX_NOT_STARTED) { |
415 | fputs("---", file); |
416 | trx_print(file, trx, 600); |
417 | } |
418 | @@ -4820,9 +4820,9 @@ |
419 | lock = UT_LIST_GET_FIRST(table->locks); |
420 | |
421 | while (lock) { |
422 | - ut_a(((lock->trx)->conc_state == TRX_ACTIVE) |
423 | - || ((lock->trx)->conc_state == TRX_PREPARED) |
424 | - || ((lock->trx)->conc_state == TRX_COMMITTED_IN_MEMORY)); |
425 | + ut_a(((lock->trx)->state == TRX_ACTIVE) |
426 | + || ((lock->trx)->state == TRX_PREPARED) |
427 | + || ((lock->trx)->state == TRX_COMMITTED_IN_MEMORY)); |
428 | |
429 | if (!lock_get_wait(lock)) { |
430 | |
431 | @@ -4870,7 +4870,7 @@ |
432 | lock = lock_rec_get_first(block, heap_no); |
433 | |
434 | while (lock) { |
435 | - switch(lock->trx->conc_state) { |
436 | + switch(lock->trx->state) { |
437 | case TRX_ACTIVE: |
438 | case TRX_PREPARED: |
439 | case TRX_COMMITTED_IN_MEMORY: |
440 | @@ -4957,9 +4957,9 @@ |
441 | lock = lock_rec_get_first(block, heap_no); |
442 | |
443 | while (lock) { |
444 | - ut_a(lock->trx->conc_state == TRX_ACTIVE |
445 | - || lock->trx->conc_state == TRX_PREPARED |
446 | - || lock->trx->conc_state == TRX_COMMITTED_IN_MEMORY); |
447 | + ut_a(lock->trx->state == TRX_ACTIVE |
448 | + || lock->trx->state == TRX_PREPARED |
449 | + || lock->trx->state == TRX_COMMITTED_IN_MEMORY); |
450 | ut_a(trx_in_trx_list(lock->trx)); |
451 | |
452 | if (index) { |
453 | @@ -5036,9 +5036,9 @@ |
454 | } |
455 | |
456 | ut_a(trx_in_trx_list(lock->trx)); |
457 | - ut_a(lock->trx->conc_state == TRX_ACTIVE |
458 | - || lock->trx->conc_state == TRX_PREPARED |
459 | - || lock->trx->conc_state == TRX_COMMITTED_IN_MEMORY); |
460 | + ut_a(lock->trx->state == TRX_ACTIVE |
461 | + || lock->trx->state == TRX_PREPARED |
462 | + || lock->trx->state == TRX_COMMITTED_IN_MEMORY); |
463 | |
464 | # ifdef UNIV_SYNC_DEBUG |
465 | /* Only validate the record queues when this thread is not |
466 | |
467 | === modified file 'Percona-Server/storage/innobase/read/read0read.c' |
468 | --- Percona-Server/storage/innobase/read/read0read.c 2013-03-15 09:55:17 +0000 |
469 | +++ Percona-Server/storage/innobase/read/read0read.c 2013-03-25 10:05:33 +0000 |
470 | @@ -150,22 +150,22 @@ |
471 | { |
472 | if (view == NULL) { |
473 | view = ut_malloc(sizeof(read_view_t)); |
474 | - view->max_trx_ids = 0; |
475 | - view->trx_ids = NULL; |
476 | + view->max_descr = 0; |
477 | + view->descriptors = NULL; |
478 | } |
479 | |
480 | - if (UNIV_UNLIKELY(view->max_trx_ids < n)) { |
481 | + if (UNIV_UNLIKELY(view->max_descr < n)) { |
482 | |
483 | - /* avoid frequent reallocations by extending the array to the |
484 | + /* avoid frequent re-allocations by extending the array to the |
485 | desired size + 10% */ |
486 | |
487 | - view->max_trx_ids = n + n / 10; |
488 | - view->trx_ids = ut_realloc(view->trx_ids, |
489 | - view->max_trx_ids * |
490 | - sizeof *view->trx_ids); |
491 | + view->max_descr = n + n / 10; |
492 | + view->descriptors = ut_realloc(view->descriptors, |
493 | + view->max_descr * |
494 | + sizeof(trx_id_t)); |
495 | } |
496 | |
497 | - view->n_trx_ids = n; |
498 | + view->n_descr = n; |
499 | |
500 | return(view); |
501 | } |
502 | @@ -198,10 +198,10 @@ |
503 | |
504 | if (old_view == NULL) { |
505 | |
506 | - return(read_view_open_now(cr_trx_id, view)); |
507 | + return(read_view_open_now(cr_trx_id, view, TRUE)); |
508 | } |
509 | |
510 | - n = old_view->n_trx_ids; |
511 | + n = old_view->n_descr; |
512 | |
513 | if (old_view->creator_trx_id) { |
514 | n++; |
515 | @@ -217,7 +217,7 @@ |
516 | i = 0; |
517 | while (i < n) { |
518 | if (needs_insert |
519 | - && (i >= old_view->n_trx_ids |
520 | + && (i >= old_view->n_descr |
521 | || old_view->creator_trx_id |
522 | > read_view_get_nth_trx_id(old_view, i))) { |
523 | |
524 | @@ -264,15 +264,17 @@ |
525 | /*===============*/ |
526 | trx_id_t cr_trx_id, /*!< in: trx_id of creating |
527 | transaction, or 0 used in purge */ |
528 | - read_view_t* view) /*!< in: current read view or NULL if it |
529 | + read_view_t* view, /*!< in: current read view or NULL if it |
530 | doesn't exist yet */ |
531 | + ibool exclude_self) /*!< in: TRUE, if cr_trx_id should be |
532 | + excluded from the resulting view */ |
533 | { |
534 | - trx_t* trx; |
535 | - ulint n; |
536 | + trx_id_t* descr; |
537 | + ulint i; |
538 | |
539 | ut_ad(mutex_own(&kernel_mutex)); |
540 | |
541 | - view = read_view_create_low(UT_LIST_GET_LEN(trx_sys->trx_list), view); |
542 | + view = read_view_create_low(trx_sys->descr_n_used, view); |
543 | |
544 | view->creator_trx_id = cr_trx_id; |
545 | view->type = VIEW_NORMAL; |
546 | @@ -283,40 +285,58 @@ |
547 | view->low_limit_no = trx_sys->max_trx_id; |
548 | view->low_limit_id = view->low_limit_no; |
549 | |
550 | - n = 0; |
551 | - trx = UT_LIST_GET_FIRST(trx_sys->trx_list); |
552 | - |
553 | - /* No active transaction should be visible, except cr_trx */ |
554 | - |
555 | - while (trx) { |
556 | - if (trx->id != cr_trx_id |
557 | - && (trx->conc_state == TRX_ACTIVE |
558 | - || trx->conc_state == TRX_PREPARED)) { |
559 | - |
560 | - read_view_set_nth_trx_id(view, n, trx->id); |
561 | - |
562 | - n++; |
563 | - |
564 | - /* NOTE that a transaction whose trx number is < |
565 | - trx_sys->max_trx_id can still be active, if it is |
566 | - in the middle of its commit! Note that when a |
567 | - transaction starts, we initialize trx->no to |
568 | - IB_ULONGLONG_MAX. */ |
569 | - |
570 | - if (view->low_limit_no > trx->no) { |
571 | - |
572 | - view->low_limit_no = trx->no; |
573 | - } |
574 | + /* No active transaction should be visible */ |
575 | + |
576 | + descr = trx_find_descriptor(trx_sys->descriptors, |
577 | + trx_sys->descr_n_used, |
578 | + cr_trx_id); |
579 | + |
580 | + if (UNIV_LIKELY(exclude_self && descr != NULL)) { |
581 | + |
582 | + ut_ad(trx_sys->descr_n_used > 0); |
583 | + ut_ad(view->n_descr > 0); |
584 | + |
585 | + view->n_descr--; |
586 | + |
587 | + i = descr - trx_sys->descriptors; |
588 | + } else { |
589 | + i = trx_sys->descr_n_used; |
590 | + } |
591 | + |
592 | + if (UNIV_LIKELY(i > 0)) { |
593 | + |
594 | + /* Copy the [0; i-1] range */ |
595 | + memcpy(view->descriptors, trx_sys->descriptors, |
596 | + i * sizeof(trx_id_t)); |
597 | + } |
598 | + |
599 | + if (UNIV_UNLIKELY(i + 1 < trx_sys->descr_n_used)) { |
600 | + |
601 | + /* Copy the [i+1; descr_n_used-1] range */ |
602 | + memcpy(view->descriptors + i, |
603 | + trx_sys->descriptors + i + 1, |
604 | + (trx_sys->descr_n_used - i - 1) * |
605 | + sizeof(trx_id_t)); |
606 | + } |
607 | + |
608 | + /* NOTE that a transaction whose trx number is < trx_sys->max_trx_id can |
609 | + still be active, if it is in the middle of its commit! Note that when a |
610 | + transaction starts, we initialize trx->no to IB_ULONGLONG_MAX. */ |
611 | + |
612 | + if (UT_LIST_GET_LEN(trx_sys->trx_serial_list) > 0) { |
613 | + |
614 | + trx_id_t trx_no; |
615 | + |
616 | + trx_no = UT_LIST_GET_FIRST(trx_sys->trx_serial_list)->no; |
617 | + |
618 | + if (trx_no < view->low_limit_no) { |
619 | + view->low_limit_no = trx_no; |
620 | } |
621 | - |
622 | - trx = UT_LIST_GET_NEXT(trx_list, trx); |
623 | } |
624 | |
625 | - view->n_trx_ids = n; |
626 | - |
627 | - if (n > 0) { |
628 | + if (UNIV_LIKELY(view->n_descr > 0)) { |
629 | /* The last active transaction has the smallest id: */ |
630 | - view->up_limit_id = read_view_get_nth_trx_id(view, n - 1); |
631 | + view->up_limit_id = view->descriptors[0]; |
632 | } else { |
633 | view->up_limit_id = view->low_limit_id; |
634 | } |
635 | @@ -350,8 +370,8 @@ |
636 | { |
637 | ut_ad(mutex_own(&kernel_mutex)); |
638 | |
639 | - if (view->trx_ids != NULL) { |
640 | - ut_free(view->trx_ids); |
641 | + if (view->descriptors != NULL) { |
642 | + ut_free(view->descriptors); |
643 | } |
644 | |
645 | ut_free(view); |
646 | @@ -409,7 +429,7 @@ |
647 | |
648 | fprintf(file, "Read view individually stored trx ids:\n"); |
649 | |
650 | - n_ids = view->n_trx_ids; |
651 | + n_ids = view->n_descr; |
652 | |
653 | for (i = 0; i < n_ids; i++) { |
654 | fprintf(file, "Read view trx id " TRX_ID_FMT "\n", |
655 | @@ -431,8 +451,6 @@ |
656 | cursor_view_t* curview; |
657 | read_view_t* view; |
658 | mem_heap_t* heap; |
659 | - trx_t* trx; |
660 | - ulint n; |
661 | |
662 | ut_a(cr_trx); |
663 | |
664 | @@ -451,61 +469,14 @@ |
665 | |
666 | mutex_enter(&kernel_mutex); |
667 | |
668 | - curview->read_view = read_view_create_low( |
669 | - UT_LIST_GET_LEN(trx_sys->trx_list), NULL); |
670 | + curview->read_view = read_view_open_now(cr_trx->id, NULL, FALSE); |
671 | + |
672 | + mutex_exit(&kernel_mutex); |
673 | |
674 | view = curview->read_view; |
675 | - view->creator_trx_id = cr_trx->id; |
676 | view->type = VIEW_HIGH_GRANULARITY; |
677 | view->undo_no = cr_trx->undo_no; |
678 | |
679 | - /* No future transactions should be visible in the view */ |
680 | - |
681 | - view->low_limit_no = trx_sys->max_trx_id; |
682 | - view->low_limit_id = view->low_limit_no; |
683 | - |
684 | - n = 0; |
685 | - trx = UT_LIST_GET_FIRST(trx_sys->trx_list); |
686 | - |
687 | - /* No active transaction should be visible */ |
688 | - |
689 | - while (trx) { |
690 | - |
691 | - if (trx->conc_state == TRX_ACTIVE |
692 | - || trx->conc_state == TRX_PREPARED) { |
693 | - |
694 | - read_view_set_nth_trx_id(view, n, trx->id); |
695 | - |
696 | - n++; |
697 | - |
698 | - /* NOTE that a transaction whose trx number is < |
699 | - trx_sys->max_trx_id can still be active, if it is |
700 | - in the middle of its commit! Note that when a |
701 | - transaction starts, we initialize trx->no to |
702 | - IB_ULONGLONG_MAX. */ |
703 | - |
704 | - if (view->low_limit_no > trx->no) { |
705 | - |
706 | - view->low_limit_no = trx->no; |
707 | - } |
708 | - } |
709 | - |
710 | - trx = UT_LIST_GET_NEXT(trx_list, trx); |
711 | - } |
712 | - |
713 | - view->n_trx_ids = n; |
714 | - |
715 | - if (n > 0) { |
716 | - /* The last active transaction has the smallest id: */ |
717 | - view->up_limit_id = read_view_get_nth_trx_id(view, n - 1); |
718 | - } else { |
719 | - view->up_limit_id = view->low_limit_id; |
720 | - } |
721 | - |
722 | - UT_LIST_ADD_FIRST(view_list, trx_sys->view_list, view); |
723 | - |
724 | - mutex_exit(&kernel_mutex); |
725 | - |
726 | return(curview); |
727 | } |
728 | |
729 | |
730 | === modified file 'Percona-Server/storage/innobase/row/row0sel.c' |
731 | --- Percona-Server/storage/innobase/row/row0sel.c 2013-03-23 04:47:09 +0000 |
732 | +++ Percona-Server/storage/innobase/row/row0sel.c 2013-03-25 10:05:33 +0000 |
733 | @@ -3813,9 +3813,9 @@ |
734 | trx->has_search_latch = FALSE; |
735 | } |
736 | |
737 | - ut_ad(prebuilt->sql_stat_start || trx->conc_state == TRX_ACTIVE); |
738 | - ut_ad(trx->conc_state == TRX_NOT_STARTED |
739 | - || trx->conc_state == TRX_ACTIVE); |
740 | + ut_ad(prebuilt->sql_stat_start || trx->state == TRX_ACTIVE); |
741 | + ut_ad(trx->state == TRX_NOT_STARTED |
742 | + || trx->state == TRX_ACTIVE); |
743 | ut_ad(prebuilt->sql_stat_start |
744 | || prebuilt->select_lock_type != LOCK_NONE |
745 | || trx->read_view); |
746 | @@ -4897,8 +4897,10 @@ |
747 | if (trx->isolation_level >= TRX_ISO_REPEATABLE_READ |
748 | && !trx->read_view) { |
749 | |
750 | - trx->read_view = read_view_open_now( |
751 | - trx->id, NULL); |
752 | + trx->read_view = |
753 | + read_view_open_now(trx->id, |
754 | + NULL, TRUE); |
755 | + |
756 | trx->global_read_view = trx->read_view; |
757 | } |
758 | } |
759 | |
760 | === modified file 'Percona-Server/storage/innobase/row/row0vers.c' |
761 | --- Percona-Server/storage/innobase/row/row0vers.c 2012-09-17 13:08:32 +0000 |
762 | +++ Percona-Server/storage/innobase/row/row0vers.c 2013-03-25 10:05:33 +0000 |
763 | @@ -667,9 +667,9 @@ |
764 | |
765 | mutex_enter(&kernel_mutex); |
766 | version_trx = trx_get_on_id(version_trx_id); |
767 | - if (version_trx |
768 | - && (version_trx->conc_state == TRX_COMMITTED_IN_MEMORY |
769 | - || version_trx->conc_state == TRX_NOT_STARTED)) { |
770 | + if (version_trx && |
771 | + (version_trx->state == TRX_COMMITTED_IN_MEMORY |
772 | + || version_trx->state == TRX_NOT_STARTED)) { |
773 | |
774 | version_trx = NULL; |
775 | } |
776 | |
777 | === modified file 'Percona-Server/storage/innobase/srv/srv0srv.c' |
778 | --- Percona-Server/storage/innobase/srv/srv0srv.c 2013-03-05 12:16:18 +0000 |
779 | +++ Percona-Server/storage/innobase/srv/srv0srv.c 2013-03-25 10:05:33 +0000 |
780 | @@ -2925,7 +2925,7 @@ |
781 | mutex_enter(&kernel_mutex); |
782 | trx = UT_LIST_GET_FIRST(trx_sys->mysql_trx_list); |
783 | while (trx) { |
784 | - if (trx->conc_state == TRX_ACTIVE |
785 | + if (trx->state == TRX_ACTIVE |
786 | && trx->mysql_thd |
787 | && innobase_thd_is_idle(trx->mysql_thd)) { |
788 | ib_int64_t start_time = innobase_thd_get_start_time(trx->mysql_thd); |
789 | |
790 | === modified file 'Percona-Server/storage/innobase/trx/trx0purge.c' |
791 | --- Percona-Server/storage/innobase/trx/trx0purge.c 2013-03-23 04:47:09 +0000 |
792 | +++ Percona-Server/storage/innobase/trx/trx0purge.c 2013-03-25 10:05:33 +0000 |
793 | @@ -280,7 +280,12 @@ |
794 | que_graph_free(purge_sys->query); |
795 | |
796 | ut_a(purge_sys->sess->trx->is_purge); |
797 | - purge_sys->sess->trx->conc_state = TRX_NOT_STARTED; |
798 | + purge_sys->sess->trx->state = TRX_NOT_STARTED; |
799 | + |
800 | + mutex_enter(&kernel_mutex); |
801 | + trx_release_descriptor(purge_sys->sess->trx); |
802 | + mutex_exit(&kernel_mutex); |
803 | + |
804 | sess_close(purge_sys->sess); |
805 | purge_sys->sess = NULL; |
806 | |
807 | |
808 | === modified file 'Percona-Server/storage/innobase/trx/trx0roll.c' |
809 | --- Percona-Server/storage/innobase/trx/trx0roll.c 2011-04-05 07:18:43 +0000 |
810 | +++ Percona-Server/storage/innobase/trx/trx0roll.c 2013-03-25 10:05:33 +0000 |
811 | @@ -132,7 +132,7 @@ |
812 | { |
813 | int err; |
814 | |
815 | - if (trx->conc_state == TRX_NOT_STARTED) { |
816 | + if (trx->state == TRX_NOT_STARTED) { |
817 | |
818 | return(DB_SUCCESS); |
819 | } |
820 | @@ -161,7 +161,7 @@ |
821 | { |
822 | int err; |
823 | |
824 | - if (trx->conc_state == TRX_NOT_STARTED) { |
825 | + if (trx->state == TRX_NOT_STARTED) { |
826 | |
827 | return(DB_SUCCESS); |
828 | } |
829 | @@ -263,7 +263,7 @@ |
830 | return(DB_NO_SAVEPOINT); |
831 | } |
832 | |
833 | - if (trx->conc_state == TRX_NOT_STARTED) { |
834 | + if (trx->state == TRX_NOT_STARTED) { |
835 | ut_print_timestamp(stderr); |
836 | fputs(" InnoDB: Error: transaction has a savepoint ", stderr); |
837 | ut_print_name(stderr, trx, FALSE, savep->name); |
838 | @@ -560,7 +560,7 @@ |
839 | continue; |
840 | } |
841 | |
842 | - switch (trx->conc_state) { |
843 | + switch (trx->state) { |
844 | case TRX_NOT_STARTED: |
845 | case TRX_PREPARED: |
846 | continue; |
847 | |
848 | === modified file 'Percona-Server/storage/innobase/trx/trx0sys.c' |
849 | --- Percona-Server/storage/innobase/trx/trx0sys.c 2012-10-22 00:07:03 +0000 |
850 | +++ Percona-Server/storage/innobase/trx/trx0sys.c 2013-03-25 10:05:33 +0000 |
851 | @@ -1319,6 +1319,12 @@ |
852 | |
853 | trx_sys = mem_zalloc(sizeof(*trx_sys)); |
854 | |
855 | + /* Allocate the trx descriptors array */ |
856 | + trx_sys->descriptors = ut_malloc(sizeof(trx_id_t) * |
857 | + TRX_DESCR_ARRAY_INITIAL_SIZE); |
858 | + trx_sys->descr_n_max = TRX_DESCR_ARRAY_INITIAL_SIZE; |
859 | + trx_sys->descr_n_used = 0; |
860 | + |
861 | sys_header = trx_sysf_get(&mtr); |
862 | |
863 | trx_rseg_list_and_array_init(sys_header, ib_bh, &mtr); |
864 | @@ -1346,7 +1352,7 @@ |
865 | |
866 | for (;;) { |
867 | |
868 | - if (trx->conc_state != TRX_PREPARED) { |
869 | + if (trx->state != TRX_PREPARED) { |
870 | rows_to_undo += trx->undo_no; |
871 | } |
872 | |
873 | @@ -2028,6 +2034,9 @@ |
874 | ut_a(UT_LIST_GET_LEN(trx_sys->view_list) == 0); |
875 | ut_a(UT_LIST_GET_LEN(trx_sys->mysql_trx_list) == 0); |
876 | |
877 | + ut_ad(trx_sys->descr_n_used == 0); |
878 | + ut_free(trx_sys->descriptors); |
879 | + |
880 | mem_free(trx_sys); |
881 | |
882 | trx_sys = NULL; |
883 | |
884 | === modified file 'Percona-Server/storage/innobase/trx/trx0trx.c' |
885 | --- Percona-Server/storage/innobase/trx/trx0trx.c 2013-03-23 04:47:09 +0000 |
886 | +++ Percona-Server/storage/innobase/trx/trx0trx.c 2013-03-25 10:05:33 +0000 |
887 | @@ -85,6 +85,126 @@ |
888 | sizeof(trx->detailed_error)); |
889 | } |
890 | |
891 | +/*************************************************************//** |
892 | +Callback function for trx_find_descriptor() to compare trx IDs. */ |
893 | +UNIV_INTERN |
894 | +int |
895 | +trx_descr_cmp( |
896 | +/*==========*/ |
897 | + const void *a, /*!< in: pointer to first comparison argument */ |
898 | + const void *b) /*!< in: pointer to second comparison argument */ |
899 | +{ |
900 | + const trx_id_t* da = (const trx_id_t*) a; |
901 | + const trx_id_t* db = (const trx_id_t*) b; |
902 | + |
903 | + if (*da < *db) { |
904 | + return -1; |
905 | + } else if (*da > *db) { |
906 | + return 1; |
907 | + } |
908 | + |
909 | + return 0; |
910 | +} |
911 | + |
912 | +/*************************************************************//** |
913 | +Reserve a slot for a given trx in the global descriptors array. */ |
914 | +UNIV_INLINE |
915 | +void |
916 | +trx_reserve_descriptor( |
917 | +/*===================*/ |
918 | + const trx_t* trx) /*!< in: trx pointer */ |
919 | +{ |
920 | + ulint n_used; |
921 | + ulint n_max; |
922 | + trx_id_t* descr; |
923 | + |
924 | + ut_ad(mutex_own(&kernel_mutex)); |
925 | + ut_ad(!trx_find_descriptor(trx_sys->descriptors, |
926 | + trx_sys->descr_n_used, |
927 | + trx->id)); |
928 | + |
929 | + n_used = trx_sys->descr_n_used + 1; |
930 | + n_max = trx_sys->descr_n_max; |
931 | + |
932 | + if (UNIV_UNLIKELY(n_used > n_max)) { |
933 | + |
934 | + n_max = n_max * 2; |
935 | + |
936 | + trx_sys->descriptors = |
937 | + ut_realloc(trx_sys->descriptors, |
938 | + n_max * sizeof(trx_id_t)); |
939 | + |
940 | + trx_sys->descr_n_max = n_max; |
941 | + } |
942 | + |
943 | + descr = trx_sys->descriptors + n_used - 1; |
944 | + |
945 | + if (UNIV_UNLIKELY(n_used > 1 && trx->id < descr[-1])) { |
946 | + |
947 | + /* Find the slot where it should be inserted. We could use a |
948 | + binary search, but in reality linear search should be faster, |
949 | + because the slot we are looking for is near the array end. */ |
950 | + |
951 | + trx_id_t* tdescr; |
952 | + |
953 | + for (tdescr = descr - 1; |
954 | + tdescr >= trx_sys->descriptors && *tdescr > trx->id; |
955 | + tdescr--) { |
956 | + } |
957 | + |
958 | + tdescr++; |
959 | + |
960 | + ut_memmove(tdescr + 1, tdescr, (descr - tdescr) * |
961 | + sizeof(trx_id_t)); |
962 | + |
963 | + descr = tdescr; |
964 | + } |
965 | + |
966 | + *descr = trx->id; |
967 | + |
968 | + trx_sys->descr_n_used = n_used; |
969 | +} |
970 | + |
971 | +/*************************************************************//** |
972 | +Release a slot for a given trx in the global descriptors array. */ |
973 | +UNIV_INTERN |
974 | +void |
975 | +trx_release_descriptor( |
976 | +/*===================*/ |
977 | + trx_t* trx) /*!< in: trx pointer */ |
978 | +{ |
979 | + ulint size; |
980 | + trx_id_t* descr; |
981 | + |
982 | + ut_ad(mutex_own(&kernel_mutex)); |
983 | + |
984 | + if (UNIV_LIKELY(trx->is_in_trx_serial_list)) { |
985 | + |
986 | + UT_LIST_REMOVE(trx_serial_list, trx_sys->trx_serial_list, |
987 | + trx); |
988 | + trx->is_in_trx_serial_list = 0; |
989 | + } |
990 | + |
991 | + descr = trx_find_descriptor(trx_sys->descriptors, |
992 | + trx_sys->descr_n_used, |
993 | + trx->id); |
994 | + |
995 | + if (UNIV_UNLIKELY(descr == NULL)) { |
996 | + |
997 | + return; |
998 | + } |
999 | + |
1000 | + size = (trx_sys->descriptors + trx_sys->descr_n_used - 1 - descr) * |
1001 | + sizeof(trx_id_t); |
1002 | + |
1003 | + if (UNIV_LIKELY(size > 0)) { |
1004 | + |
1005 | + ut_memmove(descr, descr + 1, size); |
1006 | + } |
1007 | + |
1008 | + trx_sys->descr_n_used--; |
1009 | +} |
1010 | + |
1011 | /****************************************************************//** |
1012 | Creates and initializes a transaction object. |
1013 | @return own: the transaction */ |
1014 | @@ -107,7 +227,7 @@ |
1015 | |
1016 | trx->is_purge = 0; |
1017 | trx->is_recovered = 0; |
1018 | - trx->conc_state = TRX_NOT_STARTED; |
1019 | + trx->state = TRX_NOT_STARTED; |
1020 | |
1021 | trx->is_registered = 0; |
1022 | trx->owns_prepare_mutex = 0; |
1023 | @@ -119,6 +239,7 @@ |
1024 | |
1025 | trx->id = 0; |
1026 | trx->no = IB_ULONGLONG_MAX; |
1027 | + trx->is_in_trx_serial_list = 0; |
1028 | |
1029 | trx->support_xa = TRUE; |
1030 | |
1031 | @@ -328,7 +449,7 @@ |
1032 | |
1033 | trx->magic_n = 11112222; |
1034 | |
1035 | - ut_a(trx->conc_state == TRX_NOT_STARTED); |
1036 | + ut_a(trx->state == TRX_NOT_STARTED); |
1037 | |
1038 | mutex_free(&(trx->undo_mutex)); |
1039 | |
1040 | @@ -359,15 +480,14 @@ |
1041 | read_view_free(trx->prebuilt_view); |
1042 | } |
1043 | |
1044 | - trx->global_read_view = NULL; |
1045 | - trx->prebuilt_view = NULL; |
1046 | - |
1047 | ut_a(trx->read_view == NULL); |
1048 | |
1049 | ut_a(ib_vector_is_empty(trx->autoinc_locks)); |
1050 | /* We allocated a dedicated heap for the vector. */ |
1051 | ib_vector_free(trx->autoinc_locks); |
1052 | |
1053 | + trx_release_descriptor(trx); |
1054 | + |
1055 | mem_free(trx); |
1056 | } |
1057 | |
1058 | @@ -380,7 +500,7 @@ |
1059 | trx_t* trx) /*!< in, own: trx object */ |
1060 | { |
1061 | ut_ad(mutex_own(&kernel_mutex)); |
1062 | - ut_a(trx->conc_state == TRX_PREPARED); |
1063 | + ut_a(trx->state == TRX_PREPARED); |
1064 | ut_a(trx->magic_n == TRX_MAGIC_N); |
1065 | |
1066 | /* Prepared transactions are sort of active; they allow |
1067 | @@ -416,8 +536,16 @@ |
1068 | ut_a(ib_vector_is_empty(trx->autoinc_locks)); |
1069 | ib_vector_free(trx->autoinc_locks); |
1070 | |
1071 | + trx_release_descriptor(trx); |
1072 | + |
1073 | + if (trx->prebuilt_view != NULL) { |
1074 | + read_view_free(trx->prebuilt_view); |
1075 | + } |
1076 | + |
1077 | UT_LIST_REMOVE(trx_list, trx_sys->trx_list, trx); |
1078 | |
1079 | + ut_ad(trx_sys->descr_n_used <= UT_LIST_GET_LEN(trx_sys->trx_list)); |
1080 | + |
1081 | mem_free(trx); |
1082 | } |
1083 | |
1084 | @@ -526,6 +654,7 @@ |
1085 | |
1086 | ut_ad(mutex_own(&kernel_mutex)); |
1087 | UT_LIST_INIT(trx_sys->trx_list); |
1088 | + UT_LIST_INIT(trx_sys->trx_serial_list); |
1089 | |
1090 | /* Look from the rollback segments if there exist undo logs for |
1091 | transactions */ |
1092 | @@ -562,7 +691,7 @@ |
1093 | |
1094 | if (srv_force_recovery == 0) { |
1095 | |
1096 | - trx->conc_state = TRX_PREPARED; |
1097 | + trx->state = TRX_PREPARED; |
1098 | trx_n_prepared++; |
1099 | } else { |
1100 | fprintf(stderr, |
1101 | @@ -572,11 +701,12 @@ |
1102 | " rollback it" |
1103 | " anyway.\n"); |
1104 | |
1105 | - trx->conc_state = TRX_ACTIVE; |
1106 | + trx->state = TRX_ACTIVE; |
1107 | } |
1108 | + |
1109 | + trx_reserve_descriptor(trx); |
1110 | } else { |
1111 | - trx->conc_state |
1112 | - = TRX_COMMITTED_IN_MEMORY; |
1113 | + trx->state = TRX_COMMITTED_IN_MEMORY; |
1114 | } |
1115 | |
1116 | /* We give a dummy value for the trx no; |
1117 | @@ -588,12 +718,15 @@ |
1118 | |
1119 | trx->no = trx->id; |
1120 | } else { |
1121 | - trx->conc_state = TRX_ACTIVE; |
1122 | + trx->state = TRX_ACTIVE; |
1123 | |
1124 | /* A running transaction always has the number |
1125 | field inited to IB_ULONGLONG_MAX */ |
1126 | |
1127 | trx->no = IB_ULONGLONG_MAX; |
1128 | + |
1129 | + trx_reserve_descriptor(trx); |
1130 | + |
1131 | } |
1132 | |
1133 | if (undo->dict_operation) { |
1134 | @@ -638,7 +771,7 @@ |
1135 | |
1136 | if (srv_force_recovery == 0) { |
1137 | |
1138 | - trx->conc_state |
1139 | + trx->state |
1140 | = TRX_PREPARED; |
1141 | trx_n_prepared++; |
1142 | } else { |
1143 | @@ -649,11 +782,12 @@ |
1144 | " rollback it" |
1145 | " anyway.\n"); |
1146 | |
1147 | - trx->conc_state |
1148 | - = TRX_ACTIVE; |
1149 | + trx->state = TRX_ACTIVE; |
1150 | + trx_reserve_descriptor( |
1151 | + trx); |
1152 | } |
1153 | } else { |
1154 | - trx->conc_state |
1155 | + trx->state |
1156 | = TRX_COMMITTED_IN_MEMORY; |
1157 | } |
1158 | |
1159 | @@ -662,13 +796,14 @@ |
1160 | |
1161 | trx->no = trx->id; |
1162 | } else { |
1163 | - trx->conc_state = TRX_ACTIVE; |
1164 | - |
1165 | + trx->state = TRX_ACTIVE; |
1166 | /* A running transaction always has |
1167 | the number field inited to |
1168 | IB_ULONGLONG_MAX */ |
1169 | |
1170 | trx->no = IB_ULONGLONG_MAX; |
1171 | + |
1172 | + trx_reserve_descriptor(trx); |
1173 | } |
1174 | |
1175 | trx->rseg = rseg; |
1176 | @@ -739,13 +874,15 @@ |
1177 | |
1178 | if (trx->is_purge) { |
1179 | trx->id = 0; |
1180 | - trx->conc_state = TRX_ACTIVE; |
1181 | + /* Don't reserve a descriptor, since this trx is not added to |
1182 | + trx_list. */ |
1183 | + trx->state = TRX_ACTIVE; |
1184 | trx->start_time = time(NULL); |
1185 | |
1186 | return(TRUE); |
1187 | } |
1188 | |
1189 | - ut_ad(trx->conc_state != TRX_ACTIVE); |
1190 | + ut_ad(trx->state != TRX_ACTIVE); |
1191 | |
1192 | ut_a(rseg_id == ULINT_UNDEFINED); |
1193 | |
1194 | @@ -760,7 +897,10 @@ |
1195 | |
1196 | trx->rseg = rseg; |
1197 | |
1198 | - trx->conc_state = TRX_ACTIVE; |
1199 | + trx->state = TRX_ACTIVE; |
1200 | + |
1201 | + trx_reserve_descriptor(trx); |
1202 | + |
1203 | trx->start_time = time(NULL); |
1204 | |
1205 | UT_LIST_ADD_FIRST(trx_list, trx_sys->trx_list, trx); |
1206 | @@ -817,6 +957,14 @@ |
1207 | |
1208 | trx->no = trx_sys_get_new_trx_id(); |
1209 | |
1210 | + if (UNIV_LIKELY(trx->is_in_trx_serial_list == 0)) { |
1211 | + |
1212 | + UT_LIST_ADD_LAST(trx_serial_list, trx_sys->trx_serial_list, |
1213 | + trx); |
1214 | + |
1215 | + trx->is_in_trx_serial_list = 1; |
1216 | + } |
1217 | + |
1218 | /* If the rollack segment is not empty then the |
1219 | new trx_t::no can't be less than any trx_t::no |
1220 | already in the rollback segment. User threads only |
1221 | @@ -998,10 +1146,10 @@ |
1222 | lsn = 0; |
1223 | } |
1224 | |
1225 | - ut_ad(trx->conc_state == TRX_ACTIVE || trx->conc_state == TRX_PREPARED); |
1226 | + ut_ad(trx->state == TRX_ACTIVE || trx->state == TRX_PREPARED); |
1227 | ut_ad(mutex_own(&kernel_mutex)); |
1228 | |
1229 | - if (UNIV_UNLIKELY(trx->conc_state == TRX_PREPARED)) { |
1230 | + if (UNIV_UNLIKELY(trx->state == TRX_PREPARED)) { |
1231 | ut_a(trx_n_prepared > 0); |
1232 | trx_n_prepared--; |
1233 | } |
1234 | @@ -1021,7 +1169,9 @@ |
1235 | committed. */ |
1236 | |
1237 | /*--------------------------------------*/ |
1238 | - trx->conc_state = TRX_COMMITTED_IN_MEMORY; |
1239 | + trx->state = TRX_COMMITTED_IN_MEMORY; |
1240 | + /* The following also removes trx from trx_serial_list */ |
1241 | + trx_release_descriptor(trx); |
1242 | /*--------------------------------------*/ |
1243 | |
1244 | /* If we release kernel_mutex below and we are still doing |
1245 | @@ -1127,7 +1277,7 @@ |
1246 | /* Free all savepoints */ |
1247 | trx_roll_free_all_savepoints(trx); |
1248 | |
1249 | - trx->conc_state = TRX_NOT_STARTED; |
1250 | + trx->state = TRX_NOT_STARTED; |
1251 | trx->rseg = NULL; |
1252 | trx->undo_no = 0; |
1253 | trx->last_sql_stat_start.least_undo_no = 0; |
1254 | @@ -1137,6 +1287,8 @@ |
1255 | |
1256 | UT_LIST_REMOVE(trx_list, trx_sys->trx_list, trx); |
1257 | |
1258 | + ut_ad(trx_sys->descr_n_used <= UT_LIST_GET_LEN(trx_sys->trx_list)); |
1259 | + |
1260 | trx->error_state = DB_SUCCESS; |
1261 | } |
1262 | |
1263 | @@ -1155,12 +1307,15 @@ |
1264 | trx_undo_insert_cleanup(trx); |
1265 | } |
1266 | |
1267 | - trx->conc_state = TRX_NOT_STARTED; |
1268 | + trx->state = TRX_NOT_STARTED; |
1269 | + trx_release_descriptor(trx); |
1270 | trx->rseg = NULL; |
1271 | trx->undo_no = 0; |
1272 | trx->last_sql_stat_start.least_undo_no = 0; |
1273 | |
1274 | UT_LIST_REMOVE(trx_list, trx_sys->trx_list, trx); |
1275 | + |
1276 | + ut_ad(trx_sys->descr_n_used <= UT_LIST_GET_LEN(trx_sys->trx_list)); |
1277 | } |
1278 | |
1279 | /********************************************************************//** |
1280 | @@ -1174,7 +1329,7 @@ |
1281 | /*=================*/ |
1282 | trx_t* trx) /*!< in: active transaction */ |
1283 | { |
1284 | - ut_ad(trx->conc_state == TRX_ACTIVE); |
1285 | + ut_ad(trx->state == TRX_ACTIVE); |
1286 | |
1287 | if (trx->read_view) { |
1288 | return(trx->read_view); |
1289 | @@ -1182,12 +1337,9 @@ |
1290 | |
1291 | mutex_enter(&kernel_mutex); |
1292 | |
1293 | - if (!trx->read_view) { |
1294 | - trx->read_view = read_view_open_now(trx->id, |
1295 | - trx->prebuilt_view); |
1296 | - trx->prebuilt_view = trx->read_view; |
1297 | - trx->global_read_view = trx->read_view; |
1298 | - } |
1299 | + trx->read_view = read_view_open_now(trx->id, trx->prebuilt_view, TRUE); |
1300 | + trx->prebuilt_view = trx->read_view; |
1301 | + trx->global_read_view = trx->read_view; |
1302 | |
1303 | mutex_exit(&kernel_mutex); |
1304 | |
1305 | @@ -1552,7 +1704,7 @@ |
1306 | return; |
1307 | } |
1308 | |
1309 | - if (trx->conc_state == TRX_NOT_STARTED) { |
1310 | + if (trx->state == TRX_NOT_STARTED) { |
1311 | |
1312 | trx_start_low(trx, ULINT_UNDEFINED); |
1313 | } |
1314 | @@ -1844,7 +1996,7 @@ |
1315 | { |
1316 | ut_a(trx); |
1317 | |
1318 | - if (trx->conc_state == TRX_NOT_STARTED) { |
1319 | + if (trx->state == TRX_NOT_STARTED) { |
1320 | trx->undo_no = 0; |
1321 | } |
1322 | |
1323 | @@ -1867,7 +2019,7 @@ |
1324 | |
1325 | fprintf(f, "TRANSACTION " TRX_ID_FMT, (ullint) trx->id); |
1326 | |
1327 | - switch (trx->conc_state) { |
1328 | + switch (trx->state) { |
1329 | case TRX_NOT_STARTED: |
1330 | fputs(", not started", f); |
1331 | break; |
1332 | @@ -1883,7 +2035,7 @@ |
1333 | fputs(", COMMITTED IN MEMORY", f); |
1334 | break; |
1335 | default: |
1336 | - fprintf(f, " state %lu", (ulong) trx->conc_state); |
1337 | + fprintf(f, " state %lu", (ulong) trx->state); |
1338 | } |
1339 | |
1340 | if (*trx->op_info) { |
1341 | @@ -2078,7 +2230,11 @@ |
1342 | ut_ad(mutex_own(&kernel_mutex)); |
1343 | |
1344 | /*--------------------------------------*/ |
1345 | - trx->conc_state = TRX_PREPARED; |
1346 | + if (UNIV_UNLIKELY(trx->state != TRX_ACTIVE)) { |
1347 | + |
1348 | + trx_reserve_descriptor(trx); |
1349 | + } |
1350 | + trx->state = TRX_PREPARED; |
1351 | trx_n_prepared++; |
1352 | /*--------------------------------------*/ |
1353 | |
1354 | @@ -2192,7 +2348,7 @@ |
1355 | trx = UT_LIST_GET_FIRST(trx_sys->trx_list); |
1356 | |
1357 | while (trx) { |
1358 | - if (trx->conc_state == TRX_PREPARED) { |
1359 | + if (trx->state == TRX_PREPARED) { |
1360 | xid_list[count] = trx->xid; |
1361 | |
1362 | if (count == 0) { |
1363 | @@ -2265,7 +2421,7 @@ |
1364 | the same */ |
1365 | |
1366 | if (trx->is_recovered |
1367 | - && trx->conc_state == TRX_PREPARED |
1368 | + && trx->state == TRX_PREPARED |
1369 | && xid->gtrid_length == trx->xid.gtrid_length |
1370 | && xid->bqual_length == trx->xid.bqual_length |
1371 | && memcmp(xid->data, trx->xid.data, |
http:// jenkins. percona. com/view/ PS%205. 5/job/percona- server- 5.5-param/ 685/