Merge lp:~laurynas-biveinis/percona-server/bug1236696 into lp:percona-server/5.6

Proposed by Laurynas Biveinis
Status: Merged
Approved by: Alexey Kopytov
Approved revision: no longer in the source branch.
Merged at revision: 483
Proposed branch: lp:~laurynas-biveinis/percona-server/bug1236696
Merge into: lp:percona-server/5.6
Diff against target: 345 lines (+82/-49)
7 files modified
Percona-Server/storage/innobase/include/sync0rw.h (+4/-4)
Percona-Server/storage/innobase/include/sync0rw.ic (+7/-25)
Percona-Server/storage/innobase/include/sync0sync.h (+5/-6)
Percona-Server/storage/innobase/include/sync0sync.ic (+0/-1)
Percona-Server/storage/innobase/sync/sync0arr.cc (+3/-3)
Percona-Server/storage/innobase/sync/sync0rw.cc (+45/-8)
Percona-Server/storage/innobase/sync/sync0sync.cc (+18/-2)
To merge this branch: bzr merge lp:~laurynas-biveinis/percona-server/bug1236696
Reviewer Review Type Date Requested Status
Alexey Kopytov (community) Approve
Laurynas Biveinis Pending
Vlad Lesin g2 Pending
Review via email: mp+191207@code.launchpad.net

This proposal supersedes a proposal from 2013-10-09.

Description of the change

3rd MP:
- replaced waiters counts reset to 0 by the unlocking thread with decrements by the woken up waiters themselves.
- fixed bug 1240044.

http://jenkins.percona.com/job/percona-server-5.6-param/377/

2nd MP:

fixed (spurious) compiler warnings
http://jenkins.percona.com/job/percona-server-5.6-param/369/

1st MP:

http://jenkins.percona.com/job/percona-server-5.6-param/368/

Fix bug 1236696.

The issue is a number of race conditions between successful priority
mutex or RW lock locks that happen after having set the waiters flag
(the last locking attempts before waiting), and unlocks.

If a lock (both priority mutex, or a priority RW lock) is successfully
locked this way, the corresponding waiters flag remains set, although
other waiters for this flag do not necessarily exist. Then on unlock,
the flag being set causes the corresponding event to be set (which may
have no waiters), and prevent lower-priority waiters from waking up,
leaving the lock unlocked but its waiters sleeping.

Fix by making ib_prio_mutex_t::high_priority_waiters,
prio_rw_lock_t::high_priority_x_waiters, and
prio_rw_lock_t::high_priority_s_waiters to be actual waiting thread
count instead of 0/1 flag. The counts are atomically incremented by
1 where the flag used to be set to 1, and atomically decremented if
locking after setting the waiters flag has succeeded, instead of doing
nothing. Also fix prio_rw_lock_t::high_priority_wait_ex_waiter flag
handling. Since there can be at most one wait_ex waiter, there is no
need to use atomic inc and dec operations, and it is enough to simply
reset the flag on successful lock. Assert that a high priority
wait_ex waiter does not exist before an X unlock.

Also adjust rw_lock_get_waiters() to use || instead of | for checking
for waiters presence as this function should return only 0 or 1.

To post a comment you must log in.
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote : Posted in a previous version of this proposal

[ 78%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/trx/trx0sys.cc.o
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc: In function ‘void rw_lock_s_lock_spin(void*, ulint, bool, bool, const char*, ulint)’:
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc:492: warning: ‘prio_rw_lock’ may be used uninitialized in this function
[ 78%] [ 78%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/trx/trx0trx.cc.o
Building CXX object storage/perfschema/unittest/CMakeFiles/pfs-t.dir/pfs-t.cc.o
[ 78%] Building CXX object storage/perfschema/unittest/CMakeFiles/pfs_account-oom-t.dir/pfs_account-oom-t.cc.o
[ 78%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/trx/trx0undo.cc.o
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc: In function ‘void rw_lock_x_lock_func(prio_rw_lock_t*, ulint, const char*, ulint)’:
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc:756: warning: ‘prio_lock’ may be used uninitialized in this function
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc:756: note: ‘prio_lock’ was declared here
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0sync.cc: In function ‘void mutex_spin_wait(void*, bool, const char*, ulint)’:
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0sync.cc:554: warning: ‘prio_mutex’ may be used uninitialized in this function
[ 78%] Building CXX object storage/innobase/CMakeFiles/innobase.dir/usr/usr0sess.cc.o
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc: In function ‘void rw_lock_x_lock_func(rw_lock_t*, ulint, const char*, ulint, bool, bool)’:
/data/bench/alexey.s/repo/trunk-free-backoff.clean/Percona-Server/storage/innobase/sync/sync0rw.cc:756: warning: ‘prio_lock’ may be used uninitialized in this function

review: Needs Fixing
Revision history for this message
Vlad Lesin (vlad-lesin) : Posted in a previous version of this proposal
review: Approve (g2)
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote : Posted in a previous version of this proposal

Thanks for holding this MP, Alexey.

I kept on thinking about Vlad's question on IRC, and realized that waiters sets/resets are indeed unbalanced: if one thread has bumped the waiters flag, and then an unlock happens, and then the first thread wins the lock word race, the resulting waiters value will be one less than it should be. This should be fixable by replacing the waiters count reset in the unlockers with a waiters count decrement upon wakeup.

review: Needs Fixing
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote : Posted in a previous version of this proposal

While fixing, discovered bug 1240044, will fix it too.

review: Needs Fixing
Revision history for this message
Alexey Kopytov (akopytov) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Percona-Server/storage/innobase/include/sync0rw.h'
2--- Percona-Server/storage/innobase/include/sync0rw.h 2013-09-26 14:58:37 +0000
3+++ Percona-Server/storage/innobase/include/sync0rw.h 2013-10-15 14:32:10 +0000
4@@ -780,13 +780,13 @@
5 provides the lock word etc. for
6 the priority rw lock */
7 volatile ulint high_priority_s_waiters;
8- /* If 1, high priority S
9- waiters exist */
10+ /* Number of high priority S
11+ waiters */
12 os_event_t high_priority_s_event; /* High priority wait
13 array event for S waiters */
14 volatile ulint high_priority_x_waiters;
15- /* If 1, high priority X
16- waiters exist */
17+ /* Number of high priority X
18+ waiters */
19 os_event_t high_priority_x_event;
20 /* High priority wait arraay
21 event for X waiters */
22
23=== modified file 'Percona-Server/storage/innobase/include/sync0rw.ic'
24--- Percona-Server/storage/innobase/include/sync0rw.ic 2013-10-04 14:50:45 +0000
25+++ Percona-Server/storage/innobase/include/sync0rw.ic 2013-10-15 14:32:10 +0000
26@@ -94,8 +94,8 @@
27 const prio_rw_lock_t* lock) /*!< in: rw-lock */
28 {
29 return rw_lock_get_waiters(&lock->base_lock)
30- | lock->high_priority_s_waiters
31- | lock->high_priority_x_waiters;
32+ || lock->high_priority_s_waiters
33+ || lock->high_priority_x_waiters;
34 }
35
36 /********************************************************************//**
37@@ -619,32 +619,14 @@
38 if (lock_word == 0) {
39
40 /* A waiting next-writer exists, either high priority or
41- regular. Wake up the first waiter in this order: 1) high
42- priority next-writer; 2) high priority X waiters; 3) high
43- priority S waiters; 4) regular priority next-waiter. This
44- allows high priority requests to overtake an already-waiting
45- regular priority next-waiter. */
46+ regular, sharing the same wait event. */
47 if (lock->high_priority_wait_ex_waiter) {
48
49 lock->high_priority_wait_ex_waiter = 0;
50- /* Note that we do not have a separate high priority
51- next-waiter event. There can be only one such waiter,
52- here we already know it's high priority, no
53- regular-priority wakeup may happen. */
54- os_event_set(lock->base_lock.wait_ex_event);
55- } else if (lock->high_priority_x_waiters) {
56-
57- lock->high_priority_x_waiters = 0;
58- os_event_set(lock->high_priority_x_event);
59- } else if (lock->high_priority_s_waiters) {
60-
61- lock->high_priority_s_waiters = 0;
62- os_event_set(lock->high_priority_s_event);
63- } else {
64-
65- os_event_set(lock->base_lock.wait_ex_event);
66 }
67+ os_event_set(lock->base_lock.wait_ex_event);
68 sync_array_object_signalled();
69+
70 } else if (lock_word == X_LOCK_DECR) {
71
72 /* S-waiters may exist during an S unlock if a high-priority
73@@ -765,6 +747,8 @@
74 #endif
75 &lock->base_lock);
76
77+ ut_ad(lock->high_priority_wait_ex_waiter == 0);
78+
79 if (rw_lock_lock_word_incr(&lock->base_lock, x_lock_incr)
80 == X_LOCK_DECR) {
81
82@@ -776,12 +760,10 @@
83
84 if (lock->high_priority_x_waiters) {
85
86- lock->high_priority_x_waiters = 0;
87 os_event_set(lock->high_priority_x_event);
88 sync_array_object_signalled();
89 } else if (lock->high_priority_s_waiters) {
90
91- lock->high_priority_s_waiters = 0;
92 os_event_set(lock->high_priority_s_event);
93 sync_array_object_signalled();
94 } else if (lock->base_lock.waiters) {
95
96=== modified file 'Percona-Server/storage/innobase/include/sync0sync.h'
97--- Percona-Server/storage/innobase/include/sync0sync.h 2013-09-26 14:58:37 +0000
98+++ Percona-Server/storage/innobase/include/sync0sync.h 2013-10-15 14:32:10 +0000
99@@ -977,12 +977,11 @@
100 word etc. for the priority mutex */
101 os_event_t high_priority_event; /* High priority wait array
102 event */
103- volatile ulint high_priority_waiters; /* Set to 1 if there are (or
104- may be) threads that asked for this
105- mutex to be acquired with high priority
106- in the global wait array for this mutex
107- to be released. Otherwise, this is
108- 0. */
109+ volatile ulint high_priority_waiters; /* Number of threads that asked
110+ for this mutex to be acquired with high
111+ priority in the global wait array
112+ waiting for this mutex to be
113+ released. */
114 };
115
116 /** Constant determining how long spin wait is continued before suspending
117
118=== modified file 'Percona-Server/storage/innobase/include/sync0sync.ic'
119--- Percona-Server/storage/innobase/include/sync0sync.ic 2013-09-23 13:53:38 +0000
120+++ Percona-Server/storage/innobase/include/sync0sync.ic 2013-10-15 14:32:10 +0000
121@@ -227,7 +227,6 @@
122 /* Wake up any high priority waiters first. */
123 if (mutex->high_priority_waiters != 0) {
124
125- mutex->high_priority_waiters = 0;
126 os_event_set(mutex->high_priority_event);
127 sync_array_object_signalled();
128
129
130=== modified file 'Percona-Server/storage/innobase/sync/sync0arr.cc'
131--- Percona-Server/storage/innobase/sync/sync0arr.cc 2013-10-04 04:54:33 +0000
132+++ Percona-Server/storage/innobase/sync/sync0arr.cc 2013-10-15 14:32:10 +0000
133@@ -490,7 +490,7 @@
134 if (type == SYNC_PRIO_MUTEX) {
135
136 fprintf(file,
137- "high-priority waiters flag %lu\n",
138+ "high-priority waiters count %lu\n",
139 (ulong) prio_mutex->high_priority_waiters);
140 }
141
142@@ -547,8 +547,8 @@
143 rwlock->last_x_file_name,
144 (ulong) rwlock->last_x_line);
145 if (prio_rwlock) {
146- fprintf(file, "high priority S waiters flag %lu, "
147- "high priority X waiters flag %lu, "
148+ fprintf(file, "high priority S waiters count %lu, "
149+ "high priority X waiters count %lu, "
150 "wait-exclusive waiter is "
151 "high priority if exists: %lu\n",
152 prio_rwlock->high_priority_s_waiters,
153
154=== modified file 'Percona-Server/storage/innobase/sync/sync0rw.cc'
155--- Percona-Server/storage/innobase/sync/sync0rw.cc 2013-10-04 14:50:45 +0000
156+++ Percona-Server/storage/innobase/sync/sync0rw.cc 2013-10-15 14:32:10 +0000
157@@ -408,8 +408,6 @@
158 /*=============*/
159 prio_rw_lock_t* lock) /*!< in: rw-lock */
160 {
161- ut_ad(lock->high_priority_s_waiters < 2);
162- ut_ad(lock->high_priority_x_waiters < 2);
163 return(rw_lock_validate(&lock->base_lock));
164 }
165
166@@ -491,6 +489,8 @@
167 return; /* Success */
168 } else {
169
170+ prio_rw_lock_t* prio_rw_lock = NULL;
171+
172 if (i > 0 && i < SYNC_SPIN_ROUNDS) {
173 goto lock_loop;
174 }
175@@ -507,10 +507,14 @@
176 /* Set waiters before checking lock_word to ensure wake-up
177 signal is sent. This may lead to some unnecessary signals. */
178 if (high_priority) {
179- prio_rw_lock_t* prio_rw_lock
180- = (prio_rw_lock_t *) _lock;
181- prio_rw_lock->high_priority_s_waiters = 1;
182+
183+ prio_rw_lock = reinterpret_cast<prio_rw_lock_t *>
184+ (_lock);
185+ os_atomic_increment_ulint(
186+ &prio_rw_lock->high_priority_s_waiters,
187+ 1);
188 } else {
189+
190 rw_lock_set_waiter_flag(lock);
191 }
192
193@@ -519,6 +523,12 @@
194 && (TRUE == rw_lock_s_lock_low(lock, pass,
195 file_name, line))) {
196 sync_array_free_cell(sync_arr, index);
197+ if (prio_rw_lock) {
198+
199+ os_atomic_decrement_ulint(
200+ &prio_rw_lock->high_priority_s_waiters,
201+ 1);
202+ }
203 return; /* Success */
204 }
205
206@@ -536,6 +546,13 @@
207
208 sync_array_wait_event(sync_arr, index);
209
210+ if (prio_rw_lock) {
211+
212+ os_atomic_decrement_ulint(
213+ &prio_rw_lock->high_priority_s_waiters,
214+ 1);
215+ }
216+
217 i = 0;
218 goto lock_loop;
219 }
220@@ -584,6 +601,7 @@
221 ulint i = 0;
222 sync_array_t* sync_arr;
223 size_t counter_index;
224+ prio_rw_lock_t* prio_rw_lock = NULL;
225
226 /* We reuse the thread id to index into the counter, cache
227 it here for efficiency. */
228@@ -612,7 +630,7 @@
229
230 if (high_priority) {
231
232- prio_rw_lock_t* prio_rw_lock
233+ prio_rw_lock
234 = reinterpret_cast<prio_rw_lock_t *>(lock);
235 prio_rw_lock->high_priority_wait_ex_waiter = 1;
236 }
237@@ -643,6 +661,10 @@
238 We must pass the while-loop check to proceed.*/
239 } else {
240 sync_array_free_cell(sync_arr, index);
241+ if (prio_rw_lock) {
242+
243+ prio_rw_lock->high_priority_wait_ex_waiter = 0;
244+ }
245 }
246 }
247 rw_lock_stats.rw_x_spin_round_count.add(counter_index, i);
248@@ -739,6 +761,7 @@
249 sync_array_t* sync_arr;
250 ibool spinning = FALSE;
251 size_t counter_index;
252+ prio_rw_lock_t* prio_lock = NULL;
253
254 /* We reuse the thread id to index into the counter, cache
255 it here for efficiency. */
256@@ -820,14 +843,22 @@
257 /* Waiters must be set before checking lock_word, to ensure signal
258 is sent. This could lead to a few unnecessary wake-up signals. */
259 if (high_priority) {
260- prio_rw_lock_t* prio_lock = (prio_rw_lock_t *)lock;
261- prio_lock->high_priority_x_waiters = 1;
262+
263+ prio_lock = reinterpret_cast<prio_rw_lock_t *>(lock);
264+ os_atomic_increment_ulint(&prio_lock->high_priority_x_waiters,
265+ 1);
266 } else {
267 rw_lock_set_waiter_flag(lock);
268 }
269
270 if (rw_lock_x_lock_low(lock, high_priority, pass, file_name, line)) {
271 sync_array_free_cell(sync_arr, index);
272+ if (prio_lock) {
273+
274+ os_atomic_decrement_ulint(
275+ &prio_lock->high_priority_x_waiters,
276+ 1);
277+ }
278 return; /* Locking succeeded */
279 }
280
281@@ -845,6 +876,12 @@
282
283 sync_array_wait_event(sync_arr, index);
284
285+ if (prio_lock) {
286+
287+ os_atomic_decrement_ulint(&prio_lock->high_priority_x_waiters,
288+ 1);
289+ }
290+
291 i = 0;
292 goto lock_loop;
293 }
294
295=== modified file 'Percona-Server/storage/innobase/sync/sync0sync.cc'
296--- Percona-Server/storage/innobase/sync/sync0sync.cc 2013-09-23 13:53:38 +0000
297+++ Percona-Server/storage/innobase/sync/sync0sync.cc 2013-10-15 14:32:10 +0000
298@@ -550,7 +550,8 @@
299 /* The typecast below is performed for some of the priority mutexes
300 too, when !high_priority. This exploits the fact that regular mutex is
301 a prefix of the priority mutex in memory. */
302- ib_mutex_t* mutex = (ib_mutex_t *) _mutex;
303+ ib_mutex_t* mutex = (ib_mutex_t *) _mutex;
304+ ib_prio_mutex_t* prio_mutex = NULL;
305
306 counter_index = (size_t) os_thread_get_curr_id();
307
308@@ -624,8 +625,12 @@
309 then the event is set to the signaled state. */
310
311 if (high_priority) {
312- ((ib_prio_mutex_t *)_mutex)->high_priority_waiters = 1;
313+
314+ prio_mutex = reinterpret_cast<ib_prio_mutex_t *>(_mutex);
315+ os_atomic_increment_ulint(&prio_mutex->high_priority_waiters,
316+ 1);
317 } else {
318+
319 mutex_set_waiters(mutex, 1);
320 }
321
322@@ -641,6 +646,11 @@
323 mutex_set_debug_info(mutex, file_name, line);
324 #endif
325
326+ if (prio_mutex) {
327+ os_atomic_decrement_ulint(
328+ &prio_mutex->high_priority_waiters,
329+ 1);
330+ }
331 return;
332
333 /* Note that in this case we leave the waiters field
334@@ -659,6 +669,12 @@
335
336 sync_array_wait_event(sync_arr, index);
337
338+ if (prio_mutex) {
339+
340+ os_atomic_decrement_ulint(&prio_mutex->high_priority_waiters,
341+ 1);
342+ }
343+
344 goto mutex_loop;
345 }
346

Subscribers

People subscribed via source and target branches