Merge lp:~hrvojem/percona-server/page_cleaner_docs-5.6 into lp:percona-server/5.6
- page_cleaner_docs-5.6
- Merge into 5.6
Status: | Merged |
---|---|
Approved by: | Laurynas Biveinis |
Approved revision: | no longer in the source branch. |
Merged at revision: | 463 |
Proposed branch: | lp:~hrvojem/percona-server/page_cleaner_docs-5.6 |
Merge into: | lp:percona-server/5.6 |
Diff against target: |
293 lines (+261/-1) 4 files modified
doc/source/conf.py (+1/-1) doc/source/index.rst (+2/-0) doc/source/performance/page_cleaner_tuning.rst (+88/-0) doc/source/performance/xtradb_performance_improvements.rst (+170/-0) |
To merge this branch: | bzr merge lp:~hrvojem/percona-server/page_cleaner_docs-5.6 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Laurynas Biveinis (community) | Approve | ||
Review via email: mp+189575@code.launchpad.net |
Commit message
Description of the change
Laurynas Biveinis (laurynas-biveinis) wrote : | # |
Laurynas Biveinis (laurynas-biveinis) wrote : | # |
- 39: "Pseudo-parallel flushing"
- 42: s/the 2nd chunk for the 2nd instance/the 2nd chunk for
the 2nd instance, etc.
- 42: (currently defined as <10% of (var);, might be changed in
the future), "not depleted anymore", "or the flushing limit"
- 47: "Percona Server has implemented time limits for LRU and
flush list flushes in a single page cleaner thread iteration."
- Remove "For LRU flushes, the timeout is checked before each LRU
chunk flush.", the next sentence replaces it.
- 49: new section "Tuning Variables", move it below "Adaptive
Flushing Tuning"
- 49, s/experimental, their/experimental, thus their/
- 82: s/10%/10, add allowed range
- 98: s/becomes too aggressive and maintains/may become too
aggressive and maintain
- The both pages are somehow related with each other. Let's
crosslink?
- 123: s/number/a number. Add "
- 125: "Thread Priority Locks". Move it to the end (it's
UNIV_
... must acquire the shared resources with priority. To this
end, a priority mutex and a priority RW lock locking primitives
have been implemented, that use the existing sync array code to
wake up any high-priority waiting threads before any
low-priority waiting threads, as well as reduce any low-priority
thread spinning if any high-priority waiters are already present
for a given sync object. The following mutexes have been
converted to be priority mutexes: dict_sys, LRU list, free
list, rseg, log_sys, and internal hash table sync object array
mutexes. The following RW locks have been converted to
priority RW locks: fsp, page_hash, AHI, index, and purge. To
specify which threads are high-priority for shared resource
acquisition, |Percona Server| has introduced ... ", and remove
lines 166--184.
- Before line 125, add a note that this is experimental and
applies only for UNIV_PERF_DEBUG build.
- 191: s/(i.e. purge), /(i.e. purge) threads stall,
- 192 "increasesr"
- 193 s/hundreds/in up to hundreds
- 195: s/Priority/
thread priority lock framework (link?)"
- 209: remove 1st sentence.
- 214: s/tuning in some cases/tuning, in some cases
- 234: Remove "nothing will happen" sentence, adjust the
beginning of the next sentence.
- Decribe how the option values match nice values, ie. 0 is 19
(lowest priority) and 39 is -20 (highest priority).
- Defaults is 19 not 20
Laurynas Biveinis (laurynas-biveinis) wrote : | # |
- 42: missing closing parenthesis after "future"
- 143: s/they are/the flush list mutex is
- 160: s/It is better/In such cases it is better
- 243: no need to say EXPERIMENTAL, because the parts of this
feature are used for free list refill and enabled by default.
Laurynas Biveinis (laurynas-biveinis) : | # |
Preview Diff
1 | === modified file 'doc/source/conf.py' |
2 | --- doc/source/conf.py 2013-09-20 13:13:37 +0000 |
3 | +++ doc/source/conf.py 2013-10-07 16:02:09 +0000 |
4 | @@ -54,7 +54,7 @@ |
5 | # The short X.Y version. |
6 | version = '5.6' |
7 | # The full version, including alpha/beta/rc tags. |
8 | -release = '5.6.13-60.6' |
9 | +release = '5.6.13-61.0' |
10 | |
11 | # The language for content autogenerated by Sphinx. Refer to documentation |
12 | # for a list of supported languages. |
13 | |
14 | === modified file 'doc/source/index.rst' |
15 | --- doc/source/index.rst 2013-10-01 15:11:56 +0000 |
16 | +++ doc/source/index.rst 2013-10-07 16:02:09 +0000 |
17 | @@ -75,6 +75,8 @@ |
18 | performance/innodb_numa_support |
19 | performance/innodb_opt_lru_count |
20 | performance/threadpool |
21 | + performance/page_cleaner_tuning |
22 | + performance/xtradb_performance_improvements |
23 | |
24 | Flexibility Improvements |
25 | ======================== |
26 | |
27 | === added file 'doc/source/performance/page_cleaner_tuning.rst' |
28 | --- doc/source/performance/page_cleaner_tuning.rst 1970-01-01 00:00:00 +0000 |
29 | +++ doc/source/performance/page_cleaner_tuning.rst 2013-10-07 16:02:09 +0000 |
30 | @@ -0,0 +1,88 @@ |
31 | +.. _page_cleaner_tuning: |
32 | + |
33 | +============================ |
34 | + Page cleaner thread tuning |
35 | +============================ |
36 | + |
37 | +|Percona Server| has implemented page cleaner thread improvements in :rn:`5.6.13-61.0` release. |
38 | + |
39 | +Pseudo-parallel flushing |
40 | +======================== |
41 | + |
42 | +Usage of the of multiple buffer pool instances is `not uniform <http://mikaelronstrom.blogspot.com/2010/09/multiple-buffer-pools-in-mysql-55.html>`_. The non-uniform buffer pool instance use means non-uniform free list depletion levels, which in turn causes transient stalls or single page LRU flushes for the query threads, increasing latency, and LRU/free list mutex contention. |Percona Server| has added heuristics that reduce the occurrence of depleted free lists and associated mutex contentions. Instead of issuing all the chunk-size LRU flush requests for the 1st instance, then for the 2nd instance, etc, the requests are issued to all instances in a pseudo-parallel manner: the 1st chunk for the 1st instance, the 1st chunk for the 2nd instance, etc., the 2nd chunk for the 1st instance, the 2nd chunk for the 2nd instance, etc. Moreover, if a particular instance has a nearly depleted free list (currently defined as <10% of (:variable:`innodb_lru_scan_depth`);, might be changed in the future), then the server keeps on issuing requests for that instance until it's not depleted anymore, or the flushing limit for it has been reached. |
43 | + |
44 | +Timeouts |
45 | +======== |
46 | + |
47 | +|Percona Server| has implemented time limits for LRU and flush list flushes in a single page cleaner thread iteration. The thread assumes that one iteration of its main loop (LRU and flush list flushes) complete under 1 second, but under heavy load we have observed iterations taking up to 17 seconds. Such situations confuse the heuristics, and an LRU or a flush taking a long time prevents the other kind of flush from running, which in turn may cause query threads to perform sync preflushes or single page LRU flushes depending on the starved flush type. If a LRU flush timeout happens, the current flushing pass over all buffer pool instances is still completed in order to ensure that all the instances have received at least a bit of flushing. In order to implement this for flush list flushes, the flush requests for each buffer pool instance were broken up to chunks too. |
48 | + |
49 | +Adaptive Flushing Tuning |
50 | +======================== |
51 | + |
52 | +With the tuned page cleaner heuristics, adaptive flushing may become too aggressive and maintain a consistently lower checkpoint age than a similarly-configured stock server. This results in a performance loss due to reduced write combining. To address this |Percona Server| has implemented new LSN age factor formula for page cleaner adaptive flushing. that can be controlled with :variable:`innodb_cleaner_lsn_age_factor` variable. |
53 | + |
54 | +.. variable:: innodb_cleaner_lsn_age_factor |
55 | + |
56 | + :version 5.6.13-61.0: Introduced. |
57 | + :cli: Yes |
58 | + :conf: Yes |
59 | + :scope: Global |
60 | + :dyn: Yes |
61 | + :values: legacy, high_checkpoint |
62 | + :default: high_checkpoint |
63 | + |
64 | +This variable is used to specify which algorithm should be used for page cleaner adaptive flushing. When ``legacy`` option is set, server will use the upstream algorithm and when the ``high_checkpoint`` is selected, |Percona| implementation will be used. |
65 | + |
66 | +Tuning Variables |
67 | +================ |
68 | + |
69 | +|Percona Server| has introduced several tuning options, only available in builds compiled with ``UNIV_PERF_DEBUG`` or ``UNIV_DEBUG`` C preprocessor define. These options are experimental, thus their name, allowed values, their semantics, and UNIV_PERF_DEBUG presence may change at any future release. Their default values are used for the corresponding variables in regular (that is, no ``UNIV_PERF_DEBUG`` defined) builds. |
70 | + |
71 | +.. variable:: innodb_cleaner_max_lru_time |
72 | + |
73 | + :version 5.6.13-61.0: Introduced. |
74 | + :default: 1000 (miliseconds) |
75 | + |
76 | +This variable is used to specify the timeout for the LRU flush of one page cleaner thread iteration. |
77 | + |
78 | +.. variable:: innodb_cleaner_max_flush_time |
79 | + |
80 | + :version 5.6.13-61.0: Introduced. |
81 | + :default: 1000 (miliseconds) |
82 | + |
83 | +This variable is used to specify the timeout for the flush list flush. |
84 | + |
85 | +.. variable:: innodb_cleaner_lru_chunk_size |
86 | + |
87 | + :version 5.6.13-61.0: Introduced. |
88 | + :default: 100 |
89 | + |
90 | +This variable replaces the hardcoded 100 constant as a chunk size for the LRU flushes. |
91 | + |
92 | +.. variable:: innodb_cleaner_flush_chunk_size |
93 | + |
94 | + :version 5.6.13-61.0: Introduced. |
95 | + :default: 100 |
96 | + |
97 | +This variable is used for specifying the chunk size for the flush list flushes. |
98 | + |
99 | +.. variable:: innodb_cleaner_free_list_lwm |
100 | + |
101 | + :version 5.6.13-61.0: Introduced. |
102 | + :default: 10 |
103 | + :values: 0-100 |
104 | + |
105 | +This variable is used to specify the percentage of free list length below which LRU flushing will keep on iterating on the same buffer pool instance to prevent empty free list. |
106 | + |
107 | +.. variable:: innodb_cleaner_eviction_factor |
108 | + |
109 | + :version 5.6.13-61.0: Introduced. |
110 | + :vartype: Boolean |
111 | + :values: ON/OFF |
112 | + :default: OFF |
113 | + |
114 | +This variable is used for choosing between flushed and evicted page counts for LRU flushing heuristics. If enabled, makes LRU tail flushing to use evicted instead of flushed page counts for its heuristics. |
115 | + |
116 | +Other reading |
117 | +============= |
118 | +* :ref:`xtradb_performance_improvements` |
119 | |
120 | === added file 'doc/source/performance/xtradb_performance_improvements.rst' |
121 | --- doc/source/performance/xtradb_performance_improvements.rst 1970-01-01 00:00:00 +0000 |
122 | +++ doc/source/performance/xtradb_performance_improvements.rst 2013-10-07 16:02:09 +0000 |
123 | @@ -0,0 +1,170 @@ |
124 | +.. _xtradb_performance_improvements: |
125 | + |
126 | +================================= |
127 | + XtraDB Performance Improvements |
128 | +================================= |
129 | + |
130 | +In |Percona Server| :rn:`5.6.13-61.0` a number of |XtraDB| performance improvements have been implemented for high-concurrency scenarios. |
131 | + |
132 | +Priority refill for the buffer pool free list |
133 | +============================================= |
134 | + |
135 | +In highly-concurrent I/O-bound workloads the following situation may happen: |
136 | + 1) Buffer pool free lists are used faster than they are refilled by the LRU cleaner thread. |
137 | + 2) Buffer pool free lists become empty and more and more query and utility (i.e. purge) thread stall, checking whether a free list has became non-empty, sleeping, performing single-page LRU flushes. |
138 | + 3) The number of free list mutex waiters increases. |
139 | + 4) When the page cleaner thread (or a single page LRU flush by a query thread) finally produces a free page, it is starved from putting it on the free list as it must acquire the free list mutex too. However, being one thread in up to hundreds, the chances of a prompt acquisition are low. |
140 | + |
141 | +To avoid this |Percona Server| has implemented priority refill for the buffer pool free list in :rn:`5.6.13-61.0`. This implementation adjusts the free list producer to always acquire the mutex with high priority and free list consumer to always acquire the mutex with low priority. The implementation makes use of :ref:`thread priority lock framework <thread_priority>`. |
142 | + |
143 | +Even the above implementation does not fully resolve the mutex contentions, as the free list mutex is still being acquired needlessly whenever the free list is empty. This was addressed by delegating all the LRU flushes to the page cleaner thread, never attempting to evict a page or perform a LRU single page flush by query thread, and introducing a backoff algorithm to reduce free list mutex pressure on empty free lists. This is controlled through a new system variable :variable:`innodb_empty_free_list_algorithm`. |
144 | + |
145 | +.. variable:: innodb_empty_free_list_algorithm |
146 | + |
147 | + :version 5.6.13-61.0: Introduced. |
148 | + :cli: Yes |
149 | + :conf: Yes |
150 | + :scope: Global |
151 | + :dyn: Yes |
152 | + :values: legacy, backoff |
153 | + :default: backoff |
154 | + |
155 | +When ``legacy`` option is set, server will used the upstream algorithm and when the ``backoff`` is selected, |Percona| implementation will be used. |
156 | + |
157 | +Backoff for sync preflushes |
158 | +=========================== |
159 | + |
160 | +Currently if a log-writing thread finds that the checkpoint age is in the sync preflush zone, it will attempt to advance the checkpoint itself by issuing a flush list flush batch unless one is running already. After the page cleaner tuning, in some cases, this feature hinders more than helps: the cleaner thread knows that the system is in sync preflush state and will perform furious flushing itself. The thread doing its own flushes will only contribute to mutex pressure and use CPU. In such cases it is better for the query threads to wait for any required flushes to complete instead. Whenever a query thread needs to perform a sync preflush to proceed, two options are now available: |
161 | + |
162 | + 1) the query thread may issue a flush list batch itself and wait for it to complete. This is also used whenever the page cleaner thread is not running. |
163 | + 2) alternatively the query thread may wait until the flush list flush is performed by the page cleaner thread. The wait is implemented using a tweaked exponential backoff: the thread sleeps for a random progressively-increasing time waiting for the flush list flush to happen. The sleep time counter is periodically reset to avoid runaway sleeps. This algorithm may change in the future. |
164 | + |
165 | +The behavior is controlled by a new system variable :variable:`innodb_foreground_preflush`. |
166 | + |
167 | +.. variable:: innodb_foreground_preflush |
168 | + |
169 | + :version 5.6.13-61.0: Introduced. |
170 | + :cli: Yes |
171 | + :conf: Yes |
172 | + :scope: Global |
173 | + :dyn: Yes |
174 | + :values: sync_preflush, exponential_backoff |
175 | + :default: exponential_backoff |
176 | + |
177 | +Relative Thread Scheduling Priorities for XtraDB |
178 | +================================================ |
179 | + |
180 | +|Percona Server| has implemented Relative Thread Scheduling Priorities for |XtraDB| in :rn:`5.6.13-61.0`. This feature was implemented because whenever a high number of query threads is running on the server, the cleaner thread and other utility threads must receive more CPU time than a fair scheduling would allocate. New :variable:`innodb_sched_priority_cleaner` option has been introduced that corresponding to Linux ``nice`` values of ``-20..19``, where 0 is 19 (lowest priority) and 39 is -20 (highest priority). When new values are set server will attempt to set the thread nice priority for the specified thread type and return a warning with an actual priority if the attempt failed. |
181 | + |
182 | +.. note:: |
183 | + |
184 | + This feature implementation is Linux-specific. |
185 | + |
186 | +.. variable:: innodb_sched_priority_cleaner |
187 | + |
188 | + :version 5.6.13-61.0: Introduced. |
189 | + :cli: Yes |
190 | + :conf: Yes |
191 | + :scope: Global |
192 | + :dyn: Yes |
193 | + :values: 1-39 |
194 | + :default: 19 |
195 | + |
196 | +This variable is used to set a thread scheduling priority. Values correspond to Linux ``nice`` values of ``-20..19``, where 0 is 19 (lowest priority) and 39 is -20 (highest priority). |
197 | + |
198 | +|Percona Server| has introduced several options, only available in builds compiled with ``UNIV_PERF_DEBUG`` C preprocessor define. |
199 | + |
200 | +.. variable:: innodb_sched_priority_purge |
201 | + |
202 | + :version 5.6.13-61.0: Introduced. |
203 | + :cli: Yes |
204 | + :conf: Yes |
205 | + :scope: Global |
206 | + :dyn: Yes |
207 | + :vartype: Boolean |
208 | + |
209 | +.. variable:: innodb_sched_priority_io |
210 | + |
211 | + :version 5.6.13-61.0: Introduced. |
212 | + :cli: Yes |
213 | + :conf: Yes |
214 | + :scope: Global |
215 | + :dyn: Yes |
216 | + :vartype: Boolean |
217 | + |
218 | +.. variable:: innodb_sched_priority_cleaner |
219 | + |
220 | + :version 5.6.13-61.0: Introduced. |
221 | + :cli: Yes |
222 | + :conf: Yes |
223 | + :scope: Global |
224 | + :dyn: Yes |
225 | + :vartype: Boolean |
226 | + |
227 | +.. variable:: innodb_sched_priority_master |
228 | + |
229 | + :version 5.6.13-61.0: Introduced. |
230 | + :cli: Yes |
231 | + :conf: Yes |
232 | + :scope: Global |
233 | + :dyn: Yes |
234 | + :vartype: Boolean |
235 | + |
236 | +.. _thread_priority: |
237 | + |
238 | +Thread Priority Locks |
239 | +===================== |
240 | + |
241 | +The |InnoDB| worker threads compete for the shared resource accesses with the query threads. Performance experiments show that under high concurrency the worker threads must acquire the shared resources with priority. To this end, a priority mutex and a priority RW lock locking primitives have been implemented, that use the existing sync array code to wake up any high-priority waiting threads before any low-priority waiting threads, as well as reduce any low-priority thread spinning if any high-priority waiters are already present for a given sync object. The following mutexes have been converted to be priority mutexes: dict_sys, LRU list, free list, rseg, log_sys, and internal hash table sync object array mutexes. The following RW locks have been converted to priority RW locks: fsp, page_hash, AHI, index, and purge. To specify which threads are high-priority for shared resource acquisition, |Percona Server| has introduced several tuning options, only available in builds compiled with ``UNIV_PERF_DEBUG`` C preprocessor define. |
242 | + |
243 | +.. variable:: innodb_priority_purge |
244 | + |
245 | + :version 5.6.13-61.0: Introduced. |
246 | + :cli: Yes |
247 | + :conf: Yes |
248 | + :scope: Global |
249 | + :dyn: Yes |
250 | + :vartype: Boolean |
251 | + |
252 | +When this option is enabled purge coordinator and worker threads acquire shared resources with priority. |
253 | + |
254 | +.. variable:: innodb_priority_io |
255 | + |
256 | + :version 5.6.13-61.0: Introduced. |
257 | + :cli: Yes |
258 | + :conf: Yes |
259 | + :scope: Global |
260 | + :dyn: Yes |
261 | + :vartype: Boolean |
262 | + |
263 | +When this option is enabled I/O threads acquire shared resources with priority. |
264 | + |
265 | +.. variable:: innodb_priority_cleaner |
266 | + |
267 | + :version 5.6.13-61.0: Introduced. |
268 | + :cli: Yes |
269 | + :conf: Yes |
270 | + :scope: Global |
271 | + :dyn: Yes |
272 | + :vartype: Boolean |
273 | + |
274 | +When this option is enabled buffer pool cleaner thread acquire shared resources with priority. |
275 | + |
276 | +.. variable:: innodb_priority_master |
277 | + |
278 | + :version 5.6.13-61.0: Introduced. |
279 | + :cli: Yes |
280 | + :conf: Yes |
281 | + :scope: Global |
282 | + :dyn: Yes |
283 | + :vartype: Boolean |
284 | + |
285 | +When buffer pool cleaner thread acquire shared resources with priority. |
286 | + |
287 | +.. note:: |
288 | + |
289 | + These variables are intended for performance experimenting and not regular user tuning. |
290 | + |
291 | +Other Reading |
292 | +============= |
293 | +* :ref:`page_cleaner_tuning` |
- 37: s/Page/page PERF_DEBUG defined) builds. Then line 90 is not needed. foreground_ preflush) go to performance_ improvements. rst. increasing time
- Move timeouts after dispatching section (timeouts will be
explained using dispatches, at which point do they apply).
- 42: s/hard//
- 42: "The thread assumes that one iteration of its main loop
(LRU and flush list flushes) complete under 1 second, but under
... "
- "and a LRU or a flush list flush taking a long time ... "
- if a LRU flush timeout happens, the current flushing pass over
all buffer pool instances is still completed in order to ensure
that all the instances have received at least a bit of
flushing.
- 47: feel free to link to Ronstrom's blog post from the
blueprint
- 47: s/for LRU flushes/, which in turn causes transient stalls
or single page LRU flushes for the query threads, increasing
latency, and LRU/free list mutex contention. Percona
Server has added heuristics that reduce the occurence of
depleted free lists and associated mutex contentions. Instead
of issuing all the chunk-size LRU flush requests for the 1st
instance, then for the 2nd instance, etc, the requests are
issued to all instances in a pseudo-parallel manner: the 1st chunk
for the 1st instance, the 1st chunk for the 2nd instance, etc.,
the 2nd chunk for the 1st instance, the 2nd chunk for the 2nd
instance. Moreover, if a particular instance has a nearly
depleted ...
- Let's remove "To support this ... " for now. Need to think more
how to describe it better.
- Line 49: UNIV_PERF_DEBUG or UNIV_DEBUG.
- Line 49: these options are experimental, their name, allowed
values, their semantics, and UNIV_PERF_DEBUG presence may
change at any future release. Their default values are used
for the corresponding variables in regular (that is, no
UNIV_
- The options need their default values.
- Line 92: its own section "Adaptive Flushing Tuning"
- Line 104: s/used/use
- Line 106+ (innodb_
xtradb_
- "this feature hinders more than helps"/"in some cases, this
feature ... "
- "in a sync preflush state"
- "the wait is implemented using a tweaked exponential backoff:
the thread sleeps for a random progressively-
waiting for the flush list flush to happen. The sleep time
counter is periodically reset to avoid runaway sleeps. This
algorithm may change in the future."
- line 123: redundant?
- Line 135: " ... for high-concurrency scenarios".
- XtraDB thread priority flag / priority mutex / priority rwlock
should be merged to a single section.
- Line 140: s/Some of the/The/
- Line 190: "Another ... " moves after the line 192 list.
- Line 205: "In highly-concurrent I/O-bound workloads the
following situation may happen:"
- Line 213: "Even the above implementation does not fully resolve
the mutex contentions, as they are still being acquired
needlessly whenever the free list is empty".
- "This was addressed by delegating all the LRU flushes to t...