Merge lp:~hrvojem/percona-server/page_cleaner_docs-5.6 into lp:percona-server/5.6

Proposed by Hrvoje Matijakovic
Status: Merged
Approved by: Laurynas Biveinis
Approved revision: no longer in the source branch.
Merged at revision: 463
Proposed branch: lp:~hrvojem/percona-server/page_cleaner_docs-5.6
Merge into: lp:percona-server/5.6
Diff against target: 293 lines (+261/-1)
4 files modified
doc/source/conf.py (+1/-1)
doc/source/index.rst (+2/-0)
doc/source/performance/page_cleaner_tuning.rst (+88/-0)
doc/source/performance/xtradb_performance_improvements.rst (+170/-0)
To merge this branch: bzr merge lp:~hrvojem/percona-server/page_cleaner_docs-5.6
Reviewer Review Type Date Requested Status
Laurynas Biveinis (community) Approve
Review via email: mp+189575@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :
Download full text (3.6 KiB)

    - 37: s/Page/page
    - Move timeouts after dispatching section (timeouts will be
      explained using dispatches, at which point do they apply).
    - 42: s/hard//
    - 42: "The thread assumes that one iteration of its main loop
      (LRU and flush list flushes) complete under 1 second, but under
      ... "
    - "and a LRU or a flush list flush taking a long time ... "
    - if a LRU flush timeout happens, the current flushing pass over
      all buffer pool instances is still completed in order to ensure
      that all the instances have received at least a bit of
      flushing.
    - 47: feel free to link to Ronstrom's blog post from the
      blueprint
    - 47: s/for LRU flushes/, which in turn causes transient stalls
      or single page LRU flushes for the query threads, increasing
      latency, and LRU/free list mutex contention. Percona
      Server has added heuristics that reduce the occurence of
      depleted free lists and associated mutex contentions. Instead
      of issuing all the chunk-size LRU flush requests for the 1st
      instance, then for the 2nd instance, etc, the requests are
      issued to all instances in a pseudo-parallel manner: the 1st chunk
      for the 1st instance, the 1st chunk for the 2nd instance, etc.,
      the 2nd chunk for the 1st instance, the 2nd chunk for the 2nd
      instance. Moreover, if a particular instance has a nearly
      depleted ...
    - Let's remove "To support this ... " for now. Need to think more
      how to describe it better.
    - Line 49: UNIV_PERF_DEBUG or UNIV_DEBUG.
    - Line 49: these options are experimental, their name, allowed
      values, their semantics, and UNIV_PERF_DEBUG presence may
      change at any future release. Their default values are used
      for the corresponding variables in regular (that is, no
      UNIV_PERF_DEBUG defined) builds. Then line 90 is not needed.
    - The options need their default values.
    - Line 92: its own section "Adaptive Flushing Tuning"
    - Line 104: s/used/use
    - Line 106+ (innodb_foreground_preflush) go to
      xtradb_performance_improvements.rst.
    - "this feature hinders more than helps"/"in some cases, this
      feature ... "
    - "in a sync preflush state"
    - "the wait is implemented using a tweaked exponential backoff:
      the thread sleeps for a random progressively-increasing time
      waiting for the flush list flush to happen. The sleep time
      counter is periodically reset to avoid runaway sleeps. This
      algorithm may change in the future."
    - line 123: redundant?
    - Line 135: " ... for high-concurrency scenarios".
    - XtraDB thread priority flag / priority mutex / priority rwlock
      should be merged to a single section.
    - Line 140: s/Some of the/The/
    - Line 190: "Another ... " moves after the line 192 list.
    - Line 205: "In highly-concurrent I/O-bound workloads the
      following situation may happen:"
    - Line 213: "Even the above implementation does not fully resolve
      the mutex contentions, as they are still being acquired
      needlessly whenever the free list is empty".
    - "This was addressed by delegating all the LRU flushes to t...

Read more...

review: Needs Fixing
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

    - 39: "Pseudo-parallel flushing"
    - 42: s/the 2nd chunk for the 2nd instance/the 2nd chunk for
      the 2nd instance, etc.
    - 42: (currently defined as <10% of (var);, might be changed in
      the future), "not depleted anymore", "or the flushing limit"
    - 47: "Percona Server has implemented time limits for LRU and
      flush list flushes in a single page cleaner thread iteration."
    - Remove "For LRU flushes, the timeout is checked before each LRU
      chunk flush.", the next sentence replaces it.
    - 49: new section "Tuning Variables", move it below "Adaptive
      Flushing Tuning"
    - 49, s/experimental, their/experimental, thus their/
    - 82: s/10%/10, add allowed range
    - 98: s/becomes too aggressive and maintains/may become too
      aggressive and maintain
    - The both pages are somehow related with each other. Let's
      crosslink?
    - 123: s/number/a number. Add "
    - 125: "Thread Priority Locks". Move it to the end (it's
      UNIV_PERF_DEBUG only for now). Reorganize the section: "
      ... must acquire the shared resources with priority. To this
      end, a priority mutex and a priority RW lock locking primitives
      have been implemented, that use the existing sync array code to
      wake up any high-priority waiting threads before any
      low-priority waiting threads, as well as reduce any low-priority
      thread spinning if any high-priority waiters are already present
      for a given sync object. The following mutexes have been
      converted to be priority mutexes: dict_sys, LRU list, free
      list, rseg, log_sys, and internal hash table sync object array
      mutexes. The following RW locks have been converted to
      priority RW locks: fsp, page_hash, AHI, index, and purge. To
      specify which threads are high-priority for shared resource
      acquisition, |Percona Server| has introduced ... ", and remove
      lines 166--184.
    - Before line 125, add a note that this is experimental and
      applies only for UNIV_PERF_DEBUG build.
    - 191: s/(i.e. purge), /(i.e. purge) threads stall,
    - 192 "increasesr"
    - 193 s/hundreds/in up to hundreds
    - 195: s/Priority/priority. Add "The implementation makes use of
      thread priority lock framework (link?)"
    - 209: remove 1st sentence.
    - 214: s/tuning in some cases/tuning, in some cases
    - 234: Remove "nothing will happen" sentence, adjust the
      beginning of the next sentence.
    - Decribe how the option values match nice values, ie. 0 is 19
      (lowest priority) and 39 is -20 (highest priority).
    - Defaults is 19 not 20

review: Needs Fixing
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) wrote :

    - 42: missing closing parenthesis after "future"
    - 143: s/they are/the flush list mutex is
    - 160: s/It is better/In such cases it is better
    - 243: no need to say EXPERIMENTAL, because the parts of this
      feature are used for free list refill and enabled by default.

review: Needs Fixing
Revision history for this message
Laurynas Biveinis (laurynas-biveinis) :
review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'doc/source/conf.py'
2--- doc/source/conf.py 2013-09-20 13:13:37 +0000
3+++ doc/source/conf.py 2013-10-07 16:02:09 +0000
4@@ -54,7 +54,7 @@
5 # The short X.Y version.
6 version = '5.6'
7 # The full version, including alpha/beta/rc tags.
8-release = '5.6.13-60.6'
9+release = '5.6.13-61.0'
10
11 # The language for content autogenerated by Sphinx. Refer to documentation
12 # for a list of supported languages.
13
14=== modified file 'doc/source/index.rst'
15--- doc/source/index.rst 2013-10-01 15:11:56 +0000
16+++ doc/source/index.rst 2013-10-07 16:02:09 +0000
17@@ -75,6 +75,8 @@
18 performance/innodb_numa_support
19 performance/innodb_opt_lru_count
20 performance/threadpool
21+ performance/page_cleaner_tuning
22+ performance/xtradb_performance_improvements
23
24 Flexibility Improvements
25 ========================
26
27=== added file 'doc/source/performance/page_cleaner_tuning.rst'
28--- doc/source/performance/page_cleaner_tuning.rst 1970-01-01 00:00:00 +0000
29+++ doc/source/performance/page_cleaner_tuning.rst 2013-10-07 16:02:09 +0000
30@@ -0,0 +1,88 @@
31+.. _page_cleaner_tuning:
32+
33+============================
34+ Page cleaner thread tuning
35+============================
36+
37+|Percona Server| has implemented page cleaner thread improvements in :rn:`5.6.13-61.0` release.
38+
39+Pseudo-parallel flushing
40+========================
41+
42+Usage of the of multiple buffer pool instances is `not uniform <http://mikaelronstrom.blogspot.com/2010/09/multiple-buffer-pools-in-mysql-55.html>`_. The non-uniform buffer pool instance use means non-uniform free list depletion levels, which in turn causes transient stalls or single page LRU flushes for the query threads, increasing latency, and LRU/free list mutex contention. |Percona Server| has added heuristics that reduce the occurrence of depleted free lists and associated mutex contentions. Instead of issuing all the chunk-size LRU flush requests for the 1st instance, then for the 2nd instance, etc, the requests are issued to all instances in a pseudo-parallel manner: the 1st chunk for the 1st instance, the 1st chunk for the 2nd instance, etc., the 2nd chunk for the 1st instance, the 2nd chunk for the 2nd instance, etc. Moreover, if a particular instance has a nearly depleted free list (currently defined as <10% of (:variable:`innodb_lru_scan_depth`);, might be changed in the future), then the server keeps on issuing requests for that instance until it's not depleted anymore, or the flushing limit for it has been reached.
43+
44+Timeouts
45+========
46+
47+|Percona Server| has implemented time limits for LRU and flush list flushes in a single page cleaner thread iteration. The thread assumes that one iteration of its main loop (LRU and flush list flushes) complete under 1 second, but under heavy load we have observed iterations taking up to 17 seconds. Such situations confuse the heuristics, and an LRU or a flush taking a long time prevents the other kind of flush from running, which in turn may cause query threads to perform sync preflushes or single page LRU flushes depending on the starved flush type. If a LRU flush timeout happens, the current flushing pass over all buffer pool instances is still completed in order to ensure that all the instances have received at least a bit of flushing. In order to implement this for flush list flushes, the flush requests for each buffer pool instance were broken up to chunks too.
48+
49+Adaptive Flushing Tuning
50+========================
51+
52+With the tuned page cleaner heuristics, adaptive flushing may become too aggressive and maintain a consistently lower checkpoint age than a similarly-configured stock server. This results in a performance loss due to reduced write combining. To address this |Percona Server| has implemented new LSN age factor formula for page cleaner adaptive flushing. that can be controlled with :variable:`innodb_cleaner_lsn_age_factor` variable.
53+
54+.. variable:: innodb_cleaner_lsn_age_factor
55+
56+ :version 5.6.13-61.0: Introduced.
57+ :cli: Yes
58+ :conf: Yes
59+ :scope: Global
60+ :dyn: Yes
61+ :values: legacy, high_checkpoint
62+ :default: high_checkpoint
63+
64+This variable is used to specify which algorithm should be used for page cleaner adaptive flushing. When ``legacy`` option is set, server will use the upstream algorithm and when the ``high_checkpoint`` is selected, |Percona| implementation will be used.
65+
66+Tuning Variables
67+================
68+
69+|Percona Server| has introduced several tuning options, only available in builds compiled with ``UNIV_PERF_DEBUG`` or ``UNIV_DEBUG`` C preprocessor define. These options are experimental, thus their name, allowed values, their semantics, and UNIV_PERF_DEBUG presence may change at any future release. Their default values are used for the corresponding variables in regular (that is, no ``UNIV_PERF_DEBUG`` defined) builds.
70+
71+.. variable:: innodb_cleaner_max_lru_time
72+
73+ :version 5.6.13-61.0: Introduced.
74+ :default: 1000 (miliseconds)
75+
76+This variable is used to specify the timeout for the LRU flush of one page cleaner thread iteration.
77+
78+.. variable:: innodb_cleaner_max_flush_time
79+
80+ :version 5.6.13-61.0: Introduced.
81+ :default: 1000 (miliseconds)
82+
83+This variable is used to specify the timeout for the flush list flush.
84+
85+.. variable:: innodb_cleaner_lru_chunk_size
86+
87+ :version 5.6.13-61.0: Introduced.
88+ :default: 100
89+
90+This variable replaces the hardcoded 100 constant as a chunk size for the LRU flushes.
91+
92+.. variable:: innodb_cleaner_flush_chunk_size
93+
94+ :version 5.6.13-61.0: Introduced.
95+ :default: 100
96+
97+This variable is used for specifying the chunk size for the flush list flushes.
98+
99+.. variable:: innodb_cleaner_free_list_lwm
100+
101+ :version 5.6.13-61.0: Introduced.
102+ :default: 10
103+ :values: 0-100
104+
105+This variable is used to specify the percentage of free list length below which LRU flushing will keep on iterating on the same buffer pool instance to prevent empty free list.
106+
107+.. variable:: innodb_cleaner_eviction_factor
108+
109+ :version 5.6.13-61.0: Introduced.
110+ :vartype: Boolean
111+ :values: ON/OFF
112+ :default: OFF
113+
114+This variable is used for choosing between flushed and evicted page counts for LRU flushing heuristics. If enabled, makes LRU tail flushing to use evicted instead of flushed page counts for its heuristics.
115+
116+Other reading
117+=============
118+* :ref:`xtradb_performance_improvements`
119
120=== added file 'doc/source/performance/xtradb_performance_improvements.rst'
121--- doc/source/performance/xtradb_performance_improvements.rst 1970-01-01 00:00:00 +0000
122+++ doc/source/performance/xtradb_performance_improvements.rst 2013-10-07 16:02:09 +0000
123@@ -0,0 +1,170 @@
124+.. _xtradb_performance_improvements:
125+
126+=================================
127+ XtraDB Performance Improvements
128+=================================
129+
130+In |Percona Server| :rn:`5.6.13-61.0` a number of |XtraDB| performance improvements have been implemented for high-concurrency scenarios.
131+
132+Priority refill for the buffer pool free list
133+=============================================
134+
135+In highly-concurrent I/O-bound workloads the following situation may happen:
136+ 1) Buffer pool free lists are used faster than they are refilled by the LRU cleaner thread.
137+ 2) Buffer pool free lists become empty and more and more query and utility (i.e. purge) thread stall, checking whether a free list has became non-empty, sleeping, performing single-page LRU flushes.
138+ 3) The number of free list mutex waiters increases.
139+ 4) When the page cleaner thread (or a single page LRU flush by a query thread) finally produces a free page, it is starved from putting it on the free list as it must acquire the free list mutex too. However, being one thread in up to hundreds, the chances of a prompt acquisition are low.
140+
141+To avoid this |Percona Server| has implemented priority refill for the buffer pool free list in :rn:`5.6.13-61.0`. This implementation adjusts the free list producer to always acquire the mutex with high priority and free list consumer to always acquire the mutex with low priority. The implementation makes use of :ref:`thread priority lock framework <thread_priority>`.
142+
143+Even the above implementation does not fully resolve the mutex contentions, as the free list mutex is still being acquired needlessly whenever the free list is empty. This was addressed by delegating all the LRU flushes to the page cleaner thread, never attempting to evict a page or perform a LRU single page flush by query thread, and introducing a backoff algorithm to reduce free list mutex pressure on empty free lists. This is controlled through a new system variable :variable:`innodb_empty_free_list_algorithm`.
144+
145+.. variable:: innodb_empty_free_list_algorithm
146+
147+ :version 5.6.13-61.0: Introduced.
148+ :cli: Yes
149+ :conf: Yes
150+ :scope: Global
151+ :dyn: Yes
152+ :values: legacy, backoff
153+ :default: backoff
154+
155+When ``legacy`` option is set, server will used the upstream algorithm and when the ``backoff`` is selected, |Percona| implementation will be used.
156+
157+Backoff for sync preflushes
158+===========================
159+
160+Currently if a log-writing thread finds that the checkpoint age is in the sync preflush zone, it will attempt to advance the checkpoint itself by issuing a flush list flush batch unless one is running already. After the page cleaner tuning, in some cases, this feature hinders more than helps: the cleaner thread knows that the system is in sync preflush state and will perform furious flushing itself. The thread doing its own flushes will only contribute to mutex pressure and use CPU. In such cases it is better for the query threads to wait for any required flushes to complete instead. Whenever a query thread needs to perform a sync preflush to proceed, two options are now available:
161+
162+ 1) the query thread may issue a flush list batch itself and wait for it to complete. This is also used whenever the page cleaner thread is not running.
163+ 2) alternatively the query thread may wait until the flush list flush is performed by the page cleaner thread. The wait is implemented using a tweaked exponential backoff: the thread sleeps for a random progressively-increasing time waiting for the flush list flush to happen. The sleep time counter is periodically reset to avoid runaway sleeps. This algorithm may change in the future.
164+
165+The behavior is controlled by a new system variable :variable:`innodb_foreground_preflush`.
166+
167+.. variable:: innodb_foreground_preflush
168+
169+ :version 5.6.13-61.0: Introduced.
170+ :cli: Yes
171+ :conf: Yes
172+ :scope: Global
173+ :dyn: Yes
174+ :values: sync_preflush, exponential_backoff
175+ :default: exponential_backoff
176+
177+Relative Thread Scheduling Priorities for XtraDB
178+================================================
179+
180+|Percona Server| has implemented Relative Thread Scheduling Priorities for |XtraDB| in :rn:`5.6.13-61.0`. This feature was implemented because whenever a high number of query threads is running on the server, the cleaner thread and other utility threads must receive more CPU time than a fair scheduling would allocate. New :variable:`innodb_sched_priority_cleaner` option has been introduced that corresponding to Linux ``nice`` values of ``-20..19``, where 0 is 19 (lowest priority) and 39 is -20 (highest priority). When new values are set server will attempt to set the thread nice priority for the specified thread type and return a warning with an actual priority if the attempt failed.
181+
182+.. note::
183+
184+ This feature implementation is Linux-specific.
185+
186+.. variable:: innodb_sched_priority_cleaner
187+
188+ :version 5.6.13-61.0: Introduced.
189+ :cli: Yes
190+ :conf: Yes
191+ :scope: Global
192+ :dyn: Yes
193+ :values: 1-39
194+ :default: 19
195+
196+This variable is used to set a thread scheduling priority. Values correspond to Linux ``nice`` values of ``-20..19``, where 0 is 19 (lowest priority) and 39 is -20 (highest priority).
197+
198+|Percona Server| has introduced several options, only available in builds compiled with ``UNIV_PERF_DEBUG`` C preprocessor define.
199+
200+.. variable:: innodb_sched_priority_purge
201+
202+ :version 5.6.13-61.0: Introduced.
203+ :cli: Yes
204+ :conf: Yes
205+ :scope: Global
206+ :dyn: Yes
207+ :vartype: Boolean
208+
209+.. variable:: innodb_sched_priority_io
210+
211+ :version 5.6.13-61.0: Introduced.
212+ :cli: Yes
213+ :conf: Yes
214+ :scope: Global
215+ :dyn: Yes
216+ :vartype: Boolean
217+
218+.. variable:: innodb_sched_priority_cleaner
219+
220+ :version 5.6.13-61.0: Introduced.
221+ :cli: Yes
222+ :conf: Yes
223+ :scope: Global
224+ :dyn: Yes
225+ :vartype: Boolean
226+
227+.. variable:: innodb_sched_priority_master
228+
229+ :version 5.6.13-61.0: Introduced.
230+ :cli: Yes
231+ :conf: Yes
232+ :scope: Global
233+ :dyn: Yes
234+ :vartype: Boolean
235+
236+.. _thread_priority:
237+
238+Thread Priority Locks
239+=====================
240+
241+The |InnoDB| worker threads compete for the shared resource accesses with the query threads. Performance experiments show that under high concurrency the worker threads must acquire the shared resources with priority. To this end, a priority mutex and a priority RW lock locking primitives have been implemented, that use the existing sync array code to wake up any high-priority waiting threads before any low-priority waiting threads, as well as reduce any low-priority thread spinning if any high-priority waiters are already present for a given sync object. The following mutexes have been converted to be priority mutexes: dict_sys, LRU list, free list, rseg, log_sys, and internal hash table sync object array mutexes. The following RW locks have been converted to priority RW locks: fsp, page_hash, AHI, index, and purge. To specify which threads are high-priority for shared resource acquisition, |Percona Server| has introduced several tuning options, only available in builds compiled with ``UNIV_PERF_DEBUG`` C preprocessor define.
242+
243+.. variable:: innodb_priority_purge
244+
245+ :version 5.6.13-61.0: Introduced.
246+ :cli: Yes
247+ :conf: Yes
248+ :scope: Global
249+ :dyn: Yes
250+ :vartype: Boolean
251+
252+When this option is enabled purge coordinator and worker threads acquire shared resources with priority.
253+
254+.. variable:: innodb_priority_io
255+
256+ :version 5.6.13-61.0: Introduced.
257+ :cli: Yes
258+ :conf: Yes
259+ :scope: Global
260+ :dyn: Yes
261+ :vartype: Boolean
262+
263+When this option is enabled I/O threads acquire shared resources with priority.
264+
265+.. variable:: innodb_priority_cleaner
266+
267+ :version 5.6.13-61.0: Introduced.
268+ :cli: Yes
269+ :conf: Yes
270+ :scope: Global
271+ :dyn: Yes
272+ :vartype: Boolean
273+
274+When this option is enabled buffer pool cleaner thread acquire shared resources with priority.
275+
276+.. variable:: innodb_priority_master
277+
278+ :version 5.6.13-61.0: Introduced.
279+ :cli: Yes
280+ :conf: Yes
281+ :scope: Global
282+ :dyn: Yes
283+ :vartype: Boolean
284+
285+When buffer pool cleaner thread acquire shared resources with priority.
286+
287+.. note::
288+
289+ These variables are intended for performance experimenting and not regular user tuning.
290+
291+Other Reading
292+=============
293+* :ref:`page_cleaner_tuning`

Subscribers

People subscribed via source and target branches