Merge lp:~lifeless/lp-dev-utils/ppr into lp:lp-dev-utils
- ppr
- Merge into trunk
Proposed by
Robert Collins
Status: | Merged |
---|---|
Approved by: | Robert Collins |
Approved revision: | 124 |
Merged at revision: | 124 |
Proposed branch: | lp:~lifeless/lp-dev-utils/ppr |
Merge into: | lp:lp-dev-utils |
Diff against target: |
2543 lines (+2463/-2) 13 files modified
.bzrignore (+7/-0) .testr.conf (+3/-2) Makefile (+13/-0) README (+7/-0) bootstrap.py (+259/-0) buildout.cfg (+38/-0) page-performance-report-daily.sh (+115/-0) page-performance-report.ini (+79/-0) page-performance-report.py (+18/-0) pageperformancereport.py (+1277/-0) setup.py (+50/-0) test_pageperformancereport.py (+486/-0) versions.cfg (+111/-0) |
To merge this branch: | bzr merge lp:~lifeless/lp-dev-utils/ppr |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
William Grant | code | Approve | |
Review via email: mp+118870@code.launchpad.net |
Commit message
Description of the change
This branch:
- updates the .testr.conf to support parallel tests.
- adds buildout to let us use zc stuff (but makes it optional)
- and migrates the pageperformance
To post a comment you must log in.
Revision history for this message
Robert Collins (lifeless) wrote : | # |
(For clarity, william doesn't condone it anywhere :P)
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1 | === modified file '.bzrignore' |
2 | --- .bzrignore 2012-04-13 15:33:03 +0000 |
3 | +++ .bzrignore 2012-08-09 04:56:19 +0000 |
4 | @@ -1,3 +1,10 @@ |
5 | .launchpadlib |
6 | _trial_temp |
7 | .testrepository |
8 | +.installed.cfg |
9 | +eggs |
10 | +download-cache |
11 | +lp_dev_utils.egg-info |
12 | +parts |
13 | +bin |
14 | +develop-eggs |
15 | |
16 | === modified file '.testr.conf' |
17 | --- .testr.conf 2012-04-13 15:08:57 +0000 |
18 | +++ .testr.conf 2012-08-09 04:56:19 +0000 |
19 | @@ -1,3 +1,4 @@ |
20 | [DEFAULT] |
21 | -test_command=PYTHONPATH=.:$PYTHONPATH python -m subunit.run discover $IDLIST |
22 | -test_id_list_default=ec2test |
23 | +test_command=${PYTHON:-python} -m subunit.run discover $LISTOPT $IDOPTION . |
24 | +test_id_option=--load-list $IDFILE |
25 | +test_list_option=--list |
26 | |
27 | === added file 'Makefile' |
28 | --- Makefile 1970-01-01 00:00:00 +0000 |
29 | +++ Makefile 2012-08-09 04:56:19 +0000 |
30 | @@ -0,0 +1,13 @@ |
31 | +all: |
32 | + |
33 | +bin/buildout: buildout.cfg versions.cfg setup.py download-cache eggs |
34 | + ./bootstrap.py \ |
35 | + --setup-source=download-cache/ez_setup.py \ |
36 | + --download-base=download-cache/dist --eggs=eggs |
37 | + |
38 | + |
39 | +download-cache: |
40 | + bzr checkout --lightweight lp:lp-source-dependencies download-cache |
41 | + |
42 | +eggs: |
43 | + mkdir eggs |
44 | |
45 | === modified file 'README' |
46 | --- README 2012-04-13 15:08:57 +0000 |
47 | +++ README 2012-08-09 04:56:19 +0000 |
48 | @@ -1,6 +1,7 @@ |
49 | ============== |
50 | lp-dev-utils |
51 | ============== |
52 | + |
53 | Tools for hacking on Launchpad |
54 | ============================== |
55 | |
56 | @@ -40,3 +41,9 @@ |
57 | Ran 84 (+84) tests in 51.723s (+51.651s) |
58 | FAILED (id=1) |
59 | |
60 | +To run the pageperformancereport tests, zc.zservertracelog is needed, this is |
61 | +best obtained via buildout:: |
62 | + |
63 | + $ make bin/buildout |
64 | + $ bin/buildout |
65 | + $ PYTHON=bin/py testr run |
66 | |
67 | === added file 'bootstrap.py' |
68 | --- bootstrap.py 1970-01-01 00:00:00 +0000 |
69 | +++ bootstrap.py 2012-08-09 04:56:19 +0000 |
70 | @@ -0,0 +1,259 @@ |
71 | +#!/usr/bin/env python |
72 | +############################################################################## |
73 | +# |
74 | +# Copyright (c) 2006 Zope Foundation and Contributors. |
75 | +# All Rights Reserved. |
76 | +# |
77 | +# This software is subject to the provisions of the Zope Public License, |
78 | +# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. |
79 | +# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED |
80 | +# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED |
81 | +# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS |
82 | +# FOR A PARTICULAR PURPOSE. |
83 | +# |
84 | +############################################################################## |
85 | +"""Bootstrap a buildout-based project |
86 | + |
87 | +Simply run this script in a directory containing a buildout.cfg. |
88 | +The script accepts buildout command-line options, so you can |
89 | +use the -c option to specify an alternate configuration file. |
90 | +""" |
91 | + |
92 | +import os, shutil, sys, tempfile, textwrap, urllib, urllib2, subprocess |
93 | +from optparse import OptionParser |
94 | + |
95 | +if sys.platform == 'win32': |
96 | + def quote(c): |
97 | + if ' ' in c: |
98 | + return '"%s"' % c # work around spawn lamosity on windows |
99 | + else: |
100 | + return c |
101 | +else: |
102 | + quote = str |
103 | + |
104 | +# See zc.buildout.easy_install._has_broken_dash_S for motivation and comments. |
105 | +stdout, stderr = subprocess.Popen( |
106 | + [sys.executable, '-Sc', |
107 | + 'try:\n' |
108 | + ' import ConfigParser\n' |
109 | + 'except ImportError:\n' |
110 | + ' print 1\n' |
111 | + 'else:\n' |
112 | + ' print 0\n'], |
113 | + stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate() |
114 | +has_broken_dash_S = bool(int(stdout.strip())) |
115 | + |
116 | +# In order to be more robust in the face of system Pythons, we want to |
117 | +# run without site-packages loaded. This is somewhat tricky, in |
118 | +# particular because Python 2.6's distutils imports site, so starting |
119 | +# with the -S flag is not sufficient. However, we'll start with that: |
120 | +if not has_broken_dash_S and 'site' in sys.modules: |
121 | + # We will restart with python -S. |
122 | + args = sys.argv[:] |
123 | + args[0:0] = [sys.executable, '-S'] |
124 | + args = map(quote, args) |
125 | + os.execv(sys.executable, args) |
126 | +# Now we are running with -S. We'll get the clean sys.path, import site |
127 | +# because distutils will do it later, and then reset the path and clean |
128 | +# out any namespace packages from site-packages that might have been |
129 | +# loaded by .pth files. |
130 | +clean_path = sys.path[:] |
131 | +import site |
132 | +sys.path[:] = clean_path |
133 | +for k, v in sys.modules.items(): |
134 | + if (hasattr(v, '__path__') and |
135 | + len(v.__path__)==1 and |
136 | + not os.path.exists(os.path.join(v.__path__[0],'__init__.py'))): |
137 | + # This is a namespace package. Remove it. |
138 | + sys.modules.pop(k) |
139 | + |
140 | +is_jython = sys.platform.startswith('java') |
141 | + |
142 | +setuptools_source = 'http://peak.telecommunity.com/dist/ez_setup.py' |
143 | +distribute_source = 'http://python-distribute.org/distribute_setup.py' |
144 | + |
145 | +# parsing arguments |
146 | +def normalize_to_url(option, opt_str, value, parser): |
147 | + if value: |
148 | + if '://' not in value: # It doesn't smell like a URL. |
149 | + value = 'file://%s' % ( |
150 | + urllib.pathname2url( |
151 | + os.path.abspath(os.path.expanduser(value))),) |
152 | + if opt_str == '--download-base' and not value.endswith('/'): |
153 | + # Download base needs a trailing slash to make the world happy. |
154 | + value += '/' |
155 | + else: |
156 | + value = None |
157 | + name = opt_str[2:].replace('-', '_') |
158 | + setattr(parser.values, name, value) |
159 | + |
160 | +usage = '''\ |
161 | +[DESIRED PYTHON FOR BUILDOUT] bootstrap.py [options] |
162 | + |
163 | +Bootstraps a buildout-based project. |
164 | + |
165 | +Simply run this script in a directory containing a buildout.cfg, using the |
166 | +Python that you want bin/buildout to use. |
167 | + |
168 | +Note that by using --setup-source and --download-base to point to |
169 | +local resources, you can keep this script from going over the network. |
170 | +''' |
171 | + |
172 | +parser = OptionParser(usage=usage) |
173 | +parser.add_option("-v", "--version", dest="version", |
174 | + help="use a specific zc.buildout version") |
175 | +parser.add_option("-d", "--distribute", |
176 | + action="store_true", dest="use_distribute", default=False, |
177 | + help="Use Distribute rather than Setuptools.") |
178 | +parser.add_option("--setup-source", action="callback", dest="setup_source", |
179 | + callback=normalize_to_url, nargs=1, type="string", |
180 | + help=("Specify a URL or file location for the setup file. " |
181 | + "If you use Setuptools, this will default to " + |
182 | + setuptools_source + "; if you use Distribute, this " |
183 | + "will default to " + distribute_source +".")) |
184 | +parser.add_option("--download-base", action="callback", dest="download_base", |
185 | + callback=normalize_to_url, nargs=1, type="string", |
186 | + help=("Specify a URL or directory for downloading " |
187 | + "zc.buildout and either Setuptools or Distribute. " |
188 | + "Defaults to PyPI.")) |
189 | +parser.add_option("--eggs", |
190 | + help=("Specify a directory for storing eggs. Defaults to " |
191 | + "a temporary directory that is deleted when the " |
192 | + "bootstrap script completes.")) |
193 | +parser.add_option("-t", "--accept-buildout-test-releases", |
194 | + dest='accept_buildout_test_releases', |
195 | + action="store_true", default=False, |
196 | + help=("Normally, if you do not specify a --version, the " |
197 | + "bootstrap script and buildout gets the newest " |
198 | + "*final* versions of zc.buildout and its recipes and " |
199 | + "extensions for you. If you use this flag, " |
200 | + "bootstrap and buildout will get the newest releases " |
201 | + "even if they are alphas or betas.")) |
202 | +parser.add_option("-c", None, action="store", dest="config_file", |
203 | + help=("Specify the path to the buildout configuration " |
204 | + "file to be used.")) |
205 | + |
206 | +options, args = parser.parse_args() |
207 | + |
208 | +# if -c was provided, we push it back into args for buildout's main function |
209 | +if options.config_file is not None: |
210 | + args += ['-c', options.config_file] |
211 | + |
212 | +if options.eggs: |
213 | + eggs_dir = os.path.abspath(os.path.expanduser(options.eggs)) |
214 | +else: |
215 | + eggs_dir = tempfile.mkdtemp() |
216 | + |
217 | +if options.setup_source is None: |
218 | + if options.use_distribute: |
219 | + options.setup_source = distribute_source |
220 | + else: |
221 | + options.setup_source = setuptools_source |
222 | + |
223 | +if options.accept_buildout_test_releases: |
224 | + args.append('buildout:accept-buildout-test-releases=true') |
225 | +args.append('bootstrap') |
226 | + |
227 | +try: |
228 | + import pkg_resources |
229 | + import setuptools # A flag. Sometimes pkg_resources is installed alone. |
230 | + if not hasattr(pkg_resources, '_distribute'): |
231 | + raise ImportError |
232 | +except ImportError: |
233 | + ez_code = urllib2.urlopen( |
234 | + options.setup_source).read().replace('\r\n', '\n') |
235 | + ez = {} |
236 | + exec ez_code in ez |
237 | + setup_args = dict(to_dir=eggs_dir, download_delay=0) |
238 | + if options.download_base: |
239 | + setup_args['download_base'] = options.download_base |
240 | + if options.use_distribute: |
241 | + setup_args['no_fake'] = True |
242 | + ez['use_setuptools'](**setup_args) |
243 | + reload(sys.modules['pkg_resources']) |
244 | + import pkg_resources |
245 | + # This does not (always?) update the default working set. We will |
246 | + # do it. |
247 | + for path in sys.path: |
248 | + if path not in pkg_resources.working_set.entries: |
249 | + pkg_resources.working_set.add_entry(path) |
250 | + |
251 | +cmd = [quote(sys.executable), |
252 | + '-c', |
253 | + quote('from setuptools.command.easy_install import main; main()'), |
254 | + '-mqNxd', |
255 | + quote(eggs_dir)] |
256 | + |
257 | +if not has_broken_dash_S: |
258 | + cmd.insert(1, '-S') |
259 | + |
260 | +find_links = options.download_base |
261 | +if not find_links: |
262 | + find_links = os.environ.get('bootstrap-testing-find-links') |
263 | +if find_links: |
264 | + cmd.extend(['-f', quote(find_links)]) |
265 | + |
266 | +if options.use_distribute: |
267 | + setup_requirement = 'distribute' |
268 | +else: |
269 | + setup_requirement = 'setuptools' |
270 | +ws = pkg_resources.working_set |
271 | +setup_requirement_path = ws.find( |
272 | + pkg_resources.Requirement.parse(setup_requirement)).location |
273 | +env = dict( |
274 | + os.environ, |
275 | + PYTHONPATH=setup_requirement_path) |
276 | + |
277 | +requirement = 'zc.buildout' |
278 | +version = options.version |
279 | +if version is None and not options.accept_buildout_test_releases: |
280 | + # Figure out the most recent final version of zc.buildout. |
281 | + import setuptools.package_index |
282 | + _final_parts = '*final-', '*final' |
283 | + def _final_version(parsed_version): |
284 | + for part in parsed_version: |
285 | + if (part[:1] == '*') and (part not in _final_parts): |
286 | + return False |
287 | + return True |
288 | + index = setuptools.package_index.PackageIndex( |
289 | + search_path=[setup_requirement_path]) |
290 | + if find_links: |
291 | + index.add_find_links((find_links,)) |
292 | + req = pkg_resources.Requirement.parse(requirement) |
293 | + if index.obtain(req) is not None: |
294 | + best = [] |
295 | + bestv = None |
296 | + for dist in index[req.project_name]: |
297 | + distv = dist.parsed_version |
298 | + if _final_version(distv): |
299 | + if bestv is None or distv > bestv: |
300 | + best = [dist] |
301 | + bestv = distv |
302 | + elif distv == bestv: |
303 | + best.append(dist) |
304 | + if best: |
305 | + best.sort() |
306 | + version = best[-1].version |
307 | +if version: |
308 | + requirement = '=='.join((requirement, version)) |
309 | +cmd.append(requirement) |
310 | + |
311 | +if is_jython: |
312 | + import subprocess |
313 | + exitcode = subprocess.Popen(cmd, env=env).wait() |
314 | +else: # Windows prefers this, apparently; otherwise we would prefer subprocess |
315 | + exitcode = os.spawnle(*([os.P_WAIT, sys.executable] + cmd + [env])) |
316 | +if exitcode != 0: |
317 | + sys.stdout.flush() |
318 | + sys.stderr.flush() |
319 | + print ("An error occurred when trying to install zc.buildout. " |
320 | + "Look above this message for any errors that " |
321 | + "were output by easy_install.") |
322 | + sys.exit(exitcode) |
323 | + |
324 | +ws.add_entry(eggs_dir) |
325 | +ws.require(requirement) |
326 | +import zc.buildout.buildout |
327 | +zc.buildout.buildout.main(args) |
328 | +if not options.eggs: # clean up temporary egg directory |
329 | + shutil.rmtree(eggs_dir) |
330 | |
331 | === added file 'buildout.cfg' |
332 | --- buildout.cfg 1970-01-01 00:00:00 +0000 |
333 | +++ buildout.cfg 2012-08-09 04:56:19 +0000 |
334 | @@ -0,0 +1,38 @@ |
335 | +# Copyright 2011 Canonical Ltd. This software is licensed under the |
336 | +# GNU Lesser General Public License version 3 (see the file LICENSE). |
337 | + |
338 | +[buildout] |
339 | +parts = |
340 | + scripts |
341 | +unzip = true |
342 | +eggs-directory = eggs |
343 | +download-cache = download-cache |
344 | +relative-paths = true |
345 | + |
346 | +# Disable this option temporarily if you want buildout to find software |
347 | +# dependencies *other* than those in our download-cache. Once you have the |
348 | +# desired software, reenable this option (and check in the new software to |
349 | +# lp:lp-source-dependencies if this is going to be reviewed/merged/deployed.) |
350 | +install-from-cache = true |
351 | + |
352 | +# This also will need to be temporarily disabled or changed for package |
353 | +# upgrades. Newly-added packages should also add their desired version number |
354 | +# to versions.cfg. |
355 | +extends = versions.cfg |
356 | + |
357 | +allow-picked-versions = false |
358 | + |
359 | +prefer-final = true |
360 | + |
361 | +develop = . |
362 | + |
363 | +# [configuration] |
364 | +# instance_name = development |
365 | + |
366 | +[scripts] |
367 | +recipe = z3c.recipe.scripts |
368 | +eggs = lp-dev-utils [test] |
369 | +include-site-packages = true |
370 | +allowed-eggs-from-site-packages = |
371 | + subunit |
372 | +interpreter = py |
373 | |
374 | === added file 'page-performance-report-daily.sh' |
375 | --- page-performance-report-daily.sh 1970-01-01 00:00:00 +0000 |
376 | +++ page-performance-report-daily.sh 2012-08-09 04:56:19 +0000 |
377 | @@ -0,0 +1,115 @@ |
378 | +#!/bin/sh |
379 | + |
380 | +#TZ=UTC # trace logs are still BST - blech |
381 | + |
382 | +CATEGORY=lpnet |
383 | +LOGS_ROOTS="/srv/launchpad.net-logs/production /srv/launchpad.net-logs/edge" |
384 | +OUTPUT_ROOT=${HOME}/public_html/ppr/lpnet |
385 | +DAY_FMT="+%Y-%m-%d" |
386 | + |
387 | +find_logs() { |
388 | + from=$1 |
389 | + until=$2 |
390 | + |
391 | + end_mtime_switch= |
392 | + days_to_end="$(expr `date +%j` - `date -d $until +%j` - 1)" |
393 | + if [ $days_to_end -gt 0 ]; then |
394 | + end_mtime_switch="-daystart -mtime +$days_to_end" |
395 | + fi |
396 | + |
397 | + find ${LOGS_ROOTS} \ |
398 | + -maxdepth 2 -type f -newermt "$from - 1 day" $end_mtime_switch \ |
399 | + -name launchpad-trace\* \ |
400 | + | sort | xargs -x |
401 | +} |
402 | + |
403 | +# Find all the daily stats.pck.bz2 $from $until |
404 | +find_stats() { |
405 | + from=$1 |
406 | + until=$2 |
407 | + |
408 | + # Build a string of all the days within range. |
409 | + local dates |
410 | + local day |
411 | + day=$from |
412 | + while [ $day != $until ]; do |
413 | + dates="$dates $day" |
414 | + day=`date $DAY_FMT -d "$day + 1 day"` |
415 | + done |
416 | + |
417 | + # Use that to build a regex that will be used to select |
418 | + # the files to use. |
419 | + local regex |
420 | + regex="daily_(`echo $dates |sed -e 's/ /|/g'`)" |
421 | + |
422 | + find ${OUTPUT_ROOT} -name 'stats.pck.bz2' | egrep $regex |
423 | +} |
424 | + |
425 | +report() { |
426 | + type=$1 |
427 | + from=$2 |
428 | + until=$3 |
429 | + link=$4 |
430 | + |
431 | + local files |
432 | + local options |
433 | + if [ "$type" = "daily" ]; then |
434 | + files=`find_logs $from $until` |
435 | + options="--from=$from --until=$until" |
436 | + else |
437 | + files=`find_stats $from $until` |
438 | + options="--merge" |
439 | + fi |
440 | + |
441 | + local dir |
442 | + dir=${OUTPUT_ROOT}/`date -d $from +%Y-%m`/${type}_${from}_${until} |
443 | + mkdir -p ${dir} |
444 | + |
445 | + echo Generating report from $from until $until into $dir `date` |
446 | + |
447 | + ./page-performance-report.py -v --top-urls=200 --directory=${dir} \ |
448 | + $options $files |
449 | + |
450 | + # Only do the linking if requested. |
451 | + if [ "$link" = "link" ]; then |
452 | + ln -sf ${dir}/partition.html \ |
453 | + ${OUTPUT_ROOT}/latest-${type}-partition.html |
454 | + ln -sf ${dir}/categories.html \ |
455 | + ${OUTPUT_ROOT}/latest-${type}-categories.html |
456 | + ln -sf ${dir}/pageids.html \ |
457 | + ${OUTPUT_ROOT}/latest-${type}-pageids.html |
458 | + ln -sf ${dir}/combined.html \ |
459 | + ${OUTPUT_ROOT}/latest-${type}-combined.html |
460 | + ln -sf ${dir}/metrics.dat ${OUTPUT_ROOT}/latest-${type}-metrics.dat |
461 | + ln -sf ${dir}/top200.html ${OUTPUT_ROOT}/latest-${type}-top200.html |
462 | + ln -sf ${dir}/timeout-candidates.html \ |
463 | + ${OUTPUT_ROOT}/latest-${type}-timeout-candidates.html |
464 | + fi |
465 | + |
466 | + return 0 |
467 | +} |
468 | + |
469 | +local link |
470 | +if [ "$3" = "-l" ]; then |
471 | + link="link" |
472 | +fi |
473 | + |
474 | +if [ "$1" = '-d' ]; then |
475 | + report daily `date -d $2 $DAY_FMT` `date -d "$2 + 1 day" $DAY_FMT` $link |
476 | +elif [ "$1" = '-w' ]; then |
477 | + report weekly `date -d $2 $DAY_FMT` `date -d "$2 + 1 week" $DAY_FMT` $link |
478 | +elif [ "$1" = '-m' ]; then |
479 | + report monthly `date -d $2 $DAY_FMT` `date -d "$2 + 1 month" $DAY_FMT` $link |
480 | +else |
481 | + # Default invocation used from cron to generate latest one. |
482 | + now=`date $DAY_FMT` |
483 | + report daily `date -d yesterday $DAY_FMT` $now link |
484 | + |
485 | + if [ `date +%a` = 'Sun' ]; then |
486 | + report weekly `date -d 'last week' $DAY_FMT` $now link |
487 | + fi |
488 | + |
489 | + if [ `date +%d` = '01' ]; then |
490 | + report monthly `date -d 'last month' $DAY_FMT` $now link |
491 | + fi |
492 | +fi |
493 | |
494 | === added file 'page-performance-report.ini' |
495 | --- page-performance-report.ini 1970-01-01 00:00:00 +0000 |
496 | +++ page-performance-report.ini 2012-08-09 04:56:19 +0000 |
497 | @@ -0,0 +1,79 @@ |
498 | +[categories] |
499 | +# Category -> Python regular expression. |
500 | +# Remeber to quote ?, ., + & ? characters to match litterally. |
501 | +# 'kodos' is useful for interactively testing regular expressions. |
502 | +All Launchpad=. |
503 | +All Launchpad except operational pages=(?<!\+opstats|\+haproxy)$ |
504 | + |
505 | +API=(^https?://api\.|/\+access-token$) |
506 | +Operational=(\+opstats|\+haproxy)$ |
507 | +Web (Non API/non operational/non XML-RPC)=^https?://(?!api\.) |
508 | + [^/]+($|/ |
509 | + (?!\+haproxy|\+opstats|\+access-token |
510 | + |((authserver|bugs|bazaar|codehosting| |
511 | + codeimportscheduler|mailinglists|softwarecenteragent| |
512 | + featureflags)/\w+$))) |
513 | +Other=^/ |
514 | + |
515 | +Launchpad Frontpage=^https?://launchpad\.[^/]+(/index\.html)?$ |
516 | + |
517 | +# Note that the bug text dump is served on the main launchpad domain |
518 | +# and we need to exlude it from the registry stats. |
519 | +Registry=^https?://launchpad\..*(?<!/\+text)(?<!/\+access-token)$ |
520 | +Registry - Person Index=^https?://launchpad\.[^/]+/%7E[^/]+(/\+index)?$ |
521 | +Registry - Pillar Index=^https?://launchpad\.[^/]+/\w[^/]*(/\+index)?$ |
522 | + |
523 | +Answers=^https?://answers\. |
524 | +Answers - Front page=^https?://answers\.[^/]+(/questions/\+index)?$ |
525 | + |
526 | +Blueprints=^https?://blueprints\. |
527 | +Blueprints - Front page=^https?://blueprints\.[^/]+(/specs/\+index)?$ |
528 | + |
529 | +# Note that the bug text dump is not served on the bugs domain, |
530 | +# probably for hysterical reasons. This is why the bugs regexp is |
531 | +# confusing. |
532 | +Bugs=^https?://(bugs\.|.+/bugs/\d+/\+text$) |
533 | +Bugs - Front page=^https?://bugs\.[^/]+(/bugs/\+index)?$ |
534 | +Bugs - Bug Page=^https?://bugs\.[^/]+/.+/\+bug/\d+(/\+index)?$ |
535 | +Bugs - Pillar Index=^https?://bugs\.[^/]+/\w[^/]*(/\+bugs-index)?$ |
536 | +Bugs - Search=^https?://bugs\.[^/]+/.+/\+bugs$ |
537 | +Bugs - Text Dump=^https?://launchpad\..+/\+text$ |
538 | + |
539 | +Code=^https?://code\. |
540 | +Code - Front page=^https?://code\.[^/]+(/\+code/\+index)?$ |
541 | +Code - Pillar Branches=^https?://code\.[^/]+/\w[^/]*(/\+code-index)?$ |
542 | +Code - Branch Page=^https?://code\.[^/]+/%7E[^/]+/[^/]+/[^/]+(/\+index)?$ |
543 | +Code - Merge Proposal=^https?://code\.[^/]+/.+/\+merge/\d+(/\+index)$ |
544 | + |
545 | +Soyuz - PPA Index=^https?://launchpad\.[^/]+/.+/\+archive/[^/]+(/\+index)?$ |
546 | + |
547 | +Translations=^https?://translations\. |
548 | +Translations - Front page=^https?://translations\.[^/]+/translations/\+index$ |
549 | +Translations - Overview=^https?://translations\..*/\+lang/\w+(/\+index)?$ |
550 | + |
551 | +Public XML-RPC=^https://(launchpad|xmlrpc)[^/]+/bazaar/\w+$ |
552 | +Private XML-RPC=^https://(launchpad|xmlrpc)[^/]+/ |
553 | + (authserver|bugs|codehosting| |
554 | + codeimportscheduler|mailinglists| |
555 | + softwarecenteragent|featureflags)/\w+$ |
556 | + |
557 | +[metrics] |
558 | +ppr_all=All Launchpad except operational pages |
559 | +ppr_web=Web (Non API/non operational/non XML-RPC) |
560 | +ppr_operational=Operational |
561 | +ppr_bugs=Bugs |
562 | +ppr_api=API |
563 | +ppr_code=Code |
564 | +ppr_public_xmlrpc=Public XML-RPC |
565 | +ppr_private_xmlrpc=Private XML-RPC |
566 | +ppr_translations=Translations |
567 | +ppr_registry=Registry |
568 | +ppr_other=Other |
569 | + |
570 | +[partition] |
571 | +API= |
572 | +Operational= |
573 | +Private XML-RPC= |
574 | +Public XML-RPC= |
575 | +Web (Non API/non operational/non XML-RPC)= |
576 | +Other= |
577 | |
578 | === added file 'page-performance-report.py' |
579 | --- page-performance-report.py 1970-01-01 00:00:00 +0000 |
580 | +++ page-performance-report.py 2012-08-09 04:56:19 +0000 |
581 | @@ -0,0 +1,18 @@ |
582 | +#!/usr/bin/python -S |
583 | +# |
584 | +# Copyright 2010 Canonical Ltd. This software is licensed under the |
585 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
586 | + |
587 | +"""Page performance report generated from zserver tracelogs.""" |
588 | + |
589 | +__metaclass__ = type |
590 | + |
591 | +import _pythonpath |
592 | + |
593 | +import sys |
594 | + |
595 | +from lp.scripts.utilities.pageperformancereport import main |
596 | + |
597 | + |
598 | +if __name__ == '__main__': |
599 | + sys.exit(main()) |
600 | |
601 | === added file 'pageperformancereport.py' |
602 | --- pageperformancereport.py 1970-01-01 00:00:00 +0000 |
603 | +++ pageperformancereport.py 2012-08-09 04:56:19 +0000 |
604 | @@ -0,0 +1,1277 @@ |
605 | +# Copyright 2010 Canonical Ltd. This software is licensed under the |
606 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
607 | + |
608 | +"""Page performance report generated from zserver trace logs.""" |
609 | + |
610 | +__metaclass__ = type |
611 | +__all__ = ['main'] |
612 | + |
613 | +import bz2 |
614 | +from cgi import escape as html_quote |
615 | +from ConfigParser import RawConfigParser |
616 | +import copy |
617 | +import cPickle |
618 | +import csv |
619 | +from datetime import datetime |
620 | +import gzip |
621 | +import logging |
622 | +import math |
623 | +import optparse |
624 | +import os.path |
625 | +import re |
626 | +import textwrap |
627 | +from textwrap import dedent |
628 | +import time |
629 | + |
630 | +import simplejson as json |
631 | +import sre_constants |
632 | +import zc.zservertracelog.tracereport |
633 | + |
634 | +logging.basicConfig() |
635 | +log = logging |
636 | + |
637 | + |
638 | +def _check_datetime(option, opt, value): |
639 | + "Type checker for optparse datetime option type." |
640 | + # We support 5 valid ISO8601 formats. |
641 | + formats = [ |
642 | + '%Y-%m-%dT%H:%M:%S', |
643 | + '%Y-%m-%dT%H:%M', |
644 | + '%Y-%m-%d %H:%M:%S', |
645 | + '%Y-%m-%d %H:%M', |
646 | + '%Y-%m-%d', |
647 | + ] |
648 | + for format in formats: |
649 | + try: |
650 | + return datetime.strptime(value, format) |
651 | + except ValueError: |
652 | + pass |
653 | + raise optparse.OptionValueError( |
654 | + "option %s: invalid datetime value: %r" % (opt, value)) |
655 | + |
656 | + |
657 | +class Option(optparse.Option): |
658 | + """Extended optparse Option class. |
659 | + |
660 | + Adds a 'datetime' option type. |
661 | + """ |
662 | + TYPES = optparse.Option.TYPES + ("datetime", datetime) |
663 | + TYPE_CHECKER = copy.copy(optparse.Option.TYPE_CHECKER) |
664 | + TYPE_CHECKER["datetime"] = _check_datetime |
665 | + TYPE_CHECKER[datetime] = _check_datetime |
666 | + |
667 | + |
668 | +class OptionParser(optparse.OptionParser): |
669 | + """Extended optparse OptionParser. |
670 | + |
671 | + Adds a 'datetime' option type. |
672 | + """ |
673 | + |
674 | + def __init__(self, *args, **kw): |
675 | + kw.setdefault('option_class', Option) |
676 | + optparse.OptionParser.__init__(self, *args, **kw) |
677 | + |
678 | + |
679 | +class Request(zc.zservertracelog.tracereport.Request): |
680 | + url = None |
681 | + pageid = None |
682 | + ticks = None |
683 | + sql_statements = None |
684 | + sql_seconds = None |
685 | + |
686 | + # Override the broken version in our superclass that always |
687 | + # returns an integer. |
688 | + @property |
689 | + def app_seconds(self): |
690 | + interval = self.app_time - self.start_app_time |
691 | + return interval.seconds + interval.microseconds / 1000000.0 |
692 | + |
693 | + # Override the broken version in our superclass that always |
694 | + # returns an integer. |
695 | + @property |
696 | + def total_seconds(self): |
697 | + interval = self.end - self.start |
698 | + return interval.seconds + interval.microseconds / 1000000.0 |
699 | + |
700 | + |
701 | +class Category: |
702 | + """A Category in our report. |
703 | + |
704 | + Requests belong to a Category if the URL matches a regular expression. |
705 | + """ |
706 | + |
707 | + def __init__(self, title, regexp): |
708 | + self.title = title |
709 | + self.regexp = regexp |
710 | + self._compiled_regexp = re.compile(regexp, re.I | re.X) |
711 | + self.partition = False |
712 | + |
713 | + def match(self, request): |
714 | + """Return true when the request match this category.""" |
715 | + return self._compiled_regexp.search(request.url) is not None |
716 | + |
717 | + def __cmp__(self, other): |
718 | + return cmp(self.title.lower(), other.title.lower()) |
719 | + |
720 | + def __deepcopy__(self, memo): |
721 | + # We provide __deepcopy__ because the module doesn't handle |
722 | + # compiled regular expression by default. |
723 | + return Category(self.title, self.regexp) |
724 | + |
725 | + |
726 | +class OnlineStatsCalculator: |
727 | + """Object that can compute count, sum, mean, variance and median. |
728 | + |
729 | + It computes these value incrementally and using minimal storage |
730 | + using the Welford / Knuth algorithm described at |
731 | + http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm |
732 | + """ |
733 | + |
734 | + def __init__(self): |
735 | + self.count = 0 |
736 | + self.sum = 0 |
737 | + self.M2 = 0.0 # Sum of square difference |
738 | + self.mean = 0.0 |
739 | + |
740 | + def update(self, x): |
741 | + """Incrementally update the stats when adding x to the set. |
742 | + |
743 | + None values are ignored. |
744 | + """ |
745 | + if x is None: |
746 | + return |
747 | + self.count += 1 |
748 | + self.sum += x |
749 | + delta = x - self.mean |
750 | + self.mean = float(self.sum)/self.count |
751 | + self.M2 += delta*(x - self.mean) |
752 | + |
753 | + @property |
754 | + def variance(self): |
755 | + """Return the population variance.""" |
756 | + if self.count == 0: |
757 | + return 0 |
758 | + else: |
759 | + return self.M2/self.count |
760 | + |
761 | + @property |
762 | + def std(self): |
763 | + """Return the standard deviation.""" |
764 | + if self.count == 0: |
765 | + return 0 |
766 | + else: |
767 | + return math.sqrt(self.variance) |
768 | + |
769 | + def __add__(self, other): |
770 | + """Adds this and another OnlineStatsCalculator. |
771 | + |
772 | + The result combines the stats of the two objects. |
773 | + """ |
774 | + results = OnlineStatsCalculator() |
775 | + results.count = self.count + other.count |
776 | + results.sum = self.sum + other.sum |
777 | + if self.count > 0 and other.count > 0: |
778 | + # This is 2.1b in Chan, Tony F.; Golub, Gene H.; LeVeque, |
779 | + # Randall J. (1979), "Updating Formulae and a Pairwise Algorithm |
780 | + # for Computing Sample Variances.", |
781 | + # Technical Report STAN-CS-79-773, |
782 | + # Department of Computer Science, Stanford University, |
783 | + # ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf . |
784 | + results.M2 = self.M2 + other.M2 + ( |
785 | + (float(self.count) / (other.count * results.count)) * |
786 | + ((float(other.count) / self.count) * self.sum - other.sum)**2) |
787 | + else: |
788 | + results.M2 = self.M2 + other.M2 # One of them is 0. |
789 | + if results.count > 0: |
790 | + results.mean = float(results.sum) / results.count |
791 | + return results |
792 | + |
793 | + |
794 | +class OnlineApproximateMedian: |
795 | + """Approximate the median of a set of elements. |
796 | + |
797 | + This implements a space-efficient algorithm which only sees each value |
798 | + once. (It will hold in memory log bucket_size of n elements.) |
799 | + |
800 | + It was described and analysed in |
801 | + D. Cantone and M.Hofri, |
802 | + "Analysis of An Approximate Median Selection Algorithm" |
803 | + ftp://ftp.cs.wpi.edu/pub/techreports/pdf/06-17.pdf |
804 | + |
805 | + This algorithm is similar to Tukey's median of medians technique. |
806 | + It will compute the median among bucket_size values. And the median among |
807 | + those. |
808 | + """ |
809 | + |
810 | + def __init__(self, bucket_size=9): |
811 | + """Creates a new estimator. |
812 | + |
813 | + It approximates the median by finding the median among each |
814 | + successive bucket_size element. And then using these medians for other |
815 | + rounds of selection. |
816 | + |
817 | + The bucket size should be a low odd-integer. |
818 | + """ |
819 | + self.bucket_size = bucket_size |
820 | + # Index of the median in a completed bucket. |
821 | + self.median_idx = (bucket_size-1)//2 |
822 | + self.buckets = [] |
823 | + |
824 | + def update(self, x, order=0): |
825 | + """Update with x.""" |
826 | + if x is None: |
827 | + return |
828 | + |
829 | + i = order |
830 | + while True: |
831 | + # Create bucket on demand. |
832 | + if i >= len(self.buckets): |
833 | + for n in range((i+1)-len(self.buckets)): |
834 | + self.buckets.append([]) |
835 | + bucket = self.buckets[i] |
836 | + bucket.append(x) |
837 | + if len(bucket) == self.bucket_size: |
838 | + # Select the median in this bucket, and promote it. |
839 | + x = sorted(bucket)[self.median_idx] |
840 | + # Free the bucket for the next round. |
841 | + del bucket[:] |
842 | + i += 1 |
843 | + continue |
844 | + else: |
845 | + break |
846 | + |
847 | + @property |
848 | + def median(self): |
849 | + """Return the median.""" |
850 | + # Find the 'weighted' median by assigning a weight to each |
851 | + # element proportional to how far they have been selected. |
852 | + candidates = [] |
853 | + total_weight = 0 |
854 | + for i, bucket in enumerate(self.buckets): |
855 | + weight = self.bucket_size ** i |
856 | + for x in bucket: |
857 | + total_weight += weight |
858 | + candidates.append([x, weight]) |
859 | + if len(candidates) == 0: |
860 | + return 0 |
861 | + |
862 | + # Each weight is the equivalent of having the candidates appear |
863 | + # that number of times in the array. |
864 | + # So buckets like [[1, 2], [2, 3], [4, 2]] would be expanded to |
865 | + # [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, |
866 | + # 4, 4, 4, 4, 4] and we find the median of that list (2). |
867 | + # We don't expand the items to conserve memory. |
868 | + median = (total_weight-1) / 2 |
869 | + weighted_idx = 0 |
870 | + for x, weight in sorted(candidates): |
871 | + weighted_idx += weight |
872 | + if weighted_idx > median: |
873 | + return x |
874 | + |
875 | + def __add__(self, other): |
876 | + """Merge two approximators together. |
877 | + |
878 | + All candidates from the other are merged through the standard |
879 | + algorithm, starting at the same level. So an item that went through |
880 | + two rounds of selection, will be compared with other items having |
881 | + gone through the same number of rounds. |
882 | + """ |
883 | + results = OnlineApproximateMedian(self.bucket_size) |
884 | + results.buckets = copy.deepcopy(self.buckets) |
885 | + for i, bucket in enumerate(other.buckets): |
886 | + for x in bucket: |
887 | + results.update(x, i) |
888 | + return results |
889 | + |
890 | + |
891 | +class Stats: |
892 | + """Bag to hold and compute request statistics. |
893 | + |
894 | + All times are in seconds. |
895 | + """ |
896 | + total_hits = 0 # Total hits. |
897 | + |
898 | + total_time = 0 # Total time spent rendering. |
899 | + mean = 0 # Mean time per hit. |
900 | + median = 0 # Median time per hit. |
901 | + std = 0 # Standard deviation per hit. |
902 | + histogram = None # # Request times histogram. |
903 | + |
904 | + total_sqltime = 0 # Total time spent waiting for SQL to process. |
905 | + mean_sqltime = 0 # Mean time spend waiting for SQL to process. |
906 | + median_sqltime = 0 # Median time spend waiting for SQL to process. |
907 | + std_sqltime = 0 # Standard deviation of SQL time. |
908 | + |
909 | + total_sqlstatements = 0 # Total number of SQL statements issued. |
910 | + mean_sqlstatements = 0 |
911 | + median_sqlstatements = 0 |
912 | + std_sqlstatements = 0 |
913 | + |
914 | + @property |
915 | + def ninetyninth_percentile_time(self): |
916 | + """Time under which 99% of requests are rendered. |
917 | + |
918 | + This is estimated as 3 std deviations from the mean. Given that |
919 | + in a daily report, many URLs or PageIds won't have 100 requests, it's |
920 | + more useful to use this estimator. |
921 | + """ |
922 | + return self.mean + 3*self.std |
923 | + |
924 | + @property |
925 | + def ninetyninth_percentile_sqltime(self): |
926 | + """SQL time under which 99% of requests are rendered. |
927 | + |
928 | + This is estimated as 3 std deviations from the mean. |
929 | + """ |
930 | + return self.mean_sqltime + 3*self.std_sqltime |
931 | + |
932 | + @property |
933 | + def ninetyninth_percentile_sqlstatements(self): |
934 | + """Number of SQL statements under which 99% of requests are rendered. |
935 | + |
936 | + This is estimated as 3 std deviations from the mean. |
937 | + """ |
938 | + return self.mean_sqlstatements + 3*self.std_sqlstatements |
939 | + |
940 | + def text(self): |
941 | + """Return a textual version of the stats.""" |
942 | + return textwrap.dedent(""" |
943 | + <Stats for %d requests: |
944 | + Time: total=%.2f; mean=%.2f; median=%.2f; std=%.2f |
945 | + SQL time: total=%.2f; mean=%.2f; median=%.2f; std=%.2f |
946 | + SQL stmt: total=%.f; mean=%.2f; median=%.f; std=%.2f |
947 | + >""" % ( |
948 | + self.total_hits, self.total_time, self.mean, self.median, |
949 | + self.std, self.total_sqltime, self.mean_sqltime, |
950 | + self.median_sqltime, self.std_sqltime, |
951 | + self.total_sqlstatements, self.mean_sqlstatements, |
952 | + self.median_sqlstatements, self.std_sqlstatements)) |
953 | + |
954 | + |
955 | +class OnlineStats(Stats): |
956 | + """Implementation of stats that can be computed online. |
957 | + |
958 | + You call update() for each request and the stats are updated incrementally |
959 | + with minimum storage space. |
960 | + """ |
961 | + |
962 | + def __init__(self, histogram_width, histogram_resolution): |
963 | + self.time_stats = OnlineStatsCalculator() |
964 | + self.time_median_approximate = OnlineApproximateMedian() |
965 | + self.sql_time_stats = OnlineStatsCalculator() |
966 | + self.sql_time_median_approximate = OnlineApproximateMedian() |
967 | + self.sql_statements_stats = OnlineStatsCalculator() |
968 | + self.sql_statements_median_approximate = OnlineApproximateMedian() |
969 | + self.histogram = Histogram(histogram_width, histogram_resolution) |
970 | + |
971 | + @property |
972 | + def total_hits(self): |
973 | + return self.time_stats.count |
974 | + |
975 | + @property |
976 | + def total_time(self): |
977 | + return self.time_stats.sum |
978 | + |
979 | + @property |
980 | + def mean(self): |
981 | + return self.time_stats.mean |
982 | + |
983 | + @property |
984 | + def median(self): |
985 | + return self.time_median_approximate.median |
986 | + |
987 | + @property |
988 | + def std(self): |
989 | + return self.time_stats.std |
990 | + |
991 | + @property |
992 | + def total_sqltime(self): |
993 | + return self.sql_time_stats.sum |
994 | + |
995 | + @property |
996 | + def mean_sqltime(self): |
997 | + return self.sql_time_stats.mean |
998 | + |
999 | + @property |
1000 | + def median_sqltime(self): |
1001 | + return self.sql_time_median_approximate.median |
1002 | + |
1003 | + @property |
1004 | + def std_sqltime(self): |
1005 | + return self.sql_time_stats.std |
1006 | + |
1007 | + @property |
1008 | + def total_sqlstatements(self): |
1009 | + return self.sql_statements_stats.sum |
1010 | + |
1011 | + @property |
1012 | + def mean_sqlstatements(self): |
1013 | + return self.sql_statements_stats.mean |
1014 | + |
1015 | + @property |
1016 | + def median_sqlstatements(self): |
1017 | + return self.sql_statements_median_approximate.median |
1018 | + |
1019 | + @property |
1020 | + def std_sqlstatements(self): |
1021 | + return self.sql_statements_stats.std |
1022 | + |
1023 | + def update(self, request): |
1024 | + """Update the stats based on request.""" |
1025 | + self.time_stats.update(request.app_seconds) |
1026 | + self.time_median_approximate.update(request.app_seconds) |
1027 | + self.sql_time_stats.update(request.sql_seconds) |
1028 | + self.sql_time_median_approximate.update(request.sql_seconds) |
1029 | + self.sql_statements_stats.update(request.sql_statements) |
1030 | + self.sql_statements_median_approximate.update(request.sql_statements) |
1031 | + self.histogram.update(request.app_seconds) |
1032 | + |
1033 | + def __add__(self, other): |
1034 | + """Merge another OnlineStats with this one.""" |
1035 | + results = copy.deepcopy(self) |
1036 | + results.time_stats += other.time_stats |
1037 | + results.time_median_approximate += other.time_median_approximate |
1038 | + results.sql_time_stats += other.sql_time_stats |
1039 | + results.sql_time_median_approximate += ( |
1040 | + other.sql_time_median_approximate) |
1041 | + results.sql_statements_stats += other.sql_statements_stats |
1042 | + results.sql_statements_median_approximate += ( |
1043 | + other.sql_statements_median_approximate) |
1044 | + results.histogram = self.histogram + other.histogram |
1045 | + return results |
1046 | + |
1047 | + |
1048 | +class Histogram: |
1049 | + """A simple object to compute histogram of a value.""" |
1050 | + |
1051 | + @staticmethod |
1052 | + def from_bins_data(data): |
1053 | + """Create an histogram from existing bins data.""" |
1054 | + assert data[0][0] == 0, "First bin should start at zero." |
1055 | + |
1056 | + hist = Histogram(len(data), data[1][0]) |
1057 | + for idx, bin in enumerate(data): |
1058 | + hist.count += bin[1] |
1059 | + hist.bins[idx][1] = bin[1] |
1060 | + |
1061 | + return hist |
1062 | + |
1063 | + def __init__(self, bins_count, bins_size): |
1064 | + """Create a new histogram. |
1065 | + |
1066 | + The histogram will count the frequency of values in bins_count bins |
1067 | + of bins_size each. |
1068 | + """ |
1069 | + self.count = 0 |
1070 | + self.bins_count = bins_count |
1071 | + self.bins_size = bins_size |
1072 | + self.bins = [] |
1073 | + for x in range(bins_count): |
1074 | + self.bins.append([x*bins_size, 0]) |
1075 | + |
1076 | + @property |
1077 | + def bins_relative(self): |
1078 | + """Return the bins with the frequency expressed as a ratio.""" |
1079 | + return [[x, float(f)/self.count] for x, f in self.bins] |
1080 | + |
1081 | + def update(self, value): |
1082 | + """Update the histogram for this value. |
1083 | + |
1084 | + All values higher than the last bin minimum are counted in that last |
1085 | + bin. |
1086 | + """ |
1087 | + self.count += 1 |
1088 | + idx = int(min(self.bins_count-1, value / self.bins_size)) |
1089 | + self.bins[idx][1] += 1 |
1090 | + |
1091 | + def __repr__(self): |
1092 | + """A string representation of this histogram.""" |
1093 | + return "<Histogram %s>" % self.bins |
1094 | + |
1095 | + def __eq__(self, other): |
1096 | + """Two histogram are equals if they have the same bins content.""" |
1097 | + if not isinstance(other, Histogram): |
1098 | + return False |
1099 | + |
1100 | + if self.bins_count != other.bins_count: |
1101 | + return False |
1102 | + |
1103 | + if self.bins_size != other.bins_size: |
1104 | + return False |
1105 | + |
1106 | + for idx, other_bin in enumerate(other.bins): |
1107 | + if self.bins[idx][1] != other_bin[1]: |
1108 | + return False |
1109 | + |
1110 | + return True |
1111 | + |
1112 | + def __add__(self, other): |
1113 | + """Add the frequency of the other histogram to this one. |
1114 | + |
1115 | + The resulting histogram has the same bins_size than this one. |
1116 | + If the other one has a bigger bins_size, we'll assume an even |
1117 | + distribution and distribute the frequency across the smaller bins. If |
1118 | + it has a lower bin_size, we'll aggregate its bins into the larger |
1119 | + ones. We only support different bins_size if the ratio can be |
1120 | + expressed as the ratio between 1 and an integer. |
1121 | + |
1122 | + The resulting histogram is as wide as the widest one. |
1123 | + """ |
1124 | + ratio = float(other.bins_size) / self.bins_size |
1125 | + bins_count = max(self.bins_count, math.ceil(other.bins_count * ratio)) |
1126 | + total = Histogram(int(bins_count), self.bins_size) |
1127 | + total.count = self.count + other.count |
1128 | + |
1129 | + # Copy our bins into the total |
1130 | + for idx, bin in enumerate(self.bins): |
1131 | + total.bins[idx][1] = bin[1] |
1132 | + |
1133 | + assert int(ratio) == ratio or int(1/ratio) == 1/ratio, ( |
1134 | + "We only support different bins size when the ratio is an " |
1135 | + "integer to 1: " |
1136 | + % ratio) |
1137 | + |
1138 | + if ratio >= 1: |
1139 | + # We distribute the frequency across the bins. |
1140 | + # For example. if the ratio is 3:1, we'll add a third |
1141 | + # of the lower resolution bin to 3 of the higher one. |
1142 | + for other_idx, bin in enumerate(other.bins): |
1143 | + f = bin[1] / ratio |
1144 | + start = int(math.floor(other_idx * ratio)) |
1145 | + end = int(start + ratio) |
1146 | + for idx in range(start, end): |
1147 | + total.bins[idx][1] += f |
1148 | + else: |
1149 | + # We need to collect the higher resolution bins into the |
1150 | + # corresponding lower one. |
1151 | + for other_idx, bin in enumerate(other.bins): |
1152 | + idx = int(other_idx * ratio) |
1153 | + total.bins[idx][1] += bin[1] |
1154 | + |
1155 | + return total |
1156 | + |
1157 | + |
1158 | +class RequestTimes: |
1159 | + """Collect statistics from requests. |
1160 | + |
1161 | + Statistics are updated by calling the add_request() method. |
1162 | + |
1163 | + Statistics for mean/stddev/total/median for request times, SQL times and |
1164 | + number of SQL statements are collected. |
1165 | + |
1166 | + They are grouped by Category, URL or PageID. |
1167 | + """ |
1168 | + |
1169 | + def __init__(self, categories, options): |
1170 | + self.by_pageids = options.pageids |
1171 | + self.top_urls = options.top_urls |
1172 | + # We only keep in memory 50 times the number of URLs we want to |
1173 | + # return. The number of URLs can go pretty high (because of the |
1174 | + # distinct query parameters). |
1175 | + # |
1176 | + # Keeping all in memory at once is prohibitive. On a small but |
1177 | + # representative sample, keeping 50 times the possible number of |
1178 | + # candidates and culling to 90% on overflow, generated an identical |
1179 | + # report than keeping all the candidates in-memory. |
1180 | + # |
1181 | + # Keeping 10 times or culling at 90% generated a near-identical report |
1182 | + # (it differed a little in the tail.) |
1183 | + # |
1184 | + # The size/cull parameters might need to change if the requests |
1185 | + # distribution become very different than what it currently is. |
1186 | + self.top_urls_cache_size = self.top_urls * 50 |
1187 | + |
1188 | + # Histogram has a bin per resolution up to our timeout |
1189 | + #(and an extra bin). |
1190 | + self.histogram_resolution = float(options.resolution) |
1191 | + self.histogram_width = int( |
1192 | + options.timeout / self.histogram_resolution) + 1 |
1193 | + self.category_times = [ |
1194 | + (category, OnlineStats( |
1195 | + self.histogram_width, self.histogram_resolution)) |
1196 | + for category in categories] |
1197 | + self.url_times = {} |
1198 | + self.pageid_times = {} |
1199 | + |
1200 | + def add_request(self, request): |
1201 | + """Add request to the set of requests we collect stats for.""" |
1202 | + matched = [] |
1203 | + for category, stats in self.category_times: |
1204 | + if category.match(request): |
1205 | + stats.update(request) |
1206 | + if category.partition: |
1207 | + matched.append(category.title) |
1208 | + |
1209 | + if len(matched) > 1: |
1210 | + log.warning( |
1211 | + "Multiple partition categories matched by %s (%s)", |
1212 | + request.url, ", ".join(matched)) |
1213 | + elif not matched: |
1214 | + log.warning("%s isn't part of the partition", request.url) |
1215 | + |
1216 | + if self.by_pageids: |
1217 | + pageid = request.pageid or 'Unknown' |
1218 | + stats = self.pageid_times.setdefault( |
1219 | + pageid, OnlineStats( |
1220 | + self.histogram_width, self.histogram_resolution)) |
1221 | + stats.update(request) |
1222 | + |
1223 | + if self.top_urls: |
1224 | + stats = self.url_times.setdefault( |
1225 | + request.url, OnlineStats( |
1226 | + self.histogram_width, self.histogram_resolution)) |
1227 | + stats.update(request) |
1228 | + # Whenever we have more URLs than we need to, discard 10% |
1229 | + # that is less likely to end up in the top. |
1230 | + if len(self.url_times) > self.top_urls_cache_size: |
1231 | + cutoff = int(self.top_urls_cache_size*0.90) |
1232 | + self.url_times = dict( |
1233 | + sorted(self.url_times.items(), |
1234 | + key=lambda (url, stats): stats.total_time, |
1235 | + reverse=True)[:cutoff]) |
1236 | + |
1237 | + def get_category_times(self): |
1238 | + """Return the times for each category.""" |
1239 | + return self.category_times |
1240 | + |
1241 | + def get_top_urls_times(self): |
1242 | + """Return the times for the Top URL by total time""" |
1243 | + # Sort the result by total time |
1244 | + return sorted( |
1245 | + self.url_times.items(), |
1246 | + key=lambda (url, stats): stats.total_time, |
1247 | + reverse=True)[:self.top_urls] |
1248 | + |
1249 | + def get_pageid_times(self): |
1250 | + """Return the times for the pageids.""" |
1251 | + # Sort the result by pageid |
1252 | + return sorted(self.pageid_times.items()) |
1253 | + |
1254 | + def __add__(self, other): |
1255 | + """Merge two RequestTimes together.""" |
1256 | + results = copy.deepcopy(self) |
1257 | + for other_category, other_stats in other.category_times: |
1258 | + for i, (category, stats) in enumerate(self.category_times): |
1259 | + if category.title == other_category.title: |
1260 | + results.category_times[i] = ( |
1261 | + category, stats + other_stats) |
1262 | + break |
1263 | + else: |
1264 | + results.category_times.append( |
1265 | + (other_category, copy.deepcopy(other_stats))) |
1266 | + |
1267 | + url_times = results.url_times |
1268 | + for url, stats in other.url_times.items(): |
1269 | + if url in url_times: |
1270 | + url_times[url] += stats |
1271 | + else: |
1272 | + url_times[url] = copy.deepcopy(stats) |
1273 | + # Only keep top_urls_cache_size entries. |
1274 | + if len(self.url_times) > self.top_urls_cache_size: |
1275 | + self.url_times = dict( |
1276 | + sorted( |
1277 | + url_times.items(), |
1278 | + key=lambda (url, stats): stats.total_time, |
1279 | + reverse=True)[:self.top_urls_cache_size]) |
1280 | + |
1281 | + pageid_times = results.pageid_times |
1282 | + for pageid, stats in other.pageid_times.items(): |
1283 | + if pageid in pageid_times: |
1284 | + pageid_times[pageid] += stats |
1285 | + else: |
1286 | + pageid_times[pageid] = copy.deepcopy(stats) |
1287 | + |
1288 | + return results |
1289 | + |
1290 | + |
1291 | +def main(): |
1292 | + parser = ExtendedOptionParser("%prog [args] tracelog [...]") |
1293 | + |
1294 | + parser.add_option( |
1295 | + "-c", "--config", dest="config", |
1296 | + default="page-performance-report.ini", |
1297 | + metavar="FILE", help="Load configuration from FILE") |
1298 | + parser.add_option( |
1299 | + "--from", dest="from_ts", type="datetime", |
1300 | + default=None, metavar="TIMESTAMP", |
1301 | + help="Ignore log entries before TIMESTAMP") |
1302 | + parser.add_option( |
1303 | + "--until", dest="until_ts", type="datetime", |
1304 | + default=None, metavar="TIMESTAMP", |
1305 | + help="Ignore log entries after TIMESTAMP") |
1306 | + parser.add_option( |
1307 | + "--no-partition", dest="partition", |
1308 | + action="store_false", default=True, |
1309 | + help="Do not produce partition report") |
1310 | + parser.add_option( |
1311 | + "--no-categories", dest="categories", |
1312 | + action="store_false", default=True, |
1313 | + help="Do not produce categories report") |
1314 | + parser.add_option( |
1315 | + "--no-pageids", dest="pageids", |
1316 | + action="store_false", default=True, |
1317 | + help="Do not produce pageids report") |
1318 | + parser.add_option( |
1319 | + "--top-urls", dest="top_urls", type=int, metavar="N", |
1320 | + default=50, help="Generate report for top N urls by hitcount.") |
1321 | + parser.add_option( |
1322 | + "--directory", dest="directory", |
1323 | + default=os.getcwd(), metavar="DIR", |
1324 | + help="Output reports in DIR directory") |
1325 | + parser.add_option( |
1326 | + "--timeout", dest="timeout", |
1327 | + # Default to 9: our production timeout. |
1328 | + default=9, type="int", metavar="SECONDS", |
1329 | + help="The configured timeout value: used to determine high risk " + |
1330 | + "page ids. That would be pages which 99% under render time is " |
1331 | + "greater than timeoout - 2s. Default is %defaults.") |
1332 | + parser.add_option( |
1333 | + "--histogram-resolution", dest="resolution", |
1334 | + # Default to 0.5s |
1335 | + default=0.5, type="float", metavar="SECONDS", |
1336 | + help="The resolution of the histogram bin width. Detault to " |
1337 | + "%defaults.") |
1338 | + parser.add_option( |
1339 | + "--merge", dest="merge", |
1340 | + default=False, action='store_true', |
1341 | + help="Files are interpreted as pickled stats and are aggregated " + |
1342 | + "for the report.") |
1343 | + |
1344 | + options, args = parser.parse_args() |
1345 | + |
1346 | + if not os.path.isdir(options.directory): |
1347 | + parser.error("Directory %s does not exist" % options.directory) |
1348 | + |
1349 | + if len(args) == 0: |
1350 | + parser.error("At least one zserver tracelog file must be provided") |
1351 | + |
1352 | + if options.from_ts is not None and options.until_ts is not None: |
1353 | + if options.from_ts > options.until_ts: |
1354 | + parser.error( |
1355 | + "--from timestamp %s is before --until timestamp %s" |
1356 | + % (options.from_ts, options.until_ts)) |
1357 | + if options.from_ts is not None or options.until_ts is not None: |
1358 | + if options.merge: |
1359 | + parser.error('--from and --until cannot be used with --merge') |
1360 | + |
1361 | + for filename in args: |
1362 | + if not os.path.exists(filename): |
1363 | + parser.error("Tracelog file %s not found." % filename) |
1364 | + |
1365 | + if not os.path.exists(options.config): |
1366 | + parser.error("Config file %s not found." % options.config) |
1367 | + |
1368 | + # Need a better config mechanism as ConfigParser doesn't preserve order. |
1369 | + script_config = RawConfigParser() |
1370 | + script_config.optionxform = str # Make keys case sensitive. |
1371 | + script_config.readfp(open(options.config)) |
1372 | + |
1373 | + categories = [] # A list of Category, in report order. |
1374 | + for option in script_config.options('categories'): |
1375 | + regexp = script_config.get('categories', option) |
1376 | + try: |
1377 | + categories.append(Category(option, regexp)) |
1378 | + except sre_constants.error as x: |
1379 | + log.fatal("Unable to compile regexp %r (%s)" % (regexp, x)) |
1380 | + return 1 |
1381 | + categories.sort() |
1382 | + |
1383 | + if len(categories) == 0: |
1384 | + parser.error("No data in [categories] section of configuration.") |
1385 | + |
1386 | + # Determine the categories making a partition of the requests |
1387 | + for option in script_config.options('partition'): |
1388 | + for category in categories: |
1389 | + if category.title == option: |
1390 | + category.partition = True |
1391 | + break |
1392 | + else: |
1393 | + log.warning( |
1394 | + "In partition definition: %s isn't a defined category", |
1395 | + option) |
1396 | + |
1397 | + times = RequestTimes(categories, options) |
1398 | + |
1399 | + if options.merge: |
1400 | + for filename in args: |
1401 | + log.info('Merging %s...' % filename) |
1402 | + f = bz2.BZ2File(filename, 'r') |
1403 | + times += cPickle.load(f) |
1404 | + f.close() |
1405 | + else: |
1406 | + parse(args, times, options) |
1407 | + |
1408 | + category_times = times.get_category_times() |
1409 | + |
1410 | + pageid_times = [] |
1411 | + url_times= [] |
1412 | + if options.top_urls: |
1413 | + url_times = times.get_top_urls_times() |
1414 | + if options.pageids: |
1415 | + pageid_times = times.get_pageid_times() |
1416 | + |
1417 | + def _report_filename(filename): |
1418 | + return os.path.join(options.directory, filename) |
1419 | + |
1420 | + # Partition report |
1421 | + if options.partition: |
1422 | + report_filename = _report_filename('partition.html') |
1423 | + log.info("Generating %s", report_filename) |
1424 | + partition_times = [ |
1425 | + category_time |
1426 | + for category_time in category_times |
1427 | + if category_time[0].partition] |
1428 | + html_report( |
1429 | + open(report_filename, 'w'), partition_times, None, None, |
1430 | + histogram_resolution=options.resolution, |
1431 | + category_name='Partition') |
1432 | + |
1433 | + # Category only report. |
1434 | + if options.categories: |
1435 | + report_filename = _report_filename('categories.html') |
1436 | + log.info("Generating %s", report_filename) |
1437 | + html_report( |
1438 | + open(report_filename, 'w'), category_times, None, None, |
1439 | + histogram_resolution=options.resolution) |
1440 | + |
1441 | + # Pageid only report. |
1442 | + if options.pageids: |
1443 | + report_filename = _report_filename('pageids.html') |
1444 | + log.info("Generating %s", report_filename) |
1445 | + html_report( |
1446 | + open(report_filename, 'w'), None, pageid_times, None, |
1447 | + histogram_resolution=options.resolution) |
1448 | + |
1449 | + # Top URL only report. |
1450 | + if options.top_urls: |
1451 | + report_filename = _report_filename('top%d.html' % options.top_urls) |
1452 | + log.info("Generating %s", report_filename) |
1453 | + html_report( |
1454 | + open(report_filename, 'w'), None, None, url_times, |
1455 | + histogram_resolution=options.resolution) |
1456 | + |
1457 | + # Combined report. |
1458 | + if options.categories and options.pageids: |
1459 | + report_filename = _report_filename('combined.html') |
1460 | + html_report( |
1461 | + open(report_filename, 'w'), |
1462 | + category_times, pageid_times, url_times, |
1463 | + histogram_resolution=options.resolution) |
1464 | + |
1465 | + # Report of likely timeout candidates |
1466 | + report_filename = _report_filename('timeout-candidates.html') |
1467 | + log.info("Generating %s", report_filename) |
1468 | + html_report( |
1469 | + open(report_filename, 'w'), None, pageid_times, None, |
1470 | + options.timeout - 2, |
1471 | + histogram_resolution=options.resolution) |
1472 | + |
1473 | + # Save the times cache for later merging. |
1474 | + report_filename = _report_filename('stats.pck.bz2') |
1475 | + log.info("Saving times database in %s", report_filename) |
1476 | + stats_file = bz2.BZ2File(report_filename, 'w') |
1477 | + cPickle.dump(times, stats_file, protocol=cPickle.HIGHEST_PROTOCOL) |
1478 | + stats_file.close() |
1479 | + |
1480 | + # Output metrics for selected categories. |
1481 | + report_filename = _report_filename('metrics.dat') |
1482 | + log.info('Saving category_metrics %s', report_filename) |
1483 | + metrics_file = open(report_filename, 'w') |
1484 | + writer = csv.writer(metrics_file, delimiter=':') |
1485 | + date = options.until_ts or options.from_ts or datetime.utcnow() |
1486 | + date = time.mktime(date.timetuple()) |
1487 | + |
1488 | + for option in script_config.options('metrics'): |
1489 | + name = script_config.get('metrics', option) |
1490 | + for category, stats in category_times: |
1491 | + if category.title == name: |
1492 | + writer.writerows([ |
1493 | + ("%s_99" % option, "%f@%d" % ( |
1494 | + stats.ninetyninth_percentile_time, date)), |
1495 | + ("%s_hits" % option, "%d@%d" % (stats.total_hits, date))]) |
1496 | + break |
1497 | + else: |
1498 | + log.warning("Can't find category %s for metric %s" % ( |
1499 | + option, name)) |
1500 | + metrics_file.close() |
1501 | + |
1502 | + return 0 |
1503 | + |
1504 | + |
1505 | +def smart_open(filename, mode='r'): |
1506 | + """Open a file, transparently handling compressed files. |
1507 | + |
1508 | + Compressed files are detected by file extension. |
1509 | + """ |
1510 | + ext = os.path.splitext(filename)[1] |
1511 | + if ext == '.bz2': |
1512 | + return bz2.BZ2File(filename, 'r') |
1513 | + elif ext == '.gz': |
1514 | + return gzip.GzipFile(filename, 'r') |
1515 | + else: |
1516 | + return open(filename, mode) |
1517 | + |
1518 | + |
1519 | +class MalformedLine(Exception): |
1520 | + """A malformed line was found in the trace log.""" |
1521 | + |
1522 | + |
1523 | +_ts_re = re.compile( |
1524 | + '^(\d{4})-(\d\d)-(\d\d)\s(\d\d):(\d\d):(\d\d)(?:.(\d{6}))?$') |
1525 | + |
1526 | + |
1527 | +def parse_timestamp(ts_string): |
1528 | + match = _ts_re.search(ts_string) |
1529 | + if match is None: |
1530 | + raise ValueError("Invalid timestamp") |
1531 | + return datetime( |
1532 | + *(int(elem) for elem in match.groups() if elem is not None)) |
1533 | + |
1534 | + |
1535 | +def parse(tracefiles, times, options): |
1536 | + requests = {} |
1537 | + total_requests = 0 |
1538 | + for tracefile in tracefiles: |
1539 | + log.info('Processing %s', tracefile) |
1540 | + for line in smart_open(tracefile): |
1541 | + line = line.rstrip() |
1542 | + try: |
1543 | + record = line.split(' ', 7) |
1544 | + try: |
1545 | + record_type, request_id, date, time_ = record[:4] |
1546 | + except ValueError: |
1547 | + raise MalformedLine() |
1548 | + |
1549 | + if record_type == 'S': |
1550 | + # Short circuit - we don't care about these entries. |
1551 | + continue |
1552 | + |
1553 | + # Parse the timestamp. |
1554 | + ts_string = '%s %s' % (date, time_) |
1555 | + try: |
1556 | + dt = parse_timestamp(ts_string) |
1557 | + except ValueError: |
1558 | + raise MalformedLine( |
1559 | + 'Invalid timestamp %s' % repr(ts_string)) |
1560 | + |
1561 | + # Filter entries by command line date range. |
1562 | + if options.from_ts is not None and dt < options.from_ts: |
1563 | + continue # Skip to next line. |
1564 | + if options.until_ts is not None and dt > options.until_ts: |
1565 | + break # Skip to next log file. |
1566 | + |
1567 | + args = record[4:] |
1568 | + |
1569 | + def require_args(count): |
1570 | + if len(args) < count: |
1571 | + raise MalformedLine() |
1572 | + |
1573 | + if record_type == 'B': # Request begins. |
1574 | + require_args(2) |
1575 | + requests[request_id] = Request(dt, args[0], args[1]) |
1576 | + continue |
1577 | + |
1578 | + request = requests.get(request_id, None) |
1579 | + if request is None: # Just ignore partial records. |
1580 | + continue |
1581 | + |
1582 | + # Old stype extension record from Launchpad. Just |
1583 | + # contains the URL. |
1584 | + if (record_type == '-' and len(args) == 1 |
1585 | + and args[0].startswith('http')): |
1586 | + request.url = args[0] |
1587 | + |
1588 | + # New style extension record with a prefix. |
1589 | + elif record_type == '-': |
1590 | + # Launchpad outputs several things as tracelog |
1591 | + # extension records. We include a prefix to tell |
1592 | + # them apart. |
1593 | + require_args(1) |
1594 | + |
1595 | + parse_extension_record(request, args) |
1596 | + |
1597 | + elif record_type == 'I': # Got request input. |
1598 | + require_args(1) |
1599 | + request.I(dt, args[0]) |
1600 | + |
1601 | + elif record_type == 'C': # Entered application thread. |
1602 | + request.C(dt) |
1603 | + |
1604 | + elif record_type == 'A': # Application done. |
1605 | + require_args(2) |
1606 | + request.A(dt, args[0], args[1]) |
1607 | + |
1608 | + elif record_type == 'E': # Request done. |
1609 | + del requests[request_id] |
1610 | + request.E(dt) |
1611 | + total_requests += 1 |
1612 | + if total_requests % 10000 == 0: |
1613 | + log.debug("Parsed %d requests", total_requests) |
1614 | + |
1615 | + # Add the request to any matching categories. |
1616 | + times.add_request(request) |
1617 | + else: |
1618 | + raise MalformedLine('Unknown record type %s', record_type) |
1619 | + except MalformedLine as x: |
1620 | + log.error( |
1621 | + "Malformed line %s (%s)" % (repr(line), x)) |
1622 | + |
1623 | + |
1624 | +def parse_extension_record(request, args): |
1625 | + """Decode a ZServer extension records and annotate request.""" |
1626 | + prefix = args[0] |
1627 | + |
1628 | + if prefix == 'u': |
1629 | + request.url = ' '.join(args[1:]) or None |
1630 | + elif prefix == 'p': |
1631 | + request.pageid = ' '.join(args[1:]) or None |
1632 | + elif prefix == 't': |
1633 | + if len(args) != 4: |
1634 | + raise MalformedLine("Wrong number of arguments %s" % (args,)) |
1635 | + request.sql_statements = int(args[2]) |
1636 | + request.sql_seconds = float(args[3]) / 1000 |
1637 | + else: |
1638 | + raise MalformedLine( |
1639 | + "Unknown extension prefix %s" % prefix) |
1640 | + |
1641 | + |
1642 | +def html_report( |
1643 | + outf, category_times, pageid_times, url_times, |
1644 | + ninetyninth_percentile_threshold=None, histogram_resolution=0.5, |
1645 | + category_name='Category'): |
1646 | + """Write an html report to outf. |
1647 | + |
1648 | + :param outf: A file object to write the report to. |
1649 | + :param category_times: The time statistics for categories. |
1650 | + :param pageid_times: The time statistics for pageids. |
1651 | + :param url_times: The time statistics for the top XXX urls. |
1652 | + :param ninetyninth_percentile_threshold: Lower threshold for inclusion of |
1653 | + pages in the pageid section; pages where 99 percent of the requests are |
1654 | + served under this threshold will not be included. |
1655 | + :param histogram_resolution: used as the histogram bar width |
1656 | + :param category_name: The name to use for category report. Defaults to |
1657 | + 'Category'. |
1658 | + """ |
1659 | + |
1660 | + print >> outf, dedent('''\ |
1661 | + <!DOCTYPE html> |
1662 | + <html> |
1663 | + <head> |
1664 | + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
1665 | + <title>Launchpad Page Performance Report %(date)s</title> |
1666 | + <script language="javascript" type="text/javascript" |
1667 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/flot/jquery.min.js" |
1668 | + ></script> |
1669 | + <script language="javascript" type="text/javascript" |
1670 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/jquery.appear-1.1.1.min.js" |
1671 | + ></script> |
1672 | + <script language="javascript" type="text/javascript" |
1673 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/flot/jquery.flot.min.js" |
1674 | + ></script> |
1675 | + <script language="javascript" type="text/javascript" |
1676 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/sorttable.js"></script> |
1677 | + <style type="text/css"> |
1678 | + h3 { font-weight: normal; font-size: 1em; } |
1679 | + thead th { padding-left: 1em; padding-right: 1em; } |
1680 | + .category-title { text-align: right; padding-right: 2em; |
1681 | + max-width: 25em; } |
1682 | + .regexp { font-size: x-small; font-weight: normal; } |
1683 | + .mean { text-align: right; padding-right: 1em; } |
1684 | + .median { text-align: right; padding-right: 1em; } |
1685 | + .standard-deviation { text-align: right; padding-right: 1em; } |
1686 | + .histogram { padding: 0.5em 1em; width:400px; height:250px; } |
1687 | + .odd-row { background-color: #eeeeff; } |
1688 | + .even-row { background-color: #ffffee; } |
1689 | + table.sortable thead { |
1690 | + background-color:#eee; |
1691 | + color:#666666; |
1692 | + font-weight: bold; |
1693 | + cursor: default; |
1694 | + } |
1695 | + td.numeric { |
1696 | + font-family: monospace; |
1697 | + text-align: right; |
1698 | + padding: 1em; |
1699 | + } |
1700 | + .clickable { cursor: hand; } |
1701 | + .total-hits, .histogram, .median-sqltime, |
1702 | + .median-sqlstatements { border-right: 1px dashed #000000; } |
1703 | + </style> |
1704 | + </head> |
1705 | + <body> |
1706 | + <h1>Launchpad Page Performance Report</h1> |
1707 | + <h3>%(date)s</h3> |
1708 | + ''' % {'date': time.ctime()}) |
1709 | + |
1710 | + table_header = dedent('''\ |
1711 | + <table class="sortable page-performance-report"> |
1712 | + <caption align="top">Click on column headings to sort.</caption> |
1713 | + <thead> |
1714 | + <tr> |
1715 | + <th class="clickable">Name</th> |
1716 | + |
1717 | + <th class="clickable">Total Hits</th> |
1718 | + |
1719 | + <th class="clickable">99% Under Time (secs)</th> |
1720 | + |
1721 | + <th class="clickable">Mean Time (secs)</th> |
1722 | + <th class="clickable">Time Standard Deviation</th> |
1723 | + <th class="clickable">Median Time (secs)</th> |
1724 | + <th class="sorttable_nosort">Time Distribution</th> |
1725 | + |
1726 | + <th class="clickable">99% Under SQL Time (secs)</th> |
1727 | + <th class="clickable">Mean SQL Time (secs)</th> |
1728 | + <th class="clickable">SQL Time Standard Deviation</th> |
1729 | + <th class="clickable">Median SQL Time (secs)</th> |
1730 | + |
1731 | + <th class="clickable">99% Under SQL Statements</th> |
1732 | + <th class="clickable">Mean SQL Statements</th> |
1733 | + <th class="clickable">SQL Statement Standard Deviation</th> |
1734 | + <th class="clickable">Median SQL Statements</th> |
1735 | + |
1736 | + <th class="clickable">Hits * 99% Under SQL Statement</th> |
1737 | + </tr> |
1738 | + </thead> |
1739 | + <tbody> |
1740 | + ''') |
1741 | + table_footer = "</tbody></table>" |
1742 | + |
1743 | + # Store our generated histograms to output Javascript later. |
1744 | + histograms = [] |
1745 | + |
1746 | + def handle_times(html_title, stats): |
1747 | + histograms.append(stats.histogram) |
1748 | + print >> outf, dedent("""\ |
1749 | + <tr> |
1750 | + <th class="category-title">%s</th> |
1751 | + <td class="numeric total-hits">%d</td> |
1752 | + <td class="numeric 99pc-under-time">%.2f</td> |
1753 | + <td class="numeric mean-time">%.2f</td> |
1754 | + <td class="numeric std-time">%.2f</td> |
1755 | + <td class="numeric median-time">%.2f</td> |
1756 | + <td> |
1757 | + <div class="histogram" id="histogram%d"></div> |
1758 | + </td> |
1759 | + <td class="numeric 99pc-under-sqltime">%.2f</td> |
1760 | + <td class="numeric mean-sqltime">%.2f</td> |
1761 | + <td class="numeric std-sqltime">%.2f</td> |
1762 | + <td class="numeric median-sqltime">%.2f</td> |
1763 | + |
1764 | + <td class="numeric 99pc-under-sqlstatement">%.f</td> |
1765 | + <td class="numeric mean-sqlstatements">%.2f</td> |
1766 | + <td class="numeric std-sqlstatements">%.2f</td> |
1767 | + <td class="numeric median-sqlstatements">%.2f</td> |
1768 | + |
1769 | + <td class="numeric high-db-usage">%.f</td> |
1770 | + </tr> |
1771 | + """ % ( |
1772 | + html_title, |
1773 | + stats.total_hits, stats.ninetyninth_percentile_time, |
1774 | + stats.mean, stats.std, stats.median, |
1775 | + len(histograms) - 1, |
1776 | + stats.ninetyninth_percentile_sqltime, stats.mean_sqltime, |
1777 | + stats.std_sqltime, stats.median_sqltime, |
1778 | + stats.ninetyninth_percentile_sqlstatements, |
1779 | + stats.mean_sqlstatements, |
1780 | + stats.std_sqlstatements, stats.median_sqlstatements, |
1781 | + stats.ninetyninth_percentile_sqlstatements* stats.total_hits, |
1782 | + )) |
1783 | + |
1784 | + # Table of contents |
1785 | + print >> outf, '<ol>' |
1786 | + if category_times: |
1787 | + print >> outf, '<li><a href="#catrep">%s Report</a></li>' % ( |
1788 | + category_name) |
1789 | + if pageid_times: |
1790 | + print >> outf, '<li><a href="#pageidrep">Pageid Report</a></li>' |
1791 | + if url_times: |
1792 | + print >> outf, '<li><a href="#topurlrep">Top URL Report</a></li>' |
1793 | + print >> outf, '</ol>' |
1794 | + |
1795 | + if category_times: |
1796 | + print >> outf, '<h2 id="catrep">%s Report</h2>' % ( |
1797 | + category_name) |
1798 | + print >> outf, table_header |
1799 | + for category, times in category_times: |
1800 | + html_title = '%s<br/><span class="regexp">%s</span>' % ( |
1801 | + html_quote(category.title), html_quote(category.regexp)) |
1802 | + handle_times(html_title, times) |
1803 | + print >> outf, table_footer |
1804 | + |
1805 | + if pageid_times: |
1806 | + print >> outf, '<h2 id="pageidrep">Pageid Report</h2>' |
1807 | + print >> outf, table_header |
1808 | + for pageid, times in pageid_times: |
1809 | + if (ninetyninth_percentile_threshold is not None and |
1810 | + (times.ninetyninth_percentile_time < |
1811 | + ninetyninth_percentile_threshold)): |
1812 | + continue |
1813 | + handle_times(html_quote(pageid), times) |
1814 | + print >> outf, table_footer |
1815 | + |
1816 | + if url_times: |
1817 | + print >> outf, '<h2 id="topurlrep">Top URL Report</h2>' |
1818 | + print >> outf, table_header |
1819 | + for url, times in url_times: |
1820 | + handle_times(html_quote(url), times) |
1821 | + print >> outf, table_footer |
1822 | + |
1823 | + # Ourput the javascript to render our histograms nicely, replacing |
1824 | + # the placeholder <div> tags output earlier. |
1825 | + print >> outf, dedent("""\ |
1826 | + <script language="javascript" type="text/javascript"> |
1827 | + $(function () { |
1828 | + var options = { |
1829 | + series: { |
1830 | + bars: {show: true, barWidth: %s} |
1831 | + }, |
1832 | + xaxis: { |
1833 | + tickFormatter: function (val, axis) { |
1834 | + return val.toFixed(axis.tickDecimals) + "s"; |
1835 | + } |
1836 | + }, |
1837 | + yaxis: { |
1838 | + min: 0, |
1839 | + max: 1, |
1840 | + transform: function (v) { |
1841 | + return Math.pow(Math.log(v*100+1)/Math.LN2, 0.5); |
1842 | + }, |
1843 | + inverseTransform: function (v) { |
1844 | + return Math.pow(Math.exp(v*100+1)/Math.LN2, 2); |
1845 | + }, |
1846 | + tickDecimals: 1, |
1847 | + tickFormatter: function (val, axis) { |
1848 | + return (val * 100).toFixed(axis.tickDecimals) + "%%"; |
1849 | + }, |
1850 | + ticks: [0.001,0.01,0.10,0.50,1.0] |
1851 | + }, |
1852 | + grid: { |
1853 | + aboveData: true, |
1854 | + labelMargin: 15 |
1855 | + } |
1856 | + }; |
1857 | + """ % histogram_resolution) |
1858 | + |
1859 | + for i, histogram in enumerate(histograms): |
1860 | + if histogram.count == 0: |
1861 | + continue |
1862 | + print >> outf, dedent("""\ |
1863 | + function plot_histogram_%(id)d() { |
1864 | + var d = %(data)s; |
1865 | + |
1866 | + $.plot( |
1867 | + $("#histogram%(id)d"), |
1868 | + [{data: d}], options); |
1869 | + } |
1870 | + $('#histogram%(id)d').appear(function() { |
1871 | + plot_histogram_%(id)d(); |
1872 | + }); |
1873 | + |
1874 | + """ % {'id': i, 'data': json.dumps(histogram.bins_relative)}) |
1875 | + |
1876 | + print >> outf, dedent("""\ |
1877 | + }); |
1878 | + </script> |
1879 | + </body> |
1880 | + </html> |
1881 | + """) |
1882 | |
1883 | === added file 'setup.py' |
1884 | --- setup.py 1970-01-01 00:00:00 +0000 |
1885 | +++ setup.py 2012-08-09 04:56:19 +0000 |
1886 | @@ -0,0 +1,50 @@ |
1887 | +#!/usr/bin/env python |
1888 | +# |
1889 | +# Copyright (c) 2012, Canonical Ltd |
1890 | +# |
1891 | +# This program is free software: you can redistribute it and/or modify |
1892 | +# it under the terms of the GNU Lesser General Public License as published by |
1893 | +# the Free Software Foundation, version 3 only. |
1894 | +# |
1895 | +# This program is distributed in the hope that it will be useful, |
1896 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
1897 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
1898 | +# GNU Lesser General Public License for more details. |
1899 | +# |
1900 | +# You should have received a copy of the GNU Lesser General Public License |
1901 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
1902 | +# GNU Lesser General Public License version 3 (see the file LICENSE). |
1903 | + |
1904 | +from distutils.core import setup |
1905 | +import os.path |
1906 | + |
1907 | +description = file( |
1908 | + os.path.join(os.path.dirname(__file__), 'README'), 'rb').read() |
1909 | + |
1910 | +setup(name="lp-dev-utils", |
1911 | + version="0.0.0", |
1912 | + description=\ |
1913 | + "Tools for working on or with Launchpad.", |
1914 | + long_description=description, |
1915 | + maintainer="Launchpad Developers", |
1916 | + maintainer_email="launchpad-dev@lists.launchpad.net", |
1917 | + url="https://launchpad.net/lp-dev-utils", |
1918 | + packages=['ec2test'], |
1919 | + package_dir = {'':'.'}, |
1920 | + classifiers = [ |
1921 | + 'Development Status :: 2 - Pre-Alpha', |
1922 | + 'Intended Audience :: Developers', |
1923 | + 'License :: OSI Approved :: GNU General Public License v3 (GPLv3)', |
1924 | + 'Operating System :: OS Independent', |
1925 | + 'Programming Language :: Python', |
1926 | + ], |
1927 | + install_requires = [ |
1928 | + 'zc.zservertracelog', |
1929 | + ], |
1930 | + extras_require = dict( |
1931 | + test=[ |
1932 | + 'fixtures', |
1933 | + 'testtools', |
1934 | + ] |
1935 | + ), |
1936 | + ) |
1937 | |
1938 | === added file 'test_pageperformancereport.py' |
1939 | --- test_pageperformancereport.py 1970-01-01 00:00:00 +0000 |
1940 | +++ test_pageperformancereport.py 2012-08-09 04:56:19 +0000 |
1941 | @@ -0,0 +1,486 @@ |
1942 | +# Copyright 2010 Canonical Ltd. This software is licensed under the |
1943 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
1944 | + |
1945 | +"""Test the pageperformancereport script.""" |
1946 | + |
1947 | +__metaclass__ = type |
1948 | + |
1949 | +import fixtures |
1950 | +from testtools import TestCase |
1951 | + |
1952 | +from pageperformancereport import ( |
1953 | + Category, |
1954 | + Histogram, |
1955 | + OnlineApproximateMedian, |
1956 | + OnlineStats, |
1957 | + OnlineStatsCalculator, |
1958 | + RequestTimes, |
1959 | + Stats, |
1960 | + ) |
1961 | + |
1962 | + |
1963 | +class FakeOptions: |
1964 | + timeout = 5 |
1965 | + db_file = None |
1966 | + pageids = True |
1967 | + top_urls = 3 |
1968 | + resolution = 1 |
1969 | + |
1970 | + def __init__(self, **kwargs): |
1971 | + """Assign all arguments as attributes.""" |
1972 | + self.__dict__.update(kwargs) |
1973 | + |
1974 | + |
1975 | +class FakeRequest: |
1976 | + |
1977 | + def __init__(self, url, app_seconds, sql_statements=None, |
1978 | + sql_seconds=None, pageid=None): |
1979 | + self.url = url |
1980 | + self.pageid = pageid |
1981 | + self.app_seconds = app_seconds |
1982 | + self.sql_statements = sql_statements |
1983 | + self.sql_seconds = sql_seconds |
1984 | + |
1985 | + |
1986 | +class FakeStats(Stats): |
1987 | + |
1988 | + def __init__(self, **kwargs): |
1989 | + # Override the constructor to just store the values. |
1990 | + self.__dict__.update(kwargs) |
1991 | + |
1992 | + |
1993 | +FAKE_REQUESTS = [ |
1994 | + FakeRequest('/', 0.5, pageid='+root'), |
1995 | + FakeRequest('/bugs', 4.5, 56, 3.0, pageid='+bugs'), |
1996 | + FakeRequest('/bugs', 4.2, 56, 2.2, pageid='+bugs'), |
1997 | + FakeRequest('/bugs', 5.5, 76, 4.0, pageid='+bugs'), |
1998 | + FakeRequest('/ubuntu', 2.5, 6, 2.0, pageid='+distribution'), |
1999 | + FakeRequest('/launchpad', 3.5, 3, 3.0, pageid='+project'), |
2000 | + FakeRequest('/bzr', 2.5, 4, 2.0, pageid='+project'), |
2001 | + FakeRequest('/bugs/1', 20.5, 567, 14.0, pageid='+bug'), |
2002 | + FakeRequest('/bugs/1', 15.5, 567, 9.0, pageid='+bug'), |
2003 | + FakeRequest('/bugs/5', 1.5, 30, 1.2, pageid='+bug'), |
2004 | + FakeRequest('/lazr', 1.0, 16, 0.3, pageid='+project'), |
2005 | + FakeRequest('/drizzle', 0.9, 11, 1.3, pageid='+project'), |
2006 | + ] |
2007 | + |
2008 | + |
2009 | +# The category stats computed for the above 12 requests. |
2010 | +CATEGORY_STATS = [ |
2011 | + # Median is an approximation. |
2012 | + # Real values are: 2.50, 2.20, 30 |
2013 | + (Category('All', ''), FakeStats( |
2014 | + total_hits=12, total_time=62.60, mean=5.22, median=4.20, std=5.99, |
2015 | + total_sqltime=42, mean_sqltime=3.82, median_sqltime=3.0, |
2016 | + std_sqltime=3.89, |
2017 | + total_sqlstatements=1392, mean_sqlstatements=126.55, |
2018 | + median_sqlstatements=56, std_sqlstatements=208.94, |
2019 | + histogram=[[0, 2], [1, 2], [2, 2], [3, 1], [4, 2], [5, 3]], |
2020 | + )), |
2021 | + (Category('Test', ''), FakeStats( |
2022 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 0], [5, 0]])), |
2023 | + (Category('Bugs', ''), FakeStats( |
2024 | + total_hits=6, total_time=51.70, mean=8.62, median=4.5, std=6.90, |
2025 | + total_sqltime=33.40, mean_sqltime=5.57, median_sqltime=3, |
2026 | + std_sqltime=4.52, |
2027 | + total_sqlstatements=1352, mean_sqlstatements=225.33, |
2028 | + median_sqlstatements=56, std_sqlstatements=241.96, |
2029 | + histogram=[[0, 0], [1, 1], [2, 0], [3, 0], [4, 2], [5, 3]], |
2030 | + )), |
2031 | + ] |
2032 | + |
2033 | + |
2034 | +# The top 3 URL stats computed for the above 12 requests. |
2035 | +TOP_3_URL_STATS = [ |
2036 | + ('/bugs/1', FakeStats( |
2037 | + total_hits=2, total_time=36.0, mean=18.0, median=15.5, std=2.50, |
2038 | + total_sqltime=23.0, mean_sqltime=11.5, median_sqltime=9.0, |
2039 | + std_sqltime=2.50, |
2040 | + total_sqlstatements=1134, mean_sqlstatements=567.0, |
2041 | + median_sqlstatements=567, std_statements=0, |
2042 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 0], [5, 2]], |
2043 | + )), |
2044 | + ('/bugs', FakeStats( |
2045 | + total_hits=3, total_time=14.2, mean=4.73, median=4.5, std=0.56, |
2046 | + total_sqltime=9.2, mean_sqltime=3.07, median_sqltime=3, |
2047 | + std_sqltime=0.74, |
2048 | + total_sqlstatements=188, mean_sqlstatements=62.67, |
2049 | + median_sqlstatements=56, std_sqlstatements=9.43, |
2050 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 2], [5, 1]], |
2051 | + )), |
2052 | + ('/launchpad', FakeStats( |
2053 | + total_hits=1, total_time=3.5, mean=3.5, median=3.5, std=0, |
2054 | + total_sqltime=3.0, mean_sqltime=3, median_sqltime=3, std_sqltime=0, |
2055 | + total_sqlstatements=3, mean_sqlstatements=3, |
2056 | + median_sqlstatements=3, std_sqlstatements=0, |
2057 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 1], [4, 0], [5, 0]], |
2058 | + )), |
2059 | + ] |
2060 | + |
2061 | + |
2062 | +# The pageid stats computed for the above 12 requests. |
2063 | +PAGEID_STATS = [ |
2064 | + ('+bug', FakeStats( |
2065 | + total_hits=3, total_time=37.5, mean=12.5, median=15.5, std=8.04, |
2066 | + total_sqltime=24.2, mean_sqltime=8.07, median_sqltime=9, |
2067 | + std_sqltime=5.27, |
2068 | + total_sqlstatements=1164, mean_sqlstatements=388, |
2069 | + median_sqlstatements=567, std_sqlstatements=253.14, |
2070 | + histogram=[[0, 0], [1, 1], [2, 0], [3, 0], [4, 0], [5, 2]], |
2071 | + )), |
2072 | + ('+bugs', FakeStats( |
2073 | + total_hits=3, total_time=14.2, mean=4.73, median=4.5, std=0.56, |
2074 | + total_sqltime=9.2, mean_sqltime=3.07, median_sqltime=3, |
2075 | + std_sqltime=0.74, |
2076 | + total_sqlstatements=188, mean_sqlstatements=62.67, |
2077 | + median_sqlstatements=56, std_sqlstatements=9.43, |
2078 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 2], [5, 1]], |
2079 | + )), |
2080 | + ('+distribution', FakeStats( |
2081 | + total_hits=1, total_time=2.5, mean=2.5, median=2.5, std=0, |
2082 | + total_sqltime=2.0, mean_sqltime=2, median_sqltime=2, std_sqltime=0, |
2083 | + total_sqlstatements=6, mean_sqlstatements=6, |
2084 | + median_sqlstatements=6, std_sqlstatements=0, |
2085 | + histogram=[[0, 0], [1, 0], [2, 1], [3, 0], [4, 0], [5, 0]], |
2086 | + )), |
2087 | + ('+project', FakeStats( |
2088 | + total_hits=4, total_time=7.9, mean=1.98, median=1, std=1.08, |
2089 | + total_sqltime=6.6, mean_sqltime=1.65, median_sqltime=1.3, |
2090 | + std_sqltime=0.99, |
2091 | + total_sqlstatements=34, mean_sqlstatements=8.5, |
2092 | + median_sqlstatements=4, std_sqlstatements=5.32, |
2093 | + histogram=[[0, 1], [1, 1], [2, 1], [3, 1], [4, 0], [5, 0]], |
2094 | + )), |
2095 | + ('+root', FakeStats( |
2096 | + total_hits=1, total_time=0.5, mean=0.5, median=0.5, std=0, |
2097 | + histogram=[[0, 1], [1, 0], [2, 0], [3, 0], [4, 0], [5, 0]], |
2098 | + )), |
2099 | + ] |
2100 | + |
2101 | + |
2102 | +class TestRequestTimes(TestCase): |
2103 | + """Tests the RequestTimes backend.""" |
2104 | + |
2105 | + def setUp(self): |
2106 | + super(TestRequestTimes, self).setUp() |
2107 | + self.categories = [ |
2108 | + Category('All', '.*'), Category('Test', '.*test.*'), |
2109 | + Category('Bugs', '.*bugs.*')] |
2110 | + self.db = RequestTimes(self.categories, FakeOptions()) |
2111 | + self.useFixture(fixtures.LoggerFixture()) |
2112 | + |
2113 | + def setUpRequests(self): |
2114 | + """Insert some requests into the db.""" |
2115 | + for r in FAKE_REQUESTS: |
2116 | + self.db.add_request(r) |
2117 | + |
2118 | + def assertStatsAreEquals(self, expected, results): |
2119 | + self.assertEquals( |
2120 | + len(expected), len(results), 'Wrong number of results') |
2121 | + for idx in range(len(results)): |
2122 | + self.assertEquals(expected[idx][0], results[idx][0], |
2123 | + "Wrong key for results %d" % idx) |
2124 | + key = results[idx][0] |
2125 | + self.assertEquals(expected[idx][1].text(), results[idx][1].text(), |
2126 | + "Wrong stats for results %d (%s)" % (idx, key)) |
2127 | + self.assertEquals( |
2128 | + Histogram.from_bins_data(expected[idx][1].histogram), |
2129 | + results[idx][1].histogram, |
2130 | + "Wrong histogram for results %d (%s)" % (idx, key)) |
2131 | + |
2132 | + def test_get_category_times(self): |
2133 | + self.setUpRequests() |
2134 | + category_times = self.db.get_category_times() |
2135 | + self.assertStatsAreEquals(CATEGORY_STATS, category_times) |
2136 | + |
2137 | + def test_get_url_times(self): |
2138 | + self.setUpRequests() |
2139 | + url_times = self.db.get_top_urls_times() |
2140 | + self.assertStatsAreEquals(TOP_3_URL_STATS, url_times) |
2141 | + |
2142 | + def test_get_pageid_times(self): |
2143 | + self.setUpRequests() |
2144 | + pageid_times = self.db.get_pageid_times() |
2145 | + self.assertStatsAreEquals(PAGEID_STATS, pageid_times) |
2146 | + |
2147 | + def test___add__(self): |
2148 | + # Ensure that adding two RequestTimes together result in |
2149 | + # a merge of their constituencies. |
2150 | + db1 = self.db |
2151 | + db2 = RequestTimes(self.categories, FakeOptions()) |
2152 | + db1.add_request(FakeRequest('/', 1.5, 5, 1.0, '+root')) |
2153 | + db1.add_request(FakeRequest('/bugs', 3.5, 15, 1.0, '+bugs')) |
2154 | + db2.add_request(FakeRequest('/bugs/1', 5.0, 30, 4.0, '+bug')) |
2155 | + results = db1 + db2 |
2156 | + self.assertEquals(3, results.category_times[0][1].total_hits) |
2157 | + self.assertEquals(0, results.category_times[1][1].total_hits) |
2158 | + self.assertEquals(2, results.category_times[2][1].total_hits) |
2159 | + self.assertEquals(1, results.pageid_times['+root'].total_hits) |
2160 | + self.assertEquals(1, results.pageid_times['+bugs'].total_hits) |
2161 | + self.assertEquals(1, results.pageid_times['+bug'].total_hits) |
2162 | + self.assertEquals(1, results.url_times['/'].total_hits) |
2163 | + self.assertEquals(1, results.url_times['/bugs'].total_hits) |
2164 | + self.assertEquals(1, results.url_times['/bugs/1'].total_hits) |
2165 | + |
2166 | + def test_histogram_init_with_resolution(self): |
2167 | + # Test that the resolution parameter increase the number of bins |
2168 | + db = RequestTimes( |
2169 | + self.categories, FakeOptions(timeout=4, resolution=1)) |
2170 | + self.assertEquals(5, db.histogram_width) |
2171 | + self.assertEquals(1, db.histogram_resolution) |
2172 | + db = RequestTimes( |
2173 | + self.categories, FakeOptions(timeout=4, resolution=0.5)) |
2174 | + self.assertEquals(9, db.histogram_width) |
2175 | + self.assertEquals(0.5, db.histogram_resolution) |
2176 | + db = RequestTimes( |
2177 | + self.categories, FakeOptions(timeout=4, resolution=2)) |
2178 | + self.assertEquals(3, db.histogram_width) |
2179 | + self.assertEquals(2, db.histogram_resolution) |
2180 | + |
2181 | + |
2182 | +class TestOnlineStats(TestCase): |
2183 | + """Tests for the OnlineStats class.""" |
2184 | + |
2185 | + def test___add__(self): |
2186 | + # Ensure that adding two OnlineStats merge all their constituencies. |
2187 | + stats1 = OnlineStats(4, 1) |
2188 | + stats1.update(FakeRequest('/', 2.0, 5, 1.5)) |
2189 | + stats2 = OnlineStats(4, 1) |
2190 | + stats2.update(FakeRequest('/', 1.5, 2, 3.0)) |
2191 | + stats2.update(FakeRequest('/', 5.0, 2, 2.0)) |
2192 | + results = stats1 + stats2 |
2193 | + self.assertEquals(3, results.total_hits) |
2194 | + self.assertEquals(2, results.median) |
2195 | + self.assertEquals(9, results.total_sqlstatements) |
2196 | + self.assertEquals(2, results.median_sqlstatements) |
2197 | + self.assertEquals(6.5, results.total_sqltime) |
2198 | + self.assertEquals(2.0, results.median_sqltime) |
2199 | + self.assertEquals( |
2200 | + Histogram.from_bins_data([[0, 0], [1, 1], [2, 1], [3, 1]]), |
2201 | + results.histogram) |
2202 | + |
2203 | + |
2204 | +class TestOnlineStatsCalculator(TestCase): |
2205 | + """Tests for the online stats calculator.""" |
2206 | + |
2207 | + def setUp(self): |
2208 | + TestCase.setUp(self) |
2209 | + self.stats = OnlineStatsCalculator() |
2210 | + |
2211 | + def test_stats_for_empty_set(self): |
2212 | + # Test the stats when there is no input. |
2213 | + self.assertEquals(0, self.stats.count) |
2214 | + self.assertEquals(0, self.stats.sum) |
2215 | + self.assertEquals(0, self.stats.mean) |
2216 | + self.assertEquals(0, self.stats.variance) |
2217 | + self.assertEquals(0, self.stats.std) |
2218 | + |
2219 | + def test_stats_for_one_value(self): |
2220 | + # Test the stats when adding one element. |
2221 | + self.stats.update(5) |
2222 | + self.assertEquals(1, self.stats.count) |
2223 | + self.assertEquals(5, self.stats.sum) |
2224 | + self.assertEquals(5, self.stats.mean) |
2225 | + self.assertEquals(0, self.stats.variance) |
2226 | + self.assertEquals(0, self.stats.std) |
2227 | + |
2228 | + def test_None_are_ignored(self): |
2229 | + self.stats.update(None) |
2230 | + self.assertEquals(0, self.stats.count) |
2231 | + |
2232 | + def test_stats_for_3_values(self): |
2233 | + for x in [3, 6, 9]: |
2234 | + self.stats.update(x) |
2235 | + self.assertEquals(3, self.stats.count) |
2236 | + self.assertEquals(18, self.stats.sum) |
2237 | + self.assertEquals(6, self.stats.mean) |
2238 | + self.assertEquals(6, self.stats.variance) |
2239 | + self.assertEquals("2.45", "%.2f" % self.stats.std) |
2240 | + |
2241 | + def test___add___two_empty_together(self): |
2242 | + stats2 = OnlineStatsCalculator() |
2243 | + results = self.stats + stats2 |
2244 | + self.assertEquals(0, results.count) |
2245 | + self.assertEquals(0, results.sum) |
2246 | + self.assertEquals(0, results.mean) |
2247 | + self.assertEquals(0, results.variance) |
2248 | + |
2249 | + def test___add___one_empty(self): |
2250 | + stats2 = OnlineStatsCalculator() |
2251 | + for x in [1, 2, 3]: |
2252 | + self.stats.update(x) |
2253 | + results = self.stats + stats2 |
2254 | + self.assertEquals(3, results.count) |
2255 | + self.assertEquals(6, results.sum) |
2256 | + self.assertEquals(2, results.mean) |
2257 | + self.assertEquals(2, results.M2) |
2258 | + |
2259 | + def test___add__(self): |
2260 | + stats2 = OnlineStatsCalculator() |
2261 | + for x in [3, 6, 9]: |
2262 | + self.stats.update(x) |
2263 | + for x in [1, 2, 3]: |
2264 | + stats2.update(x) |
2265 | + results = self.stats + stats2 |
2266 | + self.assertEquals(6, results.count) |
2267 | + self.assertEquals(24, results.sum) |
2268 | + self.assertEquals(4, results.mean) |
2269 | + self.assertEquals(44, results.M2) |
2270 | + |
2271 | + |
2272 | +SHUFFLE_RANGE_100 = [ |
2273 | + 25, 79, 99, 76, 60, 63, 87, 77, 51, 82, 42, 96, 93, 58, 32, 66, 75, |
2274 | + 2, 26, 22, 11, 73, 61, 83, 65, 68, 44, 81, 64, 3, 33, 34, 15, 1, |
2275 | + 92, 27, 90, 74, 46, 57, 59, 31, 13, 19, 89, 29, 56, 94, 50, 49, 62, |
2276 | + 37, 21, 35, 5, 84, 88, 16, 8, 23, 40, 6, 48, 10, 97, 0, 53, 17, 30, |
2277 | + 18, 43, 86, 12, 71, 38, 78, 36, 7, 45, 47, 80, 54, 39, 91, 98, 24, |
2278 | + 55, 14, 52, 20, 69, 85, 95, 28, 4, 9, 67, 70, 41, 72, |
2279 | + ] |
2280 | + |
2281 | + |
2282 | +class TestOnlineApproximateMedian(TestCase): |
2283 | + """Tests for the approximate median computation.""" |
2284 | + |
2285 | + def setUp(self): |
2286 | + TestCase.setUp(self) |
2287 | + self.estimator = OnlineApproximateMedian() |
2288 | + |
2289 | + def test_median_is_0_when_no_input(self): |
2290 | + self.assertEquals(0, self.estimator.median) |
2291 | + |
2292 | + def test_median_is_true_median_for_n_lower_than_bucket_size(self): |
2293 | + for x in range(9): |
2294 | + self.estimator.update(x) |
2295 | + self.assertEquals(4, self.estimator.median) |
2296 | + |
2297 | + def test_None_input_is_ignored(self): |
2298 | + self.estimator.update(1) |
2299 | + self.estimator.update(None) |
2300 | + self.assertEquals(1, self.estimator.median) |
2301 | + |
2302 | + def test_approximate_median_is_good_enough(self): |
2303 | + for x in SHUFFLE_RANGE_100: |
2304 | + self.estimator.update(x) |
2305 | + # True median is 50, 49 is good enough :-) |
2306 | + self.assertIn(self.estimator.median, range(49,52)) |
2307 | + |
2308 | + def test___add__(self): |
2309 | + median1 = OnlineApproximateMedian(3) |
2310 | + median1.buckets = [[1, 3], [4, 5], [6, 3]] |
2311 | + median2 = OnlineApproximateMedian(3) |
2312 | + median2.buckets = [[], [3, 6], [3, 7]] |
2313 | + results = median1 + median2 |
2314 | + self.assertEquals([[1, 3], [6], [3, 7], [4]], results.buckets) |
2315 | + |
2316 | + |
2317 | +class TestHistogram(TestCase): |
2318 | + """Test the histogram computation.""" |
2319 | + |
2320 | + def test__init__(self): |
2321 | + hist = Histogram(4, 1) |
2322 | + self.assertEquals(4, hist.bins_count) |
2323 | + self.assertEquals(1, hist.bins_size) |
2324 | + self.assertEquals([[0, 0], [1, 0], [2, 0], [3, 0]], hist.bins) |
2325 | + |
2326 | + def test__init__bins_size_float(self): |
2327 | + hist = Histogram(9, 0.5) |
2328 | + self.assertEquals(9, hist.bins_count) |
2329 | + self.assertEquals(0.5, hist.bins_size) |
2330 | + self.assertEquals( |
2331 | + [[0, 0], [0.5, 0], [1.0, 0], [1.5, 0], |
2332 | + [2.0, 0], [2.5, 0], [3.0, 0], [3.5, 0], [4.0, 0]], hist.bins) |
2333 | + |
2334 | + def test_update(self): |
2335 | + hist = Histogram(4, 1) |
2336 | + hist.update(1) |
2337 | + self.assertEquals(1, hist.count) |
2338 | + self.assertEquals([[0, 0], [1, 1], [2, 0], [3, 0]], hist.bins) |
2339 | + |
2340 | + hist.update(1.3) |
2341 | + self.assertEquals(2, hist.count) |
2342 | + self.assertEquals([[0, 0], [1, 2], [2, 0], [3, 0]], hist.bins) |
2343 | + |
2344 | + def test_update_float_bin_size(self): |
2345 | + hist = Histogram(4, 0.5) |
2346 | + hist.update(1.3) |
2347 | + self.assertEquals([[0, 0], [0.5, 0], [1.0, 1], [1.5, 0]], hist.bins) |
2348 | + hist.update(0.5) |
2349 | + self.assertEquals([[0, 0], [0.5, 1], [1.0, 1], [1.5, 0]], hist.bins) |
2350 | + hist.update(0.6) |
2351 | + self.assertEquals([[0, 0], [0.5, 2], [1.0, 1], [1.5, 0]], hist.bins) |
2352 | + |
2353 | + def test_update_max_goes_in_last_bin(self): |
2354 | + hist = Histogram(4, 1) |
2355 | + hist.update(9) |
2356 | + self.assertEquals([[0, 0], [1, 0], [2, 0], [3, 1]], hist.bins) |
2357 | + |
2358 | + def test_bins_relative(self): |
2359 | + hist = Histogram(4, 1) |
2360 | + for x in range(4): |
2361 | + hist.update(x) |
2362 | + self.assertEquals( |
2363 | + [[0, 0.25], [1, 0.25], [2, 0.25], [3, 0.25]], hist.bins_relative) |
2364 | + |
2365 | + def test_from_bins_data(self): |
2366 | + hist = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2367 | + self.assertEquals(4, hist.bins_count) |
2368 | + self.assertEquals(1, hist.bins_size) |
2369 | + self.assertEquals(6, hist.count) |
2370 | + self.assertEquals([[0, 1], [1, 3], [2, 1], [3, 1]], hist.bins) |
2371 | + |
2372 | + def test___repr__(self): |
2373 | + hist = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2374 | + self.assertEquals( |
2375 | + "<Histogram [[0, 1], [1, 3], [2, 1], [3, 1]]>", repr(hist)) |
2376 | + |
2377 | + def test___eq__(self): |
2378 | + hist1 = Histogram(4, 1) |
2379 | + hist2 = Histogram(4, 1) |
2380 | + self.assertEquals(hist1, hist2) |
2381 | + |
2382 | + def test__eq___with_data(self): |
2383 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2384 | + hist2 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2385 | + self.assertEquals(hist1, hist2) |
2386 | + |
2387 | + def test___add__(self): |
2388 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2389 | + hist2 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2390 | + hist3 = Histogram.from_bins_data([[0, 2], [1, 6], [2, 2], [3, 2]]) |
2391 | + total = hist1 + hist2 |
2392 | + self.assertEquals(hist3, total) |
2393 | + self.assertEquals(12, total.count) |
2394 | + |
2395 | + def test___add___uses_widest(self): |
2396 | + # Make sure that the resulting histogram is as wide as the widest one. |
2397 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
2398 | + hist2 = Histogram.from_bins_data( |
2399 | + [[0, 1], [1, 3], [2, 1], [3, 1], [4, 2], [5, 3]]) |
2400 | + hist3 = Histogram.from_bins_data( |
2401 | + [[0, 2], [1, 6], [2, 2], [3, 2], [4, 2], [5, 3]]) |
2402 | + self.assertEquals(hist3, hist1 + hist2) |
2403 | + |
2404 | + def test___add___interpolate_lower_resolution(self): |
2405 | + # Make sure that when the other histogram has a bigger bin_size |
2406 | + # the frequency is correctly split across the different bins. |
2407 | + hist1 = Histogram.from_bins_data( |
2408 | + [[0, 1], [0.5, 3], [1.0, 1], [1.5, 1]]) |
2409 | + hist2 = Histogram.from_bins_data( |
2410 | + [[0, 1], [1, 2], [2, 3], [3, 1], [4, 1]]) |
2411 | + |
2412 | + hist3 = Histogram.from_bins_data( |
2413 | + [[0, 1.5], [0.5, 3.5], [1.0, 2], [1.5, 2], |
2414 | + [2.0, 1.5], [2.5, 1.5], [3.0, 0.5], [3.5, 0.5], |
2415 | + [4.0, 0.5], [4.5, 0.5]]) |
2416 | + self.assertEquals(hist3, hist1 + hist2) |
2417 | + |
2418 | + def test___add___higher_resolution(self): |
2419 | + # Make sure that when the other histogram has a smaller bin_size |
2420 | + # the frequency is correctly added. |
2421 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 2], [2, 3]]) |
2422 | + hist2 = Histogram.from_bins_data( |
2423 | + [[0, 1], [0.5, 3], [1.0, 1], [1.5, 1], [2.0, 3], [2.5, 1], |
2424 | + [3, 4], [3.5, 2]]) |
2425 | + |
2426 | + hist3 = Histogram.from_bins_data([[0, 5], [1, 4], [2, 7], [3, 6]]) |
2427 | + self.assertEquals(hist3, hist1 + hist2) |
2428 | |
2429 | === added file 'versions.cfg' |
2430 | --- versions.cfg 1970-01-01 00:00:00 +0000 |
2431 | +++ versions.cfg 2012-08-09 04:56:19 +0000 |
2432 | @@ -0,0 +1,111 @@ |
2433 | +[buildout] |
2434 | +versions = versions |
2435 | + |
2436 | +[versions] |
2437 | +# Alphabetical, case-insensitive, please! :-) |
2438 | +fixtures = 0.3.9 |
2439 | +pytz = 2012c |
2440 | +RestrictedPython = 3.5.1 |
2441 | +setuptools = 0.6c11 |
2442 | +testtools = 0.9.14 |
2443 | +transaction = 1.0.0 |
2444 | +# Also upgrade the zc.buildout version in the Makefile's bin/buildout section. |
2445 | +zc.buildout = 1.5.1 |
2446 | +zc.lockfile = 1.0.0 |
2447 | +zc.recipe.egg = 1.3.2 |
2448 | +z3c.recipe.scripts = 1.0.1 |
2449 | +zc.zservertracelog = 1.1.5 |
2450 | +ZConfig = 2.9.1dev-20110728 |
2451 | +zdaemon = 2.0.4 |
2452 | +ZODB3 = 3.9.2 |
2453 | +zope.annotation = 3.5.0 |
2454 | +zope.app.applicationcontrol = 3.5.1 |
2455 | +zope.app.appsetup = 3.12.0 |
2456 | +zope.app.authentication = 3.6.1 |
2457 | +zope.app.basicskin = 3.4.1 |
2458 | +zope.app.component = 3.8.3 |
2459 | +zope.app.container = 3.8.0 |
2460 | +zope.app.form = 3.8.1 |
2461 | +zope.app.pagetemplate = 3.7.1 |
2462 | +zope.app.publication = 3.9.0 |
2463 | +zope.app.publisher = 3.10.0 |
2464 | +zope.app.server = 3.4.2 |
2465 | +zope.app.wsgi = 3.6.0 |
2466 | +zope.authentication = 3.7.0 |
2467 | +zope.broken = 3.5.0 |
2468 | +zope.browser = 1.2 |
2469 | +zope.browsermenu = 3.9.0 |
2470 | +zope.browserpage = 3.9.0 |
2471 | +zope.browserresource = 3.9.0 |
2472 | +zope.cachedescriptors = 3.5.0 |
2473 | +zope.component = 3.9.3 |
2474 | +zope.componentvocabulary = 1.0 |
2475 | +zope.configuration = 3.6.0 |
2476 | +zope.container = 3.9.0 |
2477 | +zope.contenttype = 3.5.0 |
2478 | +zope.copy = 3.5.0 |
2479 | +zope.copypastemove = 3.5.2 |
2480 | +zope.datetime = 3.4.0 |
2481 | +zope.deferredimport = 3.5.0 |
2482 | +zope.deprecation = 3.4.0 |
2483 | +zope.dottedname = 3.4.6 |
2484 | +zope.dublincore = 3.5.0 |
2485 | +zope.error = 3.7.0 |
2486 | +zope.event = 3.4.1 |
2487 | +zope.exceptions = 3.5.2 |
2488 | +zope.filerepresentation = 3.5.0 |
2489 | +zope.formlib = 3.6.0 |
2490 | +zope.hookable = 3.4.1 |
2491 | +zope.i18n = 3.7.1 |
2492 | +zope.i18nmessageid = 3.5.0 |
2493 | +zope.interface = 3.5.2 |
2494 | +zope.lifecycleevent = 3.5.2 |
2495 | +zope.location = 3.7.0 |
2496 | +zope.minmax = 1.1.1 |
2497 | +# Build of lp:~wallyworld/zope.pagetemplate/fix-isinstance |
2498 | +# This version adds a small change to the traversal logic so that the |
2499 | +# optimisation which applies if the object is a dict also works for subclasses |
2500 | +# of dict. The change has been approved for merge into the official zope code |
2501 | +# base. This patch is a temporary fix until the next official release. |
2502 | +zope.pagetemplate = 3.5.0-p1 |
2503 | +zope.password = 3.5.1 |
2504 | +zope.processlifetime = 1.0 |
2505 | +zope.proxy = 3.5.0 |
2506 | +zope.ptresource = 3.9.0 |
2507 | +zope.publisher = 3.12.0 |
2508 | +zope.schema = 3.5.4 |
2509 | +zope.security = 3.7.1 |
2510 | +zope.server = 3.6.1 |
2511 | +zope.session = 3.9.1 |
2512 | +zope.site = 3.7.0 |
2513 | +zope.size = 3.4.1 |
2514 | +zope.tal = 3.5.1 |
2515 | +zope.tales = 3.4.0 |
2516 | +# p1 Build of lp:~mars/zope.testing/3.9.4-p1. Fixes bugs 570380 and 587886. |
2517 | +# p2 With patch for thread leaks to make them skips, fixes windmill errors |
2518 | +# with 'new threads' in hudson/ec2 builds. |
2519 | +# p3 And always tear down layers, because thats the Right Thing To Do. |
2520 | +# p4 fixes --subunit --list to really just list the tests. |
2521 | +# p5 Build of lp:~launchpad/zope.testing/3.9.4-p5. Fixes bug #609986. |
2522 | +# p6 reinstates fix from p4. Build of lp:~launchpad/zope.testing/3.9.4-fork |
2523 | +# revision 26. |
2524 | +# p7 was unused |
2525 | +# p8 redirects stdout and stderr to a black hole device when --subunit is used |
2526 | +# p9 adds the redirection of __stderr__ to a black hole device |
2527 | +# p10 changed the test reporting to use test.id() rather than |
2528 | +# str(test) since only the id is unique. |
2529 | +# p11 reverts p9. |
2530 | +# p12 reverts p11, restoring p9. |
2531 | +# p13 Add a new --require-unique flag to the testrunner. When set, |
2532 | +# this will cause the testrunner to check all tests IDs to ensure they |
2533 | +# haven't been loaded before. If it encounters a duplicate, it will |
2534 | +# raise an error and quit. |
2535 | +# p14 Adds test data written to stderr and stdout into the subunit output. |
2536 | +# p15 Fixed internal tests. |
2537 | +# p16 Adds support for skips in Python 2.7. |
2538 | +# p17 Fixes skip support for Python 2.6. |
2539 | +# To build (use Python 2.6) run "python bootstrap.py; ./bin/buildout". Then to |
2540 | +# build the distribution run "bin/buildout setup . sdist" |
2541 | +# Make sure you have subunit installed. |
2542 | +zope.testing = 3.9.4-p17 |
2543 | +zope.traversing = 3.8.0 |
I don't condone the buildout approach, but OK.