Merge lp:~lifeless/lp-dev-utils/ppr into lp:lp-dev-utils
- ppr
- Merge into trunk
Proposed by
Robert Collins
on 2012-08-09
| Status: | Merged |
|---|---|
| Approved by: | Robert Collins on 2012-08-09 |
| Approved revision: | 124 |
| Merged at revision: | 124 |
| Proposed branch: | lp:~lifeless/lp-dev-utils/ppr |
| Merge into: | lp:lp-dev-utils |
| Diff against target: |
2543 lines (+2463/-2) 13 files modified
.bzrignore (+7/-0) .testr.conf (+3/-2) Makefile (+13/-0) README (+7/-0) bootstrap.py (+259/-0) buildout.cfg (+38/-0) page-performance-report-daily.sh (+115/-0) page-performance-report.ini (+79/-0) page-performance-report.py (+18/-0) pageperformancereport.py (+1277/-0) setup.py (+50/-0) test_pageperformancereport.py (+486/-0) versions.cfg (+111/-0) |
| To merge this branch: | bzr merge lp:~lifeless/lp-dev-utils/ppr |
| Related bugs: |
| Reviewer | Review Type | Date Requested | Status |
|---|---|---|---|
| William Grant | code | 2012-08-09 | Approve on 2012-08-09 |
|
Review via email:
|
|||
Commit Message
Description of the Change
This branch:
- updates the .testr.conf to support parallel tests.
- adds buildout to let us use zc stuff (but makes it optional)
- and migrates the pageperformance
To post a comment you must log in.
| Robert Collins (lifeless) wrote : | # |
(For clarity, william doesn't condone it anywhere :P)
Preview Diff
[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
| 1 | === modified file '.bzrignore' |
| 2 | --- .bzrignore 2012-04-13 15:33:03 +0000 |
| 3 | +++ .bzrignore 2012-08-09 04:56:19 +0000 |
| 4 | @@ -1,3 +1,10 @@ |
| 5 | .launchpadlib |
| 6 | _trial_temp |
| 7 | .testrepository |
| 8 | +.installed.cfg |
| 9 | +eggs |
| 10 | +download-cache |
| 11 | +lp_dev_utils.egg-info |
| 12 | +parts |
| 13 | +bin |
| 14 | +develop-eggs |
| 15 | |
| 16 | === modified file '.testr.conf' |
| 17 | --- .testr.conf 2012-04-13 15:08:57 +0000 |
| 18 | +++ .testr.conf 2012-08-09 04:56:19 +0000 |
| 19 | @@ -1,3 +1,4 @@ |
| 20 | [DEFAULT] |
| 21 | -test_command=PYTHONPATH=.:$PYTHONPATH python -m subunit.run discover $IDLIST |
| 22 | -test_id_list_default=ec2test |
| 23 | +test_command=${PYTHON:-python} -m subunit.run discover $LISTOPT $IDOPTION . |
| 24 | +test_id_option=--load-list $IDFILE |
| 25 | +test_list_option=--list |
| 26 | |
| 27 | === added file 'Makefile' |
| 28 | --- Makefile 1970-01-01 00:00:00 +0000 |
| 29 | +++ Makefile 2012-08-09 04:56:19 +0000 |
| 30 | @@ -0,0 +1,13 @@ |
| 31 | +all: |
| 32 | + |
| 33 | +bin/buildout: buildout.cfg versions.cfg setup.py download-cache eggs |
| 34 | + ./bootstrap.py \ |
| 35 | + --setup-source=download-cache/ez_setup.py \ |
| 36 | + --download-base=download-cache/dist --eggs=eggs |
| 37 | + |
| 38 | + |
| 39 | +download-cache: |
| 40 | + bzr checkout --lightweight lp:lp-source-dependencies download-cache |
| 41 | + |
| 42 | +eggs: |
| 43 | + mkdir eggs |
| 44 | |
| 45 | === modified file 'README' |
| 46 | --- README 2012-04-13 15:08:57 +0000 |
| 47 | +++ README 2012-08-09 04:56:19 +0000 |
| 48 | @@ -1,6 +1,7 @@ |
| 49 | ============== |
| 50 | lp-dev-utils |
| 51 | ============== |
| 52 | + |
| 53 | Tools for hacking on Launchpad |
| 54 | ============================== |
| 55 | |
| 56 | @@ -40,3 +41,9 @@ |
| 57 | Ran 84 (+84) tests in 51.723s (+51.651s) |
| 58 | FAILED (id=1) |
| 59 | |
| 60 | +To run the pageperformancereport tests, zc.zservertracelog is needed, this is |
| 61 | +best obtained via buildout:: |
| 62 | + |
| 63 | + $ make bin/buildout |
| 64 | + $ bin/buildout |
| 65 | + $ PYTHON=bin/py testr run |
| 66 | |
| 67 | === added file 'bootstrap.py' |
| 68 | --- bootstrap.py 1970-01-01 00:00:00 +0000 |
| 69 | +++ bootstrap.py 2012-08-09 04:56:19 +0000 |
| 70 | @@ -0,0 +1,259 @@ |
| 71 | +#!/usr/bin/env python |
| 72 | +############################################################################## |
| 73 | +# |
| 74 | +# Copyright (c) 2006 Zope Foundation and Contributors. |
| 75 | +# All Rights Reserved. |
| 76 | +# |
| 77 | +# This software is subject to the provisions of the Zope Public License, |
| 78 | +# Version 2.1 (ZPL). A copy of the ZPL should accompany this distribution. |
| 79 | +# THIS SOFTWARE IS PROVIDED "AS IS" AND ANY AND ALL EXPRESS OR IMPLIED |
| 80 | +# WARRANTIES ARE DISCLAIMED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED |
| 81 | +# WARRANTIES OF TITLE, MERCHANTABILITY, AGAINST INFRINGEMENT, AND FITNESS |
| 82 | +# FOR A PARTICULAR PURPOSE. |
| 83 | +# |
| 84 | +############################################################################## |
| 85 | +"""Bootstrap a buildout-based project |
| 86 | + |
| 87 | +Simply run this script in a directory containing a buildout.cfg. |
| 88 | +The script accepts buildout command-line options, so you can |
| 89 | +use the -c option to specify an alternate configuration file. |
| 90 | +""" |
| 91 | + |
| 92 | +import os, shutil, sys, tempfile, textwrap, urllib, urllib2, subprocess |
| 93 | +from optparse import OptionParser |
| 94 | + |
| 95 | +if sys.platform == 'win32': |
| 96 | + def quote(c): |
| 97 | + if ' ' in c: |
| 98 | + return '"%s"' % c # work around spawn lamosity on windows |
| 99 | + else: |
| 100 | + return c |
| 101 | +else: |
| 102 | + quote = str |
| 103 | + |
| 104 | +# See zc.buildout.easy_install._has_broken_dash_S for motivation and comments. |
| 105 | +stdout, stderr = subprocess.Popen( |
| 106 | + [sys.executable, '-Sc', |
| 107 | + 'try:\n' |
| 108 | + ' import ConfigParser\n' |
| 109 | + 'except ImportError:\n' |
| 110 | + ' print 1\n' |
| 111 | + 'else:\n' |
| 112 | + ' print 0\n'], |
| 113 | + stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate() |
| 114 | +has_broken_dash_S = bool(int(stdout.strip())) |
| 115 | + |
| 116 | +# In order to be more robust in the face of system Pythons, we want to |
| 117 | +# run without site-packages loaded. This is somewhat tricky, in |
| 118 | +# particular because Python 2.6's distutils imports site, so starting |
| 119 | +# with the -S flag is not sufficient. However, we'll start with that: |
| 120 | +if not has_broken_dash_S and 'site' in sys.modules: |
| 121 | + # We will restart with python -S. |
| 122 | + args = sys.argv[:] |
| 123 | + args[0:0] = [sys.executable, '-S'] |
| 124 | + args = map(quote, args) |
| 125 | + os.execv(sys.executable, args) |
| 126 | +# Now we are running with -S. We'll get the clean sys.path, import site |
| 127 | +# because distutils will do it later, and then reset the path and clean |
| 128 | +# out any namespace packages from site-packages that might have been |
| 129 | +# loaded by .pth files. |
| 130 | +clean_path = sys.path[:] |
| 131 | +import site |
| 132 | +sys.path[:] = clean_path |
| 133 | +for k, v in sys.modules.items(): |
| 134 | + if (hasattr(v, '__path__') and |
| 135 | + len(v.__path__)==1 and |
| 136 | + not os.path.exists(os.path.join(v.__path__[0],'__init__.py'))): |
| 137 | + # This is a namespace package. Remove it. |
| 138 | + sys.modules.pop(k) |
| 139 | + |
| 140 | +is_jython = sys.platform.startswith('java') |
| 141 | + |
| 142 | +setuptools_source = 'http://peak.telecommunity.com/dist/ez_setup.py' |
| 143 | +distribute_source = 'http://python-distribute.org/distribute_setup.py' |
| 144 | + |
| 145 | +# parsing arguments |
| 146 | +def normalize_to_url(option, opt_str, value, parser): |
| 147 | + if value: |
| 148 | + if '://' not in value: # It doesn't smell like a URL. |
| 149 | + value = 'file://%s' % ( |
| 150 | + urllib.pathname2url( |
| 151 | + os.path.abspath(os.path.expanduser(value))),) |
| 152 | + if opt_str == '--download-base' and not value.endswith('/'): |
| 153 | + # Download base needs a trailing slash to make the world happy. |
| 154 | + value += '/' |
| 155 | + else: |
| 156 | + value = None |
| 157 | + name = opt_str[2:].replace('-', '_') |
| 158 | + setattr(parser.values, name, value) |
| 159 | + |
| 160 | +usage = '''\ |
| 161 | +[DESIRED PYTHON FOR BUILDOUT] bootstrap.py [options] |
| 162 | + |
| 163 | +Bootstraps a buildout-based project. |
| 164 | + |
| 165 | +Simply run this script in a directory containing a buildout.cfg, using the |
| 166 | +Python that you want bin/buildout to use. |
| 167 | + |
| 168 | +Note that by using --setup-source and --download-base to point to |
| 169 | +local resources, you can keep this script from going over the network. |
| 170 | +''' |
| 171 | + |
| 172 | +parser = OptionParser(usage=usage) |
| 173 | +parser.add_option("-v", "--version", dest="version", |
| 174 | + help="use a specific zc.buildout version") |
| 175 | +parser.add_option("-d", "--distribute", |
| 176 | + action="store_true", dest="use_distribute", default=False, |
| 177 | + help="Use Distribute rather than Setuptools.") |
| 178 | +parser.add_option("--setup-source", action="callback", dest="setup_source", |
| 179 | + callback=normalize_to_url, nargs=1, type="string", |
| 180 | + help=("Specify a URL or file location for the setup file. " |
| 181 | + "If you use Setuptools, this will default to " + |
| 182 | + setuptools_source + "; if you use Distribute, this " |
| 183 | + "will default to " + distribute_source +".")) |
| 184 | +parser.add_option("--download-base", action="callback", dest="download_base", |
| 185 | + callback=normalize_to_url, nargs=1, type="string", |
| 186 | + help=("Specify a URL or directory for downloading " |
| 187 | + "zc.buildout and either Setuptools or Distribute. " |
| 188 | + "Defaults to PyPI.")) |
| 189 | +parser.add_option("--eggs", |
| 190 | + help=("Specify a directory for storing eggs. Defaults to " |
| 191 | + "a temporary directory that is deleted when the " |
| 192 | + "bootstrap script completes.")) |
| 193 | +parser.add_option("-t", "--accept-buildout-test-releases", |
| 194 | + dest='accept_buildout_test_releases', |
| 195 | + action="store_true", default=False, |
| 196 | + help=("Normally, if you do not specify a --version, the " |
| 197 | + "bootstrap script and buildout gets the newest " |
| 198 | + "*final* versions of zc.buildout and its recipes and " |
| 199 | + "extensions for you. If you use this flag, " |
| 200 | + "bootstrap and buildout will get the newest releases " |
| 201 | + "even if they are alphas or betas.")) |
| 202 | +parser.add_option("-c", None, action="store", dest="config_file", |
| 203 | + help=("Specify the path to the buildout configuration " |
| 204 | + "file to be used.")) |
| 205 | + |
| 206 | +options, args = parser.parse_args() |
| 207 | + |
| 208 | +# if -c was provided, we push it back into args for buildout's main function |
| 209 | +if options.config_file is not None: |
| 210 | + args += ['-c', options.config_file] |
| 211 | + |
| 212 | +if options.eggs: |
| 213 | + eggs_dir = os.path.abspath(os.path.expanduser(options.eggs)) |
| 214 | +else: |
| 215 | + eggs_dir = tempfile.mkdtemp() |
| 216 | + |
| 217 | +if options.setup_source is None: |
| 218 | + if options.use_distribute: |
| 219 | + options.setup_source = distribute_source |
| 220 | + else: |
| 221 | + options.setup_source = setuptools_source |
| 222 | + |
| 223 | +if options.accept_buildout_test_releases: |
| 224 | + args.append('buildout:accept-buildout-test-releases=true') |
| 225 | +args.append('bootstrap') |
| 226 | + |
| 227 | +try: |
| 228 | + import pkg_resources |
| 229 | + import setuptools # A flag. Sometimes pkg_resources is installed alone. |
| 230 | + if not hasattr(pkg_resources, '_distribute'): |
| 231 | + raise ImportError |
| 232 | +except ImportError: |
| 233 | + ez_code = urllib2.urlopen( |
| 234 | + options.setup_source).read().replace('\r\n', '\n') |
| 235 | + ez = {} |
| 236 | + exec ez_code in ez |
| 237 | + setup_args = dict(to_dir=eggs_dir, download_delay=0) |
| 238 | + if options.download_base: |
| 239 | + setup_args['download_base'] = options.download_base |
| 240 | + if options.use_distribute: |
| 241 | + setup_args['no_fake'] = True |
| 242 | + ez['use_setuptools'](**setup_args) |
| 243 | + reload(sys.modules['pkg_resources']) |
| 244 | + import pkg_resources |
| 245 | + # This does not (always?) update the default working set. We will |
| 246 | + # do it. |
| 247 | + for path in sys.path: |
| 248 | + if path not in pkg_resources.working_set.entries: |
| 249 | + pkg_resources.working_set.add_entry(path) |
| 250 | + |
| 251 | +cmd = [quote(sys.executable), |
| 252 | + '-c', |
| 253 | + quote('from setuptools.command.easy_install import main; main()'), |
| 254 | + '-mqNxd', |
| 255 | + quote(eggs_dir)] |
| 256 | + |
| 257 | +if not has_broken_dash_S: |
| 258 | + cmd.insert(1, '-S') |
| 259 | + |
| 260 | +find_links = options.download_base |
| 261 | +if not find_links: |
| 262 | + find_links = os.environ.get('bootstrap-testing-find-links') |
| 263 | +if find_links: |
| 264 | + cmd.extend(['-f', quote(find_links)]) |
| 265 | + |
| 266 | +if options.use_distribute: |
| 267 | + setup_requirement = 'distribute' |
| 268 | +else: |
| 269 | + setup_requirement = 'setuptools' |
| 270 | +ws = pkg_resources.working_set |
| 271 | +setup_requirement_path = ws.find( |
| 272 | + pkg_resources.Requirement.parse(setup_requirement)).location |
| 273 | +env = dict( |
| 274 | + os.environ, |
| 275 | + PYTHONPATH=setup_requirement_path) |
| 276 | + |
| 277 | +requirement = 'zc.buildout' |
| 278 | +version = options.version |
| 279 | +if version is None and not options.accept_buildout_test_releases: |
| 280 | + # Figure out the most recent final version of zc.buildout. |
| 281 | + import setuptools.package_index |
| 282 | + _final_parts = '*final-', '*final' |
| 283 | + def _final_version(parsed_version): |
| 284 | + for part in parsed_version: |
| 285 | + if (part[:1] == '*') and (part not in _final_parts): |
| 286 | + return False |
| 287 | + return True |
| 288 | + index = setuptools.package_index.PackageIndex( |
| 289 | + search_path=[setup_requirement_path]) |
| 290 | + if find_links: |
| 291 | + index.add_find_links((find_links,)) |
| 292 | + req = pkg_resources.Requirement.parse(requirement) |
| 293 | + if index.obtain(req) is not None: |
| 294 | + best = [] |
| 295 | + bestv = None |
| 296 | + for dist in index[req.project_name]: |
| 297 | + distv = dist.parsed_version |
| 298 | + if _final_version(distv): |
| 299 | + if bestv is None or distv > bestv: |
| 300 | + best = [dist] |
| 301 | + bestv = distv |
| 302 | + elif distv == bestv: |
| 303 | + best.append(dist) |
| 304 | + if best: |
| 305 | + best.sort() |
| 306 | + version = best[-1].version |
| 307 | +if version: |
| 308 | + requirement = '=='.join((requirement, version)) |
| 309 | +cmd.append(requirement) |
| 310 | + |
| 311 | +if is_jython: |
| 312 | + import subprocess |
| 313 | + exitcode = subprocess.Popen(cmd, env=env).wait() |
| 314 | +else: # Windows prefers this, apparently; otherwise we would prefer subprocess |
| 315 | + exitcode = os.spawnle(*([os.P_WAIT, sys.executable] + cmd + [env])) |
| 316 | +if exitcode != 0: |
| 317 | + sys.stdout.flush() |
| 318 | + sys.stderr.flush() |
| 319 | + print ("An error occurred when trying to install zc.buildout. " |
| 320 | + "Look above this message for any errors that " |
| 321 | + "were output by easy_install.") |
| 322 | + sys.exit(exitcode) |
| 323 | + |
| 324 | +ws.add_entry(eggs_dir) |
| 325 | +ws.require(requirement) |
| 326 | +import zc.buildout.buildout |
| 327 | +zc.buildout.buildout.main(args) |
| 328 | +if not options.eggs: # clean up temporary egg directory |
| 329 | + shutil.rmtree(eggs_dir) |
| 330 | |
| 331 | === added file 'buildout.cfg' |
| 332 | --- buildout.cfg 1970-01-01 00:00:00 +0000 |
| 333 | +++ buildout.cfg 2012-08-09 04:56:19 +0000 |
| 334 | @@ -0,0 +1,38 @@ |
| 335 | +# Copyright 2011 Canonical Ltd. This software is licensed under the |
| 336 | +# GNU Lesser General Public License version 3 (see the file LICENSE). |
| 337 | + |
| 338 | +[buildout] |
| 339 | +parts = |
| 340 | + scripts |
| 341 | +unzip = true |
| 342 | +eggs-directory = eggs |
| 343 | +download-cache = download-cache |
| 344 | +relative-paths = true |
| 345 | + |
| 346 | +# Disable this option temporarily if you want buildout to find software |
| 347 | +# dependencies *other* than those in our download-cache. Once you have the |
| 348 | +# desired software, reenable this option (and check in the new software to |
| 349 | +# lp:lp-source-dependencies if this is going to be reviewed/merged/deployed.) |
| 350 | +install-from-cache = true |
| 351 | + |
| 352 | +# This also will need to be temporarily disabled or changed for package |
| 353 | +# upgrades. Newly-added packages should also add their desired version number |
| 354 | +# to versions.cfg. |
| 355 | +extends = versions.cfg |
| 356 | + |
| 357 | +allow-picked-versions = false |
| 358 | + |
| 359 | +prefer-final = true |
| 360 | + |
| 361 | +develop = . |
| 362 | + |
| 363 | +# [configuration] |
| 364 | +# instance_name = development |
| 365 | + |
| 366 | +[scripts] |
| 367 | +recipe = z3c.recipe.scripts |
| 368 | +eggs = lp-dev-utils [test] |
| 369 | +include-site-packages = true |
| 370 | +allowed-eggs-from-site-packages = |
| 371 | + subunit |
| 372 | +interpreter = py |
| 373 | |
| 374 | === added file 'page-performance-report-daily.sh' |
| 375 | --- page-performance-report-daily.sh 1970-01-01 00:00:00 +0000 |
| 376 | +++ page-performance-report-daily.sh 2012-08-09 04:56:19 +0000 |
| 377 | @@ -0,0 +1,115 @@ |
| 378 | +#!/bin/sh |
| 379 | + |
| 380 | +#TZ=UTC # trace logs are still BST - blech |
| 381 | + |
| 382 | +CATEGORY=lpnet |
| 383 | +LOGS_ROOTS="/srv/launchpad.net-logs/production /srv/launchpad.net-logs/edge" |
| 384 | +OUTPUT_ROOT=${HOME}/public_html/ppr/lpnet |
| 385 | +DAY_FMT="+%Y-%m-%d" |
| 386 | + |
| 387 | +find_logs() { |
| 388 | + from=$1 |
| 389 | + until=$2 |
| 390 | + |
| 391 | + end_mtime_switch= |
| 392 | + days_to_end="$(expr `date +%j` - `date -d $until +%j` - 1)" |
| 393 | + if [ $days_to_end -gt 0 ]; then |
| 394 | + end_mtime_switch="-daystart -mtime +$days_to_end" |
| 395 | + fi |
| 396 | + |
| 397 | + find ${LOGS_ROOTS} \ |
| 398 | + -maxdepth 2 -type f -newermt "$from - 1 day" $end_mtime_switch \ |
| 399 | + -name launchpad-trace\* \ |
| 400 | + | sort | xargs -x |
| 401 | +} |
| 402 | + |
| 403 | +# Find all the daily stats.pck.bz2 $from $until |
| 404 | +find_stats() { |
| 405 | + from=$1 |
| 406 | + until=$2 |
| 407 | + |
| 408 | + # Build a string of all the days within range. |
| 409 | + local dates |
| 410 | + local day |
| 411 | + day=$from |
| 412 | + while [ $day != $until ]; do |
| 413 | + dates="$dates $day" |
| 414 | + day=`date $DAY_FMT -d "$day + 1 day"` |
| 415 | + done |
| 416 | + |
| 417 | + # Use that to build a regex that will be used to select |
| 418 | + # the files to use. |
| 419 | + local regex |
| 420 | + regex="daily_(`echo $dates |sed -e 's/ /|/g'`)" |
| 421 | + |
| 422 | + find ${OUTPUT_ROOT} -name 'stats.pck.bz2' | egrep $regex |
| 423 | +} |
| 424 | + |
| 425 | +report() { |
| 426 | + type=$1 |
| 427 | + from=$2 |
| 428 | + until=$3 |
| 429 | + link=$4 |
| 430 | + |
| 431 | + local files |
| 432 | + local options |
| 433 | + if [ "$type" = "daily" ]; then |
| 434 | + files=`find_logs $from $until` |
| 435 | + options="--from=$from --until=$until" |
| 436 | + else |
| 437 | + files=`find_stats $from $until` |
| 438 | + options="--merge" |
| 439 | + fi |
| 440 | + |
| 441 | + local dir |
| 442 | + dir=${OUTPUT_ROOT}/`date -d $from +%Y-%m`/${type}_${from}_${until} |
| 443 | + mkdir -p ${dir} |
| 444 | + |
| 445 | + echo Generating report from $from until $until into $dir `date` |
| 446 | + |
| 447 | + ./page-performance-report.py -v --top-urls=200 --directory=${dir} \ |
| 448 | + $options $files |
| 449 | + |
| 450 | + # Only do the linking if requested. |
| 451 | + if [ "$link" = "link" ]; then |
| 452 | + ln -sf ${dir}/partition.html \ |
| 453 | + ${OUTPUT_ROOT}/latest-${type}-partition.html |
| 454 | + ln -sf ${dir}/categories.html \ |
| 455 | + ${OUTPUT_ROOT}/latest-${type}-categories.html |
| 456 | + ln -sf ${dir}/pageids.html \ |
| 457 | + ${OUTPUT_ROOT}/latest-${type}-pageids.html |
| 458 | + ln -sf ${dir}/combined.html \ |
| 459 | + ${OUTPUT_ROOT}/latest-${type}-combined.html |
| 460 | + ln -sf ${dir}/metrics.dat ${OUTPUT_ROOT}/latest-${type}-metrics.dat |
| 461 | + ln -sf ${dir}/top200.html ${OUTPUT_ROOT}/latest-${type}-top200.html |
| 462 | + ln -sf ${dir}/timeout-candidates.html \ |
| 463 | + ${OUTPUT_ROOT}/latest-${type}-timeout-candidates.html |
| 464 | + fi |
| 465 | + |
| 466 | + return 0 |
| 467 | +} |
| 468 | + |
| 469 | +local link |
| 470 | +if [ "$3" = "-l" ]; then |
| 471 | + link="link" |
| 472 | +fi |
| 473 | + |
| 474 | +if [ "$1" = '-d' ]; then |
| 475 | + report daily `date -d $2 $DAY_FMT` `date -d "$2 + 1 day" $DAY_FMT` $link |
| 476 | +elif [ "$1" = '-w' ]; then |
| 477 | + report weekly `date -d $2 $DAY_FMT` `date -d "$2 + 1 week" $DAY_FMT` $link |
| 478 | +elif [ "$1" = '-m' ]; then |
| 479 | + report monthly `date -d $2 $DAY_FMT` `date -d "$2 + 1 month" $DAY_FMT` $link |
| 480 | +else |
| 481 | + # Default invocation used from cron to generate latest one. |
| 482 | + now=`date $DAY_FMT` |
| 483 | + report daily `date -d yesterday $DAY_FMT` $now link |
| 484 | + |
| 485 | + if [ `date +%a` = 'Sun' ]; then |
| 486 | + report weekly `date -d 'last week' $DAY_FMT` $now link |
| 487 | + fi |
| 488 | + |
| 489 | + if [ `date +%d` = '01' ]; then |
| 490 | + report monthly `date -d 'last month' $DAY_FMT` $now link |
| 491 | + fi |
| 492 | +fi |
| 493 | |
| 494 | === added file 'page-performance-report.ini' |
| 495 | --- page-performance-report.ini 1970-01-01 00:00:00 +0000 |
| 496 | +++ page-performance-report.ini 2012-08-09 04:56:19 +0000 |
| 497 | @@ -0,0 +1,79 @@ |
| 498 | +[categories] |
| 499 | +# Category -> Python regular expression. |
| 500 | +# Remeber to quote ?, ., + & ? characters to match litterally. |
| 501 | +# 'kodos' is useful for interactively testing regular expressions. |
| 502 | +All Launchpad=. |
| 503 | +All Launchpad except operational pages=(?<!\+opstats|\+haproxy)$ |
| 504 | + |
| 505 | +API=(^https?://api\.|/\+access-token$) |
| 506 | +Operational=(\+opstats|\+haproxy)$ |
| 507 | +Web (Non API/non operational/non XML-RPC)=^https?://(?!api\.) |
| 508 | + [^/]+($|/ |
| 509 | + (?!\+haproxy|\+opstats|\+access-token |
| 510 | + |((authserver|bugs|bazaar|codehosting| |
| 511 | + codeimportscheduler|mailinglists|softwarecenteragent| |
| 512 | + featureflags)/\w+$))) |
| 513 | +Other=^/ |
| 514 | + |
| 515 | +Launchpad Frontpage=^https?://launchpad\.[^/]+(/index\.html)?$ |
| 516 | + |
| 517 | +# Note that the bug text dump is served on the main launchpad domain |
| 518 | +# and we need to exlude it from the registry stats. |
| 519 | +Registry=^https?://launchpad\..*(?<!/\+text)(?<!/\+access-token)$ |
| 520 | +Registry - Person Index=^https?://launchpad\.[^/]+/%7E[^/]+(/\+index)?$ |
| 521 | +Registry - Pillar Index=^https?://launchpad\.[^/]+/\w[^/]*(/\+index)?$ |
| 522 | + |
| 523 | +Answers=^https?://answers\. |
| 524 | +Answers - Front page=^https?://answers\.[^/]+(/questions/\+index)?$ |
| 525 | + |
| 526 | +Blueprints=^https?://blueprints\. |
| 527 | +Blueprints - Front page=^https?://blueprints\.[^/]+(/specs/\+index)?$ |
| 528 | + |
| 529 | +# Note that the bug text dump is not served on the bugs domain, |
| 530 | +# probably for hysterical reasons. This is why the bugs regexp is |
| 531 | +# confusing. |
| 532 | +Bugs=^https?://(bugs\.|.+/bugs/\d+/\+text$) |
| 533 | +Bugs - Front page=^https?://bugs\.[^/]+(/bugs/\+index)?$ |
| 534 | +Bugs - Bug Page=^https?://bugs\.[^/]+/.+/\+bug/\d+(/\+index)?$ |
| 535 | +Bugs - Pillar Index=^https?://bugs\.[^/]+/\w[^/]*(/\+bugs-index)?$ |
| 536 | +Bugs - Search=^https?://bugs\.[^/]+/.+/\+bugs$ |
| 537 | +Bugs - Text Dump=^https?://launchpad\..+/\+text$ |
| 538 | + |
| 539 | +Code=^https?://code\. |
| 540 | +Code - Front page=^https?://code\.[^/]+(/\+code/\+index)?$ |
| 541 | +Code - Pillar Branches=^https?://code\.[^/]+/\w[^/]*(/\+code-index)?$ |
| 542 | +Code - Branch Page=^https?://code\.[^/]+/%7E[^/]+/[^/]+/[^/]+(/\+index)?$ |
| 543 | +Code - Merge Proposal=^https?://code\.[^/]+/.+/\+merge/\d+(/\+index)$ |
| 544 | + |
| 545 | +Soyuz - PPA Index=^https?://launchpad\.[^/]+/.+/\+archive/[^/]+(/\+index)?$ |
| 546 | + |
| 547 | +Translations=^https?://translations\. |
| 548 | +Translations - Front page=^https?://translations\.[^/]+/translations/\+index$ |
| 549 | +Translations - Overview=^https?://translations\..*/\+lang/\w+(/\+index)?$ |
| 550 | + |
| 551 | +Public XML-RPC=^https://(launchpad|xmlrpc)[^/]+/bazaar/\w+$ |
| 552 | +Private XML-RPC=^https://(launchpad|xmlrpc)[^/]+/ |
| 553 | + (authserver|bugs|codehosting| |
| 554 | + codeimportscheduler|mailinglists| |
| 555 | + softwarecenteragent|featureflags)/\w+$ |
| 556 | + |
| 557 | +[metrics] |
| 558 | +ppr_all=All Launchpad except operational pages |
| 559 | +ppr_web=Web (Non API/non operational/non XML-RPC) |
| 560 | +ppr_operational=Operational |
| 561 | +ppr_bugs=Bugs |
| 562 | +ppr_api=API |
| 563 | +ppr_code=Code |
| 564 | +ppr_public_xmlrpc=Public XML-RPC |
| 565 | +ppr_private_xmlrpc=Private XML-RPC |
| 566 | +ppr_translations=Translations |
| 567 | +ppr_registry=Registry |
| 568 | +ppr_other=Other |
| 569 | + |
| 570 | +[partition] |
| 571 | +API= |
| 572 | +Operational= |
| 573 | +Private XML-RPC= |
| 574 | +Public XML-RPC= |
| 575 | +Web (Non API/non operational/non XML-RPC)= |
| 576 | +Other= |
| 577 | |
| 578 | === added file 'page-performance-report.py' |
| 579 | --- page-performance-report.py 1970-01-01 00:00:00 +0000 |
| 580 | +++ page-performance-report.py 2012-08-09 04:56:19 +0000 |
| 581 | @@ -0,0 +1,18 @@ |
| 582 | +#!/usr/bin/python -S |
| 583 | +# |
| 584 | +# Copyright 2010 Canonical Ltd. This software is licensed under the |
| 585 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
| 586 | + |
| 587 | +"""Page performance report generated from zserver tracelogs.""" |
| 588 | + |
| 589 | +__metaclass__ = type |
| 590 | + |
| 591 | +import _pythonpath |
| 592 | + |
| 593 | +import sys |
| 594 | + |
| 595 | +from lp.scripts.utilities.pageperformancereport import main |
| 596 | + |
| 597 | + |
| 598 | +if __name__ == '__main__': |
| 599 | + sys.exit(main()) |
| 600 | |
| 601 | === added file 'pageperformancereport.py' |
| 602 | --- pageperformancereport.py 1970-01-01 00:00:00 +0000 |
| 603 | +++ pageperformancereport.py 2012-08-09 04:56:19 +0000 |
| 604 | @@ -0,0 +1,1277 @@ |
| 605 | +# Copyright 2010 Canonical Ltd. This software is licensed under the |
| 606 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
| 607 | + |
| 608 | +"""Page performance report generated from zserver trace logs.""" |
| 609 | + |
| 610 | +__metaclass__ = type |
| 611 | +__all__ = ['main'] |
| 612 | + |
| 613 | +import bz2 |
| 614 | +from cgi import escape as html_quote |
| 615 | +from ConfigParser import RawConfigParser |
| 616 | +import copy |
| 617 | +import cPickle |
| 618 | +import csv |
| 619 | +from datetime import datetime |
| 620 | +import gzip |
| 621 | +import logging |
| 622 | +import math |
| 623 | +import optparse |
| 624 | +import os.path |
| 625 | +import re |
| 626 | +import textwrap |
| 627 | +from textwrap import dedent |
| 628 | +import time |
| 629 | + |
| 630 | +import simplejson as json |
| 631 | +import sre_constants |
| 632 | +import zc.zservertracelog.tracereport |
| 633 | + |
| 634 | +logging.basicConfig() |
| 635 | +log = logging |
| 636 | + |
| 637 | + |
| 638 | +def _check_datetime(option, opt, value): |
| 639 | + "Type checker for optparse datetime option type." |
| 640 | + # We support 5 valid ISO8601 formats. |
| 641 | + formats = [ |
| 642 | + '%Y-%m-%dT%H:%M:%S', |
| 643 | + '%Y-%m-%dT%H:%M', |
| 644 | + '%Y-%m-%d %H:%M:%S', |
| 645 | + '%Y-%m-%d %H:%M', |
| 646 | + '%Y-%m-%d', |
| 647 | + ] |
| 648 | + for format in formats: |
| 649 | + try: |
| 650 | + return datetime.strptime(value, format) |
| 651 | + except ValueError: |
| 652 | + pass |
| 653 | + raise optparse.OptionValueError( |
| 654 | + "option %s: invalid datetime value: %r" % (opt, value)) |
| 655 | + |
| 656 | + |
| 657 | +class Option(optparse.Option): |
| 658 | + """Extended optparse Option class. |
| 659 | + |
| 660 | + Adds a 'datetime' option type. |
| 661 | + """ |
| 662 | + TYPES = optparse.Option.TYPES + ("datetime", datetime) |
| 663 | + TYPE_CHECKER = copy.copy(optparse.Option.TYPE_CHECKER) |
| 664 | + TYPE_CHECKER["datetime"] = _check_datetime |
| 665 | + TYPE_CHECKER[datetime] = _check_datetime |
| 666 | + |
| 667 | + |
| 668 | +class OptionParser(optparse.OptionParser): |
| 669 | + """Extended optparse OptionParser. |
| 670 | + |
| 671 | + Adds a 'datetime' option type. |
| 672 | + """ |
| 673 | + |
| 674 | + def __init__(self, *args, **kw): |
| 675 | + kw.setdefault('option_class', Option) |
| 676 | + optparse.OptionParser.__init__(self, *args, **kw) |
| 677 | + |
| 678 | + |
| 679 | +class Request(zc.zservertracelog.tracereport.Request): |
| 680 | + url = None |
| 681 | + pageid = None |
| 682 | + ticks = None |
| 683 | + sql_statements = None |
| 684 | + sql_seconds = None |
| 685 | + |
| 686 | + # Override the broken version in our superclass that always |
| 687 | + # returns an integer. |
| 688 | + @property |
| 689 | + def app_seconds(self): |
| 690 | + interval = self.app_time - self.start_app_time |
| 691 | + return interval.seconds + interval.microseconds / 1000000.0 |
| 692 | + |
| 693 | + # Override the broken version in our superclass that always |
| 694 | + # returns an integer. |
| 695 | + @property |
| 696 | + def total_seconds(self): |
| 697 | + interval = self.end - self.start |
| 698 | + return interval.seconds + interval.microseconds / 1000000.0 |
| 699 | + |
| 700 | + |
| 701 | +class Category: |
| 702 | + """A Category in our report. |
| 703 | + |
| 704 | + Requests belong to a Category if the URL matches a regular expression. |
| 705 | + """ |
| 706 | + |
| 707 | + def __init__(self, title, regexp): |
| 708 | + self.title = title |
| 709 | + self.regexp = regexp |
| 710 | + self._compiled_regexp = re.compile(regexp, re.I | re.X) |
| 711 | + self.partition = False |
| 712 | + |
| 713 | + def match(self, request): |
| 714 | + """Return true when the request match this category.""" |
| 715 | + return self._compiled_regexp.search(request.url) is not None |
| 716 | + |
| 717 | + def __cmp__(self, other): |
| 718 | + return cmp(self.title.lower(), other.title.lower()) |
| 719 | + |
| 720 | + def __deepcopy__(self, memo): |
| 721 | + # We provide __deepcopy__ because the module doesn't handle |
| 722 | + # compiled regular expression by default. |
| 723 | + return Category(self.title, self.regexp) |
| 724 | + |
| 725 | + |
| 726 | +class OnlineStatsCalculator: |
| 727 | + """Object that can compute count, sum, mean, variance and median. |
| 728 | + |
| 729 | + It computes these value incrementally and using minimal storage |
| 730 | + using the Welford / Knuth algorithm described at |
| 731 | + http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm |
| 732 | + """ |
| 733 | + |
| 734 | + def __init__(self): |
| 735 | + self.count = 0 |
| 736 | + self.sum = 0 |
| 737 | + self.M2 = 0.0 # Sum of square difference |
| 738 | + self.mean = 0.0 |
| 739 | + |
| 740 | + def update(self, x): |
| 741 | + """Incrementally update the stats when adding x to the set. |
| 742 | + |
| 743 | + None values are ignored. |
| 744 | + """ |
| 745 | + if x is None: |
| 746 | + return |
| 747 | + self.count += 1 |
| 748 | + self.sum += x |
| 749 | + delta = x - self.mean |
| 750 | + self.mean = float(self.sum)/self.count |
| 751 | + self.M2 += delta*(x - self.mean) |
| 752 | + |
| 753 | + @property |
| 754 | + def variance(self): |
| 755 | + """Return the population variance.""" |
| 756 | + if self.count == 0: |
| 757 | + return 0 |
| 758 | + else: |
| 759 | + return self.M2/self.count |
| 760 | + |
| 761 | + @property |
| 762 | + def std(self): |
| 763 | + """Return the standard deviation.""" |
| 764 | + if self.count == 0: |
| 765 | + return 0 |
| 766 | + else: |
| 767 | + return math.sqrt(self.variance) |
| 768 | + |
| 769 | + def __add__(self, other): |
| 770 | + """Adds this and another OnlineStatsCalculator. |
| 771 | + |
| 772 | + The result combines the stats of the two objects. |
| 773 | + """ |
| 774 | + results = OnlineStatsCalculator() |
| 775 | + results.count = self.count + other.count |
| 776 | + results.sum = self.sum + other.sum |
| 777 | + if self.count > 0 and other.count > 0: |
| 778 | + # This is 2.1b in Chan, Tony F.; Golub, Gene H.; LeVeque, |
| 779 | + # Randall J. (1979), "Updating Formulae and a Pairwise Algorithm |
| 780 | + # for Computing Sample Variances.", |
| 781 | + # Technical Report STAN-CS-79-773, |
| 782 | + # Department of Computer Science, Stanford University, |
| 783 | + # ftp://reports.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf . |
| 784 | + results.M2 = self.M2 + other.M2 + ( |
| 785 | + (float(self.count) / (other.count * results.count)) * |
| 786 | + ((float(other.count) / self.count) * self.sum - other.sum)**2) |
| 787 | + else: |
| 788 | + results.M2 = self.M2 + other.M2 # One of them is 0. |
| 789 | + if results.count > 0: |
| 790 | + results.mean = float(results.sum) / results.count |
| 791 | + return results |
| 792 | + |
| 793 | + |
| 794 | +class OnlineApproximateMedian: |
| 795 | + """Approximate the median of a set of elements. |
| 796 | + |
| 797 | + This implements a space-efficient algorithm which only sees each value |
| 798 | + once. (It will hold in memory log bucket_size of n elements.) |
| 799 | + |
| 800 | + It was described and analysed in |
| 801 | + D. Cantone and M.Hofri, |
| 802 | + "Analysis of An Approximate Median Selection Algorithm" |
| 803 | + ftp://ftp.cs.wpi.edu/pub/techreports/pdf/06-17.pdf |
| 804 | + |
| 805 | + This algorithm is similar to Tukey's median of medians technique. |
| 806 | + It will compute the median among bucket_size values. And the median among |
| 807 | + those. |
| 808 | + """ |
| 809 | + |
| 810 | + def __init__(self, bucket_size=9): |
| 811 | + """Creates a new estimator. |
| 812 | + |
| 813 | + It approximates the median by finding the median among each |
| 814 | + successive bucket_size element. And then using these medians for other |
| 815 | + rounds of selection. |
| 816 | + |
| 817 | + The bucket size should be a low odd-integer. |
| 818 | + """ |
| 819 | + self.bucket_size = bucket_size |
| 820 | + # Index of the median in a completed bucket. |
| 821 | + self.median_idx = (bucket_size-1)//2 |
| 822 | + self.buckets = [] |
| 823 | + |
| 824 | + def update(self, x, order=0): |
| 825 | + """Update with x.""" |
| 826 | + if x is None: |
| 827 | + return |
| 828 | + |
| 829 | + i = order |
| 830 | + while True: |
| 831 | + # Create bucket on demand. |
| 832 | + if i >= len(self.buckets): |
| 833 | + for n in range((i+1)-len(self.buckets)): |
| 834 | + self.buckets.append([]) |
| 835 | + bucket = self.buckets[i] |
| 836 | + bucket.append(x) |
| 837 | + if len(bucket) == self.bucket_size: |
| 838 | + # Select the median in this bucket, and promote it. |
| 839 | + x = sorted(bucket)[self.median_idx] |
| 840 | + # Free the bucket for the next round. |
| 841 | + del bucket[:] |
| 842 | + i += 1 |
| 843 | + continue |
| 844 | + else: |
| 845 | + break |
| 846 | + |
| 847 | + @property |
| 848 | + def median(self): |
| 849 | + """Return the median.""" |
| 850 | + # Find the 'weighted' median by assigning a weight to each |
| 851 | + # element proportional to how far they have been selected. |
| 852 | + candidates = [] |
| 853 | + total_weight = 0 |
| 854 | + for i, bucket in enumerate(self.buckets): |
| 855 | + weight = self.bucket_size ** i |
| 856 | + for x in bucket: |
| 857 | + total_weight += weight |
| 858 | + candidates.append([x, weight]) |
| 859 | + if len(candidates) == 0: |
| 860 | + return 0 |
| 861 | + |
| 862 | + # Each weight is the equivalent of having the candidates appear |
| 863 | + # that number of times in the array. |
| 864 | + # So buckets like [[1, 2], [2, 3], [4, 2]] would be expanded to |
| 865 | + # [1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4, |
| 866 | + # 4, 4, 4, 4, 4] and we find the median of that list (2). |
| 867 | + # We don't expand the items to conserve memory. |
| 868 | + median = (total_weight-1) / 2 |
| 869 | + weighted_idx = 0 |
| 870 | + for x, weight in sorted(candidates): |
| 871 | + weighted_idx += weight |
| 872 | + if weighted_idx > median: |
| 873 | + return x |
| 874 | + |
| 875 | + def __add__(self, other): |
| 876 | + """Merge two approximators together. |
| 877 | + |
| 878 | + All candidates from the other are merged through the standard |
| 879 | + algorithm, starting at the same level. So an item that went through |
| 880 | + two rounds of selection, will be compared with other items having |
| 881 | + gone through the same number of rounds. |
| 882 | + """ |
| 883 | + results = OnlineApproximateMedian(self.bucket_size) |
| 884 | + results.buckets = copy.deepcopy(self.buckets) |
| 885 | + for i, bucket in enumerate(other.buckets): |
| 886 | + for x in bucket: |
| 887 | + results.update(x, i) |
| 888 | + return results |
| 889 | + |
| 890 | + |
| 891 | +class Stats: |
| 892 | + """Bag to hold and compute request statistics. |
| 893 | + |
| 894 | + All times are in seconds. |
| 895 | + """ |
| 896 | + total_hits = 0 # Total hits. |
| 897 | + |
| 898 | + total_time = 0 # Total time spent rendering. |
| 899 | + mean = 0 # Mean time per hit. |
| 900 | + median = 0 # Median time per hit. |
| 901 | + std = 0 # Standard deviation per hit. |
| 902 | + histogram = None # # Request times histogram. |
| 903 | + |
| 904 | + total_sqltime = 0 # Total time spent waiting for SQL to process. |
| 905 | + mean_sqltime = 0 # Mean time spend waiting for SQL to process. |
| 906 | + median_sqltime = 0 # Median time spend waiting for SQL to process. |
| 907 | + std_sqltime = 0 # Standard deviation of SQL time. |
| 908 | + |
| 909 | + total_sqlstatements = 0 # Total number of SQL statements issued. |
| 910 | + mean_sqlstatements = 0 |
| 911 | + median_sqlstatements = 0 |
| 912 | + std_sqlstatements = 0 |
| 913 | + |
| 914 | + @property |
| 915 | + def ninetyninth_percentile_time(self): |
| 916 | + """Time under which 99% of requests are rendered. |
| 917 | + |
| 918 | + This is estimated as 3 std deviations from the mean. Given that |
| 919 | + in a daily report, many URLs or PageIds won't have 100 requests, it's |
| 920 | + more useful to use this estimator. |
| 921 | + """ |
| 922 | + return self.mean + 3*self.std |
| 923 | + |
| 924 | + @property |
| 925 | + def ninetyninth_percentile_sqltime(self): |
| 926 | + """SQL time under which 99% of requests are rendered. |
| 927 | + |
| 928 | + This is estimated as 3 std deviations from the mean. |
| 929 | + """ |
| 930 | + return self.mean_sqltime + 3*self.std_sqltime |
| 931 | + |
| 932 | + @property |
| 933 | + def ninetyninth_percentile_sqlstatements(self): |
| 934 | + """Number of SQL statements under which 99% of requests are rendered. |
| 935 | + |
| 936 | + This is estimated as 3 std deviations from the mean. |
| 937 | + """ |
| 938 | + return self.mean_sqlstatements + 3*self.std_sqlstatements |
| 939 | + |
| 940 | + def text(self): |
| 941 | + """Return a textual version of the stats.""" |
| 942 | + return textwrap.dedent(""" |
| 943 | + <Stats for %d requests: |
| 944 | + Time: total=%.2f; mean=%.2f; median=%.2f; std=%.2f |
| 945 | + SQL time: total=%.2f; mean=%.2f; median=%.2f; std=%.2f |
| 946 | + SQL stmt: total=%.f; mean=%.2f; median=%.f; std=%.2f |
| 947 | + >""" % ( |
| 948 | + self.total_hits, self.total_time, self.mean, self.median, |
| 949 | + self.std, self.total_sqltime, self.mean_sqltime, |
| 950 | + self.median_sqltime, self.std_sqltime, |
| 951 | + self.total_sqlstatements, self.mean_sqlstatements, |
| 952 | + self.median_sqlstatements, self.std_sqlstatements)) |
| 953 | + |
| 954 | + |
| 955 | +class OnlineStats(Stats): |
| 956 | + """Implementation of stats that can be computed online. |
| 957 | + |
| 958 | + You call update() for each request and the stats are updated incrementally |
| 959 | + with minimum storage space. |
| 960 | + """ |
| 961 | + |
| 962 | + def __init__(self, histogram_width, histogram_resolution): |
| 963 | + self.time_stats = OnlineStatsCalculator() |
| 964 | + self.time_median_approximate = OnlineApproximateMedian() |
| 965 | + self.sql_time_stats = OnlineStatsCalculator() |
| 966 | + self.sql_time_median_approximate = OnlineApproximateMedian() |
| 967 | + self.sql_statements_stats = OnlineStatsCalculator() |
| 968 | + self.sql_statements_median_approximate = OnlineApproximateMedian() |
| 969 | + self.histogram = Histogram(histogram_width, histogram_resolution) |
| 970 | + |
| 971 | + @property |
| 972 | + def total_hits(self): |
| 973 | + return self.time_stats.count |
| 974 | + |
| 975 | + @property |
| 976 | + def total_time(self): |
| 977 | + return self.time_stats.sum |
| 978 | + |
| 979 | + @property |
| 980 | + def mean(self): |
| 981 | + return self.time_stats.mean |
| 982 | + |
| 983 | + @property |
| 984 | + def median(self): |
| 985 | + return self.time_median_approximate.median |
| 986 | + |
| 987 | + @property |
| 988 | + def std(self): |
| 989 | + return self.time_stats.std |
| 990 | + |
| 991 | + @property |
| 992 | + def total_sqltime(self): |
| 993 | + return self.sql_time_stats.sum |
| 994 | + |
| 995 | + @property |
| 996 | + def mean_sqltime(self): |
| 997 | + return self.sql_time_stats.mean |
| 998 | + |
| 999 | + @property |
| 1000 | + def median_sqltime(self): |
| 1001 | + return self.sql_time_median_approximate.median |
| 1002 | + |
| 1003 | + @property |
| 1004 | + def std_sqltime(self): |
| 1005 | + return self.sql_time_stats.std |
| 1006 | + |
| 1007 | + @property |
| 1008 | + def total_sqlstatements(self): |
| 1009 | + return self.sql_statements_stats.sum |
| 1010 | + |
| 1011 | + @property |
| 1012 | + def mean_sqlstatements(self): |
| 1013 | + return self.sql_statements_stats.mean |
| 1014 | + |
| 1015 | + @property |
| 1016 | + def median_sqlstatements(self): |
| 1017 | + return self.sql_statements_median_approximate.median |
| 1018 | + |
| 1019 | + @property |
| 1020 | + def std_sqlstatements(self): |
| 1021 | + return self.sql_statements_stats.std |
| 1022 | + |
| 1023 | + def update(self, request): |
| 1024 | + """Update the stats based on request.""" |
| 1025 | + self.time_stats.update(request.app_seconds) |
| 1026 | + self.time_median_approximate.update(request.app_seconds) |
| 1027 | + self.sql_time_stats.update(request.sql_seconds) |
| 1028 | + self.sql_time_median_approximate.update(request.sql_seconds) |
| 1029 | + self.sql_statements_stats.update(request.sql_statements) |
| 1030 | + self.sql_statements_median_approximate.update(request.sql_statements) |
| 1031 | + self.histogram.update(request.app_seconds) |
| 1032 | + |
| 1033 | + def __add__(self, other): |
| 1034 | + """Merge another OnlineStats with this one.""" |
| 1035 | + results = copy.deepcopy(self) |
| 1036 | + results.time_stats += other.time_stats |
| 1037 | + results.time_median_approximate += other.time_median_approximate |
| 1038 | + results.sql_time_stats += other.sql_time_stats |
| 1039 | + results.sql_time_median_approximate += ( |
| 1040 | + other.sql_time_median_approximate) |
| 1041 | + results.sql_statements_stats += other.sql_statements_stats |
| 1042 | + results.sql_statements_median_approximate += ( |
| 1043 | + other.sql_statements_median_approximate) |
| 1044 | + results.histogram = self.histogram + other.histogram |
| 1045 | + return results |
| 1046 | + |
| 1047 | + |
| 1048 | +class Histogram: |
| 1049 | + """A simple object to compute histogram of a value.""" |
| 1050 | + |
| 1051 | + @staticmethod |
| 1052 | + def from_bins_data(data): |
| 1053 | + """Create an histogram from existing bins data.""" |
| 1054 | + assert data[0][0] == 0, "First bin should start at zero." |
| 1055 | + |
| 1056 | + hist = Histogram(len(data), data[1][0]) |
| 1057 | + for idx, bin in enumerate(data): |
| 1058 | + hist.count += bin[1] |
| 1059 | + hist.bins[idx][1] = bin[1] |
| 1060 | + |
| 1061 | + return hist |
| 1062 | + |
| 1063 | + def __init__(self, bins_count, bins_size): |
| 1064 | + """Create a new histogram. |
| 1065 | + |
| 1066 | + The histogram will count the frequency of values in bins_count bins |
| 1067 | + of bins_size each. |
| 1068 | + """ |
| 1069 | + self.count = 0 |
| 1070 | + self.bins_count = bins_count |
| 1071 | + self.bins_size = bins_size |
| 1072 | + self.bins = [] |
| 1073 | + for x in range(bins_count): |
| 1074 | + self.bins.append([x*bins_size, 0]) |
| 1075 | + |
| 1076 | + @property |
| 1077 | + def bins_relative(self): |
| 1078 | + """Return the bins with the frequency expressed as a ratio.""" |
| 1079 | + return [[x, float(f)/self.count] for x, f in self.bins] |
| 1080 | + |
| 1081 | + def update(self, value): |
| 1082 | + """Update the histogram for this value. |
| 1083 | + |
| 1084 | + All values higher than the last bin minimum are counted in that last |
| 1085 | + bin. |
| 1086 | + """ |
| 1087 | + self.count += 1 |
| 1088 | + idx = int(min(self.bins_count-1, value / self.bins_size)) |
| 1089 | + self.bins[idx][1] += 1 |
| 1090 | + |
| 1091 | + def __repr__(self): |
| 1092 | + """A string representation of this histogram.""" |
| 1093 | + return "<Histogram %s>" % self.bins |
| 1094 | + |
| 1095 | + def __eq__(self, other): |
| 1096 | + """Two histogram are equals if they have the same bins content.""" |
| 1097 | + if not isinstance(other, Histogram): |
| 1098 | + return False |
| 1099 | + |
| 1100 | + if self.bins_count != other.bins_count: |
| 1101 | + return False |
| 1102 | + |
| 1103 | + if self.bins_size != other.bins_size: |
| 1104 | + return False |
| 1105 | + |
| 1106 | + for idx, other_bin in enumerate(other.bins): |
| 1107 | + if self.bins[idx][1] != other_bin[1]: |
| 1108 | + return False |
| 1109 | + |
| 1110 | + return True |
| 1111 | + |
| 1112 | + def __add__(self, other): |
| 1113 | + """Add the frequency of the other histogram to this one. |
| 1114 | + |
| 1115 | + The resulting histogram has the same bins_size than this one. |
| 1116 | + If the other one has a bigger bins_size, we'll assume an even |
| 1117 | + distribution and distribute the frequency across the smaller bins. If |
| 1118 | + it has a lower bin_size, we'll aggregate its bins into the larger |
| 1119 | + ones. We only support different bins_size if the ratio can be |
| 1120 | + expressed as the ratio between 1 and an integer. |
| 1121 | + |
| 1122 | + The resulting histogram is as wide as the widest one. |
| 1123 | + """ |
| 1124 | + ratio = float(other.bins_size) / self.bins_size |
| 1125 | + bins_count = max(self.bins_count, math.ceil(other.bins_count * ratio)) |
| 1126 | + total = Histogram(int(bins_count), self.bins_size) |
| 1127 | + total.count = self.count + other.count |
| 1128 | + |
| 1129 | + # Copy our bins into the total |
| 1130 | + for idx, bin in enumerate(self.bins): |
| 1131 | + total.bins[idx][1] = bin[1] |
| 1132 | + |
| 1133 | + assert int(ratio) == ratio or int(1/ratio) == 1/ratio, ( |
| 1134 | + "We only support different bins size when the ratio is an " |
| 1135 | + "integer to 1: " |
| 1136 | + % ratio) |
| 1137 | + |
| 1138 | + if ratio >= 1: |
| 1139 | + # We distribute the frequency across the bins. |
| 1140 | + # For example. if the ratio is 3:1, we'll add a third |
| 1141 | + # of the lower resolution bin to 3 of the higher one. |
| 1142 | + for other_idx, bin in enumerate(other.bins): |
| 1143 | + f = bin[1] / ratio |
| 1144 | + start = int(math.floor(other_idx * ratio)) |
| 1145 | + end = int(start + ratio) |
| 1146 | + for idx in range(start, end): |
| 1147 | + total.bins[idx][1] += f |
| 1148 | + else: |
| 1149 | + # We need to collect the higher resolution bins into the |
| 1150 | + # corresponding lower one. |
| 1151 | + for other_idx, bin in enumerate(other.bins): |
| 1152 | + idx = int(other_idx * ratio) |
| 1153 | + total.bins[idx][1] += bin[1] |
| 1154 | + |
| 1155 | + return total |
| 1156 | + |
| 1157 | + |
| 1158 | +class RequestTimes: |
| 1159 | + """Collect statistics from requests. |
| 1160 | + |
| 1161 | + Statistics are updated by calling the add_request() method. |
| 1162 | + |
| 1163 | + Statistics for mean/stddev/total/median for request times, SQL times and |
| 1164 | + number of SQL statements are collected. |
| 1165 | + |
| 1166 | + They are grouped by Category, URL or PageID. |
| 1167 | + """ |
| 1168 | + |
| 1169 | + def __init__(self, categories, options): |
| 1170 | + self.by_pageids = options.pageids |
| 1171 | + self.top_urls = options.top_urls |
| 1172 | + # We only keep in memory 50 times the number of URLs we want to |
| 1173 | + # return. The number of URLs can go pretty high (because of the |
| 1174 | + # distinct query parameters). |
| 1175 | + # |
| 1176 | + # Keeping all in memory at once is prohibitive. On a small but |
| 1177 | + # representative sample, keeping 50 times the possible number of |
| 1178 | + # candidates and culling to 90% on overflow, generated an identical |
| 1179 | + # report than keeping all the candidates in-memory. |
| 1180 | + # |
| 1181 | + # Keeping 10 times or culling at 90% generated a near-identical report |
| 1182 | + # (it differed a little in the tail.) |
| 1183 | + # |
| 1184 | + # The size/cull parameters might need to change if the requests |
| 1185 | + # distribution become very different than what it currently is. |
| 1186 | + self.top_urls_cache_size = self.top_urls * 50 |
| 1187 | + |
| 1188 | + # Histogram has a bin per resolution up to our timeout |
| 1189 | + #(and an extra bin). |
| 1190 | + self.histogram_resolution = float(options.resolution) |
| 1191 | + self.histogram_width = int( |
| 1192 | + options.timeout / self.histogram_resolution) + 1 |
| 1193 | + self.category_times = [ |
| 1194 | + (category, OnlineStats( |
| 1195 | + self.histogram_width, self.histogram_resolution)) |
| 1196 | + for category in categories] |
| 1197 | + self.url_times = {} |
| 1198 | + self.pageid_times = {} |
| 1199 | + |
| 1200 | + def add_request(self, request): |
| 1201 | + """Add request to the set of requests we collect stats for.""" |
| 1202 | + matched = [] |
| 1203 | + for category, stats in self.category_times: |
| 1204 | + if category.match(request): |
| 1205 | + stats.update(request) |
| 1206 | + if category.partition: |
| 1207 | + matched.append(category.title) |
| 1208 | + |
| 1209 | + if len(matched) > 1: |
| 1210 | + log.warning( |
| 1211 | + "Multiple partition categories matched by %s (%s)", |
| 1212 | + request.url, ", ".join(matched)) |
| 1213 | + elif not matched: |
| 1214 | + log.warning("%s isn't part of the partition", request.url) |
| 1215 | + |
| 1216 | + if self.by_pageids: |
| 1217 | + pageid = request.pageid or 'Unknown' |
| 1218 | + stats = self.pageid_times.setdefault( |
| 1219 | + pageid, OnlineStats( |
| 1220 | + self.histogram_width, self.histogram_resolution)) |
| 1221 | + stats.update(request) |
| 1222 | + |
| 1223 | + if self.top_urls: |
| 1224 | + stats = self.url_times.setdefault( |
| 1225 | + request.url, OnlineStats( |
| 1226 | + self.histogram_width, self.histogram_resolution)) |
| 1227 | + stats.update(request) |
| 1228 | + # Whenever we have more URLs than we need to, discard 10% |
| 1229 | + # that is less likely to end up in the top. |
| 1230 | + if len(self.url_times) > self.top_urls_cache_size: |
| 1231 | + cutoff = int(self.top_urls_cache_size*0.90) |
| 1232 | + self.url_times = dict( |
| 1233 | + sorted(self.url_times.items(), |
| 1234 | + key=lambda (url, stats): stats.total_time, |
| 1235 | + reverse=True)[:cutoff]) |
| 1236 | + |
| 1237 | + def get_category_times(self): |
| 1238 | + """Return the times for each category.""" |
| 1239 | + return self.category_times |
| 1240 | + |
| 1241 | + def get_top_urls_times(self): |
| 1242 | + """Return the times for the Top URL by total time""" |
| 1243 | + # Sort the result by total time |
| 1244 | + return sorted( |
| 1245 | + self.url_times.items(), |
| 1246 | + key=lambda (url, stats): stats.total_time, |
| 1247 | + reverse=True)[:self.top_urls] |
| 1248 | + |
| 1249 | + def get_pageid_times(self): |
| 1250 | + """Return the times for the pageids.""" |
| 1251 | + # Sort the result by pageid |
| 1252 | + return sorted(self.pageid_times.items()) |
| 1253 | + |
| 1254 | + def __add__(self, other): |
| 1255 | + """Merge two RequestTimes together.""" |
| 1256 | + results = copy.deepcopy(self) |
| 1257 | + for other_category, other_stats in other.category_times: |
| 1258 | + for i, (category, stats) in enumerate(self.category_times): |
| 1259 | + if category.title == other_category.title: |
| 1260 | + results.category_times[i] = ( |
| 1261 | + category, stats + other_stats) |
| 1262 | + break |
| 1263 | + else: |
| 1264 | + results.category_times.append( |
| 1265 | + (other_category, copy.deepcopy(other_stats))) |
| 1266 | + |
| 1267 | + url_times = results.url_times |
| 1268 | + for url, stats in other.url_times.items(): |
| 1269 | + if url in url_times: |
| 1270 | + url_times[url] += stats |
| 1271 | + else: |
| 1272 | + url_times[url] = copy.deepcopy(stats) |
| 1273 | + # Only keep top_urls_cache_size entries. |
| 1274 | + if len(self.url_times) > self.top_urls_cache_size: |
| 1275 | + self.url_times = dict( |
| 1276 | + sorted( |
| 1277 | + url_times.items(), |
| 1278 | + key=lambda (url, stats): stats.total_time, |
| 1279 | + reverse=True)[:self.top_urls_cache_size]) |
| 1280 | + |
| 1281 | + pageid_times = results.pageid_times |
| 1282 | + for pageid, stats in other.pageid_times.items(): |
| 1283 | + if pageid in pageid_times: |
| 1284 | + pageid_times[pageid] += stats |
| 1285 | + else: |
| 1286 | + pageid_times[pageid] = copy.deepcopy(stats) |
| 1287 | + |
| 1288 | + return results |
| 1289 | + |
| 1290 | + |
| 1291 | +def main(): |
| 1292 | + parser = ExtendedOptionParser("%prog [args] tracelog [...]") |
| 1293 | + |
| 1294 | + parser.add_option( |
| 1295 | + "-c", "--config", dest="config", |
| 1296 | + default="page-performance-report.ini", |
| 1297 | + metavar="FILE", help="Load configuration from FILE") |
| 1298 | + parser.add_option( |
| 1299 | + "--from", dest="from_ts", type="datetime", |
| 1300 | + default=None, metavar="TIMESTAMP", |
| 1301 | + help="Ignore log entries before TIMESTAMP") |
| 1302 | + parser.add_option( |
| 1303 | + "--until", dest="until_ts", type="datetime", |
| 1304 | + default=None, metavar="TIMESTAMP", |
| 1305 | + help="Ignore log entries after TIMESTAMP") |
| 1306 | + parser.add_option( |
| 1307 | + "--no-partition", dest="partition", |
| 1308 | + action="store_false", default=True, |
| 1309 | + help="Do not produce partition report") |
| 1310 | + parser.add_option( |
| 1311 | + "--no-categories", dest="categories", |
| 1312 | + action="store_false", default=True, |
| 1313 | + help="Do not produce categories report") |
| 1314 | + parser.add_option( |
| 1315 | + "--no-pageids", dest="pageids", |
| 1316 | + action="store_false", default=True, |
| 1317 | + help="Do not produce pageids report") |
| 1318 | + parser.add_option( |
| 1319 | + "--top-urls", dest="top_urls", type=int, metavar="N", |
| 1320 | + default=50, help="Generate report for top N urls by hitcount.") |
| 1321 | + parser.add_option( |
| 1322 | + "--directory", dest="directory", |
| 1323 | + default=os.getcwd(), metavar="DIR", |
| 1324 | + help="Output reports in DIR directory") |
| 1325 | + parser.add_option( |
| 1326 | + "--timeout", dest="timeout", |
| 1327 | + # Default to 9: our production timeout. |
| 1328 | + default=9, type="int", metavar="SECONDS", |
| 1329 | + help="The configured timeout value: used to determine high risk " + |
| 1330 | + "page ids. That would be pages which 99% under render time is " |
| 1331 | + "greater than timeoout - 2s. Default is %defaults.") |
| 1332 | + parser.add_option( |
| 1333 | + "--histogram-resolution", dest="resolution", |
| 1334 | + # Default to 0.5s |
| 1335 | + default=0.5, type="float", metavar="SECONDS", |
| 1336 | + help="The resolution of the histogram bin width. Detault to " |
| 1337 | + "%defaults.") |
| 1338 | + parser.add_option( |
| 1339 | + "--merge", dest="merge", |
| 1340 | + default=False, action='store_true', |
| 1341 | + help="Files are interpreted as pickled stats and are aggregated " + |
| 1342 | + "for the report.") |
| 1343 | + |
| 1344 | + options, args = parser.parse_args() |
| 1345 | + |
| 1346 | + if not os.path.isdir(options.directory): |
| 1347 | + parser.error("Directory %s does not exist" % options.directory) |
| 1348 | + |
| 1349 | + if len(args) == 0: |
| 1350 | + parser.error("At least one zserver tracelog file must be provided") |
| 1351 | + |
| 1352 | + if options.from_ts is not None and options.until_ts is not None: |
| 1353 | + if options.from_ts > options.until_ts: |
| 1354 | + parser.error( |
| 1355 | + "--from timestamp %s is before --until timestamp %s" |
| 1356 | + % (options.from_ts, options.until_ts)) |
| 1357 | + if options.from_ts is not None or options.until_ts is not None: |
| 1358 | + if options.merge: |
| 1359 | + parser.error('--from and --until cannot be used with --merge') |
| 1360 | + |
| 1361 | + for filename in args: |
| 1362 | + if not os.path.exists(filename): |
| 1363 | + parser.error("Tracelog file %s not found." % filename) |
| 1364 | + |
| 1365 | + if not os.path.exists(options.config): |
| 1366 | + parser.error("Config file %s not found." % options.config) |
| 1367 | + |
| 1368 | + # Need a better config mechanism as ConfigParser doesn't preserve order. |
| 1369 | + script_config = RawConfigParser() |
| 1370 | + script_config.optionxform = str # Make keys case sensitive. |
| 1371 | + script_config.readfp(open(options.config)) |
| 1372 | + |
| 1373 | + categories = [] # A list of Category, in report order. |
| 1374 | + for option in script_config.options('categories'): |
| 1375 | + regexp = script_config.get('categories', option) |
| 1376 | + try: |
| 1377 | + categories.append(Category(option, regexp)) |
| 1378 | + except sre_constants.error as x: |
| 1379 | + log.fatal("Unable to compile regexp %r (%s)" % (regexp, x)) |
| 1380 | + return 1 |
| 1381 | + categories.sort() |
| 1382 | + |
| 1383 | + if len(categories) == 0: |
| 1384 | + parser.error("No data in [categories] section of configuration.") |
| 1385 | + |
| 1386 | + # Determine the categories making a partition of the requests |
| 1387 | + for option in script_config.options('partition'): |
| 1388 | + for category in categories: |
| 1389 | + if category.title == option: |
| 1390 | + category.partition = True |
| 1391 | + break |
| 1392 | + else: |
| 1393 | + log.warning( |
| 1394 | + "In partition definition: %s isn't a defined category", |
| 1395 | + option) |
| 1396 | + |
| 1397 | + times = RequestTimes(categories, options) |
| 1398 | + |
| 1399 | + if options.merge: |
| 1400 | + for filename in args: |
| 1401 | + log.info('Merging %s...' % filename) |
| 1402 | + f = bz2.BZ2File(filename, 'r') |
| 1403 | + times += cPickle.load(f) |
| 1404 | + f.close() |
| 1405 | + else: |
| 1406 | + parse(args, times, options) |
| 1407 | + |
| 1408 | + category_times = times.get_category_times() |
| 1409 | + |
| 1410 | + pageid_times = [] |
| 1411 | + url_times= [] |
| 1412 | + if options.top_urls: |
| 1413 | + url_times = times.get_top_urls_times() |
| 1414 | + if options.pageids: |
| 1415 | + pageid_times = times.get_pageid_times() |
| 1416 | + |
| 1417 | + def _report_filename(filename): |
| 1418 | + return os.path.join(options.directory, filename) |
| 1419 | + |
| 1420 | + # Partition report |
| 1421 | + if options.partition: |
| 1422 | + report_filename = _report_filename('partition.html') |
| 1423 | + log.info("Generating %s", report_filename) |
| 1424 | + partition_times = [ |
| 1425 | + category_time |
| 1426 | + for category_time in category_times |
| 1427 | + if category_time[0].partition] |
| 1428 | + html_report( |
| 1429 | + open(report_filename, 'w'), partition_times, None, None, |
| 1430 | + histogram_resolution=options.resolution, |
| 1431 | + category_name='Partition') |
| 1432 | + |
| 1433 | + # Category only report. |
| 1434 | + if options.categories: |
| 1435 | + report_filename = _report_filename('categories.html') |
| 1436 | + log.info("Generating %s", report_filename) |
| 1437 | + html_report( |
| 1438 | + open(report_filename, 'w'), category_times, None, None, |
| 1439 | + histogram_resolution=options.resolution) |
| 1440 | + |
| 1441 | + # Pageid only report. |
| 1442 | + if options.pageids: |
| 1443 | + report_filename = _report_filename('pageids.html') |
| 1444 | + log.info("Generating %s", report_filename) |
| 1445 | + html_report( |
| 1446 | + open(report_filename, 'w'), None, pageid_times, None, |
| 1447 | + histogram_resolution=options.resolution) |
| 1448 | + |
| 1449 | + # Top URL only report. |
| 1450 | + if options.top_urls: |
| 1451 | + report_filename = _report_filename('top%d.html' % options.top_urls) |
| 1452 | + log.info("Generating %s", report_filename) |
| 1453 | + html_report( |
| 1454 | + open(report_filename, 'w'), None, None, url_times, |
| 1455 | + histogram_resolution=options.resolution) |
| 1456 | + |
| 1457 | + # Combined report. |
| 1458 | + if options.categories and options.pageids: |
| 1459 | + report_filename = _report_filename('combined.html') |
| 1460 | + html_report( |
| 1461 | + open(report_filename, 'w'), |
| 1462 | + category_times, pageid_times, url_times, |
| 1463 | + histogram_resolution=options.resolution) |
| 1464 | + |
| 1465 | + # Report of likely timeout candidates |
| 1466 | + report_filename = _report_filename('timeout-candidates.html') |
| 1467 | + log.info("Generating %s", report_filename) |
| 1468 | + html_report( |
| 1469 | + open(report_filename, 'w'), None, pageid_times, None, |
| 1470 | + options.timeout - 2, |
| 1471 | + histogram_resolution=options.resolution) |
| 1472 | + |
| 1473 | + # Save the times cache for later merging. |
| 1474 | + report_filename = _report_filename('stats.pck.bz2') |
| 1475 | + log.info("Saving times database in %s", report_filename) |
| 1476 | + stats_file = bz2.BZ2File(report_filename, 'w') |
| 1477 | + cPickle.dump(times, stats_file, protocol=cPickle.HIGHEST_PROTOCOL) |
| 1478 | + stats_file.close() |
| 1479 | + |
| 1480 | + # Output metrics for selected categories. |
| 1481 | + report_filename = _report_filename('metrics.dat') |
| 1482 | + log.info('Saving category_metrics %s', report_filename) |
| 1483 | + metrics_file = open(report_filename, 'w') |
| 1484 | + writer = csv.writer(metrics_file, delimiter=':') |
| 1485 | + date = options.until_ts or options.from_ts or datetime.utcnow() |
| 1486 | + date = time.mktime(date.timetuple()) |
| 1487 | + |
| 1488 | + for option in script_config.options('metrics'): |
| 1489 | + name = script_config.get('metrics', option) |
| 1490 | + for category, stats in category_times: |
| 1491 | + if category.title == name: |
| 1492 | + writer.writerows([ |
| 1493 | + ("%s_99" % option, "%f@%d" % ( |
| 1494 | + stats.ninetyninth_percentile_time, date)), |
| 1495 | + ("%s_hits" % option, "%d@%d" % (stats.total_hits, date))]) |
| 1496 | + break |
| 1497 | + else: |
| 1498 | + log.warning("Can't find category %s for metric %s" % ( |
| 1499 | + option, name)) |
| 1500 | + metrics_file.close() |
| 1501 | + |
| 1502 | + return 0 |
| 1503 | + |
| 1504 | + |
| 1505 | +def smart_open(filename, mode='r'): |
| 1506 | + """Open a file, transparently handling compressed files. |
| 1507 | + |
| 1508 | + Compressed files are detected by file extension. |
| 1509 | + """ |
| 1510 | + ext = os.path.splitext(filename)[1] |
| 1511 | + if ext == '.bz2': |
| 1512 | + return bz2.BZ2File(filename, 'r') |
| 1513 | + elif ext == '.gz': |
| 1514 | + return gzip.GzipFile(filename, 'r') |
| 1515 | + else: |
| 1516 | + return open(filename, mode) |
| 1517 | + |
| 1518 | + |
| 1519 | +class MalformedLine(Exception): |
| 1520 | + """A malformed line was found in the trace log.""" |
| 1521 | + |
| 1522 | + |
| 1523 | +_ts_re = re.compile( |
| 1524 | + '^(\d{4})-(\d\d)-(\d\d)\s(\d\d):(\d\d):(\d\d)(?:.(\d{6}))?$') |
| 1525 | + |
| 1526 | + |
| 1527 | +def parse_timestamp(ts_string): |
| 1528 | + match = _ts_re.search(ts_string) |
| 1529 | + if match is None: |
| 1530 | + raise ValueError("Invalid timestamp") |
| 1531 | + return datetime( |
| 1532 | + *(int(elem) for elem in match.groups() if elem is not None)) |
| 1533 | + |
| 1534 | + |
| 1535 | +def parse(tracefiles, times, options): |
| 1536 | + requests = {} |
| 1537 | + total_requests = 0 |
| 1538 | + for tracefile in tracefiles: |
| 1539 | + log.info('Processing %s', tracefile) |
| 1540 | + for line in smart_open(tracefile): |
| 1541 | + line = line.rstrip() |
| 1542 | + try: |
| 1543 | + record = line.split(' ', 7) |
| 1544 | + try: |
| 1545 | + record_type, request_id, date, time_ = record[:4] |
| 1546 | + except ValueError: |
| 1547 | + raise MalformedLine() |
| 1548 | + |
| 1549 | + if record_type == 'S': |
| 1550 | + # Short circuit - we don't care about these entries. |
| 1551 | + continue |
| 1552 | + |
| 1553 | + # Parse the timestamp. |
| 1554 | + ts_string = '%s %s' % (date, time_) |
| 1555 | + try: |
| 1556 | + dt = parse_timestamp(ts_string) |
| 1557 | + except ValueError: |
| 1558 | + raise MalformedLine( |
| 1559 | + 'Invalid timestamp %s' % repr(ts_string)) |
| 1560 | + |
| 1561 | + # Filter entries by command line date range. |
| 1562 | + if options.from_ts is not None and dt < options.from_ts: |
| 1563 | + continue # Skip to next line. |
| 1564 | + if options.until_ts is not None and dt > options.until_ts: |
| 1565 | + break # Skip to next log file. |
| 1566 | + |
| 1567 | + args = record[4:] |
| 1568 | + |
| 1569 | + def require_args(count): |
| 1570 | + if len(args) < count: |
| 1571 | + raise MalformedLine() |
| 1572 | + |
| 1573 | + if record_type == 'B': # Request begins. |
| 1574 | + require_args(2) |
| 1575 | + requests[request_id] = Request(dt, args[0], args[1]) |
| 1576 | + continue |
| 1577 | + |
| 1578 | + request = requests.get(request_id, None) |
| 1579 | + if request is None: # Just ignore partial records. |
| 1580 | + continue |
| 1581 | + |
| 1582 | + # Old stype extension record from Launchpad. Just |
| 1583 | + # contains the URL. |
| 1584 | + if (record_type == '-' and len(args) == 1 |
| 1585 | + and args[0].startswith('http')): |
| 1586 | + request.url = args[0] |
| 1587 | + |
| 1588 | + # New style extension record with a prefix. |
| 1589 | + elif record_type == '-': |
| 1590 | + # Launchpad outputs several things as tracelog |
| 1591 | + # extension records. We include a prefix to tell |
| 1592 | + # them apart. |
| 1593 | + require_args(1) |
| 1594 | + |
| 1595 | + parse_extension_record(request, args) |
| 1596 | + |
| 1597 | + elif record_type == 'I': # Got request input. |
| 1598 | + require_args(1) |
| 1599 | + request.I(dt, args[0]) |
| 1600 | + |
| 1601 | + elif record_type == 'C': # Entered application thread. |
| 1602 | + request.C(dt) |
| 1603 | + |
| 1604 | + elif record_type == 'A': # Application done. |
| 1605 | + require_args(2) |
| 1606 | + request.A(dt, args[0], args[1]) |
| 1607 | + |
| 1608 | + elif record_type == 'E': # Request done. |
| 1609 | + del requests[request_id] |
| 1610 | + request.E(dt) |
| 1611 | + total_requests += 1 |
| 1612 | + if total_requests % 10000 == 0: |
| 1613 | + log.debug("Parsed %d requests", total_requests) |
| 1614 | + |
| 1615 | + # Add the request to any matching categories. |
| 1616 | + times.add_request(request) |
| 1617 | + else: |
| 1618 | + raise MalformedLine('Unknown record type %s', record_type) |
| 1619 | + except MalformedLine as x: |
| 1620 | + log.error( |
| 1621 | + "Malformed line %s (%s)" % (repr(line), x)) |
| 1622 | + |
| 1623 | + |
| 1624 | +def parse_extension_record(request, args): |
| 1625 | + """Decode a ZServer extension records and annotate request.""" |
| 1626 | + prefix = args[0] |
| 1627 | + |
| 1628 | + if prefix == 'u': |
| 1629 | + request.url = ' '.join(args[1:]) or None |
| 1630 | + elif prefix == 'p': |
| 1631 | + request.pageid = ' '.join(args[1:]) or None |
| 1632 | + elif prefix == 't': |
| 1633 | + if len(args) != 4: |
| 1634 | + raise MalformedLine("Wrong number of arguments %s" % (args,)) |
| 1635 | + request.sql_statements = int(args[2]) |
| 1636 | + request.sql_seconds = float(args[3]) / 1000 |
| 1637 | + else: |
| 1638 | + raise MalformedLine( |
| 1639 | + "Unknown extension prefix %s" % prefix) |
| 1640 | + |
| 1641 | + |
| 1642 | +def html_report( |
| 1643 | + outf, category_times, pageid_times, url_times, |
| 1644 | + ninetyninth_percentile_threshold=None, histogram_resolution=0.5, |
| 1645 | + category_name='Category'): |
| 1646 | + """Write an html report to outf. |
| 1647 | + |
| 1648 | + :param outf: A file object to write the report to. |
| 1649 | + :param category_times: The time statistics for categories. |
| 1650 | + :param pageid_times: The time statistics for pageids. |
| 1651 | + :param url_times: The time statistics for the top XXX urls. |
| 1652 | + :param ninetyninth_percentile_threshold: Lower threshold for inclusion of |
| 1653 | + pages in the pageid section; pages where 99 percent of the requests are |
| 1654 | + served under this threshold will not be included. |
| 1655 | + :param histogram_resolution: used as the histogram bar width |
| 1656 | + :param category_name: The name to use for category report. Defaults to |
| 1657 | + 'Category'. |
| 1658 | + """ |
| 1659 | + |
| 1660 | + print >> outf, dedent('''\ |
| 1661 | + <!DOCTYPE html> |
| 1662 | + <html> |
| 1663 | + <head> |
| 1664 | + <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| 1665 | + <title>Launchpad Page Performance Report %(date)s</title> |
| 1666 | + <script language="javascript" type="text/javascript" |
| 1667 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/flot/jquery.min.js" |
| 1668 | + ></script> |
| 1669 | + <script language="javascript" type="text/javascript" |
| 1670 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/jquery.appear-1.1.1.min.js" |
| 1671 | + ></script> |
| 1672 | + <script language="javascript" type="text/javascript" |
| 1673 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/flot/jquery.flot.min.js" |
| 1674 | + ></script> |
| 1675 | + <script language="javascript" type="text/javascript" |
| 1676 | + src="https://devpad.canonical.com/~lpqateam/ppr/js/sorttable.js"></script> |
| 1677 | + <style type="text/css"> |
| 1678 | + h3 { font-weight: normal; font-size: 1em; } |
| 1679 | + thead th { padding-left: 1em; padding-right: 1em; } |
| 1680 | + .category-title { text-align: right; padding-right: 2em; |
| 1681 | + max-width: 25em; } |
| 1682 | + .regexp { font-size: x-small; font-weight: normal; } |
| 1683 | + .mean { text-align: right; padding-right: 1em; } |
| 1684 | + .median { text-align: right; padding-right: 1em; } |
| 1685 | + .standard-deviation { text-align: right; padding-right: 1em; } |
| 1686 | + .histogram { padding: 0.5em 1em; width:400px; height:250px; } |
| 1687 | + .odd-row { background-color: #eeeeff; } |
| 1688 | + .even-row { background-color: #ffffee; } |
| 1689 | + table.sortable thead { |
| 1690 | + background-color:#eee; |
| 1691 | + color:#666666; |
| 1692 | + font-weight: bold; |
| 1693 | + cursor: default; |
| 1694 | + } |
| 1695 | + td.numeric { |
| 1696 | + font-family: monospace; |
| 1697 | + text-align: right; |
| 1698 | + padding: 1em; |
| 1699 | + } |
| 1700 | + .clickable { cursor: hand; } |
| 1701 | + .total-hits, .histogram, .median-sqltime, |
| 1702 | + .median-sqlstatements { border-right: 1px dashed #000000; } |
| 1703 | + </style> |
| 1704 | + </head> |
| 1705 | + <body> |
| 1706 | + <h1>Launchpad Page Performance Report</h1> |
| 1707 | + <h3>%(date)s</h3> |
| 1708 | + ''' % {'date': time.ctime()}) |
| 1709 | + |
| 1710 | + table_header = dedent('''\ |
| 1711 | + <table class="sortable page-performance-report"> |
| 1712 | + <caption align="top">Click on column headings to sort.</caption> |
| 1713 | + <thead> |
| 1714 | + <tr> |
| 1715 | + <th class="clickable">Name</th> |
| 1716 | + |
| 1717 | + <th class="clickable">Total Hits</th> |
| 1718 | + |
| 1719 | + <th class="clickable">99% Under Time (secs)</th> |
| 1720 | + |
| 1721 | + <th class="clickable">Mean Time (secs)</th> |
| 1722 | + <th class="clickable">Time Standard Deviation</th> |
| 1723 | + <th class="clickable">Median Time (secs)</th> |
| 1724 | + <th class="sorttable_nosort">Time Distribution</th> |
| 1725 | + |
| 1726 | + <th class="clickable">99% Under SQL Time (secs)</th> |
| 1727 | + <th class="clickable">Mean SQL Time (secs)</th> |
| 1728 | + <th class="clickable">SQL Time Standard Deviation</th> |
| 1729 | + <th class="clickable">Median SQL Time (secs)</th> |
| 1730 | + |
| 1731 | + <th class="clickable">99% Under SQL Statements</th> |
| 1732 | + <th class="clickable">Mean SQL Statements</th> |
| 1733 | + <th class="clickable">SQL Statement Standard Deviation</th> |
| 1734 | + <th class="clickable">Median SQL Statements</th> |
| 1735 | + |
| 1736 | + <th class="clickable">Hits * 99% Under SQL Statement</th> |
| 1737 | + </tr> |
| 1738 | + </thead> |
| 1739 | + <tbody> |
| 1740 | + ''') |
| 1741 | + table_footer = "</tbody></table>" |
| 1742 | + |
| 1743 | + # Store our generated histograms to output Javascript later. |
| 1744 | + histograms = [] |
| 1745 | + |
| 1746 | + def handle_times(html_title, stats): |
| 1747 | + histograms.append(stats.histogram) |
| 1748 | + print >> outf, dedent("""\ |
| 1749 | + <tr> |
| 1750 | + <th class="category-title">%s</th> |
| 1751 | + <td class="numeric total-hits">%d</td> |
| 1752 | + <td class="numeric 99pc-under-time">%.2f</td> |
| 1753 | + <td class="numeric mean-time">%.2f</td> |
| 1754 | + <td class="numeric std-time">%.2f</td> |
| 1755 | + <td class="numeric median-time">%.2f</td> |
| 1756 | + <td> |
| 1757 | + <div class="histogram" id="histogram%d"></div> |
| 1758 | + </td> |
| 1759 | + <td class="numeric 99pc-under-sqltime">%.2f</td> |
| 1760 | + <td class="numeric mean-sqltime">%.2f</td> |
| 1761 | + <td class="numeric std-sqltime">%.2f</td> |
| 1762 | + <td class="numeric median-sqltime">%.2f</td> |
| 1763 | + |
| 1764 | + <td class="numeric 99pc-under-sqlstatement">%.f</td> |
| 1765 | + <td class="numeric mean-sqlstatements">%.2f</td> |
| 1766 | + <td class="numeric std-sqlstatements">%.2f</td> |
| 1767 | + <td class="numeric median-sqlstatements">%.2f</td> |
| 1768 | + |
| 1769 | + <td class="numeric high-db-usage">%.f</td> |
| 1770 | + </tr> |
| 1771 | + """ % ( |
| 1772 | + html_title, |
| 1773 | + stats.total_hits, stats.ninetyninth_percentile_time, |
| 1774 | + stats.mean, stats.std, stats.median, |
| 1775 | + len(histograms) - 1, |
| 1776 | + stats.ninetyninth_percentile_sqltime, stats.mean_sqltime, |
| 1777 | + stats.std_sqltime, stats.median_sqltime, |
| 1778 | + stats.ninetyninth_percentile_sqlstatements, |
| 1779 | + stats.mean_sqlstatements, |
| 1780 | + stats.std_sqlstatements, stats.median_sqlstatements, |
| 1781 | + stats.ninetyninth_percentile_sqlstatements* stats.total_hits, |
| 1782 | + )) |
| 1783 | + |
| 1784 | + # Table of contents |
| 1785 | + print >> outf, '<ol>' |
| 1786 | + if category_times: |
| 1787 | + print >> outf, '<li><a href="#catrep">%s Report</a></li>' % ( |
| 1788 | + category_name) |
| 1789 | + if pageid_times: |
| 1790 | + print >> outf, '<li><a href="#pageidrep">Pageid Report</a></li>' |
| 1791 | + if url_times: |
| 1792 | + print >> outf, '<li><a href="#topurlrep">Top URL Report</a></li>' |
| 1793 | + print >> outf, '</ol>' |
| 1794 | + |
| 1795 | + if category_times: |
| 1796 | + print >> outf, '<h2 id="catrep">%s Report</h2>' % ( |
| 1797 | + category_name) |
| 1798 | + print >> outf, table_header |
| 1799 | + for category, times in category_times: |
| 1800 | + html_title = '%s<br/><span class="regexp">%s</span>' % ( |
| 1801 | + html_quote(category.title), html_quote(category.regexp)) |
| 1802 | + handle_times(html_title, times) |
| 1803 | + print >> outf, table_footer |
| 1804 | + |
| 1805 | + if pageid_times: |
| 1806 | + print >> outf, '<h2 id="pageidrep">Pageid Report</h2>' |
| 1807 | + print >> outf, table_header |
| 1808 | + for pageid, times in pageid_times: |
| 1809 | + if (ninetyninth_percentile_threshold is not None and |
| 1810 | + (times.ninetyninth_percentile_time < |
| 1811 | + ninetyninth_percentile_threshold)): |
| 1812 | + continue |
| 1813 | + handle_times(html_quote(pageid), times) |
| 1814 | + print >> outf, table_footer |
| 1815 | + |
| 1816 | + if url_times: |
| 1817 | + print >> outf, '<h2 id="topurlrep">Top URL Report</h2>' |
| 1818 | + print >> outf, table_header |
| 1819 | + for url, times in url_times: |
| 1820 | + handle_times(html_quote(url), times) |
| 1821 | + print >> outf, table_footer |
| 1822 | + |
| 1823 | + # Ourput the javascript to render our histograms nicely, replacing |
| 1824 | + # the placeholder <div> tags output earlier. |
| 1825 | + print >> outf, dedent("""\ |
| 1826 | + <script language="javascript" type="text/javascript"> |
| 1827 | + $(function () { |
| 1828 | + var options = { |
| 1829 | + series: { |
| 1830 | + bars: {show: true, barWidth: %s} |
| 1831 | + }, |
| 1832 | + xaxis: { |
| 1833 | + tickFormatter: function (val, axis) { |
| 1834 | + return val.toFixed(axis.tickDecimals) + "s"; |
| 1835 | + } |
| 1836 | + }, |
| 1837 | + yaxis: { |
| 1838 | + min: 0, |
| 1839 | + max: 1, |
| 1840 | + transform: function (v) { |
| 1841 | + return Math.pow(Math.log(v*100+1)/Math.LN2, 0.5); |
| 1842 | + }, |
| 1843 | + inverseTransform: function (v) { |
| 1844 | + return Math.pow(Math.exp(v*100+1)/Math.LN2, 2); |
| 1845 | + }, |
| 1846 | + tickDecimals: 1, |
| 1847 | + tickFormatter: function (val, axis) { |
| 1848 | + return (val * 100).toFixed(axis.tickDecimals) + "%%"; |
| 1849 | + }, |
| 1850 | + ticks: [0.001,0.01,0.10,0.50,1.0] |
| 1851 | + }, |
| 1852 | + grid: { |
| 1853 | + aboveData: true, |
| 1854 | + labelMargin: 15 |
| 1855 | + } |
| 1856 | + }; |
| 1857 | + """ % histogram_resolution) |
| 1858 | + |
| 1859 | + for i, histogram in enumerate(histograms): |
| 1860 | + if histogram.count == 0: |
| 1861 | + continue |
| 1862 | + print >> outf, dedent("""\ |
| 1863 | + function plot_histogram_%(id)d() { |
| 1864 | + var d = %(data)s; |
| 1865 | + |
| 1866 | + $.plot( |
| 1867 | + $("#histogram%(id)d"), |
| 1868 | + [{data: d}], options); |
| 1869 | + } |
| 1870 | + $('#histogram%(id)d').appear(function() { |
| 1871 | + plot_histogram_%(id)d(); |
| 1872 | + }); |
| 1873 | + |
| 1874 | + """ % {'id': i, 'data': json.dumps(histogram.bins_relative)}) |
| 1875 | + |
| 1876 | + print >> outf, dedent("""\ |
| 1877 | + }); |
| 1878 | + </script> |
| 1879 | + </body> |
| 1880 | + </html> |
| 1881 | + """) |
| 1882 | |
| 1883 | === added file 'setup.py' |
| 1884 | --- setup.py 1970-01-01 00:00:00 +0000 |
| 1885 | +++ setup.py 2012-08-09 04:56:19 +0000 |
| 1886 | @@ -0,0 +1,50 @@ |
| 1887 | +#!/usr/bin/env python |
| 1888 | +# |
| 1889 | +# Copyright (c) 2012, Canonical Ltd |
| 1890 | +# |
| 1891 | +# This program is free software: you can redistribute it and/or modify |
| 1892 | +# it under the terms of the GNU Lesser General Public License as published by |
| 1893 | +# the Free Software Foundation, version 3 only. |
| 1894 | +# |
| 1895 | +# This program is distributed in the hope that it will be useful, |
| 1896 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
| 1897 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
| 1898 | +# GNU Lesser General Public License for more details. |
| 1899 | +# |
| 1900 | +# You should have received a copy of the GNU Lesser General Public License |
| 1901 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
| 1902 | +# GNU Lesser General Public License version 3 (see the file LICENSE). |
| 1903 | + |
| 1904 | +from distutils.core import setup |
| 1905 | +import os.path |
| 1906 | + |
| 1907 | +description = file( |
| 1908 | + os.path.join(os.path.dirname(__file__), 'README'), 'rb').read() |
| 1909 | + |
| 1910 | +setup(name="lp-dev-utils", |
| 1911 | + version="0.0.0", |
| 1912 | + description=\ |
| 1913 | + "Tools for working on or with Launchpad.", |
| 1914 | + long_description=description, |
| 1915 | + maintainer="Launchpad Developers", |
| 1916 | + maintainer_email="launchpad-dev@lists.launchpad.net", |
| 1917 | + url="https://launchpad.net/lp-dev-utils", |
| 1918 | + packages=['ec2test'], |
| 1919 | + package_dir = {'':'.'}, |
| 1920 | + classifiers = [ |
| 1921 | + 'Development Status :: 2 - Pre-Alpha', |
| 1922 | + 'Intended Audience :: Developers', |
| 1923 | + 'License :: OSI Approved :: GNU General Public License v3 (GPLv3)', |
| 1924 | + 'Operating System :: OS Independent', |
| 1925 | + 'Programming Language :: Python', |
| 1926 | + ], |
| 1927 | + install_requires = [ |
| 1928 | + 'zc.zservertracelog', |
| 1929 | + ], |
| 1930 | + extras_require = dict( |
| 1931 | + test=[ |
| 1932 | + 'fixtures', |
| 1933 | + 'testtools', |
| 1934 | + ] |
| 1935 | + ), |
| 1936 | + ) |
| 1937 | |
| 1938 | === added file 'test_pageperformancereport.py' |
| 1939 | --- test_pageperformancereport.py 1970-01-01 00:00:00 +0000 |
| 1940 | +++ test_pageperformancereport.py 2012-08-09 04:56:19 +0000 |
| 1941 | @@ -0,0 +1,486 @@ |
| 1942 | +# Copyright 2010 Canonical Ltd. This software is licensed under the |
| 1943 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
| 1944 | + |
| 1945 | +"""Test the pageperformancereport script.""" |
| 1946 | + |
| 1947 | +__metaclass__ = type |
| 1948 | + |
| 1949 | +import fixtures |
| 1950 | +from testtools import TestCase |
| 1951 | + |
| 1952 | +from pageperformancereport import ( |
| 1953 | + Category, |
| 1954 | + Histogram, |
| 1955 | + OnlineApproximateMedian, |
| 1956 | + OnlineStats, |
| 1957 | + OnlineStatsCalculator, |
| 1958 | + RequestTimes, |
| 1959 | + Stats, |
| 1960 | + ) |
| 1961 | + |
| 1962 | + |
| 1963 | +class FakeOptions: |
| 1964 | + timeout = 5 |
| 1965 | + db_file = None |
| 1966 | + pageids = True |
| 1967 | + top_urls = 3 |
| 1968 | + resolution = 1 |
| 1969 | + |
| 1970 | + def __init__(self, **kwargs): |
| 1971 | + """Assign all arguments as attributes.""" |
| 1972 | + self.__dict__.update(kwargs) |
| 1973 | + |
| 1974 | + |
| 1975 | +class FakeRequest: |
| 1976 | + |
| 1977 | + def __init__(self, url, app_seconds, sql_statements=None, |
| 1978 | + sql_seconds=None, pageid=None): |
| 1979 | + self.url = url |
| 1980 | + self.pageid = pageid |
| 1981 | + self.app_seconds = app_seconds |
| 1982 | + self.sql_statements = sql_statements |
| 1983 | + self.sql_seconds = sql_seconds |
| 1984 | + |
| 1985 | + |
| 1986 | +class FakeStats(Stats): |
| 1987 | + |
| 1988 | + def __init__(self, **kwargs): |
| 1989 | + # Override the constructor to just store the values. |
| 1990 | + self.__dict__.update(kwargs) |
| 1991 | + |
| 1992 | + |
| 1993 | +FAKE_REQUESTS = [ |
| 1994 | + FakeRequest('/', 0.5, pageid='+root'), |
| 1995 | + FakeRequest('/bugs', 4.5, 56, 3.0, pageid='+bugs'), |
| 1996 | + FakeRequest('/bugs', 4.2, 56, 2.2, pageid='+bugs'), |
| 1997 | + FakeRequest('/bugs', 5.5, 76, 4.0, pageid='+bugs'), |
| 1998 | + FakeRequest('/ubuntu', 2.5, 6, 2.0, pageid='+distribution'), |
| 1999 | + FakeRequest('/launchpad', 3.5, 3, 3.0, pageid='+project'), |
| 2000 | + FakeRequest('/bzr', 2.5, 4, 2.0, pageid='+project'), |
| 2001 | + FakeRequest('/bugs/1', 20.5, 567, 14.0, pageid='+bug'), |
| 2002 | + FakeRequest('/bugs/1', 15.5, 567, 9.0, pageid='+bug'), |
| 2003 | + FakeRequest('/bugs/5', 1.5, 30, 1.2, pageid='+bug'), |
| 2004 | + FakeRequest('/lazr', 1.0, 16, 0.3, pageid='+project'), |
| 2005 | + FakeRequest('/drizzle', 0.9, 11, 1.3, pageid='+project'), |
| 2006 | + ] |
| 2007 | + |
| 2008 | + |
| 2009 | +# The category stats computed for the above 12 requests. |
| 2010 | +CATEGORY_STATS = [ |
| 2011 | + # Median is an approximation. |
| 2012 | + # Real values are: 2.50, 2.20, 30 |
| 2013 | + (Category('All', ''), FakeStats( |
| 2014 | + total_hits=12, total_time=62.60, mean=5.22, median=4.20, std=5.99, |
| 2015 | + total_sqltime=42, mean_sqltime=3.82, median_sqltime=3.0, |
| 2016 | + std_sqltime=3.89, |
| 2017 | + total_sqlstatements=1392, mean_sqlstatements=126.55, |
| 2018 | + median_sqlstatements=56, std_sqlstatements=208.94, |
| 2019 | + histogram=[[0, 2], [1, 2], [2, 2], [3, 1], [4, 2], [5, 3]], |
| 2020 | + )), |
| 2021 | + (Category('Test', ''), FakeStats( |
| 2022 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 0], [5, 0]])), |
| 2023 | + (Category('Bugs', ''), FakeStats( |
| 2024 | + total_hits=6, total_time=51.70, mean=8.62, median=4.5, std=6.90, |
| 2025 | + total_sqltime=33.40, mean_sqltime=5.57, median_sqltime=3, |
| 2026 | + std_sqltime=4.52, |
| 2027 | + total_sqlstatements=1352, mean_sqlstatements=225.33, |
| 2028 | + median_sqlstatements=56, std_sqlstatements=241.96, |
| 2029 | + histogram=[[0, 0], [1, 1], [2, 0], [3, 0], [4, 2], [5, 3]], |
| 2030 | + )), |
| 2031 | + ] |
| 2032 | + |
| 2033 | + |
| 2034 | +# The top 3 URL stats computed for the above 12 requests. |
| 2035 | +TOP_3_URL_STATS = [ |
| 2036 | + ('/bugs/1', FakeStats( |
| 2037 | + total_hits=2, total_time=36.0, mean=18.0, median=15.5, std=2.50, |
| 2038 | + total_sqltime=23.0, mean_sqltime=11.5, median_sqltime=9.0, |
| 2039 | + std_sqltime=2.50, |
| 2040 | + total_sqlstatements=1134, mean_sqlstatements=567.0, |
| 2041 | + median_sqlstatements=567, std_statements=0, |
| 2042 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 0], [5, 2]], |
| 2043 | + )), |
| 2044 | + ('/bugs', FakeStats( |
| 2045 | + total_hits=3, total_time=14.2, mean=4.73, median=4.5, std=0.56, |
| 2046 | + total_sqltime=9.2, mean_sqltime=3.07, median_sqltime=3, |
| 2047 | + std_sqltime=0.74, |
| 2048 | + total_sqlstatements=188, mean_sqlstatements=62.67, |
| 2049 | + median_sqlstatements=56, std_sqlstatements=9.43, |
| 2050 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 2], [5, 1]], |
| 2051 | + )), |
| 2052 | + ('/launchpad', FakeStats( |
| 2053 | + total_hits=1, total_time=3.5, mean=3.5, median=3.5, std=0, |
| 2054 | + total_sqltime=3.0, mean_sqltime=3, median_sqltime=3, std_sqltime=0, |
| 2055 | + total_sqlstatements=3, mean_sqlstatements=3, |
| 2056 | + median_sqlstatements=3, std_sqlstatements=0, |
| 2057 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 1], [4, 0], [5, 0]], |
| 2058 | + )), |
| 2059 | + ] |
| 2060 | + |
| 2061 | + |
| 2062 | +# The pageid stats computed for the above 12 requests. |
| 2063 | +PAGEID_STATS = [ |
| 2064 | + ('+bug', FakeStats( |
| 2065 | + total_hits=3, total_time=37.5, mean=12.5, median=15.5, std=8.04, |
| 2066 | + total_sqltime=24.2, mean_sqltime=8.07, median_sqltime=9, |
| 2067 | + std_sqltime=5.27, |
| 2068 | + total_sqlstatements=1164, mean_sqlstatements=388, |
| 2069 | + median_sqlstatements=567, std_sqlstatements=253.14, |
| 2070 | + histogram=[[0, 0], [1, 1], [2, 0], [3, 0], [4, 0], [5, 2]], |
| 2071 | + )), |
| 2072 | + ('+bugs', FakeStats( |
| 2073 | + total_hits=3, total_time=14.2, mean=4.73, median=4.5, std=0.56, |
| 2074 | + total_sqltime=9.2, mean_sqltime=3.07, median_sqltime=3, |
| 2075 | + std_sqltime=0.74, |
| 2076 | + total_sqlstatements=188, mean_sqlstatements=62.67, |
| 2077 | + median_sqlstatements=56, std_sqlstatements=9.43, |
| 2078 | + histogram=[[0, 0], [1, 0], [2, 0], [3, 0], [4, 2], [5, 1]], |
| 2079 | + )), |
| 2080 | + ('+distribution', FakeStats( |
| 2081 | + total_hits=1, total_time=2.5, mean=2.5, median=2.5, std=0, |
| 2082 | + total_sqltime=2.0, mean_sqltime=2, median_sqltime=2, std_sqltime=0, |
| 2083 | + total_sqlstatements=6, mean_sqlstatements=6, |
| 2084 | + median_sqlstatements=6, std_sqlstatements=0, |
| 2085 | + histogram=[[0, 0], [1, 0], [2, 1], [3, 0], [4, 0], [5, 0]], |
| 2086 | + )), |
| 2087 | + ('+project', FakeStats( |
| 2088 | + total_hits=4, total_time=7.9, mean=1.98, median=1, std=1.08, |
| 2089 | + total_sqltime=6.6, mean_sqltime=1.65, median_sqltime=1.3, |
| 2090 | + std_sqltime=0.99, |
| 2091 | + total_sqlstatements=34, mean_sqlstatements=8.5, |
| 2092 | + median_sqlstatements=4, std_sqlstatements=5.32, |
| 2093 | + histogram=[[0, 1], [1, 1], [2, 1], [3, 1], [4, 0], [5, 0]], |
| 2094 | + )), |
| 2095 | + ('+root', FakeStats( |
| 2096 | + total_hits=1, total_time=0.5, mean=0.5, median=0.5, std=0, |
| 2097 | + histogram=[[0, 1], [1, 0], [2, 0], [3, 0], [4, 0], [5, 0]], |
| 2098 | + )), |
| 2099 | + ] |
| 2100 | + |
| 2101 | + |
| 2102 | +class TestRequestTimes(TestCase): |
| 2103 | + """Tests the RequestTimes backend.""" |
| 2104 | + |
| 2105 | + def setUp(self): |
| 2106 | + super(TestRequestTimes, self).setUp() |
| 2107 | + self.categories = [ |
| 2108 | + Category('All', '.*'), Category('Test', '.*test.*'), |
| 2109 | + Category('Bugs', '.*bugs.*')] |
| 2110 | + self.db = RequestTimes(self.categories, FakeOptions()) |
| 2111 | + self.useFixture(fixtures.LoggerFixture()) |
| 2112 | + |
| 2113 | + def setUpRequests(self): |
| 2114 | + """Insert some requests into the db.""" |
| 2115 | + for r in FAKE_REQUESTS: |
| 2116 | + self.db.add_request(r) |
| 2117 | + |
| 2118 | + def assertStatsAreEquals(self, expected, results): |
| 2119 | + self.assertEquals( |
| 2120 | + len(expected), len(results), 'Wrong number of results') |
| 2121 | + for idx in range(len(results)): |
| 2122 | + self.assertEquals(expected[idx][0], results[idx][0], |
| 2123 | + "Wrong key for results %d" % idx) |
| 2124 | + key = results[idx][0] |
| 2125 | + self.assertEquals(expected[idx][1].text(), results[idx][1].text(), |
| 2126 | + "Wrong stats for results %d (%s)" % (idx, key)) |
| 2127 | + self.assertEquals( |
| 2128 | + Histogram.from_bins_data(expected[idx][1].histogram), |
| 2129 | + results[idx][1].histogram, |
| 2130 | + "Wrong histogram for results %d (%s)" % (idx, key)) |
| 2131 | + |
| 2132 | + def test_get_category_times(self): |
| 2133 | + self.setUpRequests() |
| 2134 | + category_times = self.db.get_category_times() |
| 2135 | + self.assertStatsAreEquals(CATEGORY_STATS, category_times) |
| 2136 | + |
| 2137 | + def test_get_url_times(self): |
| 2138 | + self.setUpRequests() |
| 2139 | + url_times = self.db.get_top_urls_times() |
| 2140 | + self.assertStatsAreEquals(TOP_3_URL_STATS, url_times) |
| 2141 | + |
| 2142 | + def test_get_pageid_times(self): |
| 2143 | + self.setUpRequests() |
| 2144 | + pageid_times = self.db.get_pageid_times() |
| 2145 | + self.assertStatsAreEquals(PAGEID_STATS, pageid_times) |
| 2146 | + |
| 2147 | + def test___add__(self): |
| 2148 | + # Ensure that adding two RequestTimes together result in |
| 2149 | + # a merge of their constituencies. |
| 2150 | + db1 = self.db |
| 2151 | + db2 = RequestTimes(self.categories, FakeOptions()) |
| 2152 | + db1.add_request(FakeRequest('/', 1.5, 5, 1.0, '+root')) |
| 2153 | + db1.add_request(FakeRequest('/bugs', 3.5, 15, 1.0, '+bugs')) |
| 2154 | + db2.add_request(FakeRequest('/bugs/1', 5.0, 30, 4.0, '+bug')) |
| 2155 | + results = db1 + db2 |
| 2156 | + self.assertEquals(3, results.category_times[0][1].total_hits) |
| 2157 | + self.assertEquals(0, results.category_times[1][1].total_hits) |
| 2158 | + self.assertEquals(2, results.category_times[2][1].total_hits) |
| 2159 | + self.assertEquals(1, results.pageid_times['+root'].total_hits) |
| 2160 | + self.assertEquals(1, results.pageid_times['+bugs'].total_hits) |
| 2161 | + self.assertEquals(1, results.pageid_times['+bug'].total_hits) |
| 2162 | + self.assertEquals(1, results.url_times['/'].total_hits) |
| 2163 | + self.assertEquals(1, results.url_times['/bugs'].total_hits) |
| 2164 | + self.assertEquals(1, results.url_times['/bugs/1'].total_hits) |
| 2165 | + |
| 2166 | + def test_histogram_init_with_resolution(self): |
| 2167 | + # Test that the resolution parameter increase the number of bins |
| 2168 | + db = RequestTimes( |
| 2169 | + self.categories, FakeOptions(timeout=4, resolution=1)) |
| 2170 | + self.assertEquals(5, db.histogram_width) |
| 2171 | + self.assertEquals(1, db.histogram_resolution) |
| 2172 | + db = RequestTimes( |
| 2173 | + self.categories, FakeOptions(timeout=4, resolution=0.5)) |
| 2174 | + self.assertEquals(9, db.histogram_width) |
| 2175 | + self.assertEquals(0.5, db.histogram_resolution) |
| 2176 | + db = RequestTimes( |
| 2177 | + self.categories, FakeOptions(timeout=4, resolution=2)) |
| 2178 | + self.assertEquals(3, db.histogram_width) |
| 2179 | + self.assertEquals(2, db.histogram_resolution) |
| 2180 | + |
| 2181 | + |
| 2182 | +class TestOnlineStats(TestCase): |
| 2183 | + """Tests for the OnlineStats class.""" |
| 2184 | + |
| 2185 | + def test___add__(self): |
| 2186 | + # Ensure that adding two OnlineStats merge all their constituencies. |
| 2187 | + stats1 = OnlineStats(4, 1) |
| 2188 | + stats1.update(FakeRequest('/', 2.0, 5, 1.5)) |
| 2189 | + stats2 = OnlineStats(4, 1) |
| 2190 | + stats2.update(FakeRequest('/', 1.5, 2, 3.0)) |
| 2191 | + stats2.update(FakeRequest('/', 5.0, 2, 2.0)) |
| 2192 | + results = stats1 + stats2 |
| 2193 | + self.assertEquals(3, results.total_hits) |
| 2194 | + self.assertEquals(2, results.median) |
| 2195 | + self.assertEquals(9, results.total_sqlstatements) |
| 2196 | + self.assertEquals(2, results.median_sqlstatements) |
| 2197 | + self.assertEquals(6.5, results.total_sqltime) |
| 2198 | + self.assertEquals(2.0, results.median_sqltime) |
| 2199 | + self.assertEquals( |
| 2200 | + Histogram.from_bins_data([[0, 0], [1, 1], [2, 1], [3, 1]]), |
| 2201 | + results.histogram) |
| 2202 | + |
| 2203 | + |
| 2204 | +class TestOnlineStatsCalculator(TestCase): |
| 2205 | + """Tests for the online stats calculator.""" |
| 2206 | + |
| 2207 | + def setUp(self): |
| 2208 | + TestCase.setUp(self) |
| 2209 | + self.stats = OnlineStatsCalculator() |
| 2210 | + |
| 2211 | + def test_stats_for_empty_set(self): |
| 2212 | + # Test the stats when there is no input. |
| 2213 | + self.assertEquals(0, self.stats.count) |
| 2214 | + self.assertEquals(0, self.stats.sum) |
| 2215 | + self.assertEquals(0, self.stats.mean) |
| 2216 | + self.assertEquals(0, self.stats.variance) |
| 2217 | + self.assertEquals(0, self.stats.std) |
| 2218 | + |
| 2219 | + def test_stats_for_one_value(self): |
| 2220 | + # Test the stats when adding one element. |
| 2221 | + self.stats.update(5) |
| 2222 | + self.assertEquals(1, self.stats.count) |
| 2223 | + self.assertEquals(5, self.stats.sum) |
| 2224 | + self.assertEquals(5, self.stats.mean) |
| 2225 | + self.assertEquals(0, self.stats.variance) |
| 2226 | + self.assertEquals(0, self.stats.std) |
| 2227 | + |
| 2228 | + def test_None_are_ignored(self): |
| 2229 | + self.stats.update(None) |
| 2230 | + self.assertEquals(0, self.stats.count) |
| 2231 | + |
| 2232 | + def test_stats_for_3_values(self): |
| 2233 | + for x in [3, 6, 9]: |
| 2234 | + self.stats.update(x) |
| 2235 | + self.assertEquals(3, self.stats.count) |
| 2236 | + self.assertEquals(18, self.stats.sum) |
| 2237 | + self.assertEquals(6, self.stats.mean) |
| 2238 | + self.assertEquals(6, self.stats.variance) |
| 2239 | + self.assertEquals("2.45", "%.2f" % self.stats.std) |
| 2240 | + |
| 2241 | + def test___add___two_empty_together(self): |
| 2242 | + stats2 = OnlineStatsCalculator() |
| 2243 | + results = self.stats + stats2 |
| 2244 | + self.assertEquals(0, results.count) |
| 2245 | + self.assertEquals(0, results.sum) |
| 2246 | + self.assertEquals(0, results.mean) |
| 2247 | + self.assertEquals(0, results.variance) |
| 2248 | + |
| 2249 | + def test___add___one_empty(self): |
| 2250 | + stats2 = OnlineStatsCalculator() |
| 2251 | + for x in [1, 2, 3]: |
| 2252 | + self.stats.update(x) |
| 2253 | + results = self.stats + stats2 |
| 2254 | + self.assertEquals(3, results.count) |
| 2255 | + self.assertEquals(6, results.sum) |
| 2256 | + self.assertEquals(2, results.mean) |
| 2257 | + self.assertEquals(2, results.M2) |
| 2258 | + |
| 2259 | + def test___add__(self): |
| 2260 | + stats2 = OnlineStatsCalculator() |
| 2261 | + for x in [3, 6, 9]: |
| 2262 | + self.stats.update(x) |
| 2263 | + for x in [1, 2, 3]: |
| 2264 | + stats2.update(x) |
| 2265 | + results = self.stats + stats2 |
| 2266 | + self.assertEquals(6, results.count) |
| 2267 | + self.assertEquals(24, results.sum) |
| 2268 | + self.assertEquals(4, results.mean) |
| 2269 | + self.assertEquals(44, results.M2) |
| 2270 | + |
| 2271 | + |
| 2272 | +SHUFFLE_RANGE_100 = [ |
| 2273 | + 25, 79, 99, 76, 60, 63, 87, 77, 51, 82, 42, 96, 93, 58, 32, 66, 75, |
| 2274 | + 2, 26, 22, 11, 73, 61, 83, 65, 68, 44, 81, 64, 3, 33, 34, 15, 1, |
| 2275 | + 92, 27, 90, 74, 46, 57, 59, 31, 13, 19, 89, 29, 56, 94, 50, 49, 62, |
| 2276 | + 37, 21, 35, 5, 84, 88, 16, 8, 23, 40, 6, 48, 10, 97, 0, 53, 17, 30, |
| 2277 | + 18, 43, 86, 12, 71, 38, 78, 36, 7, 45, 47, 80, 54, 39, 91, 98, 24, |
| 2278 | + 55, 14, 52, 20, 69, 85, 95, 28, 4, 9, 67, 70, 41, 72, |
| 2279 | + ] |
| 2280 | + |
| 2281 | + |
| 2282 | +class TestOnlineApproximateMedian(TestCase): |
| 2283 | + """Tests for the approximate median computation.""" |
| 2284 | + |
| 2285 | + def setUp(self): |
| 2286 | + TestCase.setUp(self) |
| 2287 | + self.estimator = OnlineApproximateMedian() |
| 2288 | + |
| 2289 | + def test_median_is_0_when_no_input(self): |
| 2290 | + self.assertEquals(0, self.estimator.median) |
| 2291 | + |
| 2292 | + def test_median_is_true_median_for_n_lower_than_bucket_size(self): |
| 2293 | + for x in range(9): |
| 2294 | + self.estimator.update(x) |
| 2295 | + self.assertEquals(4, self.estimator.median) |
| 2296 | + |
| 2297 | + def test_None_input_is_ignored(self): |
| 2298 | + self.estimator.update(1) |
| 2299 | + self.estimator.update(None) |
| 2300 | + self.assertEquals(1, self.estimator.median) |
| 2301 | + |
| 2302 | + def test_approximate_median_is_good_enough(self): |
| 2303 | + for x in SHUFFLE_RANGE_100: |
| 2304 | + self.estimator.update(x) |
| 2305 | + # True median is 50, 49 is good enough :-) |
| 2306 | + self.assertIn(self.estimator.median, range(49,52)) |
| 2307 | + |
| 2308 | + def test___add__(self): |
| 2309 | + median1 = OnlineApproximateMedian(3) |
| 2310 | + median1.buckets = [[1, 3], [4, 5], [6, 3]] |
| 2311 | + median2 = OnlineApproximateMedian(3) |
| 2312 | + median2.buckets = [[], [3, 6], [3, 7]] |
| 2313 | + results = median1 + median2 |
| 2314 | + self.assertEquals([[1, 3], [6], [3, 7], [4]], results.buckets) |
| 2315 | + |
| 2316 | + |
| 2317 | +class TestHistogram(TestCase): |
| 2318 | + """Test the histogram computation.""" |
| 2319 | + |
| 2320 | + def test__init__(self): |
| 2321 | + hist = Histogram(4, 1) |
| 2322 | + self.assertEquals(4, hist.bins_count) |
| 2323 | + self.assertEquals(1, hist.bins_size) |
| 2324 | + self.assertEquals([[0, 0], [1, 0], [2, 0], [3, 0]], hist.bins) |
| 2325 | + |
| 2326 | + def test__init__bins_size_float(self): |
| 2327 | + hist = Histogram(9, 0.5) |
| 2328 | + self.assertEquals(9, hist.bins_count) |
| 2329 | + self.assertEquals(0.5, hist.bins_size) |
| 2330 | + self.assertEquals( |
| 2331 | + [[0, 0], [0.5, 0], [1.0, 0], [1.5, 0], |
| 2332 | + [2.0, 0], [2.5, 0], [3.0, 0], [3.5, 0], [4.0, 0]], hist.bins) |
| 2333 | + |
| 2334 | + def test_update(self): |
| 2335 | + hist = Histogram(4, 1) |
| 2336 | + hist.update(1) |
| 2337 | + self.assertEquals(1, hist.count) |
| 2338 | + self.assertEquals([[0, 0], [1, 1], [2, 0], [3, 0]], hist.bins) |
| 2339 | + |
| 2340 | + hist.update(1.3) |
| 2341 | + self.assertEquals(2, hist.count) |
| 2342 | + self.assertEquals([[0, 0], [1, 2], [2, 0], [3, 0]], hist.bins) |
| 2343 | + |
| 2344 | + def test_update_float_bin_size(self): |
| 2345 | + hist = Histogram(4, 0.5) |
| 2346 | + hist.update(1.3) |
| 2347 | + self.assertEquals([[0, 0], [0.5, 0], [1.0, 1], [1.5, 0]], hist.bins) |
| 2348 | + hist.update(0.5) |
| 2349 | + self.assertEquals([[0, 0], [0.5, 1], [1.0, 1], [1.5, 0]], hist.bins) |
| 2350 | + hist.update(0.6) |
| 2351 | + self.assertEquals([[0, 0], [0.5, 2], [1.0, 1], [1.5, 0]], hist.bins) |
| 2352 | + |
| 2353 | + def test_update_max_goes_in_last_bin(self): |
| 2354 | + hist = Histogram(4, 1) |
| 2355 | + hist.update(9) |
| 2356 | + self.assertEquals([[0, 0], [1, 0], [2, 0], [3, 1]], hist.bins) |
| 2357 | + |
| 2358 | + def test_bins_relative(self): |
| 2359 | + hist = Histogram(4, 1) |
| 2360 | + for x in range(4): |
| 2361 | + hist.update(x) |
| 2362 | + self.assertEquals( |
| 2363 | + [[0, 0.25], [1, 0.25], [2, 0.25], [3, 0.25]], hist.bins_relative) |
| 2364 | + |
| 2365 | + def test_from_bins_data(self): |
| 2366 | + hist = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2367 | + self.assertEquals(4, hist.bins_count) |
| 2368 | + self.assertEquals(1, hist.bins_size) |
| 2369 | + self.assertEquals(6, hist.count) |
| 2370 | + self.assertEquals([[0, 1], [1, 3], [2, 1], [3, 1]], hist.bins) |
| 2371 | + |
| 2372 | + def test___repr__(self): |
| 2373 | + hist = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2374 | + self.assertEquals( |
| 2375 | + "<Histogram [[0, 1], [1, 3], [2, 1], [3, 1]]>", repr(hist)) |
| 2376 | + |
| 2377 | + def test___eq__(self): |
| 2378 | + hist1 = Histogram(4, 1) |
| 2379 | + hist2 = Histogram(4, 1) |
| 2380 | + self.assertEquals(hist1, hist2) |
| 2381 | + |
| 2382 | + def test__eq___with_data(self): |
| 2383 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2384 | + hist2 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2385 | + self.assertEquals(hist1, hist2) |
| 2386 | + |
| 2387 | + def test___add__(self): |
| 2388 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2389 | + hist2 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2390 | + hist3 = Histogram.from_bins_data([[0, 2], [1, 6], [2, 2], [3, 2]]) |
| 2391 | + total = hist1 + hist2 |
| 2392 | + self.assertEquals(hist3, total) |
| 2393 | + self.assertEquals(12, total.count) |
| 2394 | + |
| 2395 | + def test___add___uses_widest(self): |
| 2396 | + # Make sure that the resulting histogram is as wide as the widest one. |
| 2397 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 3], [2, 1], [3, 1]]) |
| 2398 | + hist2 = Histogram.from_bins_data( |
| 2399 | + [[0, 1], [1, 3], [2, 1], [3, 1], [4, 2], [5, 3]]) |
| 2400 | + hist3 = Histogram.from_bins_data( |
| 2401 | + [[0, 2], [1, 6], [2, 2], [3, 2], [4, 2], [5, 3]]) |
| 2402 | + self.assertEquals(hist3, hist1 + hist2) |
| 2403 | + |
| 2404 | + def test___add___interpolate_lower_resolution(self): |
| 2405 | + # Make sure that when the other histogram has a bigger bin_size |
| 2406 | + # the frequency is correctly split across the different bins. |
| 2407 | + hist1 = Histogram.from_bins_data( |
| 2408 | + [[0, 1], [0.5, 3], [1.0, 1], [1.5, 1]]) |
| 2409 | + hist2 = Histogram.from_bins_data( |
| 2410 | + [[0, 1], [1, 2], [2, 3], [3, 1], [4, 1]]) |
| 2411 | + |
| 2412 | + hist3 = Histogram.from_bins_data( |
| 2413 | + [[0, 1.5], [0.5, 3.5], [1.0, 2], [1.5, 2], |
| 2414 | + [2.0, 1.5], [2.5, 1.5], [3.0, 0.5], [3.5, 0.5], |
| 2415 | + [4.0, 0.5], [4.5, 0.5]]) |
| 2416 | + self.assertEquals(hist3, hist1 + hist2) |
| 2417 | + |
| 2418 | + def test___add___higher_resolution(self): |
| 2419 | + # Make sure that when the other histogram has a smaller bin_size |
| 2420 | + # the frequency is correctly added. |
| 2421 | + hist1 = Histogram.from_bins_data([[0, 1], [1, 2], [2, 3]]) |
| 2422 | + hist2 = Histogram.from_bins_data( |
| 2423 | + [[0, 1], [0.5, 3], [1.0, 1], [1.5, 1], [2.0, 3], [2.5, 1], |
| 2424 | + [3, 4], [3.5, 2]]) |
| 2425 | + |
| 2426 | + hist3 = Histogram.from_bins_data([[0, 5], [1, 4], [2, 7], [3, 6]]) |
| 2427 | + self.assertEquals(hist3, hist1 + hist2) |
| 2428 | |
| 2429 | === added file 'versions.cfg' |
| 2430 | --- versions.cfg 1970-01-01 00:00:00 +0000 |
| 2431 | +++ versions.cfg 2012-08-09 04:56:19 +0000 |
| 2432 | @@ -0,0 +1,111 @@ |
| 2433 | +[buildout] |
| 2434 | +versions = versions |
| 2435 | + |
| 2436 | +[versions] |
| 2437 | +# Alphabetical, case-insensitive, please! :-) |
| 2438 | +fixtures = 0.3.9 |
| 2439 | +pytz = 2012c |
| 2440 | +RestrictedPython = 3.5.1 |
| 2441 | +setuptools = 0.6c11 |
| 2442 | +testtools = 0.9.14 |
| 2443 | +transaction = 1.0.0 |
| 2444 | +# Also upgrade the zc.buildout version in the Makefile's bin/buildout section. |
| 2445 | +zc.buildout = 1.5.1 |
| 2446 | +zc.lockfile = 1.0.0 |
| 2447 | +zc.recipe.egg = 1.3.2 |
| 2448 | +z3c.recipe.scripts = 1.0.1 |
| 2449 | +zc.zservertracelog = 1.1.5 |
| 2450 | +ZConfig = 2.9.1dev-20110728 |
| 2451 | +zdaemon = 2.0.4 |
| 2452 | +ZODB3 = 3.9.2 |
| 2453 | +zope.annotation = 3.5.0 |
| 2454 | +zope.app.applicationcontrol = 3.5.1 |
| 2455 | +zope.app.appsetup = 3.12.0 |
| 2456 | +zope.app.authentication = 3.6.1 |
| 2457 | +zope.app.basicskin = 3.4.1 |
| 2458 | +zope.app.component = 3.8.3 |
| 2459 | +zope.app.container = 3.8.0 |
| 2460 | +zope.app.form = 3.8.1 |
| 2461 | +zope.app.pagetemplate = 3.7.1 |
| 2462 | +zope.app.publication = 3.9.0 |
| 2463 | +zope.app.publisher = 3.10.0 |
| 2464 | +zope.app.server = 3.4.2 |
| 2465 | +zope.app.wsgi = 3.6.0 |
| 2466 | +zope.authentication = 3.7.0 |
| 2467 | +zope.broken = 3.5.0 |
| 2468 | +zope.browser = 1.2 |
| 2469 | +zope.browsermenu = 3.9.0 |
| 2470 | +zope.browserpage = 3.9.0 |
| 2471 | +zope.browserresource = 3.9.0 |
| 2472 | +zope.cachedescriptors = 3.5.0 |
| 2473 | +zope.component = 3.9.3 |
| 2474 | +zope.componentvocabulary = 1.0 |
| 2475 | +zope.configuration = 3.6.0 |
| 2476 | +zope.container = 3.9.0 |
| 2477 | +zope.contenttype = 3.5.0 |
| 2478 | +zope.copy = 3.5.0 |
| 2479 | +zope.copypastemove = 3.5.2 |
| 2480 | +zope.datetime = 3.4.0 |
| 2481 | +zope.deferredimport = 3.5.0 |
| 2482 | +zope.deprecation = 3.4.0 |
| 2483 | +zope.dottedname = 3.4.6 |
| 2484 | +zope.dublincore = 3.5.0 |
| 2485 | +zope.error = 3.7.0 |
| 2486 | +zope.event = 3.4.1 |
| 2487 | +zope.exceptions = 3.5.2 |
| 2488 | +zope.filerepresentation = 3.5.0 |
| 2489 | +zope.formlib = 3.6.0 |
| 2490 | +zope.hookable = 3.4.1 |
| 2491 | +zope.i18n = 3.7.1 |
| 2492 | +zope.i18nmessageid = 3.5.0 |
| 2493 | +zope.interface = 3.5.2 |
| 2494 | +zope.lifecycleevent = 3.5.2 |
| 2495 | +zope.location = 3.7.0 |
| 2496 | +zope.minmax = 1.1.1 |
| 2497 | +# Build of lp:~wallyworld/zope.pagetemplate/fix-isinstance |
| 2498 | +# This version adds a small change to the traversal logic so that the |
| 2499 | +# optimisation which applies if the object is a dict also works for subclasses |
| 2500 | +# of dict. The change has been approved for merge into the official zope code |
| 2501 | +# base. This patch is a temporary fix until the next official release. |
| 2502 | +zope.pagetemplate = 3.5.0-p1 |
| 2503 | +zope.password = 3.5.1 |
| 2504 | +zope.processlifetime = 1.0 |
| 2505 | +zope.proxy = 3.5.0 |
| 2506 | +zope.ptresource = 3.9.0 |
| 2507 | +zope.publisher = 3.12.0 |
| 2508 | +zope.schema = 3.5.4 |
| 2509 | +zope.security = 3.7.1 |
| 2510 | +zope.server = 3.6.1 |
| 2511 | +zope.session = 3.9.1 |
| 2512 | +zope.site = 3.7.0 |
| 2513 | +zope.size = 3.4.1 |
| 2514 | +zope.tal = 3.5.1 |
| 2515 | +zope.tales = 3.4.0 |
| 2516 | +# p1 Build of lp:~mars/zope.testing/3.9.4-p1. Fixes bugs 570380 and 587886. |
| 2517 | +# p2 With patch for thread leaks to make them skips, fixes windmill errors |
| 2518 | +# with 'new threads' in hudson/ec2 builds. |
| 2519 | +# p3 And always tear down layers, because thats the Right Thing To Do. |
| 2520 | +# p4 fixes --subunit --list to really just list the tests. |
| 2521 | +# p5 Build of lp:~launchpad/zope.testing/3.9.4-p5. Fixes bug #609986. |
| 2522 | +# p6 reinstates fix from p4. Build of lp:~launchpad/zope.testing/3.9.4-fork |
| 2523 | +# revision 26. |
| 2524 | +# p7 was unused |
| 2525 | +# p8 redirects stdout and stderr to a black hole device when --subunit is used |
| 2526 | +# p9 adds the redirection of __stderr__ to a black hole device |
| 2527 | +# p10 changed the test reporting to use test.id() rather than |
| 2528 | +# str(test) since only the id is unique. |
| 2529 | +# p11 reverts p9. |
| 2530 | +# p12 reverts p11, restoring p9. |
| 2531 | +# p13 Add a new --require-unique flag to the testrunner. When set, |
| 2532 | +# this will cause the testrunner to check all tests IDs to ensure they |
| 2533 | +# haven't been loaded before. If it encounters a duplicate, it will |
| 2534 | +# raise an error and quit. |
| 2535 | +# p14 Adds test data written to stderr and stdout into the subunit output. |
| 2536 | +# p15 Fixed internal tests. |
| 2537 | +# p16 Adds support for skips in Python 2.7. |
| 2538 | +# p17 Fixes skip support for Python 2.6. |
| 2539 | +# To build (use Python 2.6) run "python bootstrap.py; ./bin/buildout". Then to |
| 2540 | +# build the distribution run "bin/buildout setup . sdist" |
| 2541 | +# Make sure you have subunit installed. |
| 2542 | +zope.testing = 3.9.4-p17 |
| 2543 | +zope.traversing = 3.8.0 |

I don't condone the buildout approach, but OK.