MAAS

Merge lp:~allenap/maas/faster-import-checks into lp:~maas-committers/maas/trunk

faster-import-checks
Merge into trunk

Proposed by Gavin Panella on 2017-02-09

Status:	Merged
Approved by:	Gavin Panella on 2017-02-09
Approved revision:	no longer in the source branch.
Merged at revision:	5705
Proposed branch:	lp:~allenap/maas/faster-import-checks
Merge into:	lp:~maas-committers/maas/trunk
Diff against target:	83 lines (+48/-10) 1 file modified utilities/check-imports (+48/-10)
To merge this branch:	bzr merge lp:~allenap/maas/faster-import-checks
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Blake Rouse (community)		2017-02-09	Approve on 2017-02-09
Review via email: mp+316824@code.launchpad.net

Commit message

Use a multiprocessing pool to make import checks faster.

Revision history for this message

Gavin Panella (allenap) wrote on 2017-02-09:

For me this reduces wall clock time from ~6s to ~2.5s, at the expense of more CPU time.

Revision history for this message

Gavin Panella (allenap) wrote on 2017-02-09:

I realised an optimisation and it's now down to ~1.5s for me, and with overall CPU time reduced from my earlier attempt (though still higher than with no multiprocessing).

Revision history for this message

Blake Rouse (blake-rouse) wrote on 2017-02-09:

Nice improvement. Looks good.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Andres Rodriguez

Blake Rouse

Brendan Donegan

Dave Walker

Deepa

Enrique Chirivella Pérez

Fumihito YOSHIDA

Gavin Panella

MAAS Committers

Mike Pontillo

james beedy

 === modified file 'utilities/check-imports'
 --- utilities/check-imports	2017-02-09 11:34:44 +0000
 +++ utilities/check-imports	2017-02-09 14:17:28 +0000
@@ -4,7 +4,11 @@
  import ast
  from collections import Iterable
  from glob import iglob
--from itertools import chain
++from itertools import (
++    chain,
++    islice,
++)
++import multiprocessing
  from pathlib import Path
  import re
  import sys
@@ -432,6 +436,7 @@
  def extract(node):
++    """Extract all imports from the given AST node."""
      for node in ast.walk(node):
          if isinstance(node, ast.Import):
              for alias in node.names:
@@ -446,16 +451,49 @@
              pass  # Not an import.
++def _batch(objects, size):
++    """Generate batches of `size` elements from `objects`.
++
++    Each batch is a list of `size` elements exactly, except for the last which
++    may contain fewer than `size` elements.
++    """
++    objects = iter(objects)
++    batch = lambda: list(islice(objects, size))
++    return iter(batch, [])
++
++
++def _expand(checks):
++    """Generate `(rule, batch-of-filenames)` for the given checks.
++
++    It batches filenames to reduce the serialise/unserialise overhead when
++    calling out to a pooled process.
++    """
++    for filenames, rule in checks:
++        rule.compile()  # Compile or it's slow.
++        for filenames in _batch(filenames, 100):
++            yield rule, filenames
++
++
++def _scan1(rule, filename):
++    """Scan one file and check against the given rule."""
++    with tokenize.open(filename) as fd:
++        module = ast.parse(fd.read())
++    imports = set(extract(module))
++    allowed = set(filter(rule.check, imports))
++    denied = imports.difference(allowed)
++    return filename, allowed, denied
++
++
++def _scan(rule, filenames):
++    """Scan the files and check against the given rule."""
++    return [_scan1(rule, filename) for filename in filenames]
++
++
  def scan(checks):
--    for files, rule in checks:
--        rule.compile()
--        for filename in sorted(files):
--            with tokenize.open(filename) as fd:
--                module = ast.parse(fd.read())
--            imports = set(extract(module))
--            allowed = set(filter(rule.check, imports))
--            denied = imports.difference(allowed)
--            yield filename, allowed, denied
++    """Scan many files and check against the given rules."""
++    with multiprocessing.Pool() as pool:
++        for results in pool.starmap(_scan, _expand(checks)):
++            yield from results
  if sys.stdout.isatty():

MAAS

Merge lp:~allenap/maas/faster-import-checks into lp:~maas-committers/maas/trunk

Commit message

Description of the change

Preview Diff

Subscribers