testtools

Merge lp:~jml/testtools/better-doctest-output-checker into lp:~testtools-committers/testtools/trunk

better-doctest-output-checker
Merge into trunk

Proposed by Jonathan Lange on 2011-09-09

Status:	Work in progress
Proposed branch:	lp:~jml/testtools/better-doctest-output-checker
Merge into:	lp:~testtools-committers/testtools/trunk
Diff against target:	356 lines (+216/-40) 5 files modified NEWS (+7/-0) doc/for-test-authors.rst (+10/-0) testtools/helpers.py (+118/-0) testtools/matchers.py (+2/-40) testtools/tests/test_helpers.py (+79/-0)
To merge this branch:	bzr merge lp:~jml/testtools/better-doctest-output-checker
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
testtools committers		2011-09-09	Pending
Review via email: mp+74842@code.launchpad.net

Description of the change

An attempt to make a better OutputChecker for doctests.

Unmerged revisions

239. By Jonathan Lange on 2011-09-09: Documentation.
238. By Jonathan Lange on 2011-09-09: Use it for DocTestMatches
237. By Jonathan Lange on 2011-09-09: NEWS update.
236. By Jonathan Lange on 2011-09-09: Whitespace.
235. By Jonathan Lange on 2011-09-09: Get the abstraction right.:
234. By Jonathan Lange on 2011-09-09: Merge IntelligentOutputChecker with _NonManglingOutputChecker.
233. By Jonathan Lange on 2011-09-09: Collapse multiple blank lines.
232. By Jonathan Lange on 2011-09-09: Move ellipsize into the smart output checker
231. By Jonathan Lange on 2011-09-09: Refactor so we don't have to copy and paste.
230. By Jonathan Lange on 2011-09-09: Limited line-based ellipsis normalization.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Jonathan Lange

Michael Hudson-Doyle

Robert Collins

testtools developers

 === modified file 'NEWS'
 --- NEWS	2011-08-14 12:13:52 +0000
 +++ NEWS	2011-09-09 17:39:51 +0000
@@ -14,6 +14,9 @@
    now deprecated.  Please stop using it.
    (Jonathan Lange, #813460)
++* ``DocTestMatches`` now uses ``SmartOutputChecker``, leading to slightly
++  different error output.  (Jonathan Lange)
++
  * ``gather_details`` takes two dicts, rather than two detailed objects.
    (Jonathan Lange, #801027)
@@ -76,6 +79,10 @@
  * New convenience assertions, ``assertIsNone`` and ``assertIsNotNone``.
    (Christian Kampka)
++* New doctest helper, ``SmartOutputChecker`` added.  It normalizes whitespace
++  and ellipses in diff output if the checker has those options set.
++  (Jonathan Lange)
++
  * New matchers:
    * ``AllMatch`` matches many values against a single matcher.
 === modified file 'doc/for-test-authors.rst'
 --- doc/for-test-authors.rst	2011-08-15 16:14:42 +0000
 +++ doc/for-test-authors.rst	2011-09-09 17:39:51 +0000
@@ -1179,6 +1179,16 @@
  particular attribute.
++Better doctest output
++---------------------
++
++If you have doctests, then you might want to use ``SmartOutputChecker``
++instead of the default.  It has correct unicode behaviour across all of the
++Pythons supported by testtools, and also will reformat the test output
++slightly if NORMALIZE_WHITESPACE or ELLIPSIS options have been specified, in
++order to reduce the amount of noise in the error messages.
++
++
  .. _testrepository: https://launchpad.net/testrepository
  .. _Trial: http://twistedmatrix.com/documents/current/core/howto/testing.html
  .. _nose: http://somethingaboutorange.com/mrl/projects/nose/
 === modified file 'testtools/helpers.py'
 --- testtools/helpers.py	2011-07-20 20:34:29 +0000
 +++ testtools/helpers.py	2011-09-09 17:39:51 +0000
@@ -2,12 +2,17 @@
  __all__ = [
      'safe_hasattr',
++    'SmartOutputChecker',
      'try_import',
      'try_imports',
+     ]
++import doctest
++import re
  import sys
++from testtools.monkey import MonkeyPatcher
++
  def try_import(name, alternative=None, error_callback=None):
      """Attempt to import ``name``.  If it fails, return ``alternative``.
@@ -85,3 +90,116 @@
      properties.
      """
      return getattr(obj, attr, _marker) is not _marker
++
++
++class _NonManglingOutputChecker(doctest.OutputChecker):
++    """Doctest checker that works with unicode rather than mangling strings
++
++    This is needed because current Python versions have tried to fix string
++    encoding related problems, but regressed the default behaviour with
++    unicode inputs in the process.
++
++    In Python 2.6 and 2.7 `OutputChecker.output_difference` is was changed to
++    return a bytestring encoded as per `sys.stdout.encoding`, or utf-8 if that
++    can't be determined. Worse, that encoding process happens in the innocent
++    looking `_indent` global function. Because the `DocTestMismatch.describe`
++    result may well not be destined for printing to stdout, this is no good
++    for us. To get a unicode return as before, the method is monkey patched if
++    `doctest._encoding` exists.
++
++    Python 3 has a different problem. For some reason both inputs are encoded
++    to ascii with 'backslashreplace', making an escaped string matches its
++    unescaped form. Overriding the offending `OutputChecker._toAscii` method
++    is sufficient to revert this.
++    """
++
++    def _toAscii(self, s):
++        """Return `s` unchanged rather than mangling it to ascii"""
++        return s
++
++    def output_difference(self, example, got, optionflags):
++        """Describe the difference between 'example' and 'got'.
++
++        On versions of Python that have a broken doctest._indent function,
++        replace the behaviour of that function.
++        """
++        if getattr(doctest, "_encoding", None) is None:
++            return doctest.OutputChecker.output_difference(
++                self, example, got, optionflags)
++        else:
++            def _indent(s, indent=4, _pattern=re.compile("^(?!$)", re.MULTILINE)):
++                """Prepend non-empty lines in `s` with `indent` number of spaces"""
++                return _pattern.sub(indent*" ", s)
++            # Only do this overriding hackery if doctest has a broken _indent
++            # function
++            patcher = MonkeyPatcher((doctest, '_indent', _indent))
++            return patcher.run_with_patches(
++                doctest.OutputChecker.output_difference,
++                self, example, got, optionflags)
++
++
++class SmartOutputChecker(_NonManglingOutputChecker):
++    """OutputChecker that mangles the 'got' based on matching options.
++
++    One problem with doctests is that when you use, say, NORMALIZE_WHITESPACE,
++    and an example fails, then the diff will display as if the differences in
++    whitespace matter.
++
++    Likewise, the diff will treat lines that match correctly with the ELLIPSIS
++    option as differing.
++
++    This makes it difficult to determine which lines are the *actual* cause of
++    the difference.
++
++    This ``OutputChecker`` mangles the output of the doctest example so that
++    it normalizes whitespace and ellipses if needed.
++    """
++
++    def ellipsize(self, want, got):
++        """Turn 'got' into a version of itself, but replaced with ellipses."""
++        if doctest.ELLIPSIS_MARKER not in want:
++            return got
++
++        wants = want.split(doctest.ELLIPSIS_MARKER)
++        gots = []
++
++        startpos = 0
++        for w in wants:
++            index = got.find(w, startpos)
++            if index < 0:
++                if gots:
++                    gots[-1] += got[startpos:]
++                else:
++                    gots = [got]
++                break
++            else:
++                gots.append(w)
++                startpos = index + len(w)
++        return doctest.ELLIPSIS_MARKER.join(gots)
++
++    def normalize_whitespace(self, output):
++        """Take 'output', preserve its lines, but normalize other whitespace.
++        """
++        lines = []
++        for line in output.splitlines():
++            normalized = ' '.join(line.split())
++            if normalized or not (lines and lines[-1] == ''):
++                lines.append(normalized)
++        return '\n'.join(lines)
++
++    def output_difference(self, example, got, optionflags):
++        """Describe the difference between 'example' and 'got'.
++
++        Instead of the default ``OutputChecker``, this one applies the same
++        transformations to the 'got' string that ``check_output`` does. This
++        means that the diff comparison is easier.
++        """
++        if optionflags & doctest.ELLIPSIS:
++            got = self.ellipsize(example.want, got)
++
++        if optionflags & doctest.NORMALIZE_WHITESPACE:
++            example.want = self.normalize_whitespace(example.want)
++            got = self.normalize_whitespace(got)
++
++        return _NonManglingOutputChecker.output_difference(
++            self, example, got, optionflags)
 === modified file 'testtools/matchers.py'
 --- testtools/matchers.py	2011-08-15 13:22:29 +0000
 +++ testtools/matchers.py	2011-09-09 17:39:51 +0000
@@ -38,7 +38,6 @@
      'StartsWith',
+     ]
--import doctest
  import operator
  from pprint import pformat
  import re
@@ -51,6 +50,7 @@
      isbaseexception,
      istext,
+     )
++from testtools.helpers import SmartOutputChecker
  class Matcher(object):
@@ -156,44 +156,6 @@
          return self.original.get_details()
--class _NonManglingOutputChecker(doctest.OutputChecker):
--    """Doctest checker that works with unicode rather than mangling strings
--
--    This is needed because current Python versions have tried to fix string
--    encoding related problems, but regressed the default behaviour with unicode
--    inputs in the process.
--
--    In Python 2.6 and 2.7 `OutputChecker.output_difference` is was changed to
--    return a bytestring encoded as per `sys.stdout.encoding`, or utf-8 if that
--    can't be determined. Worse, that encoding process happens in the innocent
--    looking `_indent` global function. Because the `DocTestMismatch.describe`
--    result may well not be destined for printing to stdout, this is no good
--    for us. To get a unicode return as before, the method is monkey patched if
--    `doctest._encoding` exists.
--
--    Python 3 has a different problem. For some reason both inputs are encoded
--    to ascii with 'backslashreplace', making an escaped string matches its
--    unescaped form. Overriding the offending `OutputChecker._toAscii` method
--    is sufficient to revert this.
--    """
--
--    def _toAscii(self, s):
--        """Return `s` unchanged rather than mangling it to ascii"""
--        return s
--
--    # Only do this overriding hackery if doctest has a broken _input function
--    if getattr(doctest, "_encoding", None) is not None:
--        from types import FunctionType as __F
--        __f = doctest.OutputChecker.output_difference.im_func
--        __g = dict(__f.func_globals)
--        def _indent(s, indent=4, _pattern=re.compile("^(?!$)", re.MULTILINE)):
--            """Prepend non-empty lines in `s` with `indent` number of spaces"""
--            return _pattern.sub(indent*" ", s)
--        __g["_indent"] = _indent
--        output_difference = __F(__f.func_code, __g, "output_difference")
--        del __F, __f, __g, _indent
--
--
  class DocTestMatches(object):
      """See if a string matches a doctest example."""
@@ -208,7 +170,7 @@
              example += '\n'
          self.want = example # required variable name by doctest.
          self.flags = flags
--        self._checker = _NonManglingOutputChecker()
++        self._checker = SmartOutputChecker()
      def __str__(self):
          if self.flags:
 === modified file 'testtools/tests/test_helpers.py'
 --- testtools/tests/test_helpers.py	2011-08-15 13:48:10 +0000
 +++ testtools/tests/test_helpers.py	2011-09-09 17:39:51 +0000
@@ -1,7 +1,10 @@
  # Copyright (c) 2010-2011 testtools developers. See LICENSE for details.
++import doctest
++
  from testtools import TestCase
  from testtools.helpers import (
++    SmartOutputChecker,
      try_import,
      try_imports,
+     )
@@ -235,6 +238,82 @@
          self.assertThat(self.modules, StackHidden(False))
++class TestSmartOutputChecker(TestCase):
++
++    def test_ellipsis_matches(self):
++        checker = SmartOutputChecker()
++        self.assertTrue(checker.check_output('x...', 'xyz', doctest.ELLIPSIS))
++
++    def test_normalize_whitespace_multiple_blank_lines(self):
++        checker = SmartOutputChecker()
++        output = checker.normalize_whitespace('a\n\n\n\nb')
++        self.assertEqual('a\n\nb', output)
++
++    def test_diff_with_normalize_whitespace(self):
++        # If we're checking with normalized whitespace, then normalize the
++        # whitespace in the diff.
++        checker = SmartOutputChecker()
++        example = doctest.Example('f()', 'a b\nc\n')
++        diff = checker.output_difference(
++            example, 'a  b\nd\n',
++            doctest.NORMALIZE_WHITESPACE | doctest.REPORT_NDIFF)
++        self.assertEqual(
++            ('Differences (ndiff with -expected +actual):\n'
++             '      a b\n'
++             '    - c\n'
++             '    + d\n'), diff)
++
++    def test_ellipsis(self):
++        # If we're checking with ellipsis, then lines that match based on
++        # ellipsis matching should not appear as different in the diff.
++        checker = SmartOutputChecker()
++        example = doctest.Example('f()', 'a...\nb...\nc...\n')
++        diff = checker.output_difference(
++            example, 'apple\nbobble\ndog\n',
++            doctest.ELLIPSIS | doctest.REPORT_NDIFF)
++        self.assertEqual(
++            ('Differences (ndiff with -expected +actual):\n'
++             '      a...\n'
++             '    - b...\n'
++             '    - c...\n'
++             '    + bobble\n'
++             '    + dog\n'), diff)
++
++    def test_no_ellipsis(self):
++        got = SmartOutputChecker().ellipsize('xxx', 'apple\ncat\n')
++        self.assertEqual('apple\ncat\n', got)
++
++    def test_starts_with_literal(self):
++        got = SmartOutputChecker().ellipsize('app...', 'apple\ncat\n')
++        self.assertEqual('app...', got)
++
++    def test_starts_with_literal_mismatch(self):
++        got = SmartOutputChecker().ellipsize('app...', 'cattle\n')
++        self.assertEqual('cattle\n', got)
++
++    def test_ends_with_literal(self):
++        got = SmartOutputChecker().ellipsize('...cat\n', 'apple\ncat\n')
++        self.assertEqual('...cat\n', got)
++
++    def test_ends_with_literal_mismatch(self):
++        got = SmartOutputChecker().ellipsize('...cat\n', 'apple\ndog\n')
++        self.assertEqual('apple\ndog\n', got)
++
++    def test_surrounded_by_literals(self):
++        got = SmartOutputChecker().ellipsize(
++            'app...cat\n', 'apple\ncat\n')
++        self.assertEqual('app...cat\n', got)
++
++    def test_partial_match(self):
++        got = SmartOutputChecker().ellipsize(
++            'app...\nc...\nd...\n', 'apple\ncat\negg\n')
++        self.assertEqual('app...\ncat\negg\n', got)
++
++    def test_too_many_literals(self):
++        got = SmartOutputChecker().ellipsize('aa...aa', 'aaa')
++        self.assertEqual('aaa', got)
++
++
  def test_suite():
      from unittest import TestLoader
      return TestLoader().loadTestsFromName(__name__)