Merge lp:~saschpe/beautifulsoup/beautifulsoup into lp:beautifulsoup

Proposed by Sascha Peilicke
Status: Needs review
Proposed branch: lp:~saschpe/beautifulsoup/beautifulsoup
Merge into: lp:beautifulsoup
Diff against target: 29 lines (+6/-1) (has conflicts)
2 files modified
bs4/testing.py (+1/-1)
bs4/tests/test_lxml.py (+5/-0)
Text conflict in bs4/tests/test_lxml.py
To merge this branch: bzr merge lp:~saschpe/beautifulsoup/beautifulsoup
Reviewer Review Type Date Requested Status
Leonard Richardson Pending
Review via email: mp+200849@code.launchpad.net

Description of the change

LXML fixes

To post a comment you must log in.

Unmerged revisions

324. By tar_scm test suite <root@localhost>

Fix lxml deprecation warning tests.

The class BeautifulStoneSoup is instanciated first in
bs4.tests.test_soup:LXMLTreeBuilderSmokeTest.test_beautifulstonesoup_is_xml_parser
causing the UserWarning to be printed once. However, the deprecation test will
fail afterwards:

ERROR: test_beautifulstonesoup (bs4.tests.test_soup.TestDeprecatedConstructorArguments)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/Projects/python/beautifulsoup/bs4/tests/test_soup.py", line 69, in test_beautifulstonesoup
    self.assertTrue("BeautifulStoneSoup class is deprecated" in str(w[0].message))
IndexError: list index out of range

Instead, don't eat the UserWarning in the first test so that the second one can
catch it properly.

323. By tar_scm test suite <root@localhost>

Don't provide encoding for XML unicode tests with LXML.

ERROR: test_can_parse_unicode_document (bs4.tests.test_lxml.LXMLXMLTreeBuilderSmokeTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/root/Projects/python/beautifulsoup/bs4/testing.py", line 498, in test_can_parse_unicode_document
    self.assertEqual(u'Sacr\xe9 bleu!', soup.root.string)
AttributeError: 'NoneType' object has no attribute 'string'

As described at http://lxml.de/parsing.html#python-unicode-strings LXML doesn't like
Unicode XML that specifies a (wrong) encoding.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bs4/testing.py'
2--- bs4/testing.py 2013-10-18 17:03:06 +0000
3+++ bs4/testing.py 2014-01-08 15:03:12 +0000
4@@ -503,7 +503,7 @@
5 self.assertTrue(b"&lt; &lt; hey &gt; &gt;" in encoded)
6
7 def test_can_parse_unicode_document(self):
8- markup = u'<?xml version="1.0" encoding="euc-jp"><root>Sacr\N{LATIN SMALL LETTER E WITH ACUTE} bleu!</root>'
9+ markup = u'<?xml version="1.0"><root>Sacr\N{LATIN SMALL LETTER E WITH ACUTE} bleu!</root>'
10 soup = self.soup(markup)
11 self.assertEqual(u'Sacr\xe9 bleu!', soup.root.string)
12
13
14=== modified file 'bs4/tests/test_lxml.py'
15--- bs4/tests/test_lxml.py 2013-08-19 14:31:36 +0000
16+++ bs4/tests/test_lxml.py 2014-01-08 15:03:12 +0000
17@@ -60,7 +60,12 @@
18 def test_beautifulstonesoup_is_xml_parser(self):
19 # Make sure that the deprecated BSS class uses an xml builder
20 # if one is installed.
21+<<<<<<< TREE
22 with warnings.catch_warnings(record=True) as w:
23+=======
24+ with warnings.catch_warnings():
25+ warnings.simplefilter("always")
26+>>>>>>> MERGE-SOURCE
27 soup = BeautifulStoneSoup("<b />")
28 self.assertEqual(u"<b/>", unicode(soup.b))
29 self.assertTrue("BeautifulStoneSoup class is deprecated" in str(w[0].message))

Subscribers

People subscribed via source and target branches

to status/vote changes: