soupmatchers

Merge lp:~cjwatson/soupmatchers/python3 into lp:soupmatchers

python3
Merge into trunk

Proposed by Colin Watson on 2017-10-20

Status:	Merged
Merged at revision:	61
Proposed branch:	lp:~cjwatson/soupmatchers/python3
Merge into:	lp:soupmatchers
Diff against target:	730 lines (+124/-100) 5 files modified README (+38/-37) setup.py (+11/-1) soupmatchers/__init__.py (+13/-8) soupmatchers/tests/__init__.py (+13/-10) soupmatchers/tests/test_matchers.py (+49/-44)
To merge this branch:	bzr merge lp:~cjwatson/soupmatchers/python3
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
James Westby		2017-10-20	Approve on 2018-05-14
Review via email: mp+332595@code.launchpad.net

Commit message

Port to beautifulsoup4 and Python 3.

Description of the change

Since the interface exposed by this package is mostly about searching in existing text, just continuing to use native strings everywhere seems to work fine; we just need the usual porting stuff, a BeautifulSoup upgrade, and some care around testtools.Content (whose second argument is an iterator over bytes objects).

Revision history for this message

James Westby (james-w) on 2018-05-14:

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Colin Watson

James Westby

1	=== modified file 'README'
2	--- README 2010-07-12 16:43:33 +0000
3	+++ README 2017-10-20 22:33:14 +0000
4	@@ -20,25 +20,25 @@
5	BeautifulSoup
6	-------------
7
8	- >>> import BeautifulSoup
9	- >>> root = BeautifulSoup.BeautifulSoup(html)
10	+ >>> import bs4
11	+ >>> root = bs4.BeautifulSoup(html, "html.parser")
12
13	It is an HTML parsing library that includes
14	a way to search the document for matching tags. If you had a parsed
15	representation of your document you could find the above part by doing
16
17	>>> import re
18	- >>> anchor_tags = root.findAll(
19	+ >>> anchor_tags = root.find_all(
20	... "a", attrs={"href": "https://launchpad.net/testtools",
21	... "class": "awesome"})
22	- >>> print anchor_tags
23	- [<a href="https://launchpad.net/testtools" class="awesome">testtools <b>rocks</b></a>]
24	+ >>> print(anchor_tags)
25	+ [<a class="awesome" href="https://launchpad.net/testtools">testtools <b>rocks</b></a>]
26
27	-which would return you a list with (lets assume) a single entry, the
28	-BeautifulSoup.Tag for the <a>. You can locate the nested tag with:
29	+which would return you a list with (let's assume) a single entry, the
30	+bs4.Tag for the <a>. You can locate the nested tag with:
31
32	>>> anchor_tag = anchor_tags[0]
33	- >>> anchor_tag.findAll("b")
34	+ >>> anchor_tag.find_all("b")
35	[<b>rocks</b>]
36
37	which will again return a single item list.
38	@@ -65,10 +65,10 @@
39	has a certain css class, and mentions testtools in the anchor text.
40