Beautiful Soup

Merge ~chrispitude/beautifulsoup:more-modular-soupstrainers-doc into beautifulsoup:more-modular-soupstrainers

Git
lp:~chrispitude/beautifulsoup
more-modular-soupstrainers-doc
Merge into more-modular-soupstrainers

Proposed by Chris Papademetrious on 2024-02-03

Status:	Needs review
Proposed branch:	~chrispitude/beautifulsoup:more-modular-soupstrainers-doc
Merge into:	beautifulsoup:more-modular-soupstrainers
Diff against target:	2065 lines (+492/-327) 22 files modified CHANGELOG (+14/-8) bs4/__init__.py (+13/-10) bs4/_deprecation.py (+6/-6) bs4/_typing.py (+19/-8) bs4/builder/__init__.py (+76/-49) bs4/builder/_html5lib.py (+104/-97) bs4/builder/_htmlparser.py (+15/-14) bs4/builder/_lxml.py (+44/-27) bs4/dammit.py (+3/-3) bs4/element.py (+65/-39) bs4/filter.py (+6/-7) bs4/tests/__init__.py (+33/-14) bs4/tests/test_builder_registry.py (+3/-1) bs4/tests/test_css.py (+8/-2) bs4/tests/test_filter.py (+16/-4) bs4/tests/test_fuzz.py (+2/-2) bs4/tests/test_html5lib.py (+1/-2) bs4/tests/test_htmlparser.py (+4/-2) bs4/tests/test_lxml.py (+2/-2) bs4/tests/test_soup.py (+4/-2) bs4/tests/test_tree.py (+8/-8) doc/index.rst (+46/-20)
Related bugs:	Link a bug report

Reviewer	Review Type	Date Requested	Status
Leonard Richardson		2024-02-03	Pending
Review via email: mp+459970@code.launchpad.net

Commit message

add an example for the new ElementFilter feature

Unmerged commits

b17956d... by Chris Papademetrious on 2024-02-03

add documentation example for ElementFilter

Signed-off-by: Chris Papademetrious <email address hidden>

4e37196... by Leonard Richardson on 2024-02-02

Added some basic typing to test methods that are called by the actual tests.

fe88554... by Leonard Richardson on 2024-02-02

Clarified the match protocol thing.

2618f24... by Leonard Richardson on 2024-02-02

Decided putting limit in ResultSet wasn't a good idea.

cc290e9... by Leonard Richardson on 2024-02-02

Added note that Protocol is in typing_extensions so it could be used now.

4c0618d... by Leonard Richardson on 2024-02-02

Made some typing system changes necessary to get the code to run under Python 3.8.

d5db19b... by Leonard Richardson on 2024-02-02

I feel like it's more consistent to use modified_attrs throughout.

792192c... by Leonard Richardson on 2024-02-02

Sorted out the _RawAttributeValue/_AttributeValue typing issue.

814f1a6... by Leonard Richardson on 2024-02-02

All the easy stuff is resolved now, and most of the medium-sized stuff except an issue with _RawAttributeValues versus _AttributeValues.

3898972... by Leonard Richardson on 2024-02-01

Got the type definitions in _deprecation.py a _little_ better.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Chris Papademetrious

Leonard Richardson

1	diff --git a/CHANGELOG b/CHANGELOG
2	index 162e3dc..41b1467 100644
3	--- a/CHANGELOG
4	+++ b/CHANGELOG
5	@@ -1,7 +1,5 @@
6	= 4.13.0 (Unreleased)
7
8	-TODO: we could stand to put limit inside ResultSet
9	-
10	* This version drops support for Python 3.6. The minimum supported
11	major Python version for Beautiful Soup is now Python 3.7.
12
13	@@ -33,12 +31,15 @@ TODO: we could stand to put limit inside ResultSet
14	you, since you probably use HTMLParserTreeBuilder, not
15	BeautifulSoupHTMLParser directly.
16
17	-* The TreeBuilderForHtml5lib methods fragmentClass and getFragment
18	- now raise NotImplementedError. These methods are called only by
19	- html5lib's HTMLParser.parseFragment() method, which Beautiful Soup
20	- doesn't use, so they were untested and should have never been called.
21	- The getFragment() implementation was also slightly incorrect in a way
22	- that should have caused obvious problems for anyone using it.
23	+* The TreeBuilderForHtml5lib methods fragmentClass(), getFragment(),
24	+ and testSerializer() now raise NotImplementedError. These methods
25	+ are called only by html5lib's test suite, and Beautiful Soup isn't
26	+ integrated into that test suite, so this code was long since unused and
27	+ untested.
28	+
29	+ These methods are _not_ deprecated, since they are methods defined by
30	+ html5lib. They may one day have real implementations, as part of a future
31	+ effort to integrate Beautiful Soup into html5lib's test suite.
32
33	* If Tag.get_attribute_list() is used to access an attribute that's not set,
34	the return value is now an empty list rather than [None].
35	@@ -73,6 +74,11 @@ TODO: we could stand to put limit inside ResultSet
36	* A SoupStrainer can now filter tag creation based on a tag's
37	namespaced name. Previously only the unqualified name could be used.
38
39	+* Some of the arguments in the methods of LXMLTreeBuilderForXML
40	+ have been renamed for consistency with the names lxml uses for those