_findAll() ignores "text" keyword argument

Bug #493722 reported by Darcy Parks
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Beautiful Soup
Fix Released
Undecided
Unassigned

Bug Description

Version: 3.0.8

_findAll has two special cases that avoid creating a SoupStrainer: when tag=True and findAll('tag-name'). These two cases check that "limit", "attrs" and "kwargs" are unspecified but they don't check "text". So if "limit", "attrs" and "kwargs" are unspecified, "text" gets ignored and the findAll('tag-name') case occurs.

To fix:

Line 340 of BeautifulSoup.py:

Change
    elif not limit and name is True and not attrs and not kwargs:
To
    elif not limit and name is True and not attrs and not kwargs \
        and not text:

Similarly, on line 345:

Change
    elif not limit and isinstance(name, basestring) and not attrs \
        and not kwargs:
To
    elif not limit and isinstance(name, basestring) and not attrs \
        and not kwargs and not text:

Revision history for this message
Aaron DeVore (aaron-devore) wrote :

Good catch! This is a case where the user has made a mistake by using findAll('tag-name', text=...) or findAll(True, text=...). According to the API, text=... should override the tag searching.

I put a fix in a private branch. I also combined some of the checks in the two elif tests.

There was one possible issue with using 'not text': it matches for an empty string (""). That is fixed by using 'text is None' instead.

Revision history for this message
Leonard Richardson (leonardr) wrote :

Thanks to Aaron's branch, fixed in 3.0.8.1.

Changed in beautifulsoup:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.