Beautiful Soup

Bug #2034451
Comment #4

Comment 4 for bug 2034451

Revision history for this message

Leonard Richardson (leonardr) wrote on 2024-02-04:

Follow-up email from Marc:

I just saw that you released version 4.12.3 a week ago with a fix for the issue.
Unfortunately it seems, it doesn’t quite do it. There is one more `cls._warn` which
doesn’t have but needs the stacklevel attribute. In this case `10`.

```
diff --git a/bs4/builder/__init__.py b/bs4/builder/__init__.py
index ffb31fc..30d7ca1 100644
--- a/bs4/builder/__init__.py
+++ b/bs4/builder/__init__.py
@@ -588,7 +588,7 @@ class DetectsXMLParsedAsHTML(object):
             # We encountered an XML declaration and then a tag other
             # than 'html'. This is a reliable indicator that a
             # non-XHTML document is being parsed as XML.
- self._warn()
+ self._warn(stacklevel=10)

def register_treebuilders_from(module):
```

If you want to test it, the library which causes the issue tries to read an `xml` file with the ``html.parser`.
Unfortunately, it’s unmaintained so I can’t fix it there, but for the time being it still works.

```py
BeautifulSoup(xml_file.read(), "html.parser”)
```

The start of the xml file
```
<?xml version="1.0" encoding="utf-8"?>
…
```