Add a clone method to Tag and NavigableString
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
BeautifulSoup does not support cloning an element (with all contained child page elements); use cases include creating multiple copies of a sub-tree to generate a larger document quickly.
You cannot use copy.deepcopy() for this task as elements maintain relationships to their tree parent and siblings; a deepcopy would needlessly copy those elements along. NavigableString also implements an incorrect __copy__ method that returns `self` without taking the mutable parent and sibling references making this approach unworkable.
I've attached a patch that adds explicit cloning support; new `Tag.clone()` and `NavigableStrin
This patch assumes that the patch in bug #1307471 has been applied; in my testing (with lxml as the parser), a freshly created BeautifulSoup tree does not have `.builder` set on the created elements.
This code was initially written up as an answer to a question on Stack Overflow, see http://
Changed in beautifulsoup: | |
status: | Fix Committed → Fix Released |
I've adapted this patch to change the behavior of __copy__.