Merge ~rodrigo-zaiden/ubuntu-security-tools:backup-wiki into ubuntu-security-tools:master

Proposed by Rodrigo Figueiredo Zaiden
Status: Needs review
Proposed branch: ~rodrigo-zaiden/ubuntu-security-tools:backup-wiki
Merge into: ubuntu-security-tools:master
Diff against target: 28 lines (+7/-2)
1 file modified
utilities/backup_ubuntu_wiki.py (+7/-2)
Reviewer Review Type Date Requested Status
Steve Beattie Pending
Review via email: mp+413631@code.launchpad.net

Commit message

backupwiki: Add Include Macro pattern to search for included pages

When a wiki page uses Include macro, it includes a page inside
another. We would like to verify if the included page should be
added to the backup list

Description of the change

Adding the pattern '<<Include\((\w+(\/\w+){0,}).*\)>>' that will search for the Include macro and select a page candidate to be backed up (example: SecurityTeam/Page1/Page2/.../PageN)

To post a comment you must log in.

Unmerged commits

f101b55... by Rodrigo Figueiredo Zaiden

backupwiki: Add Include Macro pattern to search for included pages

When a wiki page uses Include Macro, it includes a page inside
another. We would like to verify if the included page should be
added to the backup list

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/utilities/backup_ubuntu_wiki.py b/utilities/backup_ubuntu_wiki.py
index 558c173..c5c8403 100755
--- a/utilities/backup_ubuntu_wiki.py
+++ b/utilities/backup_ubuntu_wiki.py
@@ -17,7 +17,7 @@ from urllib.parse import urlparse, urljoin
17import requests17import requests
1818
19host = 'wiki.ubuntu.com'19host = 'wiki.ubuntu.com'
20backup_dir = "/home/steve/backups/" + host20backup_dir = os.getenv("HOME") + "/backups/" + host
2121
22subtrees = [22subtrees = [
23 'Security', 'SecurityTeam', 'AppArmor', 'DebuggingSecurity',23 'Security', 'SecurityTeam', 'AppArmor', 'DebuggingSecurity',
@@ -208,8 +208,13 @@ def main():
208208
209 urls_saved.add(next_page)209 urls_saved.add(next_page)
210210
211 for m in re.finditer('\[\[\s*([\w/.:]+)\s*(\|\s*[^\]]*\s*)?\]\]', contents):211 for m in re.finditer('\[\[\s*([\w/.:]+)\s*(\|\s*[^\]]*\s*)?\]\]|<<Include\((\w+(\/\w+){0,}).*\)>>', contents):
212 link = m.group(1)212 link = m.group(1)
213 if link == None:
214 link = m.group(3)
215 if config.debug:
216 print(f'Found Include macro with {link} (referrer: {next_page})', file=sys.stderr)
217
213 # Ugh, mediawiki converts everything to have the first218 # Ugh, mediawiki converts everything to have the first
214 # letter be uppercase; need to convert references219 # letter be uppercase; need to convert references
215 #if link[0] in string.lowercase:220 #if link[0] in string.lowercase:

Subscribers

People subscribed via source and target branches