Merge ~cjwatson/launchpad-buildd:tighten-url-sanitization into launchpad-buildd:master

Proposed by Colin Watson
Status: Merged
Approved by: Colin Watson
Approved revision: 311be87080b813545c9bdffb8aef4d4d2d451092
Merge reported by: Otto Co-Pilot
Merged at revision: not available
Proposed branch: ~cjwatson/launchpad-buildd:tighten-url-sanitization
Merge into: launchpad-buildd:master
Diff against target: 86 lines (+48/-1)
3 files modified
debian/changelog (+1/-0)
lpbuildd/builder.py (+1/-1)
lpbuildd/tests/test_builder.py (+46/-0)
Reviewer Review Type Date Requested Status
Andrey Fedoseev (community) Approve
Review via email: mp+427515@code.launchpad.net

Commit message

Make URL sanitization a little less greedy

Description of the change

It could previously be confused by multiple URLs on the same line, resulting in very confusing output in build logs. This change is safe because RFC 1738 says:

  Within the user and password field, any ":", "@", or "/" must be encoded.

To post a comment you must log in.
Revision history for this message
Andrey Fedoseev (andrey-fedoseev) wrote :

Could you add a test for the case when username/password include the special characters (":", "@", or "/")?

review: Approve
Revision history for this message
Colin Watson (cjwatson) wrote :

Can you elaborate on why? I'm not sure it makes sense to test that as you describe, because such cases would be out of spec. I'm sure the regex parsing will do something with them, but I'm not sure that what it does will be very interesting.

Revision history for this message
Andrey Fedoseev (andrey-fedoseev) wrote :

I might have confused "encoded" with "escaped". I imagined they would become something like "\@" which may break the regex potentially, but it looks like this won't be the case, so never mind.

Revision history for this message
Colin Watson (cjwatson) wrote :

Ah yeah, it means URL-encoding. Cool then.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/debian/changelog b/debian/changelog
2index 7824911..c3e6e11 100644
3--- a/debian/changelog
4+++ b/debian/changelog
5@@ -1,6 +1,7 @@
6 launchpad-buildd (217) UNRELEASED; urgency=medium
7
8 * Improve deployment documentation.
9+ * Make URL sanitization a little less greedy.
10
11 -- Colin Watson <cjwatson@ubuntu.com> Mon, 18 Jul 2022 15:07:23 +0100
12
13diff --git a/lpbuildd/builder.py b/lpbuildd/builder.py
14index 61357cf..336594c 100644
15--- a/lpbuildd/builder.py
16+++ b/lpbuildd/builder.py
17@@ -47,7 +47,7 @@ def _sanitizeURLs(bytes_seq):
18 """
19 # This regular expression will be used to remove authentication
20 # credentials from URLs.
21- password_re = re.compile(br'://([^:]*:[^@]+@)(\S+)')
22+ password_re = re.compile(br'://([^:@/]*:[^:@/]+@)(\S+)')
23 # Builder proxy passwords are UUIDs.
24 proxy_auth_re = re.compile(br',proxyauth=[^:]+:[A-Za-z0-9-]+')
25
26diff --git a/lpbuildd/tests/test_builder.py b/lpbuildd/tests/test_builder.py
27index db13dd8..1f08126 100644
28--- a/lpbuildd/tests/test_builder.py
29+++ b/lpbuildd/tests/test_builder.py
30@@ -23,10 +23,56 @@ from twisted.logger import (
31 from lpbuildd.builder import (
32 Builder,
33 BuildManager,
34+ _sanitizeURLs,
35 )
36 from lpbuildd.tests.fakebuilder import FakeConfig
37
38
39+class TestSanitizeURLs(TestCase):
40+ """Unit-test URL sanitization.
41+
42+ `lpbuildd.tests.test_buildd.LaunchpadBuilddTests` also covers some of
43+ this, but at a higher level.
44+ """
45+
46+ def test_non_urls(self):
47+ lines = [b"not a URL", b"still not a URL"]
48+ self.assertEqual(lines, list(_sanitizeURLs(lines)))
49+
50+ def test_url_without_credentials(self):
51+ lines = [b"Get:1 http://ftpmaster.internal focal InRelease"]
52+ self.assertEqual(lines, list(_sanitizeURLs(lines)))
53+
54+ def test_url_with_credentials(self):
55+ lines = [
56+ b"Get:1 http://buildd:secret@ftpmaster.internal focal InRelease",
57+ ]
58+ expected_lines = [b"Get:1 http://ftpmaster.internal focal InRelease"]
59+ self.assertEqual(expected_lines, list(_sanitizeURLs(lines)))
60+
61+ def test_multiple_urls(self):
62+ lines = [
63+ b"http_proxy=http://squid.internal:3128/ "
64+ b"GOPROXY=http://user:password@example.com/goproxy",
65+ ]
66+ expected_lines = [
67+ b"http_proxy=http://squid.internal:3128/ "
68+ b"GOPROXY=http://example.com/goproxy",
69+ ]
70+ self.assertEqual(expected_lines, list(_sanitizeURLs(lines)))
71+
72+ def test_proxyauth(self):
73+ lines = [
74+ b"socat STDIO PROXY:builder-proxy.launchpad.dev:github.com:443,"
75+ b"proxyport=3128,proxyauth=user:blah",
76+ ]
77+ expected_lines = [
78+ b"socat STDIO PROXY:builder-proxy.launchpad.dev:github.com:443,"
79+ b"proxyport=3128",
80+ ]
81+ self.assertEqual(expected_lines, list(_sanitizeURLs(lines)))
82+
83+
84 class TestBuildManager(TestCase):
85
86 run_tests_with = AsynchronousDeferredRunTest.make_factory(timeout=5)

Subscribers

People subscribed via source and target branches