Merge lp:~jelmer/bzr/safe-open into lp:bzr

Proposed by Jelmer Vernooij
Status: Superseded
Proposed branch: lp:~jelmer/bzr/safe-open
Merge into: lp:bzr
Prerequisite: lp:~jelmer/bzr/simplify-server-probing
Diff against target: 709 lines (+675/-0)
4 files modified
bzrlib/tests/__init__.py (+1/-0)
bzrlib/tests/test_url_policy_open.py (+357/-0)
bzrlib/url_policy_open.py (+314/-0)
doc/en/release-notes/bzr-2.5.txt (+3/-0)
To merge this branch: bzr merge lp:~jelmer/bzr/safe-open
Reviewer Review Type Date Requested Status
Martin Packman (community) Needs Fixing
Review via email: mp+86645@code.launchpad.net

This proposal supersedes a proposal from 2011-12-22.

This proposal has been superseded by a proposal from 2012-01-19.

Description of the change

Import the safe_open code from launchpad.

This code allows the user to open branches while checking all URLs that are
opened with a policy object. This code is generic, and could be useful for
other users than Launchpad too.

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote : Posted in a previous version of this proposal

Since this is a fairly isolated piece of code I've put the errors in bzrlib.safe_open. bzrlib.errors is certainly also an option, but that module is already fairly large, and it doesn't seem likely other code will use them.

Revision history for this message
Martin Packman (gz) wrote :

Moving this up to the bzrlib level seems reasonable, but it really needs a better name. The problem with 'safe' is you need context to understand what it actually means and the term is too widely used for completely different things.

+"""Safe branch opening."""

Okay, something to do with branches.

+ def check_one_url(self, url):
+ """Check the safety of the source URL.

Something like same domain restrictions?

+class SafeBranchOpener(object):
+ """Safe branch opener.

Hm.

+def safe_open(allowed_scheme, url):
+ """Open the branch at `url`, only accessing URLs on `allowed_scheme`.

Aha! A useful docstring on this function... which actually does something much more specific than the name or module content implies.

So this is really about opening branches with certain URL based checks added. The naming should really reflect that limited scope.

+class BlacklistPolicy(BranchOpenPolicy):
+ """Branch policy that forbids certain URLs."""
+
+ def __init__(self, should_follow_references, unsafe_urls=None):
+ if unsafe_urls is None:
+ unsafe_urls = set()
+ self._unsafe_urls = unsafe_urls

This is unsafe, bzrlib doesn't normalise URLs enough to ensure there's only one canonical form. Also, forgetting a param meaning "no restrictions" isn't a great interface. I'd just add an underscore to the class name for now to discourage use.

There's some more to nitpick over with the code, but it doesn't seem that important.

review: Needs Fixing
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Am 24/12/11 13:26, schrieb Martin Packman:
> Review: Needs Fixing
>
> Moving this up to the bzrlib level seems reasonable, but it really needs a better name. The problem with 'safe' is you need context to understand what it actually means and the term is too widely used for completely different things.
Hmm, interesting point. I took the name from launchpad, but I guess I've
just gotten used to it meaning this... What about "bzrlib.hookable_open"
or "bzrlib.policy_open" ?

> +def safe_open(allowed_scheme, url):
> + """Open the branch at `url`, only accessing URLs on `allowed_scheme`.
>
> Aha! A useful docstring on this function... which actually does something much more specific than the name or module content implies.
>
>
> So this is really about opening branches with certain URL based checks added. The naming should really reflect that limited scope.

I'm not sure I follow. Do you mean e.g. renaming it to e.g.
open_with_specific_scheme ?
>
> +class BlacklistPolicy(BranchOpenPolicy):
> + """Branch policy that forbids certain URLs."""
> +
> + def __init__(self, should_follow_references, unsafe_urls=None):
> + if unsafe_urls is None:
> + unsafe_urls = set()
> + self._unsafe_urls = unsafe_urls
>
> This is unsafe, bzrlib doesn't normalise URLs enough to ensure there's only one canonical form. Also, forgetting a param meaning "no restrictions" isn't a great interface. I'd just add an underscore to the class name for now to discourage use.
>
> There's some more to nitpick over with the code, but it doesn't seem that important.
This is only really used in the Launchpad testsuite. I guess I could
just leave it over in the Launchpad code.

Cheers,

Jelmer

Revision history for this message
Martin Packman (gz) wrote :

> Hmm, interesting point. I took the name from launchpad, but I guess I've
> just gotten used to it meaning this... What about "bzrlib.hookable_open"
> or "bzrlib.policy_open" ?

Something like that, but unless there's any thought to extending the api beyond just looking at urls, I think that should be reflected in the name. I'm not bursting with ideas though... limited_location_open? restrict_url_open? That this is only branches not repos or trees is also relevant. Perhaps we can bikeshed on IRC?

With a module rename, I'd change SafeBranchOpener to just BranchOpener and fix all the docstrings with 'safe' in them to say something meaningful as well.

> I'm not sure I follow. Do you mean e.g. renaming it to e.g.
> open_with_specific_scheme ?

Yup. Or open_only_scheme, open_branch_with_scheme, something like that.

> This is only really used in the Launchpad testsuite. I guess I could
> just leave it over in the Launchpad code.

Yeah, I thought it probably wasn't in actual use. I don't mind bringing it across anyway, with either an underscore or just a docstring stating that limitation.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/tests/__init__.py'
2--- bzrlib/tests/__init__.py 2011-12-24 10:10:59 +0000
3+++ bzrlib/tests/__init__.py 2012-01-19 01:06:23 +0000
4@@ -4056,6 +4056,7 @@
5 'bzrlib.tests.test_revisiontree',
6 'bzrlib.tests.test_rio',
7 'bzrlib.tests.test_rules',
8+ 'bzrlib.tests.test_url_policy_open',
9 'bzrlib.tests.test_sampler',
10 'bzrlib.tests.test_scenarios',
11 'bzrlib.tests.test_script',
12
13=== added file 'bzrlib/tests/test_url_policy_open.py'
14--- bzrlib/tests/test_url_policy_open.py 1970-01-01 00:00:00 +0000
15+++ bzrlib/tests/test_url_policy_open.py 2012-01-19 01:06:23 +0000
16@@ -0,0 +1,357 @@
17+# Copyright (C) 2011 Canonical Ltd
18+#
19+# This program is free software; you can redistribute it and/or modify
20+# it under the terms of the GNU General Public License as published by
21+# the Free Software Foundation; either version 2 of the License, or
22+# (at your option) any later version.
23+#
24+# This program is distributed in the hope that it will be useful,
25+# but WITHOUT ANY WARRANTY; without even the implied warranty of
26+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
27+# GNU General Public License for more details.
28+#
29+# You should have received a copy of the GNU General Public License
30+# along with this program; if not, write to the Free Software
31+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
32+
33+"""Tests for the branch open with specific URL policy code."""
34+
35+from bzrlib import urlutils
36+from bzrlib.branch import (
37+ Branch,
38+ BranchReferenceFormat,
39+ )
40+from bzrlib.bzrdir import (
41+ BzrDir,
42+ BzrProber,
43+ )
44+from bzrlib.controldir import ControlDirFormat
45+from bzrlib.errors import NotBranchError
46+from bzrlib.url_policy_open import (
47+ BadUrl,
48+ _BlacklistPolicy,
49+ BranchLoopError,
50+ BranchReferenceForbidden,
51+ open_only_scheme,
52+ BranchOpener,
53+ WhitelistPolicy,
54+ )
55+from bzrlib.tests import (
56+ TestCase,
57+ TestCaseWithTransport,
58+ )
59+from bzrlib.transport import chroot
60+
61+
62+class TestBranchOpenerCheckAndFollowBranchReference(TestCase):
63+ """Unit tests for `BranchOpener.check_and_follow_branch_reference`."""
64+
65+ def setUp(self):
66+ super(TestBranchOpenerCheckAndFollowBranchReference, self).setUp()
67+ BranchOpener.install_hook()
68+
69+ class StubbedBranchOpener(BranchOpener):
70+ """BranchOpener that provides canned answers.
71+
72+ We implement the methods we need to to be able to control all the
73+ inputs to the `follow_reference` method, which is what is
74+ being tested in this class.
75+ """
76+
77+ def __init__(self, references, policy):
78+ parent_cls = TestBranchOpenerCheckAndFollowBranchReference
79+ super(parent_cls.StubbedBranchOpener, self).__init__(policy)
80+ self._reference_values = {}
81+ for i in range(len(references) - 1):
82+ self._reference_values[references[i]] = references[i + 1]
83+ self.follow_reference_calls = []
84+
85+ def follow_reference(self, url):
86+ self.follow_reference_calls.append(url)
87+ return self._reference_values[url]
88+
89+ def make_branch_opener(self, should_follow_references, references,
90+ unsafe_urls=None):
91+ policy = _BlacklistPolicy(should_follow_references, unsafe_urls)
92+ opener = self.StubbedBranchOpener(references, policy)
93+ return opener
94+
95+ def test_check_initial_url(self):
96+ # check_and_follow_branch_reference rejects all URLs that are not
97+ # allowed.
98+ opener = self.make_branch_opener(None, [], set(['a']))
99+ self.assertRaises(
100+ BadUrl, opener.check_and_follow_branch_reference, 'a')
101+
102+ def test_not_reference(self):
103+ # When branch references are forbidden, check_and_follow_branch_reference
104+ # does not raise on non-references.
105+ opener = self.make_branch_opener(False, ['a', None])
106+ self.assertEquals(
107+ 'a', opener.check_and_follow_branch_reference('a'))
108+ self.assertEquals(['a'], opener.follow_reference_calls)
109+
110+ def test_branch_reference_forbidden(self):
111+ # check_and_follow_branch_reference raises BranchReferenceForbidden if
112+ # branch references are forbidden and the source URL points to a
113+ # branch reference.
114+ opener = self.make_branch_opener(False, ['a', 'b'])
115+ self.assertRaises(
116+ BranchReferenceForbidden,
117+ opener.check_and_follow_branch_reference, 'a')
118+ self.assertEquals(['a'], opener.follow_reference_calls)
119+
120+ def test_allowed_reference(self):
121+ # check_and_follow_branch_reference does not raise if following references
122+ # is allowed and the source URL points to a branch reference to a
123+ # permitted location.
124+ opener = self.make_branch_opener(True, ['a', 'b', None])
125+ self.assertEquals(
126+ 'b', opener.check_and_follow_branch_reference('a'))
127+ self.assertEquals(['a', 'b'], opener.follow_reference_calls)
128+
129+ def test_check_referenced_urls(self):
130+ # check_and_follow_branch_reference checks if the URL a reference points
131+ # to is safe.
132+ opener = self.make_branch_opener(
133+ True, ['a', 'b', None], unsafe_urls=set('b'))
134+ self.assertRaises(
135+ BadUrl, opener.check_and_follow_branch_reference, 'a')
136+ self.assertEquals(['a'], opener.follow_reference_calls)
137+
138+ def test_self_referencing_branch(self):
139+ # check_and_follow_branch_reference raises BranchReferenceLoopError if
140+ # following references is allowed and the source url points to a
141+ # self-referencing branch reference.
142+ opener = self.make_branch_opener(True, ['a', 'a'])
143+ self.assertRaises(
144+ BranchLoopError, opener.check_and_follow_branch_reference, 'a')
145+ self.assertEquals(['a'], opener.follow_reference_calls)
146+
147+ def test_branch_reference_loop(self):
148+ # check_and_follow_branch_reference raises BranchReferenceLoopError if
149+ # following references is allowed and the source url points to a loop
150+ # of branch references.
151+ references = ['a', 'b', 'a']
152+ opener = self.make_branch_opener(True, references)
153+ self.assertRaises(
154+ BranchLoopError, opener.check_and_follow_branch_reference, 'a')
155+ self.assertEquals(['a', 'b'], opener.follow_reference_calls)
156+
157+
158+class TrackingProber(BzrProber):
159+ """Subclass of BzrProber which tracks URLs it has been asked to open."""
160+
161+ seen_urls = []
162+
163+ @classmethod
164+ def probe_transport(klass, transport):
165+ klass.seen_urls.append(transport.base)
166+ return BzrProber.probe_transport(transport)
167+
168+
169+class TestBranchOpenerStacking(TestCaseWithTransport):
170+
171+ def setUp(self):
172+ super(TestBranchOpenerStacking, self).setUp()
173+ BranchOpener.install_hook()
174+
175+ def make_branch_opener(self, allowed_urls, probers=None):
176+ policy = WhitelistPolicy(True, allowed_urls, True)
177+ return BranchOpener(policy, probers)
178+
179+ def test_probers(self):
180+ # Only the specified probers should be used
181+ b = self.make_branch('branch')
182+ opener = self.make_branch_opener([b.base], probers=[])
183+ self.assertRaises(NotBranchError, opener.open, b.base)
184+ opener = self.make_branch_opener([b.base], probers=[BzrProber])
185+ self.assertEquals(b.base, opener.open(b.base).base)
186+
187+ def test_default_probers(self):
188+ # If no probers are specified to the constructor
189+ # of BranchOpener, then a safe set will be used,
190+ # rather than all probers registered in bzr.
191+ self.addCleanup(ControlDirFormat.unregister_prober, TrackingProber)
192+ ControlDirFormat.register_prober(TrackingProber)
193+ # Open a location without any branches, so that all probers are
194+ # tried.
195+ # First, check that the TrackingProber tracks correctly.
196+ TrackingProber.seen_urls = []
197+ opener = self.make_branch_opener(["."], probers=[TrackingProber])
198+ self.assertRaises(NotBranchError, opener.open, ".")
199+ self.assertEquals(1, len(TrackingProber.seen_urls))
200+ TrackingProber.seen_urls = []
201+ # And make sure it's registered in such a way that BzrDir.open would
202+ # use it.
203+ self.assertRaises(NotBranchError, BzrDir.open, ".")
204+ self.assertEquals(1, len(TrackingProber.seen_urls))
205+
206+ def test_allowed_url(self):
207+ # the opener does not raise an exception for branches stacked on
208+ # branches with allowed URLs.
209+ stacked_on_branch = self.make_branch('base-branch', format='1.6')
210+ stacked_branch = self.make_branch('stacked-branch', format='1.6')
211+ stacked_branch.set_stacked_on_url(stacked_on_branch.base)
212+ opener = self.make_branch_opener(
213+ [stacked_branch.base, stacked_on_branch.base])
214+ # This doesn't raise an exception.
215+ opener.open(stacked_branch.base)
216+
217+ def test_nstackable_repository(self):
218+ # treats branches with UnstackableRepositoryFormats as
219+ # being not stacked.
220+ branch = self.make_branch('unstacked', format='knit')
221+ opener = self.make_branch_opener([branch.base])
222+ # This doesn't raise an exception.
223+ opener.open(branch.base)
224+
225+ def test_allowed_relative_url(self):
226+ # passes on absolute urls to check_one_url, even if the
227+ # value of stacked_on_location in the config is set to a relative URL.
228+ stacked_on_branch = self.make_branch('base-branch', format='1.6')
229+ stacked_branch = self.make_branch('stacked-branch', format='1.6')
230+ stacked_branch.set_stacked_on_url('../base-branch')
231+ opener = self.make_branch_opener(
232+ [stacked_branch.base, stacked_on_branch.base])
233+ # Note that stacked_on_branch.base is not '../base-branch', it's an
234+ # absolute URL.
235+ self.assertNotEqual('../base-branch', stacked_on_branch.base)
236+ # This doesn't raise an exception.
237+ opener.open(stacked_branch.base)
238+
239+ def test_allowed_relative_nested(self):
240+ # Relative URLs are resolved relative to the stacked branch.
241+ self.get_transport().mkdir('subdir')
242+ a = self.make_branch('subdir/a', format='1.6')
243+ b = self.make_branch('b', format='1.6')
244+ b.set_stacked_on_url('../subdir/a')
245+ c = self.make_branch('subdir/c', format='1.6')
246+ c.set_stacked_on_url('../../b')
247+ opener = self.make_branch_opener([c.base, b.base, a.base])
248+ # This doesn't raise an exception.
249+ opener.open(c.base)
250+
251+ def test_forbidden_url(self):
252+ # raises a BadUrl exception if a branch is stacked on a
253+ # branch with a forbidden URL.
254+ stacked_on_branch = self.make_branch('base-branch', format='1.6')
255+ stacked_branch = self.make_branch('stacked-branch', format='1.6')
256+ stacked_branch.set_stacked_on_url(stacked_on_branch.base)
257+ opener = self.make_branch_opener([stacked_branch.base])
258+ self.assertRaises(BadUrl, opener.open, stacked_branch.base)
259+
260+ def test_forbidden_url_nested(self):
261+ # raises a BadUrl exception if a branch is stacked on a
262+ # branch that is in turn stacked on a branch with a forbidden URL.
263+ a = self.make_branch('a', format='1.6')
264+ b = self.make_branch('b', format='1.6')
265+ b.set_stacked_on_url(a.base)
266+ c = self.make_branch('c', format='1.6')
267+ c.set_stacked_on_url(b.base)
268+ opener = self.make_branch_opener([c.base, b.base])
269+ self.assertRaises(BadUrl, opener.open, c.base)
270+
271+ def test_self_stacked_branch(self):
272+ # raises StackingLoopError if a branch is stacked on
273+ # itself. This avoids infinite recursion errors.
274+ a = self.make_branch('a', format='1.6')
275+ # Bazaar 1.17 and up make it harder to create branches like this.
276+ # It's still worth testing that we don't blow up in the face of them,
277+ # so we grovel around a bit to create one anyway.
278+ a.get_config().set_user_option('stacked_on_location', a.base)
279+ opener = self.make_branch_opener([a.base])
280+ self.assertRaises(BranchLoopError, opener.open, a.base)
281+
282+ def test_loop_stacked_branch(self):
283+ # raises StackingLoopError if a branch is stacked in such
284+ # a way so that it is ultimately stacked on itself. e.g. a stacked on
285+ # b stacked on a.
286+ a = self.make_branch('a', format='1.6')
287+ b = self.make_branch('b', format='1.6')
288+ a.set_stacked_on_url(b.base)
289+ b.set_stacked_on_url(a.base)
290+ opener = self.make_branch_opener([a.base, b.base])
291+ self.assertRaises(BranchLoopError, opener.open, a.base)
292+ self.assertRaises(BranchLoopError, opener.open, b.base)
293+
294+ def test_custom_opener(self):
295+ # A custom function for opening a control dir can be specified.
296+ a = self.make_branch('a', format='2a')
297+ b = self.make_branch('b', format='2a')
298+ b.set_stacked_on_url(a.base)
299+
300+ TrackingProber.seen_urls = []
301+ opener = self.make_branch_opener(
302+ [a.base, b.base], probers=[TrackingProber])
303+ opener.open(b.base)
304+ self.assertEquals(
305+ set(TrackingProber.seen_urls), set([b.base, a.base]))
306+
307+ def test_custom_opener_with_branch_reference(self):
308+ # A custom function for opening a control dir can be specified.
309+ a = self.make_branch('a', format='2a')
310+ b_dir = self.make_bzrdir('b')
311+ b = BranchReferenceFormat().initialize(b_dir, target_branch=a)
312+ TrackingProber.seen_urls = []
313+ opener = self.make_branch_opener(
314+ [a.base, b.base], probers=[TrackingProber])
315+ opener.open(b.base)
316+ self.assertEquals(
317+ set(TrackingProber.seen_urls), set([b.base, a.base]))
318+
319+
320+class TestOpenOnlyScheme(TestCaseWithTransport):
321+ """Tests for `open_only_scheme`."""
322+
323+ def setUp(self):
324+ super(TestOpenOnlyScheme, self).setUp()
325+ BranchOpener.install_hook()
326+
327+ def test_hook_does_not_interfere(self):
328+ # The transform_fallback_location hook does not interfere with regular
329+ # stacked branch access outside of open_only_scheme.
330+ self.make_branch('stacked')
331+ self.make_branch('stacked-on')
332+ Branch.open('stacked').set_stacked_on_url('../stacked-on')
333+ Branch.open('stacked')
334+
335+ def get_chrooted_scheme(self, relpath):
336+ """Create a server that is chrooted to `relpath`.
337+
338+ :return: ``(scheme, get_url)`` where ``scheme`` is the scheme of the
339+ chroot server and ``get_url`` returns URLs on said server.
340+ """
341+ transport = self.get_transport(relpath)
342+ chroot_server = chroot.ChrootServer(transport)
343+ chroot_server.start_server()
344+ self.addCleanup(chroot_server.stop_server)
345+
346+ def get_url(relpath):
347+ return chroot_server.get_url() + relpath
348+
349+ return urlutils.URL.from_string(chroot_server.get_url()).scheme, get_url
350+
351+ def test_stacked_within_scheme(self):
352+ # A branch that is stacked on a URL of the same scheme is safe to
353+ # open.
354+ self.get_transport().mkdir('inside')
355+ self.make_branch('inside/stacked')
356+ self.make_branch('inside/stacked-on')
357+ scheme, get_chrooted_url = self.get_chrooted_scheme('inside')
358+ Branch.open(get_chrooted_url('stacked')).set_stacked_on_url(
359+ get_chrooted_url('stacked-on'))
360+ open_only_scheme(scheme, get_chrooted_url('stacked'))
361+
362+ def test_stacked_outside_scheme(self):
363+ # A branch that is stacked on a URL that is not of the same scheme is
364+ # not safe to open.
365+ self.get_transport().mkdir('inside')
366+ self.get_transport().mkdir('outside')
367+ self.make_branch('inside/stacked')
368+ self.make_branch('outside/stacked-on')
369+ scheme, get_chrooted_url = self.get_chrooted_scheme('inside')
370+ Branch.open(get_chrooted_url('stacked')).set_stacked_on_url(
371+ self.get_url('outside/stacked-on'))
372+ self.assertRaises(
373+ BadUrl, open_only_scheme, scheme, get_chrooted_url('stacked'))
374
375=== added file 'bzrlib/url_policy_open.py'
376--- bzrlib/url_policy_open.py 1970-01-01 00:00:00 +0000
377+++ bzrlib/url_policy_open.py 2012-01-19 01:06:23 +0000
378@@ -0,0 +1,314 @@
379+# Copyright (C) 2011 Canonical Ltd
380+#
381+# This program is free software; you can redistribute it and/or modify
382+# it under the terms of the GNU General Public License as published by
383+# the Free Software Foundation; either version 2 of the License, or
384+# (at your option) any later version.
385+#
386+# This program is distributed in the hope that it will be useful,
387+# but WITHOUT ANY WARRANTY; without even the implied warranty of
388+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
389+# GNU General Public License for more details.
390+#
391+# You should have received a copy of the GNU General Public License
392+# along with this program; if not, write to the Free Software
393+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
394+
395+"""Branch opening with URL-based restrictions."""
396+
397+from __future__ import absolute_import
398+
399+import threading
400+
401+from bzrlib import (
402+ errors,
403+ urlutils,
404+ )
405+from bzrlib.branch import Branch
406+from bzrlib.controldir import (
407+ ControlDir,
408+ )
409+
410+
411+class BadUrl(errors.BzrError):
412+
413+ _fmt = "Tried to access a branch from bad URL %(url)s."
414+
415+
416+class BranchReferenceForbidden(errors.BzrError):
417+
418+ _fmt = ("Trying to mirror a branch reference and the branch type "
419+ "does not allow references.")
420+
421+
422+class BranchLoopError(errors.BzrError):
423+ """Encountered a branch cycle.
424+
425+ A URL may point to a branch reference or it may point to a stacked branch.
426+ In either case, it's possible for there to be a cycle in these references,
427+ and this exception is raised when we detect such a cycle.
428+ """
429+
430+ _fmt = "Encountered a branch cycle"""
431+
432+
433+class BranchOpenPolicy(object):
434+ """Policy on how to open branches.
435+
436+ In particular, a policy determines which branches are okay to open by
437+ checking their URLs and deciding whether or not to follow branch
438+ references.
439+ """
440+
441+ def should_follow_references(self):
442+ """Whether we traverse references when mirroring.
443+
444+ Subclasses must override this method.
445+
446+ If we encounter a branch reference and this returns false, an error is
447+ raised.
448+
449+ :returns: A boolean to indicate whether to follow a branch reference.
450+ """
451+ raise NotImplementedError(self.should_follow_references)
452+
453+ def transform_fallback_location(self, branch, url):
454+ """Validate, maybe modify, 'url' to be used as a stacked-on location.
455+
456+ :param branch: The branch that is being opened.
457+ :param url: The URL that the branch provides for its stacked-on
458+ location.
459+ :return: (new_url, check) where 'new_url' is the URL of the branch to
460+ actually open and 'check' is true if 'new_url' needs to be
461+ validated by check_and_follow_branch_reference.
462+ """
463+ raise NotImplementedError(self.transform_fallback_location)
464+
465+ def check_one_url(self, url):
466+ """Check a URL.
467+
468+ Subclasses must override this method.
469+
470+ :param url: The source URL to check.
471+ :raise BadUrl: subclasses are expected to raise this or a subclass
472+ when it finds a URL it deems to be unacceptable.
473+ """
474+ raise NotImplementedError(self.check_one_url)
475+
476+
477+class _BlacklistPolicy(BranchOpenPolicy):
478+ """Branch policy that forbids certain URLs.
479+
480+ This doesn't cope with various alternative spellings of URLs,
481+ with e.g. url encoding. It's mostly useful for tests.
482+ """
483+
484+ def __init__(self, should_follow_references, bad_urls=None):
485+ if bad_urls is None:
486+ bad_urls = set()
487+ self._bad_urls = bad_urls
488+ self._should_follow_references = should_follow_references
489+
490+ def should_follow_references(self):
491+ return self._should_follow_references
492+
493+ def check_one_url(self, url):
494+ if url in self._bad_urls:
495+ raise BadUrl(url)
496+
497+ def transform_fallback_location(self, branch, url):
498+ """See `BranchOpenPolicy.transform_fallback_location`.
499+
500+ This class is not used for testing our smarter stacking features so we
501+ just do the simplest thing: return the URL that would be used anyway
502+ and don't check it.
503+ """
504+ return urlutils.join(branch.base, url), False
505+
506+
507+class AcceptAnythingPolicy(_BlacklistPolicy):
508+ """Accept anything, to make testing easier."""
509+
510+ def __init__(self):
511+ super(AcceptAnythingPolicy, self).__init__(True, set())
512+
513+
514+class WhitelistPolicy(BranchOpenPolicy):
515+ """Branch policy that only allows certain URLs."""
516+
517+ def __init__(self, should_follow_references, allowed_urls=None,
518+ check=False):
519+ if allowed_urls is None:
520+ allowed_urls = []
521+ self.allowed_urls = set(url.rstrip('/') for url in allowed_urls)
522+ self.check = check
523+
524+ def should_follow_references(self):
525+ return self._should_follow_references
526+
527+ def check_one_url(self, url):
528+ if url.rstrip('/') not in self.allowed_urls:
529+ raise BadUrl(url)
530+
531+ def transform_fallback_location(self, branch, url):
532+ """See `BranchOpenPolicy.transform_fallback_location`.
533+
534+ Here we return the URL that would be used anyway and optionally check
535+ it.
536+ """
537+ return urlutils.join(branch.base, url), self.check
538+
539+
540+class SingleSchemePolicy(BranchOpenPolicy):
541+ """Branch open policy that rejects URLs not on the given scheme."""
542+
543+ def __init__(self, allowed_scheme):
544+ self.allowed_scheme = allowed_scheme
545+
546+ def should_follow_references(self):
547+ return True
548+
549+ def transform_fallback_location(self, branch, url):
550+ return urlutils.join(branch.base, url), True
551+
552+ def check_one_url(self, url):
553+ """Check that `url` is okay to open."""
554+ if urlutils.URL.from_string(str(url)).scheme != self.allowed_scheme:
555+ raise BadUrl(url)
556+
557+
558+class BranchOpener(object):
559+ """Branch opener which uses a URL policy.
560+
561+ All locations that are opened (stacked-on branches, references) are
562+ checked against a policy object.
563+
564+ The policy object is expected to have the following methods:
565+ * check_one_url
566+ * should_follow_references
567+ * transform_fallback_location
568+ """
569+
570+ _threading_data = threading.local()
571+
572+ def __init__(self, policy, probers=None):
573+ """Create a new BranchOpener.
574+
575+ :param policy: The opener policy to use.
576+ :param probers: Optional list of probers to allow.
577+ Defaults to local and remote bzr probers.
578+ """
579+ self.policy = policy
580+ self._seen_urls = set()
581+ self.probers = probers
582+
583+ @classmethod
584+ def install_hook(cls):
585+ """Install the ``transform_fallback_location`` hook.
586+
587+ This is done at module import time, but transform_fallback_locationHook
588+ doesn't do anything unless the `_active_openers` threading.Local
589+ object has a 'opener' attribute in this thread.
590+
591+ This is in a module-level function rather than performed at module
592+ level so that it can be called in setUp for testing `BranchOpener`
593+ as bzrlib.tests.TestCase.setUp clears hooks.
594+ """
595+ Branch.hooks.install_named_hook(
596+ 'transform_fallback_location',
597+ cls.transform_fallback_locationHook,
598+ 'BranchOpener.transform_fallback_locationHook')
599+
600+ def check_and_follow_branch_reference(self, url):
601+ """Check URL (and possibly the referenced URL).
602+
603+ This method checks that `url` passes the policy's `check_one_url`
604+ method, and if `url` refers to a branch reference, it checks whether
605+ references are allowed and whether the reference's URL passes muster
606+ also -- recursively, until a real branch is found.
607+
608+ :param url: URL to check
609+ :raise BranchLoopError: If the branch references form a loop.
610+ :raise BranchReferenceForbidden: If this opener forbids branch
611+ references.
612+ """
613+ while True:
614+ if url in self._seen_urls:
615+ raise BranchLoopError()
616+ self._seen_urls.add(url)
617+ self.policy.check_one_url(url)
618+ next_url = self.follow_reference(url)
619+ if next_url is None:
620+ return url
621+ url = next_url
622+ if not self.policy.should_follow_references():
623+ raise BranchReferenceForbidden(url)
624+
625+ @classmethod
626+ def transform_fallback_locationHook(cls, branch, url):
627+ """Installed as the 'transform_fallback_location' Branch hook.
628+
629+ This method calls `transform_fallback_location` on the policy object and
630+ either returns the url it provides or passes it back to
631+ check_and_follow_branch_reference.
632+ """
633+ try:
634+ opener = getattr(cls._threading_data, "opener")
635+ except AttributeError:
636+ return url
637+ new_url, check = opener.policy.transform_fallback_location(branch, url)
638+ if check:
639+ return opener.check_and_follow_branch_reference(new_url)
640+ else:
641+ return new_url
642+
643+ def run_with_transform_fallback_location_hook_installed(
644+ self, callable, *args, **kw):
645+ if (self.transform_fallback_locationHook not in
646+ Branch.hooks['transform_fallback_location']):
647+ raise AssertionError("hook not installed")
648+ self._threading_data.opener = self
649+ try:
650+ return callable(*args, **kw)
651+ finally:
652+ del self._threading_data.opener
653+ # We reset _seen_urls here to avoid multiple calls to open giving
654+ # spurious loop exceptions.
655+ self._seen_urls = set()
656+
657+ def follow_reference(self, url):
658+ """Get the branch-reference value at the specified url.
659+
660+ This exists as a separate method only to be overriden in unit tests.
661+ """
662+ bzrdir = ControlDir.open(url, probers=self.probers)
663+ return bzrdir.get_branch_reference()
664+
665+ def open(self, url):
666+ """Open the Bazaar branch at url, first checking it.
667+
668+ What is acceptable means is defined by the policy's `follow_reference` and
669+ `check_one_url` methods.
670+ """
671+ if type(url) != str:
672+ raise TypeError
673+
674+ url = self.check_and_follow_branch_reference(url)
675+
676+ def open_branch(url):
677+ dir = ControlDir.open(url, probers=self.probers)
678+ return dir.open_branch()
679+ return self.run_with_transform_fallback_location_hook_installed(
680+ open_branch, url)
681+
682+
683+def open_only_scheme(allowed_scheme, url):
684+ """Open the branch at `url`, only accessing URLs on `allowed_scheme`.
685+
686+ :raises BadUrl: An attempt was made to open a URL that was not on
687+ `allowed_scheme`.
688+ """
689+ return BranchOpener(SingleSchemePolicy(allowed_scheme)).open(url)
690+
691+
692+BranchOpener.install_hook()
693
694=== modified file 'doc/en/release-notes/bzr-2.5.txt'
695--- doc/en/release-notes/bzr-2.5.txt 2012-01-18 19:40:10 +0000
696+++ doc/en/release-notes/bzr-2.5.txt 2012-01-19 01:06:23 +0000
697@@ -56,10 +56,13 @@
698 .. Major internal changes, unlikely to be visible to users or plugin
699 developers, but interesting for bzr developers.
700
701+* Add new module ``bzrlib.url_policy_open``. (Jelmer Vernooij, #850843)
702+
703 * ``MutableTree`` has two new hooks ``pre_transform`` and
704 ``post_transform`` that are called for tree transform operations.
705 (Jelmer Vernooij, #912084)
706
707+
708 Testing
709 *******
710