Code review comment for lp:~mwhudson/launchpad/puller-import-hack

Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On 15/04/10 03:41, Aaron Bentley wrote:
> Review: Needs Information
> What is the impact of removing the testrunner bzr_imports_root_url? Does this cause it to default to a more suitable value?

Yes, the value in the development config is a file:/// url.

> Instead of remote_bzr_dir.transport.clone('..'), you should be able to use remote_bzr_dir.root_transport.

Ah, cool, thanks.

> Some of the test changes look like drive-bys, e.g. not wanting it to leave directories behind. Is that right?

I don't think I've really changed anything here, just folded the
addCleanup that followed every invocation of makeCleanDirectory into the
method itself.

> Is there a reason to join config.launchpad.bzr_imports_root_url and the branch id with path.join instead of urlutils.join?

No, none. Fixed.

> Did you consider using more specific name than _is_import? e.g. _vfs_initial_clone ? Or what about providing an initial_clone method on the BranchPolicy itself?

I thought about that name for about a microsecond, I have to admit.

Making it a method on the policy makes sense, partly because then it can
be tested separately...

> What about using sets to represent files_before and files_after? That's closer to the abstract representation, because order isn't important, and entries can only appear once. I don't really understand the purpose of this loop, either.

Yes, good point.

The point of the loop is to cope with the not-quite-entirely-theoretical
possibility that the branch being copied might be being written to as we
grab it. I think it can be written a bit more clearly now you mention
it, take a look.

> It might be nice to turn _runWithTransformFallbackLocationHookInstalled into a ContextManager, but that's not necessary for this branch.

Please no :-)

I think I have a strategy for getting rid of using the transform
fallback hook entirely now bzr has an ignore_fallbacks argument for
branch opening, but I'm more likely to do that in my no-hosted-area work
than here :)

Interdiff attached, although just looking at the preview diff again is
likely just as clear.

Cheers,
mwh

1=== modified file 'lib/lp/codehosting/puller/tests/test_acceptance.py'
2--- lib/lp/codehosting/puller/tests/test_acceptance.py 2010-04-14 03:07:42 +0000
3+++ lib/lp/codehosting/puller/tests/test_acceptance.py 2010-04-14 20:57:52 +0000
4@@ -20,7 +20,7 @@
5 from bzrlib.tests import HttpServer
6 from bzrlib.transport import get_transport
7 from bzrlib.upgrade import upgrade
8-from bzrlib.urlutils import local_path_from_url
9+from bzrlib.urlutils import join as urljoin, local_path_from_url
10
11 from zope.component import getUtility
12 from zope.security.proxy import removeSecurityProxy
13@@ -397,7 +397,7 @@
14 transaction.commit()
15
16 # Create the Bazaar branch in the expected location.
17- branch_url = os.path.join(
18+ branch_url = urljoin(
19 config.launchpad.bzr_imports_root_url, '%08x' % db_branch.id)
20 branch = BzrDir.create_branch_convenience(branch_url)
21 tree = branch.bzrdir.open_workingtree()
22
23=== modified file 'lib/lp/codehosting/puller/worker.py'
24--- lib/lp/codehosting/puller/worker.py 2010-04-14 03:18:02 +0000
25+++ lib/lp/codehosting/puller/worker.py 2010-04-15 01:52:27 +0000
26@@ -11,9 +11,6 @@
27 from bzrlib.branch import Branch
28 from bzrlib.bzrdir import BzrDir
29 from bzrlib import errors
30-from bzrlib.plugins.loom.branch import LoomSupport
31-from bzrlib.transport import get_transport
32-from bzrlib import urlutils
33 from bzrlib.ui import SilentUIFactory
34 import bzrlib.ui
35
36@@ -227,51 +224,14 @@
37 'source_branch'. Any content already at 'destination_url' will be
38 deleted.
39
40- If 'source_branch' is stacked, then the destination branch will be
41- stacked on the same URL, relative to 'destination_url'.
42-
43 :param source_branch: The Bazaar branch that will be mirrored.
44 :param destination_url: The place to make the destination branch. This
45 URL must point to a writable location.
46 :return: The destination branch.
47 """
48- dest_transport = get_transport(destination_url)
49- if dest_transport.has('.'):
50- dest_transport.delete_tree('.')
51- bzrdir = source_branch.bzrdir
52- if self.policy._is_import:
53- source_transport = source_branch.bzrdir.transport.clone('..')
54- while True:
55- files_before = sorted(source_transport.iter_files_recursive())
56- source_transport.copy_tree_to_transport(dest_transport)
57- files_after = sorted(source_transport.iter_files_recursive())
58- if files_before == files_after:
59- break
60- return Branch.open_from_transport(dest_transport)
61- else:
62- # We check to see if the stacked on branch exists in the mirrored
63- # area so that we can nicely signal to the scheduler that the
64- # pulling of this branch should be deferred before we even create
65- # the branch in the mirrored area.
66- stacked_on_url = (
67- self.policy.getStackedOnURLForDestinationBranch(
68- source_branch, destination_url))
69- if stacked_on_url is not None:
70- stacked_on_url = urlutils.join(destination_url, stacked_on_url)
71- try:
72- Branch.open(stacked_on_url)
73- except errors.NotBranchError:
74- raise StackedOnBranchNotFound()
75- if isinstance(source_branch, LoomSupport):
76- # Looms suck.
77- revision_id = None
78- else:
79- revision_id = 'null:'
80- self._runWithTransformFallbackLocationHookInstalled(
81- bzrdir.clone_on_transport, dest_transport,
82- revision_id=revision_id)
83- branch = Branch.open(destination_url)
84- return branch
85+ return self._runWithTransformFallbackLocationHookInstalled(
86+ self.policy.createDestinationBranch, source_branch,
87+ destination_url)
88
89 def openDestinationBranch(self, source_branch, destination_url):
90 """Open or create the destination branch at 'destination_url'.
91@@ -279,9 +239,7 @@
92 :param source_branch: The Bazaar branch that will be mirrored.
93 :param destination_url: The place to make the destination branch. This
94 URL must point to a writable location.
95- :return: (branch, up_to_date), where 'branch' is the destination
96- branch, and 'up_to_date' is a boolean saying whether the returned
97- branch is up-to-date with the source branch.
98+ :return: The opened or created branch.
99 """
100 try:
101 branch = Branch.open(destination_url)
102
103=== modified file 'lib/lp/codehosting/vfs/branchfs.py'
104--- lib/lp/codehosting/vfs/branchfs.py 2010-04-13 01:40:03 +0000
105+++ lib/lp/codehosting/vfs/branchfs.py 2010-04-15 01:49:32 +0000
106@@ -66,10 +66,12 @@
107
108 import xmlrpclib
109
110+from bzrlib.branch import Branch
111 from bzrlib.bzrdir import BzrDir, BzrDirFormat
112 from bzrlib.errors import (
113 NoSuchFile, NotBranchError, NotStacked, PermissionDenied,
114 TransportNotPossible, UnstackableBranchFormat)
115+from bzrlib.plugins.loom.branch import LoomSupport
116 from bzrlib.transport import get_transport
117 from bzrlib.transport.memory import MemoryServer
118 from bzrlib import urlutils
119@@ -697,7 +699,45 @@
120 stacked.
121 """
122
123- _is_import = False
124+ def createDestinationBranch(self, source_branch, destination_url):
125+ """Create a destination branch for 'source_branch'.
126+
127+ Creates a branch at 'destination_url' that is has the same format as
128+ 'source_branch'. Any content already at 'destination_url' will be
129+ deleted. Generally the new branch will have no revisions, but they
130+ will be copied for import branches, because this can be done safely
131+ and efficiently with a vfs-level copy (see `ImportedBranchPolicy`,
132+ below).
133+
134+ :param source_branch: The Bazaar branch that will be mirrored.
135+ :param destination_url: The place to make the destination branch. This
136+ URL must point to a writable location.
137+ :return: The destination branch.
138+ :raises StackedOnBranchNotFound: if the branch that the destination
139+ branch will be stacked on does not yet exist in the mirrored area.
140+ """
141+ dest_transport = get_transport(destination_url)
142+ if dest_transport.has('.'):
143+ dest_transport.delete_tree('.')
144+ stacked_on_url = (
145+ self.getStackedOnURLForDestinationBranch(
146+ source_branch, destination_url))
147+ if stacked_on_url is not None:
148+ stacked_on_url = urlutils.join(destination_url, stacked_on_url)
149+ try:
150+ Branch.open(stacked_on_url)
151+ except NotBranchError:
152+ from lp.codehosting.codeimport.worker import (
153+ StackedOnBranchNotFound)
154+ raise StackedOnBranchNotFound()
155+ if isinstance(source_branch, LoomSupport):
156+ # Looms suck.
157+ revision_id = None
158+ else:
159+ revision_id = 'null:'
160+ source_branch.bzrdir.clone_on_transport(
161+ dest_transport, revision_id=revision_id)
162+ return Branch.open(destination_url)
163
164 def getStackedOnURLForDestinationBranch(self, source_branch,
165 destination_url):
166@@ -768,15 +808,6 @@
167 - assert we're pulling from a lp-hosted:/// URL.
168 """
169
170- def shouldFollowReferences(self):
171- """See `BranchPolicy.shouldFollowReferences`.
172-
173- We do not traverse references for HOSTED branches because that may
174- cause us to connect to remote locations, which we do not allow because
175- we want hosted branches to be mirrored quickly.
176- """
177- return False
178-
179 def _bzrdirExists(self, url):
180 """Return whether a BzrDir exists at `url`."""
181 try:
182@@ -926,7 +957,26 @@
183 - assert the URLs start with the prefix we expect for imported branches.
184 """
185
186- _is_import = True
187+ def createDestinationBranch(self, source_branch, destination_url):
188+ """See `BranchPolicy.createDestinationBranch`.
189+
190+ Because we control the process that creates import branches, a
191+ vfs-level copy is safe and more efficient than a bzr fetch.
192+ """
193+ source_transport = source_branch.bzrdir.root_transport
194+ dest_transport = get_transport(destination_url)
195+ while True:
196+ # We loop until the remote file list before and after the copy is
197+ # the same to catch the case where the remote side is being
198+ # mutated as we copy it.
199+ if dest_transport.has('.'):
200+ dest_transport.delete_tree('.')
201+ files_before = set(source_transport.iter_files_recursive())
202+ source_transport.copy_tree_to_transport(dest_transport)
203+ files_after = set(source_transport.iter_files_recursive())
204+ if files_before == files_after:
205+ break
206+ return Branch.open_from_transport(dest_transport)
207
208 def shouldFollowReferences(self):
209 """See `BranchPolicy.shouldFollowReferences`.

« Back to merge proposal