Merge lp:~mwhudson/launchpad/puller-import-hack into lp:launchpad

Proposed by Michael Hudson-Doyle
Status: Merged
Approved by: Aaron Bentley
Approved revision: no longer in the source branch.
Merged at revision: not available
Proposed branch: lp:~mwhudson/launchpad/puller-import-hack
Merge into: lp:launchpad
Diff against target: 324 lines (+90/-80)
6 files modified
configs/testrunner/launchpad-lazr.conf (+0/-1)
lib/lp/codehosting/codeimport/worker.py (+12/-8)
lib/lp/codehosting/puller/tests/__init__.py (+1/-0)
lib/lp/codehosting/puller/tests/test_acceptance.py (+10/-26)
lib/lp/codehosting/puller/worker.py (+4/-36)
lib/lp/codehosting/vfs/branchfs.py (+63/-9)
To merge this branch: bzr merge lp:~mwhudson/launchpad/puller-import-hack
Reviewer Review Type Date Requested Status
Aaron Bentley (community) Approve
Review via email: mp+23370@code.launchpad.net

Commit message

Use vfs-level copies when moving import branches around -- faster than 2a fetch, and safe because we control how the branches are made.

Description of the change

Hi,

This branch adds a few hacks to the handling of code imports to avoid the problems of cpu intensive slow fetch.

In particular, when getting the branch from the central area to the slave and when pulling an imported branch to the codehost for the first time, just copy the branch at the vfs level.

Deploying this on production depends on https://code.edge.launchpad.net/~mwhudson/lp-production-configs/sftp-imports/+merge/23369 being deployed.

Cheers,
mwh

To post a comment you must log in.
Revision history for this message
Aaron Bentley (abentley) wrote :

What is the impact of removing the testrunner bzr_imports_root_url? Does this cause it to default to a more suitable value?

Instead of remote_bzr_dir.transport.clone('..'), you should be able to use remote_bzr_dir.root_transport.

Some of the test changes look like drive-bys, e.g. not wanting it to leave directories behind. Is that right?

Is there a reason to join config.launchpad.bzr_imports_root_url and the branch id with path.join instead of urlutils.join?

Did you consider using more specific name than _is_import? e.g. _vfs_initial_clone ? Or what about providing an initial_clone method on the BranchPolicy itself?

What about using sets to represent files_before and files_after? That's closer to the abstract representation, because order isn't important, and entries can only appear once. I don't really understand the purpose of this loop, either.

It might be nice to turn _runWithTransformFallbackLocationHookInstalled into a ContextManager, but that's not necessary for this branch.

review: Needs Information
Revision history for this message
Michael Hudson-Doyle (mwhudson) wrote :

On 15/04/10 03:41, Aaron Bentley wrote:
> Review: Needs Information
> What is the impact of removing the testrunner bzr_imports_root_url? Does this cause it to default to a more suitable value?

Yes, the value in the development config is a file:/// url.

> Instead of remote_bzr_dir.transport.clone('..'), you should be able to use remote_bzr_dir.root_transport.

Ah, cool, thanks.

> Some of the test changes look like drive-bys, e.g. not wanting it to leave directories behind. Is that right?

I don't think I've really changed anything here, just folded the
addCleanup that followed every invocation of makeCleanDirectory into the
method itself.

> Is there a reason to join config.launchpad.bzr_imports_root_url and the branch id with path.join instead of urlutils.join?

No, none. Fixed.

> Did you consider using more specific name than _is_import? e.g. _vfs_initial_clone ? Or what about providing an initial_clone method on the BranchPolicy itself?

I thought about that name for about a microsecond, I have to admit.

Making it a method on the policy makes sense, partly because then it can
be tested separately...

> What about using sets to represent files_before and files_after? That's closer to the abstract representation, because order isn't important, and entries can only appear once. I don't really understand the purpose of this loop, either.

Yes, good point.

The point of the loop is to cope with the not-quite-entirely-theoretical
possibility that the branch being copied might be being written to as we
grab it. I think it can be written a bit more clearly now you mention
it, take a look.

> It might be nice to turn _runWithTransformFallbackLocationHookInstalled into a ContextManager, but that's not necessary for this branch.

Please no :-)

I think I have a strategy for getting rid of using the transform
fallback hook entirely now bzr has an ignore_fallbacks argument for
branch opening, but I'm more likely to do that in my no-hosted-area work
than here :)

Interdiff attached, although just looking at the preview diff again is
likely just as clear.

Cheers,
mwh

=== modified file 'lib/lp/codehosting/puller/tests/test_acceptance.py'
--- lib/lp/codehosting/puller/tests/test_acceptance.py 2010-04-14 03:07:42 +0000
+++ lib/lp/codehosting/puller/tests/test_acceptance.py 2010-04-14 20:57:52 +0000
@@ -20,7 +20,7 @@
20from bzrlib.tests import HttpServer20from bzrlib.tests import HttpServer
21from bzrlib.transport import get_transport21from bzrlib.transport import get_transport
22from bzrlib.upgrade import upgrade22from bzrlib.upgrade import upgrade
23from bzrlib.urlutils import local_path_from_url23from bzrlib.urlutils import join as urljoin, local_path_from_url
2424
25from zope.component import getUtility25from zope.component import getUtility
26from zope.security.proxy import removeSecurityProxy26from zope.security.proxy import removeSecurityProxy
@@ -397,7 +397,7 @@
397 transaction.commit()397 transaction.commit()
398398
399 # Create the Bazaar branch in the expected location.399 # Create the Bazaar branch in the expected location.
400 branch_url = os.path.join(400 branch_url = urljoin(
401 config.launchpad.bzr_imports_root_url, '%08x' % db_branch.id)401 config.launchpad.bzr_imports_root_url, '%08x' % db_branch.id)
402 branch = BzrDir.create_branch_convenience(branch_url)402 branch = BzrDir.create_branch_convenience(branch_url)
403 tree = branch.bzrdir.open_workingtree()403 tree = branch.bzrdir.open_workingtree()
404404
=== modified file 'lib/lp/codehosting/puller/worker.py'
--- lib/lp/codehosting/puller/worker.py 2010-04-14 03:18:02 +0000
+++ lib/lp/codehosting/puller/worker.py 2010-04-15 01:52:27 +0000
@@ -11,9 +11,6 @@
11from bzrlib.branch import Branch11from bzrlib.branch import Branch
12from bzrlib.bzrdir import BzrDir12from bzrlib.bzrdir import BzrDir
13from bzrlib import errors13from bzrlib import errors
14from bzrlib.plugins.loom.branch import LoomSupport
15from bzrlib.transport import get_transport
16from bzrlib import urlutils
17from bzrlib.ui import SilentUIFactory14from bzrlib.ui import SilentUIFactory
18import bzrlib.ui15import bzrlib.ui
1916
@@ -227,51 +224,14 @@
227 'source_branch'. Any content already at 'destination_url' will be224 'source_branch'. Any content already at 'destination_url' will be
228 deleted.225 deleted.
229226
230 If 'source_branch' is stacked, then the destination branch will be
231 stacked on the same URL, relative to 'destination_url'.
232
233 :param source_branch: The Bazaar branch that will be mirrored.227 :param source_branch: The Bazaar branch that will be mirrored.
234 :param destination_url: The place to make the destination branch. This228 :param destination_url: The place to make the destination branch. This
235 URL must point to a writable location.229 URL must point to a writable location.
236 :return: The destination branch.230 :return: The destination branch.
237 """231 """
238 dest_transport = get_transport(destination_url)232 return self._runWithTransformFallbackLocationHookInstalled(
239 if dest_transport.has('.'):233 self.policy.createDestinationBranch, source_branch,
240 dest_transport.delete_tree('.')234 destination_url)
241 bzrdir = source_branch.bzrdir
242 if self.policy._is_import:
243 source_transport = source_branch.bzrdir.transport.clone('..')
244 while True:
245 files_before = sorted(source_transport.iter_files_recursive())
246 source_transport.copy_tree_to_transport(dest_transport)
247 files_after = sorted(source_transport.iter_files_recursive())
248 if files_before == files_after:
249 break
250 return Branch.open_from_transport(dest_transport)
251 else:
252 # We check to see if the stacked on branch exists in the mirrored
253 # area so that we can nicely signal to the scheduler that the
254 # pulling of this branch should be deferred before we even create
255 # the branch in the mirrored area.
256 stacked_on_url = (
257 self.policy.getStackedOnURLForDestinationBranch(
258 source_branch, destination_url))
259 if stacked_on_url is not None:
260 stacked_on_url = urlutils.join(destination_url, stacked_on_url)
261 try:
262 Branch.open(stacked_on_url)
263 except errors.NotBranchError:
264 raise StackedOnBranchNotFound()
265 if isinstance(source_branch, LoomSupport):
266 # Looms suck.
267 revision_id = None
268 else:
269 revision_id = 'null:'
270 self._runWithTransformFallbackLocationHookInstalled(
271 bzrdir.clone_on_transport, dest_transport,
272 revision_id=revision_id)
273 branch = Branch.open(destination_url)
274 return branch
275235
276 def openDestinationBranch(self, source_branch, destination_url):236 def openDestinationBranch(self, source_branch, destination_url):
277 """Open or create the destination branch at 'destination_url'.237 """Open or create the destination branch at 'destination_url'.
@@ -279,9 +239,7 @@
279 :param source_branch: The Bazaar branch that will be mirrored.239 :param source_branch: The Bazaar branch that will be mirrored.
280 :param destination_url: The place to make the destination branch. This240 :param destination_url: The place to make the destination branch. This
281 URL must point to a writable location.241 URL must point to a writable location.
282 :return: (branch, up_to_date), where 'branch' is the destination242 :return: The opened or created branch.
283 branch, and 'up_to_date' is a boolean saying whether the returned
284 branch is up-to-date with the source branch.
285 """243 """
286 try:244 try:
287 branch = Branch.open(destination_url)245 branch = Branch.open(destination_url)
288246
=== modified file 'lib/lp/codehosting/vfs/branchfs.py'
--- lib/lp/codehosting/vfs/branchfs.py 2010-04-13 01:40:03 +0000
+++ lib/lp/codehosting/vfs/branchfs.py 2010-04-15 01:49:32 +0000
@@ -66,10 +66,12 @@
6666
67import xmlrpclib67import xmlrpclib
6868
69from bzrlib.branch import Branch
69from bzrlib.bzrdir import BzrDir, BzrDirFormat70from bzrlib.bzrdir import BzrDir, BzrDirFormat
70from bzrlib.errors import (71from bzrlib.errors import (
71 NoSuchFile, NotBranchError, NotStacked, PermissionDenied,72 NoSuchFile, NotBranchError, NotStacked, PermissionDenied,
72 TransportNotPossible, UnstackableBranchFormat)73 TransportNotPossible, UnstackableBranchFormat)
74from bzrlib.plugins.loom.branch import LoomSupport
73from bzrlib.transport import get_transport75from bzrlib.transport import get_transport
74from bzrlib.transport.memory import MemoryServer76from bzrlib.transport.memory import MemoryServer
75from bzrlib import urlutils77from bzrlib import urlutils
@@ -697,7 +699,45 @@
697 stacked.699 stacked.
698 """700 """
699701
700 _is_import = False702 def createDestinationBranch(self, source_branch, destination_url):
703 """Create a destination branch for 'source_branch'.
704
705 Creates a branch at 'destination_url' that is has the same format as
706 'source_branch'. Any content already at 'destination_url' will be
707 deleted. Generally the new branch will have no revisions, but they
708 will be copied for import branches, because this can be done safely
709 and efficiently with a vfs-level copy (see `ImportedBranchPolicy`,
710 below).
711
712 :param source_branch: The Bazaar branch that will be mirrored.
713 :param destination_url: The place to make the destination branch. This
714 URL must point to a writable location.
715 :return: The destination branch.
716 :raises StackedOnBranchNotFound: if the branch that the destination
717 branch will be stacked on does not yet exist in the mirrored area.
718 """
719 dest_transport = get_transport(destination_url)
720 if dest_transport.has('.'):
721 dest_transport.delete_tree('.')
722 stacked_on_url = (
723 self.getStackedOnURLForDestinationBranch(
724 source_branch, destination_url))
725 if stacked_on_url is not None:
726 stacked_on_url = urlutils.join(destination_url, stacked_on_url)
727 try:
728 Branch.open(stacked_on_url)
729 except NotBranchError:
730 from lp.codehosting.codeimport.worker import (
731 StackedOnBranchNotFound)
732 raise StackedOnBranchNotFound()
733 if isinstance(source_branch, LoomSupport):
734 # Looms suck.
735 revision_id = None
736 else:
737 revision_id = 'null:'
738 source_branch.bzrdir.clone_on_transport(
739 dest_transport, revision_id=revision_id)
740 return Branch.open(destination_url)
701741
702 def getStackedOnURLForDestinationBranch(self, source_branch,742 def getStackedOnURLForDestinationBranch(self, source_branch,
703 destination_url):743 destination_url):
@@ -768,15 +808,6 @@
768 - assert we're pulling from a lp-hosted:/// URL.808 - assert we're pulling from a lp-hosted:/// URL.
769 """809 """
770810
771 def shouldFollowReferences(self):
772 """See `BranchPolicy.shouldFollowReferences`.
773
774 We do not traverse references for HOSTED branches because that may
775 cause us to connect to remote locations, which we do not allow because
776 we want hosted branches to be mirrored quickly.
777 """
778 return False
779
780 def _bzrdirExists(self, url):811 def _bzrdirExists(self, url):
781 """Return whether a BzrDir exists at `url`."""812 """Return whether a BzrDir exists at `url`."""
782 try:813 try:
@@ -926,7 +957,26 @@
926 - assert the URLs start with the prefix we expect for imported branches.957 - assert the URLs start with the prefix we expect for imported branches.
927 """958 """
928959
929 _is_import = True960 def createDestinationBranch(self, source_branch, destination_url):
961 """See `BranchPolicy.createDestinationBranch`.
962
963 Because we control the process that creates import branches, a
964 vfs-level copy is safe and more efficient than a bzr fetch.
965 """
966 source_transport = source_branch.bzrdir.root_transport
967 dest_transport = get_transport(destination_url)
968 while True:
969 # We loop until the remote file list before and after the copy is
970 # the same to catch the case where the remote side is being
971 # mutated as we copy it.
972 if dest_transport.has('.'):
973 dest_transport.delete_tree('.')
974 files_before = set(source_transport.iter_files_recursive())
975 source_transport.copy_tree_to_transport(dest_transport)
976 files_after = set(source_transport.iter_files_recursive())
977 if files_before == files_after:
978 break
979 return Branch.open_from_transport(dest_transport)
930980
931 def shouldFollowReferences(self):981 def shouldFollowReferences(self):
932 """See `BranchPolicy.shouldFollowReferences`.982 """See `BranchPolicy.shouldFollowReferences`.
Revision history for this message
Aaron Bentley (abentley) wrote :

There was still another spot where you could use root_transport in lib/lp/codehosting/codeimport/worker.py, but otherwise, looks good.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'configs/testrunner/launchpad-lazr.conf'
--- configs/testrunner/launchpad-lazr.conf 2010-03-30 17:25:52 +0000
+++ configs/testrunner/launchpad-lazr.conf 2010-04-15 06:06:18 +0000
@@ -127,7 +127,6 @@
127127
128[launchpad]128[launchpad]
129max_attachment_size: 1024129max_attachment_size: 1024
130bzr_imports_root_url: http://localhost:10899
131geoip_database: /usr/share/GeoIP/GeoLiteCity.dat130geoip_database: /usr/share/GeoIP/GeoLiteCity.dat
132131
133[launchpad_session]132[launchpad_session]
134133
=== modified file 'lib/lp/codehosting/codeimport/worker.py'
--- lib/lp/codehosting/codeimport/worker.py 2010-04-12 01:42:34 +0000
+++ lib/lp/codehosting/codeimport/worker.py 2010-04-15 06:06:18 +0000
@@ -89,14 +89,18 @@
89 except NoSuchFile:89 except NoSuchFile:
90 pass90 pass
91 upgrade(remote_url, required_format)91 upgrade(remote_url, required_format)
92 local_branch = remote_bzr_dir.sprout(92 # The proper thing to do here would be to call
93 target_path, create_tree_if_local=needs_tree).open_branch()93 # "remote_bzr_dir.sprout()". But 2a fetch slowly checks which
94 # Because of the way we do incremental imports, there may be revisions94 # revisions are in the ancestry of the tip of the remote branch, which
95 # in the branch's repo that are not in the ancestry of the branch tip.95 # we strictly don't care about, so we just copy the whole thing down
96 # We need to transfer them too.96 # at the vfs level.
97 local_branch.repository.fetch(97 target = get_transport(target_path)
98 remote_bzr_dir.open_repository())98 target.ensure_base()
99 return local_branch99 remote_bzr_dir.root_transport.copy_tree_to_transport(target)
100 local_bzr_dir = BzrDir.open_from_transport(target)
101 if needs_tree:
102 local_bzr_dir.create_workingtree()
103 return local_bzr_dir.open_branch()
100104
101 def push(self, db_branch_id, bzr_branch, required_format):105 def push(self, db_branch_id, bzr_branch, required_format):
102 """Push up `bzr_branch` as the Bazaar branch for `code_import`.106 """Push up `bzr_branch` as the Bazaar branch for `code_import`.
103107
=== modified file 'lib/lp/codehosting/puller/tests/__init__.py'
--- lib/lp/codehosting/puller/tests/__init__.py 2009-11-21 00:28:10 +0000
+++ lib/lp/codehosting/puller/tests/__init__.py 2010-04-15 06:06:18 +0000
@@ -135,6 +135,7 @@
135 if os.path.exists(path):135 if os.path.exists(path):
136 shutil.rmtree(path)136 shutil.rmtree(path)
137 os.makedirs(path)137 os.makedirs(path)
138 self.addCleanup(shutil.rmtree, path)
138139
139 def pushToBranch(self, branch, tree):140 def pushToBranch(self, branch, tree):
140 """Push a Bazaar branch to a given Launchpad branch's hosted area.141 """Push a Bazaar branch to a given Launchpad branch's hosted area.
141142
=== modified file 'lib/lp/codehosting/puller/tests/test_acceptance.py'
--- lib/lp/codehosting/puller/tests/test_acceptance.py 2010-02-24 04:24:01 +0000
+++ lib/lp/codehosting/puller/tests/test_acceptance.py 2010-04-15 06:06:18 +0000
@@ -8,10 +8,8 @@
88
99
10import os10import os
11import shutil
12from subprocess import PIPE, Popen11from subprocess import PIPE, Popen
13import unittest12import unittest
14from urlparse import urlparse
1513
16import transaction14import transaction
1715
@@ -22,6 +20,7 @@
22from bzrlib.tests import HttpServer20from bzrlib.tests import HttpServer
23from bzrlib.transport import get_transport21from bzrlib.transport import get_transport
24from bzrlib.upgrade import upgrade22from bzrlib.upgrade import upgrade
23from bzrlib.urlutils import join as urljoin, local_path_from_url
2524
26from zope.component import getUtility25from zope.component import getUtility
27from zope.security.proxy import removeSecurityProxy26from zope.security.proxy import removeSecurityProxy
@@ -49,11 +48,9 @@
49 self._puller_script = os.path.join(48 self._puller_script = os.path.join(
50 config.root, 'cronscripts', 'supermirror-pull.py')49 config.root, 'cronscripts', 'supermirror-pull.py')
51 self.makeCleanDirectory(config.codehosting.hosted_branches_root)50 self.makeCleanDirectory(config.codehosting.hosted_branches_root)
52 self.addCleanup(
53 shutil.rmtree, config.codehosting.hosted_branches_root)
54 self.makeCleanDirectory(config.codehosting.mirrored_branches_root)51 self.makeCleanDirectory(config.codehosting.mirrored_branches_root)
55 self.addCleanup(52 self.makeCleanDirectory(
56 shutil.rmtree, config.codehosting.mirrored_branches_root)53 local_path_from_url(config.launchpad.bzr_imports_root_url))
5754
58 def assertMirrored(self, db_branch, source_branch=None,55 def assertMirrored(self, db_branch, source_branch=None,
59 accessing_user=None):56 accessing_user=None):
@@ -136,10 +133,9 @@
136 retcode, output, error = self.runSubprocess(command)133 retcode, output, error = self.runSubprocess(command)
137 return command, retcode, output, error134 return command, retcode, output, error
138135
139 def serveOverHTTP(self, port=0):136 def serveOverHTTP(self):
140 """Serve the current directory over HTTP, returning the server URL."""137 """Serve the current directory over HTTP, returning the server URL."""
141 http_server = HttpServer()138 http_server = HttpServer()
142 http_server.port = port
143 http_server.start_server()139 http_server.start_server()
144 # Join cleanup added before the tearDown so the tearDown is executed140 # Join cleanup added before the tearDown so the tearDown is executed
145 # first as this tells the thread to die. We then join explicitly as141 # first as this tells the thread to die. We then join explicitly as
@@ -392,18 +388,6 @@
392 self.assertRaises(388 self.assertRaises(
393 errors.NotStacked, mirrored_branch.get_stacked_on_url)389 errors.NotStacked, mirrored_branch.get_stacked_on_url)
394390
395 def _getImportMirrorPort(self):
396 """Return the port used to serve imported branches, as specified in
397 config.launchpad.bzr_imports_root_url.
398 """
399 address = urlparse(config.launchpad.bzr_imports_root_url)[1]
400 host, port = address.split(':')
401 self.assertEqual(
402 'localhost', host,
403 'bzr_imports_root_url must be configured on localhost: %s'
404 % (config.launchpad.bzr_imports_root_url,))
405 return int(port)
406
407 def test_mirror_imported_branch(self):391 def test_mirror_imported_branch(self):
408 # Run the puller on a populated imported branch pull queue.392 # Run the puller on a populated imported branch pull queue.
409 # Create the branch in the database.393 # Create the branch in the database.
@@ -412,18 +396,18 @@
412 db_branch.requestMirror()396 db_branch.requestMirror()
413 transaction.commit()397 transaction.commit()
414398
415 # Create the Bazaar branch and serve it in the expected location.399 # Create the Bazaar branch in the expected location.
416 branch_path = '%08x' % db_branch.id400 branch_url = urljoin(
417 os.mkdir(branch_path)401 config.launchpad.bzr_imports_root_url, '%08x' % db_branch.id)
418 tree = self.make_branch_and_tree(branch_path)402 branch = BzrDir.create_branch_convenience(branch_url)
403 tree = branch.bzrdir.open_workingtree()
419 tree.commit('rev1')404 tree.commit('rev1')
420 self.serveOverHTTP(self._getImportMirrorPort())
421405
422 # Run the puller.406 # Run the puller.
423 command, retcode, output, error = self.runPuller()407 command, retcode, output, error = self.runPuller()
424 self.assertRanSuccessfully(command, retcode, output, error)408 self.assertRanSuccessfully(command, retcode, output, error)
425409
426 self.assertMirrored(db_branch, source_branch=tree.branch)410 self.assertMirrored(db_branch, source_branch=branch)
427411
428 def test_mirror_empty(self):412 def test_mirror_empty(self):
429 # Run the puller on an empty pull queue.413 # Run the puller on an empty pull queue.
430414
=== modified file 'lib/lp/codehosting/puller/worker.py'
--- lib/lp/codehosting/puller/worker.py 2010-01-20 20:56:29 +0000
+++ lib/lp/codehosting/puller/worker.py 2010-04-15 06:06:18 +0000
@@ -11,9 +11,6 @@
11from bzrlib.branch import Branch11from bzrlib.branch import Branch
12from bzrlib.bzrdir import BzrDir12from bzrlib.bzrdir import BzrDir
13from bzrlib import errors13from bzrlib import errors
14from bzrlib.plugins.loom.branch import LoomSupport
15from bzrlib.transport import get_transport
16from bzrlib import urlutils
17from bzrlib.ui import SilentUIFactory14from bzrlib.ui import SilentUIFactory
18import bzrlib.ui15import bzrlib.ui
1916
@@ -227,41 +224,14 @@
227 'source_branch'. Any content already at 'destination_url' will be224 'source_branch'. Any content already at 'destination_url' will be
228 deleted.225 deleted.
229226
230 If 'source_branch' is stacked, then the destination branch will be
231 stacked on the same URL, relative to 'destination_url'.
232
233 :param source_branch: The Bazaar branch that will be mirrored.227 :param source_branch: The Bazaar branch that will be mirrored.
234 :param destination_url: The place to make the destination branch. This228 :param destination_url: The place to make the destination branch. This
235 URL must point to a writable location.229 URL must point to a writable location.
236 :return: The destination branch.230 :return: The destination branch.
237 """231 """
238 dest_transport = get_transport(destination_url)232 return self._runWithTransformFallbackLocationHookInstalled(
239 if dest_transport.has('.'):233 self.policy.createDestinationBranch, source_branch,
240 dest_transport.delete_tree('.')234 destination_url)
241 bzrdir = source_branch.bzrdir
242 # We check to see if the stacked on branch exists in the mirrored area
243 # so that we can nicely signal to the scheduler that the pulling of
244 # this branch should be deferred before we even create the branch in
245 # the mirrored area.
246 stacked_on_url = (
247 self.policy.getStackedOnURLForDestinationBranch(
248 source_branch, destination_url))
249 if stacked_on_url is not None:
250 stacked_on_url = urlutils.join(destination_url, stacked_on_url)
251 try:
252 Branch.open(stacked_on_url)
253 except errors.NotBranchError:
254 raise StackedOnBranchNotFound()
255 if isinstance(source_branch, LoomSupport):
256 # Looms suck.
257 revision_id = None
258 else:
259 revision_id = 'null:'
260 self._runWithTransformFallbackLocationHookInstalled(
261 bzrdir.clone_on_transport, dest_transport,
262 revision_id=revision_id)
263 branch = Branch.open(destination_url)
264 return branch
265235
266 def openDestinationBranch(self, source_branch, destination_url):236 def openDestinationBranch(self, source_branch, destination_url):
267 """Open or create the destination branch at 'destination_url'.237 """Open or create the destination branch at 'destination_url'.
@@ -269,9 +239,7 @@
269 :param source_branch: The Bazaar branch that will be mirrored.239 :param source_branch: The Bazaar branch that will be mirrored.
270 :param destination_url: The place to make the destination branch. This240 :param destination_url: The place to make the destination branch. This
271 URL must point to a writable location.241 URL must point to a writable location.
272 :return: (branch, up_to_date), where 'branch' is the destination242 :return: The opened or created branch.
273 branch, and 'up_to_date' is a boolean saying whether the returned
274 branch is up-to-date with the source branch.
275 """243 """
276 try:244 try:
277 branch = Branch.open(destination_url)245 branch = Branch.open(destination_url)
278246
=== modified file 'lib/lp/codehosting/vfs/branchfs.py'
--- lib/lp/codehosting/vfs/branchfs.py 2010-04-09 12:39:23 +0000
+++ lib/lp/codehosting/vfs/branchfs.py 2010-04-15 06:06:18 +0000
@@ -66,10 +66,12 @@
6666
67import xmlrpclib67import xmlrpclib
6868
69from bzrlib.branch import Branch
69from bzrlib.bzrdir import BzrDir, BzrDirFormat70from bzrlib.bzrdir import BzrDir, BzrDirFormat
70from bzrlib.errors import (71from bzrlib.errors import (
71 NoSuchFile, NotBranchError, NotStacked, PermissionDenied,72 NoSuchFile, NotBranchError, NotStacked, PermissionDenied,
72 TransportNotPossible, UnstackableBranchFormat)73 TransportNotPossible, UnstackableBranchFormat)
74from bzrlib.plugins.loom.branch import LoomSupport
73from bzrlib.transport import get_transport75from bzrlib.transport import get_transport
74from bzrlib.transport.memory import MemoryServer76from bzrlib.transport.memory import MemoryServer
75from bzrlib import urlutils77from bzrlib import urlutils
@@ -697,6 +699,46 @@
697 stacked.699 stacked.
698 """700 """
699701
702 def createDestinationBranch(self, source_branch, destination_url):
703 """Create a destination branch for 'source_branch'.
704
705 Creates a branch at 'destination_url' that is has the same format as
706 'source_branch'. Any content already at 'destination_url' will be
707 deleted. Generally the new branch will have no revisions, but they
708 will be copied for import branches, because this can be done safely
709 and efficiently with a vfs-level copy (see `ImportedBranchPolicy`,
710 below).
711
712 :param source_branch: The Bazaar branch that will be mirrored.
713 :param destination_url: The place to make the destination branch. This
714 URL must point to a writable location.
715 :return: The destination branch.
716 :raises StackedOnBranchNotFound: if the branch that the destination
717 branch will be stacked on does not yet exist in the mirrored area.
718 """
719 dest_transport = get_transport(destination_url)
720 if dest_transport.has('.'):
721 dest_transport.delete_tree('.')
722 stacked_on_url = (
723 self.getStackedOnURLForDestinationBranch(
724 source_branch, destination_url))
725 if stacked_on_url is not None:
726 stacked_on_url = urlutils.join(destination_url, stacked_on_url)
727 try:
728 Branch.open(stacked_on_url)
729 except NotBranchError:
730 from lp.codehosting.puller.worker import (
731 StackedOnBranchNotFound)
732 raise StackedOnBranchNotFound()
733 if isinstance(source_branch, LoomSupport):
734 # Looms suck.
735 revision_id = None
736 else:
737 revision_id = 'null:'
738 source_branch.bzrdir.clone_on_transport(
739 dest_transport, revision_id=revision_id)
740 return Branch.open(destination_url)
741
700 def getStackedOnURLForDestinationBranch(self, source_branch,742 def getStackedOnURLForDestinationBranch(self, source_branch,
701 destination_url):743 destination_url):
702 """Return the URL of the branch to stack the mirrored copy on.744 """Return the URL of the branch to stack the mirrored copy on.
@@ -766,15 +808,6 @@
766 - assert we're pulling from a lp-hosted:/// URL.808 - assert we're pulling from a lp-hosted:/// URL.
767 """809 """
768810
769 def shouldFollowReferences(self):
770 """See `BranchPolicy.shouldFollowReferences`.
771
772 We do not traverse references for HOSTED branches because that may
773 cause us to connect to remote locations, which we do not allow because
774 we want hosted branches to be mirrored quickly.
775 """
776 return False
777
778 def _bzrdirExists(self, url):811 def _bzrdirExists(self, url):
779 """Return whether a BzrDir exists at `url`."""812 """Return whether a BzrDir exists at `url`."""
780 try:813 try:
@@ -924,6 +957,27 @@
924 - assert the URLs start with the prefix we expect for imported branches.957 - assert the URLs start with the prefix we expect for imported branches.
925 """958 """
926959
960 def createDestinationBranch(self, source_branch, destination_url):
961 """See `BranchPolicy.createDestinationBranch`.
962
963 Because we control the process that creates import branches, a
964 vfs-level copy is safe and more efficient than a bzr fetch.
965 """
966 source_transport = source_branch.bzrdir.root_transport
967 dest_transport = get_transport(destination_url)
968 while True:
969 # We loop until the remote file list before and after the copy is
970 # the same to catch the case where the remote side is being
971 # mutated as we copy it.
972 if dest_transport.has('.'):
973 dest_transport.delete_tree('.')
974 files_before = set(source_transport.iter_files_recursive())
975 source_transport.copy_tree_to_transport(dest_transport)
976 files_after = set(source_transport.iter_files_recursive())
977 if files_before == files_after:
978 break
979 return Branch.open_from_transport(dest_transport)
980
927 def shouldFollowReferences(self):981 def shouldFollowReferences(self):
928 """See `BranchPolicy.shouldFollowReferences`.982 """See `BranchPolicy.shouldFollowReferences`.
929983