Merge lp:~vila/bzr/323111-backup-names into lp:bzr

Proposed by Vincent Ladeuil
Status: Merged
Merged at revision: 5504
Proposed branch: lp:~vila/bzr/323111-backup-names
Merge into: lp:bzr
Diff against target: 301 lines (+109/-34)
10 files modified
bzrlib/bzrdir.py (+16/-13)
bzrlib/osutils.py (+20/-0)
bzrlib/tests/per_workingtree/__init__.py (+8/-0)
bzrlib/tests/per_workingtree/test_merge_from_branch.py (+0/-8)
bzrlib/tests/test_bzrdir.py (+11/-2)
bzrlib/tests/test_osutils.py (+31/-0)
bzrlib/tests/test_transform.py (+15/-8)
bzrlib/transform.py (+1/-1)
bzrlib/workingtree.py (+3/-2)
doc/en/release-notes/bzr-2.3.txt (+4/-0)
To merge this branch: bzr merge lp:~vila/bzr/323111-backup-names
Reviewer Review Type Date Requested Status
Martin Pool Approve
Andrew Bennetts Needs Fixing
Review via email: mp+35109@code.launchpad.net

Description of the change

This introduces a new ``osutils.available_backup_name`` that can be used
to produce <file>.~#~ paths where we need them.

With my next submissions there will be 3 such use cases, enough
to triangulate: they are all protected by a lock so we don't have
to care about race conditions.

The case at hand is ``BzrDir.generate_backup_name``. In the
future we may try to avoid LBYL, but that's out-of-scope for this
submissions and the following ones (and I don't think we really
care here and I'm sure we don't for the 2 other cases (1 in
memory, the other on local fs).

I changed the name from 'generate' to 'available' as I felt it
was better describing: nothing is created but a check is
performed (and this fit well for the other cases too).

To post a comment you must log in.
Revision history for this message
Vincent Ladeuil (vila) wrote :

>>>>> John Arbash Meinel <email address hidden> writes:

    > On 9/10/2010 9:40 AM, Vincent Ladeuil wrote:
    >> + def test_resolve_conflicts_missing_parent(self):

    > The available_backup_name seems ok. I'm not a big fan of 'transport.has'
    > or the look-before-you-leap in general, but we were already doing it.

    > I don't really understand what this is doing here, though:
    > + def test_resolve_conflicts_missing_parent(self):

    > It seems unrelated. Also why remove:
    > - def test_resolve_no_parent(self):

I've replaced it by test_resolve_conflicts_missing_parent clarifying it
while I was there.

    > === modified file 'bzrlib/workingtree.py'
    > --- bzrlib/workingtree.py 2010-08-20 19:07:17 +0000
    > +++ bzrlib/workingtree.py 2010-09-10 14:39:52 +0000
    > @@ -2078,9 +2078,10 @@
    > files_to_backup.append(path[1])

    > def backup(file_to_backup):
    > - backup_name =
    > self.bzrdir.generate_backup_name(file_to_backup)
    > + backup_name =
    > self.bzrdir._available_backup_name(file_to_backup)

    > ^- IMO: WorkingTree shouldn't be calling an '_' method on BzrDir,

Wow, we do that a lot already, what's the problem ?

The public methods or attributes are for bzrlib users while the private
ones are for bzrlib internal needs, we never say this applies to inter
class calls.

    > so we should probably be making it public.

IMHO, methods should be public if there is a need for them outside of
bzrlib and we guarantee some stability and backward compatibility.

If there is no need, the maintenance cost is lowered.

    > Then again, why is *BzrDir* finding the available name, and not
    > working tree itself?

Well, I just touched this method lightly, I didn't research the
rationale in the corresponding merge proposal.

Revision history for this message
Andrew Bennetts (spiv) wrote :

Regarding jam's comments: we aren't consistent about whether leading underscores indicates private to bzrlib, private to that module, or even private to that class. So you have to look at stuff case-by-case in my experience (unless the code somehow makes it clear via comments/docstrings).

In this case workingtree.py already calls self._bzrdir._get_file_mode a fair bit, so there's some precedent... but it seems odd to me to make it private, because presumably if it's a useful method on bzrdir (and it must be, you're using it), then couldn't 3rd-party code like plugins have a use for it too? After all the method you are deprecating, BzrDir.generate_backup_name, is public, perhaps there was a reason for that (perhaps not, but it's worth asking)? Given that the behaviour appears to be exactly the same what's the purpose of changing the method name here? Is it that you consider generate_backup_name to be a poor API that shouldn't be used? If so I don't follow why it's ok for bzrlib to keep using it internally but not 3rd-party code. If it's so that it's easy to find callers to fix later when we stop doing LBYL for this, then it should say so in the docstring!

A related issue is that _available_backup_name really ought to have a comment saying that it isn't race-free because it's not guarded by a lock.

in test_transform:

240 + # Since the directory doesn't exist it's seen as missing to resolve
241 + # create a conflict asking for it to be created.

That comment doesn't parse for me, and I don't understand how the test_transform changes are related to the rest of the patch.

So:

 * +1 on refactoring to create the function in osutils
 * I'm unconvinced on the change to BzrDir's API. I'm open to be convinced, but first I need to understand the intent of changing it :)
 * I don't understand the comment in the test_transform change. It's small enough that I don't mind too much that it appears to be mixed into an unrelated patch, but the comment does need to make sense.

review: Needs Fixing
Revision history for this message
Vincent Ladeuil (vila) wrote :
Download full text (5.6 KiB)

>>>>> Andrew Bennetts <email address hidden> writes:

    > Review: Needs Fixing

    > Regarding jam's comments: we aren't consistent about whether
    > leading underscores indicates private to bzrlib, private to that
    > module, or even private to that class. So you have to look at
    > stuff case-by-case in my experience (unless the code somehow makes
    > it clear via comments/docstrings).

    > In this case workingtree.py already calls
    > self._bzrdir._get_file_mode a fair bit, so there's some
    > precedent...

    > but it seems odd to me to make it private, because presumably if
    > it's a useful method on bzrdir (and it must be, you're using it),
    > then couldn't 3rd-party code like plugins have a use for it too?

I don't know.

That's why I make private :)

Th rationale is that maintaining compatibility is guaranteed for the
public methods. Despite this, plugin authors use private methods.

It seems more adequate then to drive the maintenance effort based on
real use cases rather than 'potential' use cases.

Since I couldn't see *obvious* cases where bzrdir._available_backup_name
can be used, I made it private (moreover, as jam pointed out, it may be
better located on wt instead but I wanted to limit my cleanups to the
code duplication and added the deprecations as I felt things weren't
that clear in this area so going private will help in the future if we
need to change).

The only bits I thought were worth publicizing were the backup name
policy, so I put that in osutils for lack of a better place (osutils
(as a module) must die ! (and become a package ;)).

    > After all the method you are deprecating,
    > BzrDir.generate_backup_name, is public, perhaps there was a reason
    > for that (perhaps not, but it's worth asking)?

It has been recently introduced and duplicated the existing code in
transform, see
https://code.edge.launchpad.net/~vila/bzr/323111-deprecate-get-backup-name/+merge/35111
which use 'get_backup_name' instead of 'generate_backup_name'.

    > Given that the behaviour appears to be exactly the same what's the
    > purpose of changing the method name here?

Hinting at the race condition which wasn't clear with 'generate' (or
get_backup_name in transform, see following submissions) and make it a
method.

I thought I explained that in the cover letter:

,----
| With my next submissions there will be 3 such use cases, enough
| to triangulate: they are all protected by a lock so we don't have
| to care about race conditions.
|
| The case at hand is ``BzrDir.generate_backup_name``. In the
| future we may try to avoid LBYL, but that's out-of-scope for this
| submissions and the following ones (and I don't think we really
| care here and I'm sure we don't for the 2 other cases (1 in
| memory, the other on local fs).
|
| I changed the name from 'generate' to 'available' as I felt it
| was better describing: nothing is created but a check is
| performed (and this fit well for the other cases too).
`----

    > Is it that you consider generate_backup_name to be a poor API that
    > shouldn't be used?

Well 'generate_backup_name' and 'get_backup_name' were racy and
misleading, hopefully '...

Read more...

Revision history for this message
Vincent Ladeuil (vila) wrote :

Updated, please review.

Revision history for this message
Martin Pool (mbp) wrote :

It looks basically OK to me; I'll let Andrew check if you've sufficiently addressed his points.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'bzrlib/bzrdir.py'
2--- bzrlib/bzrdir.py 2010-10-15 11:20:45 +0000
3+++ bzrlib/bzrdir.py 2010-10-15 14:16:33 +0000
4@@ -87,6 +87,10 @@
5 registry,
6 symbol_versioning,
7 )
8+from bzrlib.symbol_versioning import (
9+ deprecated_in,
10+ deprecated_method,
11+ )
12
13
14 class BzrDir(controldir.ControlDir):
15@@ -497,14 +501,16 @@
16 format=format).bzrdir
17 return bzrdir.create_workingtree()
18
19+ @deprecated_method(deprecated_in((2, 3, 0)))
20 def generate_backup_name(self, base):
21- """Generate a non-existing backup file name based on base."""
22- counter = 1
23- name = "%s.~%d~" % (base, counter)
24- while self.root_transport.has(name):
25- counter += 1
26- name = "%s.~%d~" % (base, counter)
27- return name
28+ return self._available_backup_name(base)
29+
30+ def _available_backup_name(self, base):
31+ """Find a non-existing backup file name based on base.
32+
33+ See bzrlib.osutils.available_backup_name about race conditions.
34+ """
35+ return osutils.available_backup_name(base, self.root_transport.has)
36
37 def backup_bzrdir(self):
38 """Backup this bzr control directory.
39@@ -512,16 +518,13 @@
40 :return: Tuple with old path name and new path name
41 """
42
43- backup_dir=self.generate_backup_name('backup.bzr')
44 pb = ui.ui_factory.nested_progress_bar()
45 try:
46- # FIXME: bug 300001 -- the backup fails if the backup directory
47- # already exists, but it should instead either remove it or make
48- # a new backup directory.
49- #
50 old_path = self.root_transport.abspath('.bzr')
51+ backup_dir = self._available_backup_name('backup.bzr')
52 new_path = self.root_transport.abspath(backup_dir)
53- ui.ui_factory.note('making backup of %s\n to %s' % (old_path, new_path,))
54+ ui.ui_factory.note('making backup of %s\n to %s'
55+ % (old_path, new_path,))
56 self.root_transport.copy_tree('.bzr', backup_dir)
57 return (old_path, new_path)
58 finally:
59
60=== modified file 'bzrlib/osutils.py'
61--- bzrlib/osutils.py 2010-08-24 12:36:53 +0000
62+++ bzrlib/osutils.py 2010-10-15 14:16:33 +0000
63@@ -2354,3 +2354,23 @@
64 raise errors.BzrError("Can't decode username as %s." % \
65 user_encoding)
66 return username
67+
68+
69+def available_backup_name(base, exists):
70+ """Find a non-existing backup file name.
71+
72+ This will *not* create anything, this only return a 'free' entry. This
73+ should be used for checking names in a directory below a locked
74+ tree/branch/repo to avoid race conditions. This is LBYL (Look Before You
75+ Leap) and generally discouraged.
76+
77+ :param base: The base name.
78+
79+ :param exists: A callable returning True if the path parameter exists.
80+ """
81+ counter = 1
82+ name = "%s.~%d~" % (base, counter)
83+ while exists(name):
84+ counter += 1
85+ name = "%s.~%d~" % (base, counter)
86+ return name
87
88=== modified file 'bzrlib/tests/per_workingtree/__init__.py'
89--- bzrlib/tests/per_workingtree/__init__.py 2010-08-21 16:06:24 +0000
90+++ bzrlib/tests/per_workingtree/__init__.py 2010-10-15 14:16:33 +0000
91@@ -23,6 +23,7 @@
92 """
93
94 from bzrlib import (
95+ branchbuilder,
96 errors,
97 tests,
98 workingtree,
99@@ -58,6 +59,13 @@
100 made_control.create_branch()
101 return self.workingtree_format.initialize(made_control)
102
103+ def make_branch_builder(self, relpath, format=None):
104+ if format is None:
105+ format = self.bzrdir_format
106+ builder = branchbuilder.BranchBuilder(self.get_transport(),
107+ format=format)
108+ return builder
109+
110
111 def workingtree_formats():
112 """The known working tree formats."""
113
114=== modified file 'bzrlib/tests/per_workingtree/test_merge_from_branch.py'
115--- bzrlib/tests/per_workingtree/test_merge_from_branch.py 2010-04-01 14:06:48 +0000
116+++ bzrlib/tests/per_workingtree/test_merge_from_branch.py 2010-10-15 14:16:33 +0000
117@@ -20,7 +20,6 @@
118 import os
119
120 from bzrlib import (
121- branchbuilder,
122 errors,
123 merge
124 )
125@@ -119,13 +118,6 @@
126
127 class TestMergedBranch(per_workingtree.TestCaseWithWorkingTree):
128
129- def make_branch_builder(self, relpath, format=None):
130- if format is None:
131- format = self.bzrdir_format
132- builder = branchbuilder.BranchBuilder(self.get_transport(),
133- format=format)
134- return builder
135-
136 def make_inner_branch(self):
137 bld_inner = self.make_branch_builder('inner')
138 bld_inner.start_series()
139
140=== modified file 'bzrlib/tests/test_bzrdir.py'
141--- bzrlib/tests/test_bzrdir.py 2010-08-20 19:07:17 +0000
142+++ bzrlib/tests/test_bzrdir.py 2010-10-15 14:16:33 +0000
143@@ -32,6 +32,7 @@
144 repository,
145 osutils,
146 remote,
147+ symbol_versioning,
148 urlutils,
149 win32utils,
150 workingtree,
151@@ -1417,6 +1418,9 @@
152
153
154 class TestGenerateBackupName(TestCaseWithMemoryTransport):
155+ # FIXME: This may need to be unified with test_osutils.TestBackupNames or
156+ # moved to per_bzrdir or per_transport for better coverage ?
157+ # -- vila 20100909
158
159 def setUp(self):
160 super(TestGenerateBackupName, self).setUp()
161@@ -1425,9 +1429,14 @@
162 possible_transports=[self._transport])
163 self._bzrdir = bzrdir.BzrDir.open_from_transport(self._transport)
164
165+ def test_deprecated_generate_backup_name(self):
166+ res = self.applyDeprecated(
167+ symbol_versioning.deprecated_in((2, 3, 0)),
168+ self._bzrdir.generate_backup_name, 'whatever')
169+
170 def test_new(self):
171- self.assertEqual("a.~1~", self._bzrdir.generate_backup_name("a"))
172+ self.assertEqual("a.~1~", self._bzrdir._available_backup_name("a"))
173
174 def test_exiting(self):
175 self._transport.put_bytes("a.~1~", "some content")
176- self.assertEqual("a.~2~", self._bzrdir.generate_backup_name("a"))
177+ self.assertEqual("a.~2~", self._bzrdir._available_backup_name("a"))
178
179=== modified file 'bzrlib/tests/test_osutils.py'
180--- bzrlib/tests/test_osutils.py 2010-09-27 07:24:54 +0000
181+++ bzrlib/tests/test_osutils.py 2010-10-15 14:16:33 +0000
182@@ -2085,3 +2085,34 @@
183 encoded_username = uni_username.encode(ue)
184 osutils.set_or_unset_env('LOGNAME', encoded_username)
185 self.assertEqual(uni_username, osutils.getuser_unicode())
186+ osutils.set_or_unset_env('LOGNAME', u'jrandom\xb6'.encode(ue))
187+ self.assertEqual(u'jrandom\xb6', osutils.getuser_unicode())
188+
189+class TestBackupNames(tests.TestCase):
190+
191+ def setUp(self):
192+ super(TestBackupNames, self).setUp()
193+ self.backups = []
194+
195+ def backup_exists(self, name):
196+ return name in self.backups
197+
198+ def available_backup_name(self, name):
199+ backup_name = osutils.available_backup_name(name, self.backup_exists)
200+ self.backups.append(backup_name)
201+ return backup_name
202+
203+ def assertBackupName(self, expected, name):
204+ self.assertEqual(expected, self.available_backup_name(name))
205+
206+ def test_empty(self):
207+ self.assertBackupName('file.~1~', 'file')
208+
209+ def test_existing(self):
210+ self.available_backup_name('file')
211+ self.available_backup_name('file')
212+ self.assertBackupName('file.~3~', 'file')
213+ # Empty slots are found, this is not a strict requirement and may be
214+ # revisited if we test against all implementations.
215+ self.backups.remove('file.~2~')
216+ self.assertBackupName('file.~2~', 'file')
217
218=== modified file 'bzrlib/tests/test_transform.py'
219--- bzrlib/tests/test_transform.py 2010-10-08 10:50:51 +0000
220+++ bzrlib/tests/test_transform.py 2010-10-15 14:16:33 +0000
221@@ -815,6 +815,21 @@
222 self.assertIs(None, self.wt.path2id('parent'))
223 self.assertIs(None, self.wt.path2id('parent.new'))
224
225+ def test_resolve_conflicts_missing_parent(self):
226+ wt = self.make_branch_and_tree('.')
227+ tt = TreeTransform(wt)
228+ self.addCleanup(tt.finalize)
229+ parent = tt.trans_id_file_id('parent-id')
230+ tt.new_file('file', parent, 'Contents')
231+ raw_conflicts = resolve_conflicts(tt)
232+ # Since the directory doesn't exist it's seen as 'missing'. So
233+ # 'resolve_conflicts' create a conflict asking for it to be created.
234+ self.assertLength(1, raw_conflicts)
235+ self.assertEqual(('missing parent', 'Created directory', 'new-1'),
236+ raw_conflicts.pop())
237+ # apply fail since the missing directory doesn't exist
238+ self.assertRaises(errors.NoFinalPath, tt.apply)
239+
240 def test_moving_versioned_directories(self):
241 create, root = self.get_transform()
242 kansas = create.new_directory('kansas', root, 'kansas-id')
243@@ -2331,14 +2346,6 @@
244 self.failUnlessExists('a')
245 self.failUnlessExists('a/b')
246
247- def test_resolve_no_parent(self):
248- wt = self.make_branch_and_tree('.')
249- tt = TreeTransform(wt)
250- self.addCleanup(tt.finalize)
251- parent = tt.trans_id_file_id('parent-id')
252- tt.new_file('file', parent, 'Contents')
253- resolve_conflicts(tt)
254-
255
256 A_ENTRY = ('a-id', ('a', 'a'), True, (True, True),
257 ('TREE_ROOT', 'TREE_ROOT'), ('a', 'a'), ('file', 'file'),
258
259=== modified file 'bzrlib/transform.py'
260--- bzrlib/transform.py 2010-08-27 07:54:21 +0000
261+++ bzrlib/transform.py 2010-10-15 14:16:33 +0000
262@@ -2917,7 +2917,7 @@
263 try:
264 os.rename(to, from_)
265 except OSError, e:
266- raise errors.TransformRenameFailed(to, from_, str(e), e.errno)
267+ raise errors.TransformRenameFailed(to, from_, str(e), e.errno)
268 # after rollback, don't reuse _FileMover
269 past_renames = None
270 pending_deletions = None
271
272=== modified file 'bzrlib/workingtree.py'
273--- bzrlib/workingtree.py 2010-09-24 16:16:13 +0000
274+++ bzrlib/workingtree.py 2010-10-15 14:16:33 +0000
275@@ -2080,9 +2080,10 @@
276 files_to_backup.append(path[1])
277
278 def backup(file_to_backup):
279- backup_name = self.bzrdir.generate_backup_name(file_to_backup)
280+ backup_name = self.bzrdir._available_backup_name(file_to_backup)
281 osutils.rename(abs_path, self.abspath(backup_name))
282- return "removed %s (but kept a copy: %s)" % (file_to_backup, backup_name)
283+ return "removed %s (but kept a copy: %s)" % (file_to_backup,
284+ backup_name)
285
286 # Build inv_delta and delete files where applicable,
287 # do this before any modifications to inventory.
288
289=== modified file 'doc/en/release-notes/bzr-2.3.txt'
290--- doc/en/release-notes/bzr-2.3.txt 2010-10-15 11:33:33 +0000
291+++ doc/en/release-notes/bzr-2.3.txt 2010-10-15 14:16:33 +0000
292@@ -103,6 +103,10 @@
293 strictly "asciibetical". That behavior is still available as ``bzr tags
294 --sort=alpha``. (Neil Martinsen-Burrell, #640760)
295
296+* ``BzrDir.generate_backup_name`` has been deprecated and replaced by a
297+ private method. ``osutils.available_backup_name`` provides an extensible
298+ replacement. (Vincent Ladeuil)
299+
300 New Features
301 ************
302