Merge lp:~parthm/bzr/250451-better-url-for-break-lock into lp:bzr

Proposed by Parth Malwankar
Status: Merged
Approved by: Parth Malwankar
Approved revision: no longer in the source branch.
Merged at revision: 5303
Proposed branch: lp:~parthm/bzr/250451-better-url-for-break-lock
Merge into: lp:bzr
Diff against target: 214 lines (+66/-52)
5 files modified
NEWS (+5/-0)
bzrlib/errors.py (+0/-2)
bzrlib/lockdir.py (+32/-24)
bzrlib/remote.py (+10/-3)
bzrlib/tests/test_lockdir.py (+19/-23)
To merge this branch: bzr merge lp:~parthm/bzr/250451-better-url-for-break-lock
Reviewer Review Type Date Requested Status
Vincent Ladeuil Needs Fixing
Martin Packman (community) Needs Fixing
Review via email: mp+27187@code.launchpad.net

Commit message

Lock URL shown in case of failure to acquire lock (for smart server access) is now valid. Default timeout for lock contention is now 30s.

Description of the change

=== Fixes Bug #250451 ===

This patch gives a more accurate url in case where bzr fails to obtain lock from smart server.
In the current implementation, the server passes details such as lock_url (inaccurate), user, time_ago, pid to client. These are displayed in bzrlib.lockdir.wait_lock. Unfortunately at this point we don't have a clear lock_url and a url like "lp-69989648:///~parthm/%2Bjunk/lock-test/.bzr/branch/lock" is displayed to the user.

This patch, avoids displaying the inaccurate lock_url in the wait_lock. If the wait_fails, bzr raises a LockContention error in bzrlib.remote.RemoteBranch._remote_lock_write. Here, we add the correct lock_url to LockContention msg which is displayed to the user.

Does this approach look reasonable? Any other areas of code that I should check? I would appreciate any pointers on what would be a good place to put tests for this.

One thing remaining would be to handle ^C in bzrlib.lockdir.wait_lock to raise LockContention error so that the correct lock_url is displayed to the user.

Running the test locally, following is the interaction:
========================================================
[foo]% bzr push bzr+ssh://parthm@localhost/home/parthm/tmp/lock-path/trunk
Unable to obtain lock held by Parth Malwankar <email address hidden> at parthm-laptop [process #9606], acquired 7 hours, 52 minutes ago.
Will continue to try until 22:40:59, unless you press Ctrl-C.
See "bzr help break-lock" for more.
bzr: ERROR: Could not acquire lock "(remote lock)": bzr+ssh://parthm@localhost/home/parthm/tmp/lock-path/trunk
[foo]%

Interacting with launchpad (which is running the un-patched server):
====================================================================
[trunk]% bzr push lp:~parthm/+junk/lock-test
Unable to obtain lock lp-69989648:///~parthm/%2Bjunk/lock-test/.bzr/branch/lock
held by <email address hidden> on host crowberry [process #26780]
locked 12 hours, 3 minutes ago
Will continue to try until 18:03:34, unless you press Ctrl-C.
See "bzr help break-lock" for more.
bzr: ERROR: Could not acquire lock "(remote lock)": bzr+ssh://bazaar.launchpad.net/~parthm/%2Bjunk/lock-test
[trunk]%

Current interaction (without patch)
===================================
[trunk]% bzr push lp:~parthm/+junk/lock-test
Unable to obtain lock lp-69989648:///~parthm/%2Bjunk/lock-test/.bzr/branch/lock
held by <email address hidden> on host crowberry [process #26780]
locked 12 hours, 2 minutes ago
Will continue to try until 18:03:15, unless you press Ctrl-C.
See "bzr help break-lock" for more.
bzr: ERROR: Could not acquire lock "(remote lock)":

To post a comment you must log in.
Revision history for this message
Martin Packman (gz) wrote :

+ e.msg = self.repository.base.rstrip('.bzr/')

That doesn't do what you want it too. Use `.split(".bzr")[0]` instead.

Would also prefer a less hacky change to the LockContention instance, there's a todo note on the class related to this.

review: Needs Fixing
Revision history for this message
Parth Malwankar (parthm) wrote :

Some discussion on ^C and timeout:
https://bugs.launchpad.net/bzr/+bug/592148

Revision history for this message
Parth Malwankar (parthm) wrote :

> + e.msg = self.repository.base.rstrip('.bzr/')
>
> That doesn't do what you want it too. Use `.split(".bzr")[0]` instead.

Wow. That was stupid :P Fixed.

> Would also prefer a less hacky change to the LockContention instance, there's
> a todo note on the class related to this.

Agreed. I have added a comment indicating that the lock_url is not know to
the server so the initial LockContention exception doesn't have this
info. A new LockContention is now raised and we avoid directly modifying
instance variables of the old exception.

This update also fixes tests and also add a (timeout > 0) check before
showing "Will continue to try .." message.

Thanks for the review.

This is what the current interaction looks like:

[foo]% bzr push bzr+ssh://parthm@localhost/home/parthm/tmp/lock-path/trunk
parthm@localhost's password:
Unable to obtain lock held by Parth Malwankar <email address hidden>
at parthm-laptop [process #9606], acquired 46 hours, 42 minutes ago.
See "bzr help break-lock" for more.
bzr: ERROR: Could not acquire lock "(remote lock)": bzr+ssh://parthm@localhost/home/parthm/tmp/lock-path/trunk/
[foo]%

[foo]% bzr push bzr://localhost
bzr: ERROR: Could not acquire lock "(remote lock)": bzr://localhost/
[foo]%

Revision history for this message
Parth Malwankar (parthm) wrote :

Some further improvements to this patch:
 - As local lock urls are valid, these are displayed immediately. This way, even if the user chooses to ^C before the timeout, she would still know the url even though ^C keeps LockContention error from being raised.
 - Default timeout for lock contention retry reduced from 300s to 10s (discussion: comments 3 and 4 for bug #592148)
 - Cleanup display of local lock url. i.e.
Display:
    bzr: ERROR: Could not acquire lock "(local)": file:///home/parthm/tmp/lock-path/trunk/
instead of:
    LockContention: Could not acquire lock "LockDir(file:///home/parthm/tmp/lock-path/trunk/.bzr/branch/lock)":

Local error is now:
===================
[foo]% bzr push ../trunk
Unable to obtain lock file:///home/parthm/tmp/lock-path/trunk/
held by Parth Malwankar <email address hidden>
at parthm-laptop [process #9606], acquired 91 hours, 3 minutes ago.
Will continue to try until 09:51:21, unless you press Ctrl-C.
See "bzr help break-lock" for more.
bzr: ERROR: Could not acquire lock "(local)": file:///home/parthm/tmp/lock-path/trunk/

Remote error is:
===================
[foo]% bzr push bzr+ssh://parthm@localhost/home/parthm/tmp/lock-path/trunk
Unable to obtain lock held by Parth Malwankar <email address hidden>
at parthm-laptop [process #9606], acquired 91 hours, 22 minutes ago.
See "bzr help break-lock" for more.
bzr: ERROR: Could not acquire lock "(remote lock)": bzr+ssh://parthm@localhost/home/parthm/tmp/lock-path/trunk/
[foo]%

Revision history for this message
Parth Malwankar (parthm) wrote :

Timeout updated to 30s based on discussion on bug #592148.

Revision history for this message
Vincent Ladeuil (vila) wrote :

Excellent !

One nit though so my vote is really Tweak (needs fixing but Approve if addressed and doesn't need another review):

66 + if lock_url.startswith('file://'):
67 + lock_url = lock_url.split('.bzr/')[0] + '\n'

96 + raise LockContention('(local)', lock_url.rstrip())

is a bit ugly, there should be a way to not add this '\n' instead so you don't have to remove it later. This '\n' is not part of the url anyway.

Not a big deal but I find it harder to read.

review: Needs Fixing
Revision history for this message
Vincent Ladeuil (vila) wrote :

Hmm, just re-read the bug comments, make sure you have agreement from John and Robert about the timeout value 30 or 5.

Revision history for this message
John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Ladeuil wrote:
> Hmm, just re-read the bug comments, make sure you have agreement from John and Robert about the timeout value 30 or 5.

So I wrote a fairly trivial plugin to time how much time is spent in
'save_pack_names'. It is available here:

 lp:~jameinel/+junk/bzr_lock_test

(install as 'lock_test'). This is what I got:

local
  $ bzr lock-test foo --count=100
  commit: 100 0.033s avg, 0.203s max
  save: 100 0.017s avg, 0.090s max

LAN sftp
  $ bzr lock-test sftp://local-server/...
  commit: 100 0.537s avg, 4.100s max
  save: 100 0.149s avg, 0.812s max

LAN smart VFS
  $ bzr lock-test bzr+ssh://local-server/...
  commit: 100 0.410s avg, 3.203s max
  save: 100 0.213s avg, 1.740s max

WAN sftp (my house in Chicago to chinstrap in London)
  $ bzr lock-test sftp://chinstrap..../xxx-test --count=100
  commit: 100 15.635s avg, 89.538s max
  save: 100 5.965s avg, 29.668s max

WAN + smart VFS
  $ bzr lock-test bzr+ssh://chinstrap.../xxx-test --count=100
  commit: 100 7.683s avg, 104.901s max
  save: 100 4.407s avg, 66.544s max

This code is testing VFS operations, rather than 'insert_stream()'
(mostly because I can't isolate the _save_pack_names time, and it should
be equivalent to 'local' time).

Note that 'commit' == 'commit_write_group' which can involve an
autopack, but 'save' is just the _save_pack_names() time. Which includes
a write-lock, reading the remote pack-names, writing a new one, and
unlocking.

Note that it can take up to *30s* just to do that via sftp, and even 66s
w/ VFS.

The averages are in the ~5s range. So I think 10s is possibly ok, 1min
would be being generous (but still possible that 2 people could push to
different branches in a shared repository and have it fail when it could
have succeeded.)

I'm willing to say that the peaks are noise, but I still think a minimum
of 10s, and 30s is probably a decent safety margin. (And yes, if we
always used RPC, then probably <5s could be ok.)

I would be interested to see what people in AU see.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwX5rUACgkQJdeBCYSNAAOwLwCgnmGR5FS2O5mLLnrZ4vNGNoiI
pPgAoMgPoonzfnZOtR/G1hN7vIH0wXqV
=ShnQ
-----END PGP SIGNATURE-----

Revision history for this message
Parth Malwankar (parthm) wrote :

> Excellent !
>
> One nit though so my vote is really Tweak (needs fixing but Approve if
> addressed and doesn't need another review):
>
> 66 + if lock_url.startswith('file://'):
> 67 + lock_url = lock_url.split('.bzr/')[0] + '\n'
>
> 96 + raise LockContention('(local)', lock_url.rstrip())
>
> is a bit ugly, there should be a way to not add this '\n' instead so you don't
> have to remove it later. This '\n' is not part of the url anyway.
>

Yes. That was a little ugly. I was not too happy with it myself.
I had set it up that way to allow a line-break in case the url is printed. I have removed that now and rely on the terminal to wrap the text.

> Not a big deal but I find it harder to read.

Revision history for this message
Parth Malwankar (parthm) wrote :

> Hmm, just re-read the bug comments, make sure you have agreement from John and
> Robert about the timeout value 30 or 5.

John has posted his detailed comment and I synced up with Robert on IRC. Robert was ok with both 10 and 30, so I am keeping this the timeout as 30. As mentioned by John, that seems to have a decent margin.

Revision history for this message
Parth Malwankar (parthm) wrote :

sent to pqm by email

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-06-15 07:44:29 +0000
3+++ NEWS 2010-06-15 09:55:53 +0000
4@@ -63,6 +63,11 @@
5 * Relative imports in plugins are now handled correctly when using
6 BZR_PLUGINS_AT. (Vincent Ladeuil, #588959)
7
8+* URL displayed for use with ``break-lock`` when smart server sees lock
9+ contention are now valid. Default timeout for lock contention retry is
10+ now 30 seconds instead of 300 seconds.
11+ (Parth Malwankar, #250451)
12+
13 Improvements
14 ************
15
16
17=== modified file 'bzrlib/errors.py'
18--- bzrlib/errors.py 2010-05-28 06:04:57 +0000
19+++ bzrlib/errors.py 2010-06-15 09:55:53 +0000
20@@ -1041,8 +1041,6 @@
21 class LockContention(LockError):
22
23 _fmt = 'Could not acquire lock "%(lock)s": %(msg)s'
24- # TODO: show full url for lock, combining the transport and relative
25- # bits?
26
27 internal_error = False
28
29
30=== modified file 'bzrlib/lockdir.py'
31--- bzrlib/lockdir.py 2010-05-25 17:27:52 +0000
32+++ bzrlib/lockdir.py 2010-06-15 09:55:53 +0000
33@@ -151,7 +151,7 @@
34 # files/dirs created.
35
36
37-_DEFAULT_TIMEOUT_SECONDS = 300
38+_DEFAULT_TIMEOUT_SECONDS = 30
39 _DEFAULT_POLL_SECONDS = 1.0
40
41
42@@ -539,24 +539,27 @@
43 if deadline_str is None:
44 deadline_str = time.strftime('%H:%M:%S',
45 time.localtime(deadline))
46+ # As local lock urls are correct we display them.
47+ # We avoid displaying remote lock urls.
48 lock_url = self.transport.abspath(self.path)
49- # See <https://bugs.launchpad.net/bzr/+bug/250451>
50- # the URL here is sometimes not one that is useful to the
51- # user, perhaps being wrapped in a lp-%d or chroot decorator,
52- # especially if this error is issued from the server.
53- self._report_function('%s %s\n'
54- '%s\n' # held by
55- '%s\n' # locked ... ago
56- 'Will continue to try until %s, unless '
57- 'you press Ctrl-C.\n'
58- 'See "bzr help break-lock" for more.',
59- start,
60- formatted_info[0],
61- formatted_info[1],
62- formatted_info[2],
63- deadline_str,
64- )
65-
66+ if lock_url.startswith('file://'):
67+ lock_url = lock_url.split('.bzr/')[0]
68+ else:
69+ lock_url = ''
70+ user, hostname, pid, time_ago = formatted_info
71+ msg = ('%s lock %s ' # lock_url
72+ 'held by ' # start
73+ '%s\n' # user
74+ 'at %s ' # hostname
75+ '[process #%s], ' # pid
76+ 'acquired %s.') # time ago
77+ msg_args = [start, lock_url, user, hostname, pid, time_ago]
78+ if timeout > 0:
79+ msg += ('\nWill continue to try until %s, unless '
80+ 'you press Ctrl-C.')
81+ msg_args.append(deadline_str)
82+ msg += '\nSee "bzr help break-lock" for more.'
83+ self._report_function(msg, *msg_args)
84 if (max_attempts is not None) and (attempt_count >= max_attempts):
85 self._trace("exceeded %d attempts")
86 raise LockContention(self)
87@@ -564,8 +567,11 @@
88 self._trace("waiting %ss", poll)
89 time.sleep(poll)
90 else:
91+ # As timeout is always 0 for remote locks
92+ # this block is applicable only for local
93+ # lock contention
94 self._trace("timeout after waiting %ss", timeout)
95- raise LockContention(self)
96+ raise LockContention('(local)', lock_url)
97
98 def leave_in_place(self):
99 self._locked_via_token = True
100@@ -616,17 +622,19 @@
101
102 def _format_lock_info(self, info):
103 """Turn the contents of peek() into something for the user"""
104- lock_url = self.transport.abspath(self.path)
105 start_time = info.get('start_time')
106 if start_time is None:
107 time_ago = '(unknown)'
108 else:
109 time_ago = format_delta(time.time() - int(info['start_time']))
110+ user = info.get('user', '<unknown>')
111+ hostname = info.get('hostname', '<unknown>')
112+ pid = info.get('pid', '<unknown>')
113 return [
114- 'lock %s' % (lock_url,),
115- 'held by %s on host %s [process #%s]' %
116- tuple([info.get(x, '<unknown>') for x in ['user', 'hostname', 'pid']]),
117- 'locked %s' % (time_ago,),
118+ user,
119+ hostname,
120+ pid,
121+ time_ago,
122 ]
123
124 def validate_token(self, token):
125
126=== modified file 'bzrlib/remote.py'
127--- bzrlib/remote.py 2010-05-20 18:23:17 +0000
128+++ bzrlib/remote.py 2010-06-15 09:55:53 +0000
129@@ -2420,9 +2420,16 @@
130 repo_token = self.repository.lock_write().repository_token
131 self.repository.unlock()
132 err_context = {'token': token}
133- response = self._call(
134- 'Branch.lock_write', self._remote_path(), branch_token,
135- repo_token or '', **err_context)
136+ try:
137+ response = self._call(
138+ 'Branch.lock_write', self._remote_path(), branch_token,
139+ repo_token or '', **err_context)
140+ except errors.LockContention, e:
141+ # The LockContention from the server doesn't have any
142+ # information about the lock_url. We re-raise LockContention
143+ # with valid lock_url.
144+ raise errors.LockContention('(remote lock)',
145+ self.repository.base.split('.bzr/')[0])
146 if response[0] != 'ok':
147 raise errors.UnexpectedSmartServerResponse(response)
148 ok, branch_token, repo_token = response
149
150=== modified file 'bzrlib/tests/test_lockdir.py'
151--- bzrlib/tests/test_lockdir.py 2010-05-20 18:23:17 +0000
152+++ bzrlib/tests/test_lockdir.py 2010-06-15 09:55:53 +0000
153@@ -191,22 +191,21 @@
154 "took %f seconds to detect lock contention" % (after - before))
155 finally:
156 lf1.unlock()
157- lock_base = lf2.transport.abspath(lf2.path)
158 self.assertEqual(1, len(self._logged_reports))
159- lock_url = lf2.transport.abspath(lf2.path)
160- self.assertEqual('%s %s\n'
161- '%s\n%s\n'
162- 'Will continue to try until %s, unless '
163- 'you press Ctrl-C.\n'
164- 'See "bzr help break-lock" for more.',
165- self._logged_reports[0][0])
166- args = self._logged_reports[0][1]
167- self.assertEqual('Unable to obtain', args[0])
168- self.assertEqual('lock %s' % (lock_base,), args[1])
169- self.assertStartsWith(args[2], 'held by ')
170- self.assertStartsWith(args[3], 'locked ')
171- self.assertEndsWith(args[3], ' ago')
172- self.assertContainsRe(args[4], r'\d\d:\d\d:\d\d')
173+ self.assertEqual(self._logged_reports[0][0],
174+ '%s lock %s held by %s\n'
175+ 'at %s [process #%s], acquired %s.\n'
176+ 'Will continue to try until %s, unless '
177+ 'you press Ctrl-C.\n'
178+ 'See "bzr help break-lock" for more.')
179+ start, lock_url, user, hostname, pid, time_ago, deadline_str = \
180+ self._logged_reports[0][1]
181+ self.assertEqual(start, u'Unable to obtain')
182+ self.assertEqual(user, u'jrandom@example.com')
183+ # skip hostname
184+ self.assertContainsRe(pid, r'\d+')
185+ self.assertContainsRe(time_ago, r'.* ago')
186+ self.assertContainsRe(deadline_str, r'\d{2}:\d{2}:\d{2}')
187
188 def test_31_lock_wait_easy(self):
189 """Succeed when waiting on a lock with no contention.
190@@ -597,11 +596,10 @@
191 info_list = ld1._format_lock_info(ld1.peek())
192 finally:
193 ld1.unlock()
194- self.assertEqual('lock %s' % (ld1.transport.abspath(ld1.path),),
195- info_list[0])
196- self.assertContainsRe(info_list[1],
197- r'^held by .* on host .* \[process #\d*\]$')
198- self.assertContainsRe(info_list[2], r'locked \d+ seconds? ago$')
199+ self.assertEqual(info_list[0], u'jrandom@example.com')
200+ # info_list[1] is hostname. we skip this.
201+ self.assertContainsRe(info_list[2], '^\d+$') # pid
202+ self.assertContainsRe(info_list[3], r'^\d+ seconds? ago$') # time_ago
203
204 def test_lock_without_email(self):
205 global_config = config.GlobalConfig()
206@@ -680,9 +678,7 @@
207 info = lf.peek()
208 formatted_info = lf._format_lock_info(info)
209 self.assertEquals(
210- ['lock %s' % t.abspath('test_lock'),
211- 'held by <unknown> on host <unknown> [process #<unknown>]',
212- 'locked (unknown)'],
213+ ['<unknown>', '<unknown>', '<unknown>', '(unknown)'],
214 formatted_info)
215
216