Merge lp:~spiv/bzr-builddeb/use-dpkg-mergechangelogs into lp:bzr-builddeb

Proposed by Andrew Bennetts
Status: Merged
Approved by: Jelmer Vernooij
Approved revision: 584
Merged at revision: 596
Proposed branch: lp:~spiv/bzr-builddeb/use-dpkg-mergechangelogs
Merge into: lp:bzr-builddeb
Diff against target: 438 lines (+167/-161)
4 files modified
README (+5/-0)
debian/changelog (+10/-1)
merge_changelog.py (+60/-116)
tests/test_merge_changelog.py (+92/-44)
To merge this branch: bzr merge lp:~spiv/bzr-builddeb/use-dpkg-mergechangelogs
Reviewer Review Type Date Requested Status
Jelmer Vernooij code Approve
Review via email: mp+68797@code.launchpad.net

Commit message

Use dpkg-mergechangelogs rather than custom logic to merge debian/changelog files.

Description of the change

This switches merge_changelog to simply call out to dpkg-mergechangelogs rather than use its own NIH logic.

As discussed on the bug and UDD list this required updating some tests for the new behaviour, but everyone seems confident that the new behaviour is actually better.

Fixing the tests revealed two very minor drawbacks to dpkg-mergechangelogs:

 1) when merging invalid changelogs it seems to lose the final line of text
 2) the conflict markers aren't quite standard: “<<<<<<” rather than “<<<<<<<” (6 vs. 7 chars).

1) is mitigated greatly by the fact that dpkg-mergechangelogs emits warnings when dealing with invalid changelogs, so developers will be encouraged to check and fix their data anyway. And I imagine invalid changelogs aren't too common, mostly people use tools to help them write them, and the format isn't *that* onerous.

2) is largely cosmetic. I can imagine it might impede using external tools to help deal with conflicts (like meld?) but with some luck those tools are tolerant of the difference, and that drawback is greatly outweighed by the benefit of doing better merges in the first place.

I've updated the README to list installing dpkg-dev as a requirement. For the intended audience of bzr-builddeb that seems pretty reasonable :)

To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

I think (1) is an acceptable regression given how many other regressions this mp fixes. We should probably file a bug against dpkg-dev about it.

For (2), would it be possible to do postprocessing on the file to fix up the conflict markers to be more standard? It seems like a minor thing though, so perhaps a bug report would also be nice for this.

Either way, I think this is a huge improvement over the previous situation.

[tweak]
Can you add the right entries for these bugs to debian/changelog? The dch command should make this easy ("dch 'Fix the foobar. LP: #...'").

review: Approve (code)
Revision history for this message
Andrew Bennetts (spiv) wrote :

Jelmer Vernooij wrote:
> I think (1) is an acceptable regression given how many other regressions this
> mp fixes. We should probably file a bug against dpkg-dev about it.

Yeah, I think so.

> For (2), would it be possible to do postprocessing on the file to fix up the
> conflict markers to be more standard? It seems like a minor thing though, so
> perhaps a bug report would also be nice for this.

Hmm! I guess we could! We know when dpkg-mergechangelogs thinks there's a
conflict, and I doubt those lines could appear in a valid changelog file. I'll
do that!

> [tweak]
> Can you add the right entries for these bugs to debian/changelog? The dch command should make this easy ("dch 'Fix the foobar. LP: #...'").

Ah, I wondered where the news file was, silly me :)

585. By Andrew Bennetts

Workaround bug 815700 in dpkg-mergechangelogs (it emits slightly non-standard conflict markers), and link to bug 815704 in an XXX comment in a test.

586. By Andrew Bennetts

Update debian/changelog.

Revision history for this message
Andrew Bennetts (spiv) wrote :

It turns out I'm not in the bzr-builddeb-hackers team so can't push directly to lp:bzr-builddeb.

Jelmer, could you push lp:~spiv/bzr-builddeb/trunk-merge-of-use-dpkg-mergechangelogs to lp:bzr-builddeb? (Unless someone else gets there first, of course!) It's just trunk plus a revision merging my changes.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'README'
2--- README 2011-01-10 21:59:19 +0000
3+++ README 2011-07-25 08:06:14 +0000
4@@ -24,6 +24,11 @@
5
6 .. _python-debian: http://bzr.debian.org/pkg-python-debian/trunk/
7
8+It also requires the ``dpkg-dev`` package to be installed (for the
9+``dpkg-mergechangelogs`` tool)::
10+
11+ apt-get install dpkg-dev
12+
13 This plugin can be installed in two ways. As you are probably using a Debian
14 system you can probably just use the Debian packages. The other way is to
15 branch it in to ``~/.bazaar/plugins/builddeb``, i.e::
16
17=== modified file 'debian/changelog'
18--- debian/changelog 2011-07-20 19:28:00 +0000
19+++ debian/changelog 2011-07-25 08:06:14 +0000
20@@ -1,9 +1,18 @@
21 bzr-builddeb (2.7.7) UNRELEASED; urgency=low
22
23+ [ Jelmer Vernooij ]
24 * Build type now defaults to normal mode when used in an empty tree.
25 LP: #776528
26
27- -- Jelmer Vernooij <jelmer@debian.org> Wed, 20 Jul 2011 21:27:48 +0200
28+ [ Andrew Bennetts ]
29+ * Use dpkg-mergechangelogs(1) to merge changelog files. Fixes LP:
30+ #718944 (preserve the existing ordering of changelog entries when
31+ merging); LP: #552950 (Additional changelog entries are modified
32+ when merging another branch); LP: #517093 (3-way changelog file
33+ merge doesn't do textual merging on sections); and LP: #517090 (3-
34+ way changelog file merge doesn't support deletions).
35+
36+ -- Andrew Bennetts <andrew@bemusement.org> Mon, 25 Jul 2011 16:38:09 +1000
37
38 bzr-builddeb (2.7.6) unstable; urgency=low
39
40
41=== modified file 'merge_changelog.py'
42--- merge_changelog.py 2010-05-28 20:47:13 +0000
43+++ merge_changelog.py 2011-07-25 08:06:14 +0000
44@@ -16,13 +16,24 @@
45 # You should have received a copy of the GNU General Public License
46 # along with this program. If not, see <http://www.gnu.org/licenses/>.
47
48+import logging
49+import os.path
50 import re
51+import shutil
52+import subprocess
53+import tempfile
54
55 from bzrlib import (
56 merge,
57 )
58
59
60+# A logger in the 'bzr' hierarchy. By default messages will be propagated to
61+# the standard bzr logger, but tests can easily intercept just this logger if
62+# they wish.
63+_logger = logging.getLogger('bzr.plugins.builddeb.merge_changelog')
64+
65+
66 class ChangeLogFileMerge(merge.ConfigurableFileMerger):
67
68 name_prefix = 'deb_changelog'
69@@ -30,123 +41,56 @@
70
71 def merge_text(self, params):
72 return merge_changelog(params.this_lines, params.other_lines,
73- params.base_lines)
74-
75-
76-# Regular expression for top of debian/changelog
77-CL_RE = re.compile(r'^(\w[-+0-9a-z.]*) \(([^\(\) \t]+)\)((\s+[-0-9a-z]+)+)\;',
78- re.IGNORECASE)
79+ params.base_lines)
80+
81
82 def merge_changelog(this_lines, other_lines, base_lines=[]):
83 """Merge a changelog file."""
84- try:
85- from debian import changelog
86- except ImportError:
87- # Prior to 0.1.15 the debian module was called debian_bundle
88- from debian_bundle import changelog
89-
90- try:
91- left_cl = read_changelog(this_lines)
92- right_cl = read_changelog(other_lines)
93- # BASE lines don't end up in the output, so we allow strict=False
94- base_cl = read_changelog(base_lines, strict=False)
95- except changelog.ChangelogParseError:
96- return ('not_applicable', None)
97-
98- content = []
99- def step(iterator):
100- try:
101- return iterator.next()
102- except StopIteration:
103- return None
104- left_blocks = dict((b.version, b) for b in left_cl._blocks)
105- right_blocks = dict((b.version, b) for b in right_cl._blocks)
106- # Unfortunately, while version objects implement __eq__ they *don't*
107- # implement __hash__, which means we can't do dict lookups properly, so
108- # instead, we fall back on the version string instead of the object.
109- # Make sure never to try to use right_version in left_blocks because of
110- # this.
111- # We lazily parse the base data, in case we never need it
112- base_blocks = dict((b.version.full_version, b) for b in base_cl._blocks)
113- left_order = iter(sorted(left_blocks.keys(), reverse=True))
114- right_order = iter(sorted(right_blocks.keys(), reverse=True))
115- left_version = step(left_order)
116- right_version = step(right_order)
117-
118- # TODO: Do we want to support the ability to delete a section? We could do
119- # a first-pass algorithm that checks the versions in base versus the
120- # versions in this and other, to determine what versions should be in
121- # the output. For now, we just assume that if a version is present in
122- # any of this or other, then we want it in the output.
123- conflict_status = 'success'
124-
125- while left_version is not None or right_version is not None:
126- if (left_version is None or
127- (right_version is not None and right_version > left_version)):
128- next_content = str(right_blocks[right_version])
129- right_version = step(right_order)
130- elif (right_version is None or
131- (left_version is not None and left_version > right_version)):
132- next_content = str(left_blocks[left_version])
133- left_version = step(left_order)
134+ # Write the BASE, THIS and OTHER versions to files in a temporary
135+ # directory, and use dpkg-mergechangelogs to merge them.
136+ tmpdir = tempfile.mkdtemp('deb_changelog_merge')
137+ try:
138+ def writelines(filename, lines):
139+ with open(filename, 'w') as f:
140+ for line in lines:
141+ f.write(line)
142+ base_filename = os.path.join(tmpdir, 'changelog.base')
143+ this_filename = os.path.join(tmpdir, 'changelog.this')
144+ other_filename = os.path.join(tmpdir, 'changelog.other')
145+ writelines(base_filename, base_lines)
146+ writelines(this_filename, this_lines)
147+ writelines(other_filename, other_lines)
148+ proc = subprocess.Popen(['dpkg-mergechangelogs', base_filename,
149+ this_filename, other_filename], stdout=subprocess.PIPE,
150+ stderr=subprocess.PIPE)
151+ stdout, stderr = proc.communicate()
152+ retcode = proc.returncode
153+ if stderr:
154+ # Relay the warning from dpkg-mergechangelogs to the user. We
155+ # don't decorate the messages at all, as dpkg-mergechangelogs
156+ # warnings are already prefixed with "dpkg-mergechangelogs:
157+ # warning:" which makes the origin of the messages quite clear.
158+ _logger.warning('%s', stderr)
159+ if retcode == 1:
160+ # dpkg-mergechangelogs reports a conflict. Unfortunately it uses
161+ # slightly non-standard conflict markers (<http://pad.lv/815700>:
162+ # "<<<<<<" rather than "<<<<<<<", i.e. 6 chars instead of 7), so we
163+ # correct that here to make the results of this plugin as
164+ # consistent with regular bzr usage as possible. Note that
165+ # conflict markers are never valid lines in a changelog file, so
166+ # it's reasonable for us to assume that any line that looks like a
167+ # conflict marker is a conflict marker (rather than valid content).
168+ # At worst a conflicted merge of an invalid changelog file that
169+ # already contained a non-standard conflict marker will have that
170+ # conflict marker made standard, which is more like a feature than
171+ # a bug!
172+ def replace_func(match_obj):
173+ match_text = match_obj.group(0)
174+ return match_text[0] * 7
175+ stdout = re.sub(
176+ '^[<=>]{6}$', replace_func, stdout, flags=re.MULTILINE)
177+ return 'conflicted', stdout
178 else:
179- assert left_version == right_version
180- # Same version, step both
181- # TODO: Conflict if left_version != right
182- # Note: See above comment why we can't use
183- # right_blocks[left_version] even though they *should* be
184- # equivalent
185- left_content = str(left_blocks[left_version])
186- right_content = str(right_blocks[right_version])
187- if left_content == right_content:
188- # Identical content
189- next_content = left_content
190- else:
191- # Sides disagree, compare with base
192- base_content = str(base_blocks.get(left_version.full_version,
193- ''))
194- if left_content == base_content:
195- next_content = right_content
196- elif right_content == base_content:
197- next_content = left_content
198- else:
199- # TODO: We could use merge3.Merge3 to try a line-based
200- # textual merge on the content. However, for now I'm
201- # just going to conflict on the whole region
202- # Conflict names taken from merge.py
203- next_content = ('<<<<<<< TREE\n'
204- + left_content
205- + '=======\n'
206- + right_content
207- + '>>>>>>> MERGE-SOURCE\n'
208- )
209- conflict_status = 'conflicted'
210- next_block = left_blocks[left_version]
211- left_version = step(left_order)
212- right_version = step(right_order)
213- content.append(next_content)
214-
215- return conflict_status, content
216-
217-
218-def read_changelog(lines, strict=True):
219- """Return a parsed changelog file."""
220- try:
221- from debian import changelog
222- except ImportError:
223- # Prior to 0.1.15 the debian module was called debian_bundle
224- from debian_bundle import changelog
225- # Note: There appears to be a bug in Changelog if you pass it an iterable
226- # of lines (like a file obj, or a list of lines). Specifically, it
227- # does not strip trailing newlines, and it adds ones back in, so you
228- # get doubled blank lines... :(
229- # So we just ''.join() the lines and don't worry about it
230- # Note: There is also a bug that the Changelog constructor suppresses parse
231- # errors, so we want to always call parse_changelog separately
232- content = ''.join(lines)
233- cl = changelog.Changelog()
234- if content:
235- # We get a warning if we try to parse an empty changelog file, which in
236- # strict mode is an error, so only parse when we have content
237- cl.parse_changelog(content, strict=strict)
238- return cl
239+ return 'success', stdout
240+ finally:
241+ shutil.rmtree(tmpdir)
242
243=== modified file 'tests/test_merge_changelog.py'
244--- tests/test_merge_changelog.py 2010-05-23 15:29:38 +0000
245+++ tests/test_merge_changelog.py 2011-07-25 08:06:14 +0000
246@@ -19,7 +19,7 @@
247
248 """Tests for the merge_changelog code."""
249
250-import warnings
251+import logging
252
253 try:
254 from debian import changelog
255@@ -27,6 +27,9 @@
256 # Prior to 0.1.15 the debian module was called debian_bundle
257 from debian_bundle import changelog
258
259+from testtools.content_type import ContentType
260+from testtools.content import Content
261+
262 from bzrlib import (
263 memorytree,
264 merge,
265@@ -70,6 +73,38 @@
266 """.splitlines(True)
267
268
269+# Merge of 2b and 2c using 2 as the base (b adds a line, c adds a line and
270+# deletes a line).
271+v_111_2bc = """\
272+psuedo-prog (1.1.1-2) unstable; urgency=low
273+
274+ * New upstream release.
275+ * Yet another content for 1.1.1-2
276+ * But more is better
277+
278+ -- Joe Foo <joe@example.com> Thu, 28 Jan 2010 10:45:44 +0000
279+
280+""".splitlines(True)
281+
282+
283+# Merge of 2b and 2c using an empty base. (As calculated by
284+# dpkg-mergechangelogs.)
285+v_111_2bc_empty_base = """\
286+psuedo-prog (1.1.1-2) unstable; urgency=low
287+
288+ * New upstream release.
289+<<<<<<<
290+ * Awesome bug fixes.
291+=======
292+ * Yet another content for 1.1.1-2
293+>>>>>>>
294+ * But more is better
295+
296+ -- Joe Foo <joe@example.com> Thu, 28 Jan 2010 10:45:44 +0000
297+
298+""".splitlines(True)
299+
300+
301 v_112_1 = """\
302 psuedo-prog (1.1.2-1) unstable; urgency=low
303
304@@ -92,15 +127,37 @@
305 """.splitlines(True)
306
307
308-class TestReadChangelog(tests.TestCase):
309+# Backports from current testtools so that we remain compatible with testtools
310+# 0.9.2 (the version in lucid).
311+UTF8_TEXT = ContentType('text', 'plain', {'charset': 'utf8'})
312+def text_content(text):
313+ """Create a `Content` object from some text.
314
315- def test_read_changelog(self):
316- cl = merge_changelog.read_changelog(v_112_1)
317- self.assertEqual(1, len(cl._blocks))
318+ This is useful for adding details which are short strings.
319+ """
320+ return Content(UTF8_TEXT, lambda: [text.encode('utf8')])
321
322
323 class TestMergeChangelog(tests.TestCase):
324
325+ def setUp(self):
326+ super(tests.TestCase, self).setUp()
327+ # Intercept warnings from merge_changelog's logger: this is where
328+ self.logged_warnings = self.make_utf8_encoded_stringio()
329+ self.addCleanup(self.addMergeChangelogWarningsDetail)
330+ handler = logging.StreamHandler(self.logged_warnings)
331+ handler.setLevel(logging.WARNING)
332+ logger = logging.getLogger('bzr.plugins.builddeb.merge_changelog')
333+ logger.addHandler(handler)
334+ self.addCleanup(logger.removeHandler, handler)
335+ self.overrideAttr(logger, 'propagate', False)
336+
337+ def addMergeChangelogWarningsDetail(self):
338+ warnings_log = self.logged_warnings.getvalue()
339+ if warnings_log:
340+ self.addDetail(
341+ 'merge_changelog warnings', text_content(warnings_log))
342+
343 def assertMergeChangelog(self, expected_lines, this_lines, other_lines,
344 base_lines=[], conflicted=False):
345 status, merged_lines = merge_changelog.merge_changelog(
346@@ -139,10 +196,10 @@
347 base_lines=[])
348
349 def test_unsorted(self):
350- # Passing in an improperly sorted text should result in a properly
351- # sorted one
352+ # The order of entries being merged is unchanged, even if they are not
353+ # properly sorted. (This is a merge tool, not a reformatting tool.)
354 self.assertMergeChangelog(v_111_2 + v_001_1,
355- this_lines = v_001_1 + v_111_2,
356+ this_lines = v_111_2 + v_001_1,
357 other_lines = [],
358 base_lines = [])
359
360@@ -158,20 +215,11 @@
361
362 def test_3way_conflicted(self):
363 self.assertMergeChangelog(
364- expected_lines=['<<<<<<< TREE\n']
365- + v_111_2b
366- + ['=======\n']
367- + v_111_2c
368- + ['>>>>>>> MERGE-SOURCE\n'],
369+ expected_lines=v_111_2bc,
370 this_lines=v_111_2b, other_lines=v_111_2c,
371- base_lines=v_111_2,
372- conflicted=True)
373+ base_lines=v_111_2)
374 self.assertMergeChangelog(
375- expected_lines=['<<<<<<< TREE\n']
376- + v_111_2b
377- + ['=======\n']
378- + v_111_2c
379- + ['>>>>>>> MERGE-SOURCE\n'],
380+ expected_lines=v_111_2bc_empty_base,
381 this_lines=v_111_2b, other_lines=v_111_2c,
382 base_lines=[],
383 conflicted=True)
384@@ -186,30 +234,30 @@
385 -- Thu, 28 Jan 2010 10:45:44 +0000
386
387 """.splitlines(True)
388- # Missing the author and we don't have allow_missing_author set
389- cl = changelog.Changelog()
390- self.assertRaises(changelog.ChangelogParseError,
391- cl.parse_changelog, ''.join(invalid_changelog), strict=True)
392- # If strict parsing fails, don't try to do special merging
393- self.assertEqual(('not_applicable', None),
394- merge_changelog.merge_changelog(invalid_changelog, v_111_2,
395- v_111_2))
396- self.assertEqual(('not_applicable', None),
397- merge_changelog.merge_changelog(v_111_2, invalid_changelog,
398- v_111_2))
399- # We are non-strict about parsing BASE, because its contents are not
400- # included in the output.
401- # This triggers a warning, but we don't want to clutter the test run
402- cur_filters = warnings.filters[:]
403- warnings.simplefilter('ignore', UserWarning)
404- try:
405- self.assertMergeChangelog(v_112_1 + v_111_2,
406- this_lines=v_111_2,
407- other_lines=v_112_1,
408- base_lines=invalid_changelog,
409- )
410- finally:
411- warnings.filters = cur_filters[:]
412+ # invalid_changelog is missing the author, but dpkg-mergechangelogs
413+ # copes gracefully with invalid input.
414+ status, lines = merge_changelog.merge_changelog(
415+ invalid_changelog, v_111_2, v_111_2)
416+ self.assertEqual('success', status)
417+ # XXX: ideally we'd expect ''.join(lines) ==
418+ # ''.join(invalid_changelog), but dpkg-mergechangelogs appears to lose
419+ # the final line in these examples.
420+ # <https://bugs.launchpad.net/ubuntu/+source/dpkg/+bug/815704>
421+ # - Andrew Bennetts, 25 July 2011.
422+ #self.assertEqual(''.join(invalid_changelog), ''.join(lines))
423+ status, lines = merge_changelog.merge_changelog(
424+ invalid_changelog, v_111_2, v_111_2)
425+ self.assertEqual('success', status)
426+ #self.assertEqual(''.join(invalid_changelog), ''.join(lines))
427+ self.assertMergeChangelog(v_112_1 +
428+ ['<<<<<<<\n'] +
429+ v_111_2 +
430+ ['=======\n>>>>>>>\n'],
431+ this_lines=v_111_2,
432+ other_lines=v_112_1,
433+ base_lines=invalid_changelog,
434+ conflicted=True
435+ )
436
437
438 class TestChangelogHook(tests.TestCaseWithMemoryTransport):

Subscribers

People subscribed via source and target branches