Merge lp:~michael.nelson/launchpad/436182-newlines-in-sources-fix-without-workaround into lp:launchpad
- 436182-newlines-in-sources-fix-without-workaround
- Merge into devel
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Gavin Panella | ||||
Approved revision: | no longer in the source branch. | ||||
Merged at revision: | not available | ||||
Proposed branch: | lp:~michael.nelson/launchpad/436182-newlines-in-sources-fix-without-workaround | ||||
Merge into: | lp:launchpad | ||||
Diff against target: |
537 lines 9 files modified
lib/canonical/launchpad/emailtemplates/upload-accepted.txt (+1/-0) lib/canonical/launchpad/emailtemplates/upload-announcement.txt (+1/-0) lib/canonical/launchpad/emailtemplates/upload-new.txt (+1/-0) lib/canonical/launchpad/emailtemplates/upload-rejection.txt (+1/-0) lib/lp/archiveuploader/changesfile.py (+15/-1) lib/lp/archiveuploader/tagfiles.py (+57/-22) lib/lp/archiveuploader/tests/data/test436182_0.1_source.changes (+23/-0) lib/lp/archiveuploader/tests/test_tagfiles.py (+101/-37) lib/lp/soyuz/model/queue.py (+15/-7) |
||||
To merge this branch: | bzr merge lp:~michael.nelson/launchpad/436182-newlines-in-sources-fix-without-workaround | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Colin Watson (community) | Approve | ||
Gavin Panella (community) | Approve | ||
Review via email: mp+12988@code.launchpad.net |
Commit message
Description of the change
Michael Nelson (michael.nelson) wrote : | # |
Gavin Panella (allenap) wrote : | # |
Hi Michael,
I looked at the diff (revisions 9647 and 9648) and there's nothing to
comment on there, so the diff I've reviewed doesn't contain those
revisions.
My only comments below are almost trivial, so r=me. It was a very good
idea to compare to apt_pkg. Also, parse_tagfile_lines is a ugly pig of
a function!
Gavin.
> === modified file 'lib/lp/
> --- lib/lp/
> +++ lib/lp/
> @@ -64,8 +64,8 @@
> """This exception is raised if parse_changes encounters nastiness"""
> pass
>
> -re_single_
> -re_multi_
> +re_single_
> +re_multi_
>
>
> def parse_tagfile_
> @@ -103,7 +103,8 @@
>
> num_of_lines = len(indexed_
> index = 0
> - first = -1
> + first_value_
> + more_values_
> while index < num_of_lines:
> index += 1
> line = indexed_
> @@ -145,21 +146,47 @@
> if slf:
> field = slf.groups(
> changes[field] = slf.groups()[1]
> - first = 1
> +
> + # If there is no value on this line, we assume this is
> + # the first line of a multiline field, such as the 'files'
> + # field.
> + if changes[field] == '':
> + first_value_
> +
> + # Either way, more values for this field could follow
> + # on the next line.
> + more_values_
> continue
> if line.rstrip() == " .":
> - changes[field] += '\n'
> + changes[field] += '\n' + line
> continue
> mlf = re_multi_
> if mlf:
> - if first == -1:
> + if more_values_
> raise TagFileParseErr
> "line %d: '%s'\n [Multi-line field continuing on from nothing?]"
> % (filename, index,line))
> - if first == 1 and changes[field] != "":
> - changes[field] += '\n'
> - first = 0
> - changes[field] += mlf.groups()[0] + '\n'
> +
> + # XXX Michael Nelson 20091001 bug=440014
> + # Is there any reason why we're not simply using
> + # apt_pkg.
> + # If we can get rid of this code that is trying to mimic
> + # what ParseTagFile does out of the box, it would be a good
> + # thing.
> +
> + # The first value for a newline delimited field, such as
> + # the 'files' field, has its leading spaces stripped. Other
> + # fields (such as a 'binary' field spanning multiple lines)
> + # should *not* be l-stripped of their leading spaces otherwise
> + ...
Michael Nelson (michael.nelson) wrote : | # |
On Wed, 2009-10-07 at 14:51 +0000, Gavin Panella wrote:
> Review: Approve
> Hi Michael,
>
> I looked at the diff (revisions 9647 and 9648) and there's nothing to
> comment on there, so the diff I've reviewed doesn't contain those
> revisions.
>
> My only comments below are almost trivial, so r=me. It was a very good
> idea to compare to apt_pkg. Also, parse_tagfile_lines is a ugly pig of
> a function!
Thanks Gavin - yeah, I'm keen to get rid of parse_tagfile_lines in the
future if possible.
>
> Gavin.
>
[snip]
> > === modified file 'lib/lp/
> > --- lib/lp/
> > +++ lib/lp/
> > @@ -5,6 +5,7 @@
> >
> > # arch-tag: 52e0c871-
> >
> > +import apt_pkg
> > import unittest
> > import sys
> > import shutil
> > @@ -106,19 +107,110 @@
> > tf = parse_tagfile(
> > self.assertRais
> >
> > +
> > +class TestTagFileDebi
> > +
> > + def setUp(self):
> > + """Parse the test file using apt_pkg for comparison."""
> > +
> > + tagfile_path = datadir(
> > + tagfile = open(tagfile_path)
> > + self.apt_
> > + self.apt_
> > +
> > + # Is it just because this is a very old file that test-related
> > + # things are imported locally?
> > + from lp.archiveuploa
> > + self.parse_
> > + tagfile_path, allow_unsigned = True)
> > +
> > + def test_parse_
> > + """parse_tagfile should not leave trailing '\n' on multiline values.
> > +
> > + This is a regression test for bug 436182.
> > + Previously we,
> > + 1. Stripped leading space/tab from subsequent lines of multiline
> > + values, and
> > + 2. appended a trailing '\n' to the end of the value.
> > + """
> > +
> > + expected_text = (
> > + 'test75874, anotherbinary,\n'
> > + ' andanother, andonemore,\n'
> > + '\tlastone')
> > +
> > + self.assertEqual(
> > + expected_text,
> > + self.apt_
> > +
> > + self.assertEqual(
> > + expected_text,
> > + self.parse_
> > +
> > + def test_parse_
> > + """parse_tagfile should not leave leading or tailing '\n' when
> > + parsing newline delimited fields.
> > +
> > + Newline-delimited fields should be parsed to match
> > + apt_pkg.
> > +
> > + Note: in the past, our parse_tagfile function left the leading
> > + '\n' in the parsed value, whereas it should not have.
> > +
> > + XXX check where this value is used and ensure it won't be affected
> > + (ie. skipp...
Michael Nelson (michael.nelson) wrote : | # |
Just adding a incremental that fixes test breakages - ensuring that we
always format the changes before displaying it in emails (so it appears
exactly as it used to in emails - without the '\n .' blank line token or
initial spaces.
Julian has already reviewed the incremental and I've put it through ec2
test so it's ready for testing on dogfood.
--
Michael
1 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-accepted.txt' |
2 | --- lib/canonical/launchpad/emailtemplates/upload-accepted.txt 2008-09-18 11:08:16 +0000 |
3 | +++ lib/canonical/launchpad/emailtemplates/upload-accepted.txt 2009-10-12 09:21:07 +0000 |
4 | @@ -1,4 +1,5 @@ |
5 | %(CHANGESFILE)s |
6 | + |
7 | %(DATE)s%(CHANGEDBY)s%(MAINTAINER)s%(SIGNER)s%(ORIGIN)s |
8 | %(SPR_URL)s |
9 | |
10 | |
11 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-announcement.txt' |
12 | --- lib/canonical/launchpad/emailtemplates/upload-announcement.txt 2008-09-18 11:08:16 +0000 |
13 | +++ lib/canonical/launchpad/emailtemplates/upload-announcement.txt 2009-10-12 09:27:30 +0000 |
14 | @@ -1,3 +1,4 @@ |
15 | %(CHANGESFILE)s |
16 | + |
17 | %(DATE)s%(CHANGEDBY)s%(MAINTAINER)s%(SIGNER)s%(ORIGIN)s |
18 | %(SPR_URL)s |
19 | |
20 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-new.txt' |
21 | --- lib/canonical/launchpad/emailtemplates/upload-new.txt 2007-07-03 11:15:10 +0000 |
22 | +++ lib/canonical/launchpad/emailtemplates/upload-new.txt 2009-10-12 09:17:07 +0000 |
23 | @@ -2,6 +2,7 @@ |
24 | |
25 | %(CHANGESFILE)s |
26 | |
27 | + |
28 | Your package contains new components which requires manual editing of |
29 | the override file. It is ok otherwise, so please be patient. New |
30 | packages are usually added to the overrides about once a week. |
31 | |
32 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-rejection.txt' |
33 | --- lib/canonical/launchpad/emailtemplates/upload-rejection.txt 2008-09-26 07:34:37 +0000 |
34 | +++ lib/canonical/launchpad/emailtemplates/upload-rejection.txt 2009-10-12 09:34:50 +0000 |
35 | @@ -2,6 +2,7 @@ |
36 | %(SUMMARY)s |
37 | |
38 | %(CHANGESFILE)s |
39 | + |
40 | %(DATE)s%(CHANGEDBY)s%(MAINTAINER)s%(SIGNER)s%(ORIGIN)s |
41 | |
42 | === |
43 | |
44 | === modified file 'lib/lp/archiveuploader/changesfile.py' |
45 | --- lib/lp/archiveuploader/changesfile.py 2009-06-24 23:33:29 +0000 |
46 | +++ lib/lp/archiveuploader/changesfile.py 2009-10-12 08:33:56 +0000 |
47 | @@ -307,10 +307,24 @@ |
48 | """Return changesfile claimed version.""" |
49 | return self._dict['version'] |
50 | |
51 | + @classmethod |
52 | + def formatChangesComment(cls, comment): |
53 | + """A class utility method for formatting changes for display.""" |
54 | + |
55 | + # Return the display version of the comment using the |
56 | + # debian policy rules. First replacing the blank line |
57 | + # indicator '\n .' and then stripping one space from each |
58 | + # successive line. |
59 | + comment = comment.replace('\n .', '\n') |
60 | + comment = comment.replace('\n ', '\n') |
61 | + return comment |
62 | + |
63 | @property |
64 | def changes_comment(self): |
65 | """Return changesfile 'change' comment.""" |
66 | - return self._dict['changes'] |
67 | + comment = self._dict['changes'] |
68 | + |
69 | + return self.formatChangesComment(comment) |
70 | |
71 | @property |
72 | def date(self): |
73 | |
74 | === modified file 'lib/lp/soyuz/model/queue.py' |
75 | --- lib/lp/soyuz/model/queue.py 2009-10-01 07:05:22 +0000 |
76 | +++ lib/lp/soyuz/model/queue.py 2009-10-12 07:53:03 +0000 |
77 | @@ -33,6 +33,7 @@ |
78 | from lp.archivepublisher.config import getPubConfig |
79 | from lp.archivepublisher.customupload import CustomUploadError |
80 | from lp.archivepublisher.utils import get_ppa_reference |
81 | +from lp.archiveuploader.changesfile import ChangesFile |
82 | from lp.archiveuploader.tagfiles import parse_tagfile_lines |
83 | from lp.archiveuploader.utils import safe_fix_maintainer |
84 | from lp.buildmaster.pas import BuildDaemonPackagesArchSpecific |
85 | @@ -737,14 +738,16 @@ |
86 | """PPA rejected message.""" |
87 | template = get_email_template('ppa-upload-rejection.txt') |
88 | SUMMARY = sanitize_string(summary_text) |
89 | - CHANGESFILE = sanitize_string("".join(changes_lines)) |
90 | + CHANGESFILE = sanitize_string( |
91 | + ChangesFile.formatChangesComment("".join(changes_lines))) |
92 | USERS_ADDRESS = config.launchpad.users_address |
93 | |
94 | class RejectedMessage: |
95 | """Rejected message.""" |
96 | template = get_email_template('upload-rejection.txt') |
97 | SUMMARY = sanitize_string(summary_text) |
98 | - CHANGESFILE = sanitize_string(changes['changes']) |
99 | + CHANGESFILE = sanitize_string( |
100 | + ChangesFile.formatChangesComment(changes['changes'])) |
101 | CHANGEDBY = '' |
102 | ORIGIN = '' |
103 | SIGNER = '' |
104 | @@ -821,7 +824,8 @@ |
105 | |
106 | STATUS = "New" |
107 | SUMMARY = summarystring |
108 | - CHANGESFILE = sanitize_string(changes['changes']) |
109 | + CHANGESFILE = sanitize_string( |
110 | + ChangesFile.formatChangesComment(changes['changes'])) |
111 | DISTRO = self.distroseries.distribution.title |
112 | if announce_list: |
113 | ANNOUNCE = 'Announcing to %s' % announce_list |
114 | @@ -835,7 +839,8 @@ |
115 | STATUS = "Waiting for approval" |
116 | SUMMARY = summarystring + ( |
117 | "\nThis upload awaits approval by a distro manager\n") |
118 | - CHANGESFILE = sanitize_string(changes['changes']) |
119 | + CHANGESFILE = sanitize_string( |
120 | + ChangesFile.formatChangesComment(changes['changes'])) |
121 | DISTRO = self.distroseries.distribution.title |
122 | if announce_list: |
123 | ANNOUNCE = 'Announcing to %s' % announce_list |
124 | @@ -853,7 +858,8 @@ |
125 | |
126 | STATUS = "Accepted" |
127 | SUMMARY = summarystring |
128 | - CHANGESFILE = sanitize_string(changes['changes']) |
129 | + CHANGESFILE = sanitize_string( |
130 | + ChangesFile.formatChangesComment(changes['changes'])) |
131 | DISTRO = self.distroseries.distribution.title |
132 | if announce_list: |
133 | ANNOUNCE = 'Announcing to %s' % announce_list |
134 | @@ -871,14 +877,16 @@ |
135 | |
136 | STATUS = "Accepted" |
137 | SUMMARY = summarystring |
138 | - CHANGESFILE = guess_encoding("".join(changes_lines)) |
139 | + CHANGESFILE = guess_encoding( |
140 | + ChangesFile.formatChangesComment("".join(changes_lines))) |
141 | |
142 | class AnnouncementMessage: |
143 | template = get_email_template('upload-announcement.txt') |
144 | |
145 | STATUS = "Accepted" |
146 | SUMMARY = summarystring |
147 | - CHANGESFILE = sanitize_string(changes['changes']) |
148 | + CHANGESFILE = sanitize_string( |
149 | + ChangesFile.formatChangesComment(changes['changes'])) |
150 | CHANGEDBY = '' |
151 | ORIGIN = '' |
152 | SIGNER = '' |
Michael Nelson (michael.nelson) wrote : | # |
Hi Colin,
I'm adding you as a reviewer as apparently you've been around long enough to have a better understanding of the code in question here (our custom parse_tagfile_lines function), or at least know who else might :).
There is already a work-around that's been CP'd onto Germanium that ensures that the Source index creation of PPAs copes with the incorrectly parsed data.
This branch fixes the actual parsing error so that we won't get any more incorrectly-parsed dsc_binaries etc., ensuring that our custom parser does exactly what apt_pkg.
The problem was not restricted to Binary fields in dsc files, but all values that can span multiple lines in both dsc and changes files. From my manual searching, as well as from the current test-suite coverage, this affected only:
(1) Indexes generation for PPAs (per the bug) as we do not use apt-ftparchive,
(2) formatting of text in Queue notification emails.
but there could be places that both the test coverage and I missed.
I've created bug 440014 to completely get rid of our custom parse_tagfile_lines function (replacing it with apt_pkg.
I've also tested this branch on dogfood, details here: http://
So if you approve of this branch landing, when would you prefer to be landed? We could
1. get it cherry-pick it now and so have some time to be condident that there are no unforseen problems, or
2. land it in devel without a CP, but then it will be rolled out in the beginning of Nov - probably not ideal, or
3. wait until LP 3.1.10 has been rolled out, and Karmic is released, and commit it to devel then (perhaps CPing it). Safer - but then more corrupt data in the database.
The dsc files with Binary values spanning multiple lines have approximately doubled each week, - mainly from soures that produce many binaries (linux, mono, oo.org) being added to people's ppas:
24-09-2009 - 12
06-10-2009 - 24
13-10-2009 - 48
Thanks for your time!
Colin Watson (cjwatson) wrote : | # |
> I'm adding you as a reviewer as apparently you've been around long enough to
> have a better understanding of the code in question here (our custom
> parse_tagfile_lines function), or at least know who else might :).
Thanks for asking me to review. The algorithmic changes look OK to me. I'm not sure I know any better than you why we have our own version of this function, though. It's been around since the very first version of Soyuz was committed (see r542, although it was called parse_changes back then). Even then, apt_pkg.
Reasons I can think of why it might have needed to be a separate function:
* no other way to enforce dsc_whitespace_
* problems with case-insensitivity at the time? (http://
* apt_pkg.
> So if you approve of this branch landing, when would you prefer to be landed?
> We could
> 1. get it cherry-pick it now and so have some time to be condident that there
> are no unforseen problems, or
> 2. land it in devel without a CP, but then it will be rolled out in the
> beginning of Nov - probably not ideal, or
> 3. wait until LP 3.1.10 has been rolled out, and Karmic is released, and
> commit it to devel then (perhaps CPing it). Safer - but then more corrupt data
> in the database.
Given that it's much more likely to affect PPAs than the main archive, and as you say the amount of corrupt data is increasing, I'd be happy for this to be cherry-picked ASAP.
Preview Diff
1 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-accepted.txt' |
2 | --- lib/canonical/launchpad/emailtemplates/upload-accepted.txt 2008-09-18 11:08:16 +0000 |
3 | +++ lib/canonical/launchpad/emailtemplates/upload-accepted.txt 2009-10-15 13:21:16 +0000 |
4 | @@ -1,4 +1,5 @@ |
5 | %(CHANGESFILE)s |
6 | + |
7 | %(DATE)s%(CHANGEDBY)s%(MAINTAINER)s%(SIGNER)s%(ORIGIN)s |
8 | %(SPR_URL)s |
9 | |
10 | |
11 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-announcement.txt' |
12 | --- lib/canonical/launchpad/emailtemplates/upload-announcement.txt 2008-09-18 11:08:16 +0000 |
13 | +++ lib/canonical/launchpad/emailtemplates/upload-announcement.txt 2009-10-15 13:21:16 +0000 |
14 | @@ -1,3 +1,4 @@ |
15 | %(CHANGESFILE)s |
16 | + |
17 | %(DATE)s%(CHANGEDBY)s%(MAINTAINER)s%(SIGNER)s%(ORIGIN)s |
18 | %(SPR_URL)s |
19 | |
20 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-new.txt' |
21 | --- lib/canonical/launchpad/emailtemplates/upload-new.txt 2007-07-03 11:15:10 +0000 |
22 | +++ lib/canonical/launchpad/emailtemplates/upload-new.txt 2009-10-15 13:21:16 +0000 |
23 | @@ -2,6 +2,7 @@ |
24 | |
25 | %(CHANGESFILE)s |
26 | |
27 | + |
28 | Your package contains new components which requires manual editing of |
29 | the override file. It is ok otherwise, so please be patient. New |
30 | packages are usually added to the overrides about once a week. |
31 | |
32 | === modified file 'lib/canonical/launchpad/emailtemplates/upload-rejection.txt' |
33 | --- lib/canonical/launchpad/emailtemplates/upload-rejection.txt 2008-09-26 07:34:37 +0000 |
34 | +++ lib/canonical/launchpad/emailtemplates/upload-rejection.txt 2009-10-15 13:21:16 +0000 |
35 | @@ -2,6 +2,7 @@ |
36 | %(SUMMARY)s |
37 | |
38 | %(CHANGESFILE)s |
39 | + |
40 | %(DATE)s%(CHANGEDBY)s%(MAINTAINER)s%(SIGNER)s%(ORIGIN)s |
41 | |
42 | === |
43 | |
44 | === modified file 'lib/lp/archiveuploader/changesfile.py' |
45 | --- lib/lp/archiveuploader/changesfile.py 2009-06-24 23:33:29 +0000 |
46 | +++ lib/lp/archiveuploader/changesfile.py 2009-10-15 13:21:16 +0000 |
47 | @@ -307,10 +307,24 @@ |
48 | """Return changesfile claimed version.""" |
49 | return self._dict['version'] |
50 | |
51 | + @classmethod |
52 | + def formatChangesComment(cls, comment): |
53 | + """A class utility method for formatting changes for display.""" |
54 | + |
55 | + # Return the display version of the comment using the |
56 | + # debian policy rules. First replacing the blank line |
57 | + # indicator '\n .' and then stripping one space from each |
58 | + # successive line. |
59 | + comment = comment.replace('\n .', '\n') |
60 | + comment = comment.replace('\n ', '\n') |
61 | + return comment |
62 | + |
63 | @property |
64 | def changes_comment(self): |
65 | """Return changesfile 'change' comment.""" |
66 | - return self._dict['changes'] |
67 | + comment = self._dict['changes'] |
68 | + |
69 | + return self.formatChangesComment(comment) |
70 | |
71 | @property |
72 | def date(self): |
73 | |
74 | === modified file 'lib/lp/archiveuploader/tagfiles.py' |
75 | --- lib/lp/archiveuploader/tagfiles.py 2009-06-24 23:33:29 +0000 |
76 | +++ lib/lp/archiveuploader/tagfiles.py 2009-10-15 13:21:16 +0000 |
77 | @@ -53,19 +53,19 @@ |
78 | """Expose a dicty has_key""" |
79 | return key in self.stanza.keys() |
80 | |
81 | - """Enables (foo in bar) functionality""" |
82 | + # Enables (foo in bar) functionality. |
83 | __contains__ = has_key |
84 | |
85 | def items(self): |
86 | """Allows for k,v in foo.items()""" |
87 | - return [ (k,self.stanza[k]) for k in self.stanza.keys() ] |
88 | + return [ (k, self.stanza[k]) for k in self.stanza.keys() ] |
89 | |
90 | class TagFileParseError(Exception): |
91 | """This exception is raised if parse_changes encounters nastiness""" |
92 | pass |
93 | |
94 | -re_single_line_field = re.compile(r"^(\S*)\s*:\s*(.*)"); |
95 | -re_multi_line_field = re.compile(r"^\s(.*)"); |
96 | +re_single_line_field = re.compile(r"^(\S*)\s*:\s*(.*)") |
97 | +re_multi_line_field = re.compile(r"^(\s.*)") |
98 | |
99 | |
100 | def parse_tagfile_lines(lines, dsc_whitespace_rules=0, allow_unsigned=False, |
101 | @@ -103,7 +103,8 @@ |
102 | |
103 | num_of_lines = len(indexed_lines.keys()) |
104 | index = 0 |
105 | - first = -1 |
106 | + first_value_for_newline_delimited_field = False |
107 | + more_values_can_follow = False |
108 | while index < num_of_lines: |
109 | index += 1 |
110 | line = indexed_lines[index] |
111 | @@ -116,11 +117,15 @@ |
112 | if dsc_whitespace_rules: |
113 | index += 1 |
114 | if index > num_of_lines: |
115 | - raise TagFileParseError("%s: invalid .dsc file at line %d" % (filename, index)) |
116 | + raise TagFileParseError( |
117 | + "%s: invalid .dsc file at line %d" % ( |
118 | + filename, index)) |
119 | line = indexed_lines[index] |
120 | if not line.startswith("-----BEGIN PGP SIGNATURE"): |
121 | - raise TagFileParseError("%s: invalid .dsc file at line %d -- " |
122 | - "expected PGP signature; got '%s'" % (filename, index,line)) |
123 | + raise TagFileParseError( |
124 | + "%s: invalid .dsc file at line %d -- " |
125 | + "expected PGP signature; got '%s'" % ( |
126 | + filename, index,line)) |
127 | inside_signature = 0 |
128 | break |
129 | else: |
130 | @@ -128,8 +133,9 @@ |
131 | if line.startswith("-----BEGIN PGP SIGNATURE"): |
132 | break |
133 | |
134 | - # If we're at the start of a signed section, then consume the signature |
135 | - # information, and remember that we're inside the signed data. |
136 | + # If we're at the start of a signed section, then consume the |
137 | + # signature information, and remember that we're inside the signed |
138 | + # data. |
139 | if line.startswith("-----BEGIN PGP SIGNED MESSAGE"): |
140 | inside_signature = 1 |
141 | if dsc_whitespace_rules: |
142 | @@ -145,31 +151,60 @@ |
143 | if slf: |
144 | field = slf.groups()[0].lower() |
145 | changes[field] = slf.groups()[1] |
146 | - first = 1 |
147 | + |
148 | + # If there is no value on this line, we assume this is |
149 | + # the first line of a multiline field, such as the 'files' |
150 | + # field. |
151 | + if changes[field] == '': |
152 | + first_value_for_newline_delimited_field = True |
153 | + |
154 | + # Either way, more values for this field could follow |
155 | + # on the next line. |
156 | + more_values_can_follow = True |
157 | continue |
158 | if line.rstrip() == " .": |
159 | - changes[field] += '\n' |
160 | + changes[field] += '\n' + line |
161 | continue |
162 | mlf = re_multi_line_field.match(line) |
163 | if mlf: |
164 | - if first == -1: |
165 | - raise TagFileParseError("%s: could not parse .changes file " |
166 | - "line %d: '%s'\n [Multi-line field continuing on from nothing?]" |
167 | - % (filename, index,line)) |
168 | - if first == 1 and changes[field] != "": |
169 | - changes[field] += '\n' |
170 | - first = 0 |
171 | - changes[field] += mlf.groups()[0] + '\n' |
172 | + if more_values_can_follow is False: |
173 | + raise TagFileParseError( |
174 | + "%s: could not parse .changes file line %d: '%s'\n" |
175 | + " [Multi-line field continuing on from nothing?]" % ( |
176 | + filename, index,line)) |
177 | + |
178 | + # XXX Michael Nelson 20091001 bug=440014 |
179 | + # Is there any reason why we're not simply using |
180 | + # apt_pkg.ParseTagFile instead of this looong function. |
181 | + # If we can get rid of this code that is trying to mimic |
182 | + # what ParseTagFile does out of the box, it would be a good |
183 | + # thing. |
184 | + |
185 | + # The first value for a newline delimited field, such as |
186 | + # the 'files' field, has its leading spaces stripped. Other |
187 | + # fields (such as a 'binary' field spanning multiple lines) |
188 | + # should *not* be l-stripped of their leading spaces otherwise |
189 | + # they will be re-parsed incorrectly by apt_get.ParseTagFiles() |
190 | + # (for example, from a Source index). |
191 | + value = mlf.groups()[0] |
192 | + if first_value_for_newline_delimited_field: |
193 | + changes[field] = value.lstrip() |
194 | + else: |
195 | + changes[field] += '\n' + value |
196 | + |
197 | + first_value_for_newline_delimited_field = False |
198 | continue |
199 | error += line |
200 | |
201 | if dsc_whitespace_rules and inside_signature: |
202 | - raise TagFileParseError("%s: invalid .dsc format at line %d" % (filename, index)) |
203 | + raise TagFileParseError( |
204 | + "%s: invalid .dsc format at line %d" % (filename, index)) |
205 | |
206 | changes["filecontents"] = "".join(lines) |
207 | |
208 | if error: |
209 | - raise TagFileParseError("%s: unable to parse .changes file: %s" % (filename, error)) |
210 | + raise TagFileParseError( |
211 | + "%s: unable to parse .changes file: %s" % (filename, error)) |
212 | |
213 | return changes |
214 | |
215 | |
216 | === added file 'lib/lp/archiveuploader/tests/data/test436182_0.1_source.changes' |
217 | --- lib/lp/archiveuploader/tests/data/test436182_0.1_source.changes 1970-01-01 00:00:00 +0000 |
218 | +++ lib/lp/archiveuploader/tests/data/test436182_0.1_source.changes 2009-10-15 13:21:16 +0000 |
219 | @@ -0,0 +1,23 @@ |
220 | +Format: 1.0 |
221 | +Date: Fri, 15 Dec 2006 12:02:42 +0000 |
222 | +Source: test75874 |
223 | +Binary: test75874 anotherbinary |
224 | + andanother andonemore |
225 | + lastone |
226 | +Architecture: all |
227 | +Version: 0.1 |
228 | +Maintainer: Colin Watson <cjwatson@ubuntu.com> |
229 | +Description: Here's the single-line synopsis. |
230 | + Then there is the extended description which can |
231 | + span multiple lines, and even include blank-lines like this: |
232 | + . |
233 | + There you go. If a line starts with two or more spaces, |
234 | + it will be displayed verbatim. Like this one: |
235 | + Example verbatim line. |
236 | + Another verbatim line. |
237 | + OK, back to normal. |
238 | +Files: |
239 | + f26bb9b29b1108e53139da3584a4dc92 1511 test75874_0.1.tar.gz |
240 | + 29c955ff520cea32ab3e0316306d0ac1 393742 pmount_0.9.7.orig.tar.gz |
241 | + 91a8f46d372c406fadcb57c6ff7016f3 5302 pmount_0.9.7-2ubuntu2.diff.gz |
242 | + |
243 | |
244 | === modified file 'lib/lp/archiveuploader/tests/test_tagfiles.py' |
245 | --- lib/lp/archiveuploader/tests/test_tagfiles.py 2009-06-24 23:33:29 +0000 |
246 | +++ lib/lp/archiveuploader/tests/test_tagfiles.py 2009-10-15 13:21:16 +0000 |
247 | @@ -5,23 +5,18 @@ |
248 | |
249 | # arch-tag: 52e0c871-49a3-4186-beb8-9817d02d5465 |
250 | |
251 | +import apt_pkg |
252 | import unittest |
253 | -import sys |
254 | -import shutil |
255 | from lp.archiveuploader.tests import datadir |
256 | |
257 | +from lp.archiveuploader.tagfiles import ( |
258 | + parse_tagfile, TagFile, TagFileParseError) |
259 | + |
260 | class Testtagfiles(unittest.TestCase): |
261 | |
262 | - def testImport(self): |
263 | - """lp.archiveuploader.tagfiles should be importable""" |
264 | - from lp.archiveuploader.tagfiles import TagFile |
265 | - from lp.archiveuploader.tagfiles import TagFileParseError |
266 | - from lp.archiveuploader.tagfiles import parse_tagfile |
267 | - |
268 | def testTagFileOnSingular(self): |
269 | """lp.archiveuploader.tagfiles.TagFile should parse a singular stanza |
270 | """ |
271 | - from lp.archiveuploader.tagfiles import TagFile |
272 | f = TagFile(file(datadir("singular-stanza"), "r")) |
273 | seenone = False |
274 | for stanza in f: |
275 | @@ -32,8 +27,7 @@ |
276 | self.assertEquals("FooBar" in stanza, False) |
277 | |
278 | def testTagFileOnSeveral(self): |
279 | - """lp.archiveuploader.tagfiles.TagFile should parse multiple stanzas""" |
280 | - from lp.archiveuploader.tagfiles import TagFile |
281 | + """TagFile should parse multiple stanzas.""" |
282 | f = TagFile(file(datadir("multiple-stanzas"), "r")) |
283 | seen = 0 |
284 | for stanza in f: |
285 | @@ -47,15 +41,12 @@ |
286 | """lp.archiveuploader.tagfiles.parse_tagfile should work on a good |
287 | changes file |
288 | """ |
289 | - from lp.archiveuploader.tagfiles import parse_tagfile |
290 | p = parse_tagfile(datadir("good-signed-changes")) |
291 | |
292 | def testCheckParseBadChangesRaises(self): |
293 | """lp.archiveuploader.tagfiles.parse_chantges should raise |
294 | TagFileParseError on failure |
295 | """ |
296 | - from lp.archiveuploader.tagfiles import parse_tagfile |
297 | - from lp.archiveuploader.tagfiles import TagFileParseError |
298 | self.assertRaises(TagFileParseError, |
299 | parse_tagfile, datadir("badformat-changes"), 1) |
300 | |
301 | @@ -63,8 +54,6 @@ |
302 | """lp.archiveuploader.tagfiles.parse_chantges should raise |
303 | TagFileParseError on empty |
304 | """ |
305 | - from lp.archiveuploader.tagfiles import parse_tagfile |
306 | - from lp.archiveuploader.tagfiles import TagFileParseError |
307 | self.assertRaises(TagFileParseError, |
308 | parse_tagfile, datadir("empty-file"), 1) |
309 | |
310 | @@ -72,16 +61,12 @@ |
311 | """lp.archiveuploader.tagfiles.parse_chantges should raise |
312 | TagFileParseError on malformed signatures |
313 | """ |
314 | - from lp.archiveuploader.tagfiles import parse_tagfile |
315 | - from lp.archiveuploader.tagfiles import TagFileParseError |
316 | self.assertRaises(TagFileParseError, |
317 | parse_tagfile, datadir("malformed-sig-changes"), 1) |
318 | |
319 | def testCheckParseMalformedMultilineRaises(self): |
320 | """lp.archiveuploader.tagfiles.parse_chantges should raise |
321 | TagFileParseError on malformed continuation lines""" |
322 | - from lp.archiveuploader.tagfiles import parse_tagfile |
323 | - from lp.archiveuploader.tagfiles import TagFileParseError |
324 | self.assertRaises(TagFileParseError, |
325 | parse_tagfile, datadir("bad-multiline-changes"), 1) |
326 | |
327 | @@ -89,8 +74,6 @@ |
328 | """lp.archiveuploader.tagfiles.parse_chantges should raise |
329 | TagFileParseError on unterminated signatures |
330 | """ |
331 | - from lp.archiveuploader.tagfiles import parse_tagfile |
332 | - from lp.archiveuploader.tagfiles import TagFileParseError |
333 | self.assertRaises(TagFileParseError, |
334 | parse_tagfile, |
335 | datadir("unterminated-sig-changes"), |
336 | @@ -100,25 +83,106 @@ |
337 | """lp.archiveuploader.tagfiles.parse_tagfile should not be vulnerable |
338 | to tags outside of the signed portion |
339 | """ |
340 | - from lp.archiveuploader.tagfiles import parse_tagfile |
341 | tf = parse_tagfile(datadir("changes-with-exploit-top")) |
342 | self.assertRaises(KeyError, tf.__getitem__, "you") |
343 | tf = parse_tagfile(datadir("changes-with-exploit-bottom")) |
344 | self.assertRaises(KeyError, tf.__getitem__, "you") |
345 | |
346 | + |
347 | +class TestTagFileDebianPolicyCompat(unittest.TestCase): |
348 | + |
349 | + def setUp(self): |
350 | + """Parse the test file using apt_pkg for comparison.""" |
351 | + |
352 | + tagfile_path = datadir("test436182_0.1_source.changes") |
353 | + tagfile = open(tagfile_path) |
354 | + self.apt_pkg_parsed_version = apt_pkg.ParseTagFile(tagfile) |
355 | + self.apt_pkg_parsed_version.Step() |
356 | + |
357 | + self.parse_tagfile_version = parse_tagfile( |
358 | + tagfile_path, allow_unsigned = True) |
359 | + |
360 | + def test_parse_tagfile_with_multiline_values(self): |
361 | + """parse_tagfile should not leave trailing '\n' on multiline values. |
362 | + |
363 | + This is a regression test for bug 436182. |
364 | + Previously we, |
365 | + 1. Stripped leading space/tab from subsequent lines of multiline |
366 | + values, and |
367 | + 2. appended a trailing '\n' to the end of the value. |
368 | + """ |
369 | + |
370 | + expected_text = ( |
371 | + 'test75874 anotherbinary\n' |
372 | + ' andanother andonemore\n' |
373 | + '\tlastone') |
374 | + |
375 | + self.assertEqual( |
376 | + expected_text, |
377 | + self.apt_pkg_parsed_version.Section['Binary']) |
378 | + |
379 | + self.assertEqual( |
380 | + expected_text, |
381 | + self.parse_tagfile_version['binary']) |
382 | + |
383 | + def test_parse_tagfile_with_newline_delimited_field(self): |
384 | + """parse_tagfile should not leave leading or tailing '\n' when |
385 | + parsing newline delimited fields. |
386 | + |
387 | + Newline-delimited fields should be parsed to match |
388 | + apt_pkg.ParseTageFile. |
389 | + |
390 | + Note: in the past, our parse_tagfile function left the leading |
391 | + '\n' in the parsed value, whereas it should not have. |
392 | + |
393 | + For an example, |
394 | + see http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Files |
395 | + """ |
396 | + |
397 | + expected_text = ( |
398 | + 'f26bb9b29b1108e53139da3584a4dc92 1511 test75874_0.1.tar.gz\n ' |
399 | + '29c955ff520cea32ab3e0316306d0ac1 393742 ' |
400 | + 'pmount_0.9.7.orig.tar.gz\n' |
401 | + ' 91a8f46d372c406fadcb57c6ff7016f3 5302 ' |
402 | + 'pmount_0.9.7-2ubuntu2.diff.gz') |
403 | + |
404 | + self.assertEqual( |
405 | + expected_text, |
406 | + self.apt_pkg_parsed_version.Section['Files']) |
407 | + |
408 | + self.assertEqual( |
409 | + expected_text, |
410 | + self.parse_tagfile_version['files']) |
411 | + |
412 | + def test_parse_description_field(self): |
413 | + """Apt-pkg preserves the blank-line indicator and does not strip |
414 | + leading spaces. |
415 | + |
416 | + See http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description |
417 | + """ |
418 | + expected_text = ( |
419 | + "Here's the single-line synopsis.\n" |
420 | + " Then there is the extended description which can\n" |
421 | + " span multiple lines, and even include blank-lines like this:\n" |
422 | + " .\n" |
423 | + " There you go. If a line starts with two or more spaces,\n" |
424 | + " it will be displayed verbatim. Like this one:\n" |
425 | + " Example verbatim line.\n" |
426 | + " Another verbatim line.\n" |
427 | + " OK, back to normal.") |
428 | + |
429 | + self.assertEqual( |
430 | + expected_text, |
431 | + self.apt_pkg_parsed_version.Section['Description']) |
432 | + |
433 | + # In the past our parse_tagfile function replaced blank-line |
434 | + # indicators in the description (' .\n') with new lines ('\n'), |
435 | + # but it is now compatible with ParseTagFiles (and ready to be |
436 | + # replaced by ParseTagFiles). |
437 | + self.assertEqual( |
438 | + expected_text, |
439 | + self.parse_tagfile_version['description']) |
440 | + |
441 | def test_suite(): |
442 | - suite = unittest.TestSuite() |
443 | - loader = unittest.TestLoader() |
444 | - suite.addTest(loader.loadTestsFromTestCase(Testtagfiles)) |
445 | - return suite |
446 | - |
447 | -def main(argv): |
448 | - suite = test_suite() |
449 | - runner = unittest.TextTestRunner(verbosity = 2) |
450 | - if not runner.run(suite).wasSuccessful(): |
451 | - return 1 |
452 | - return 0 |
453 | - |
454 | -if __name__ == '__main__': |
455 | - sys.exit(main(sys.argv)) |
456 | + return unittest.TestLoader().loadTestsFromName(__name__) |
457 | |
458 | |
459 | === modified file 'lib/lp/soyuz/model/queue.py' |
460 | --- lib/lp/soyuz/model/queue.py 2009-10-09 11:13:30 +0000 |
461 | +++ lib/lp/soyuz/model/queue.py 2009-10-15 13:21:16 +0000 |
462 | @@ -33,6 +33,7 @@ |
463 | from lp.archivepublisher.config import getPubConfig |
464 | from lp.archivepublisher.customupload import CustomUploadError |
465 | from lp.archivepublisher.utils import get_ppa_reference |
466 | +from lp.archiveuploader.changesfile import ChangesFile |
467 | from lp.archiveuploader.tagfiles import parse_tagfile_lines |
468 | from lp.archiveuploader.utils import safe_fix_maintainer |
469 | from lp.buildmaster.pas import BuildDaemonPackagesArchSpecific |
470 | @@ -747,14 +748,16 @@ |
471 | """PPA rejected message.""" |
472 | template = get_email_template('ppa-upload-rejection.txt') |
473 | SUMMARY = sanitize_string(summary_text) |
474 | - CHANGESFILE = sanitize_string("".join(changes_lines)) |
475 | + CHANGESFILE = sanitize_string( |
476 | + ChangesFile.formatChangesComment("".join(changes_lines))) |
477 | USERS_ADDRESS = config.launchpad.users_address |
478 | |
479 | class RejectedMessage: |
480 | """Rejected message.""" |
481 | template = get_email_template('upload-rejection.txt') |
482 | SUMMARY = sanitize_string(summary_text) |
483 | - CHANGESFILE = sanitize_string(changes['changes']) |
484 | + CHANGESFILE = sanitize_string( |
485 | + ChangesFile.formatChangesComment(changes['changes'])) |
486 | CHANGEDBY = '' |
487 | ORIGIN = '' |
488 | SIGNER = '' |
489 | @@ -831,7 +834,8 @@ |
490 | |
491 | STATUS = "New" |
492 | SUMMARY = summarystring |
493 | - CHANGESFILE = sanitize_string(changes['changes']) |
494 | + CHANGESFILE = sanitize_string( |
495 | + ChangesFile.formatChangesComment(changes['changes'])) |
496 | DISTRO = self.distroseries.distribution.title |
497 | if announce_list: |
498 | ANNOUNCE = 'Announcing to %s' % announce_list |
499 | @@ -845,7 +849,8 @@ |
500 | STATUS = "Waiting for approval" |
501 | SUMMARY = summarystring + ( |
502 | "\nThis upload awaits approval by a distro manager\n") |
503 | - CHANGESFILE = sanitize_string(changes['changes']) |
504 | + CHANGESFILE = sanitize_string( |
505 | + ChangesFile.formatChangesComment(changes['changes'])) |
506 | DISTRO = self.distroseries.distribution.title |
507 | if announce_list: |
508 | ANNOUNCE = 'Announcing to %s' % announce_list |
509 | @@ -863,7 +868,8 @@ |
510 | |
511 | STATUS = "Accepted" |
512 | SUMMARY = summarystring |
513 | - CHANGESFILE = sanitize_string(changes['changes']) |
514 | + CHANGESFILE = sanitize_string( |
515 | + ChangesFile.formatChangesComment(changes['changes'])) |
516 | DISTRO = self.distroseries.distribution.title |
517 | if announce_list: |
518 | ANNOUNCE = 'Announcing to %s' % announce_list |
519 | @@ -881,14 +887,16 @@ |
520 | |
521 | STATUS = "Accepted" |
522 | SUMMARY = summarystring |
523 | - CHANGESFILE = guess_encoding("".join(changes_lines)) |
524 | + CHANGESFILE = guess_encoding( |
525 | + ChangesFile.formatChangesComment("".join(changes_lines))) |
526 | |
527 | class AnnouncementMessage: |
528 | template = get_email_template('upload-announcement.txt') |
529 | |
530 | STATUS = "Accepted" |
531 | SUMMARY = summarystring |
532 | - CHANGESFILE = sanitize_string(changes['changes']) |
533 | + CHANGESFILE = sanitize_string( |
534 | + ChangesFile.formatChangesComment(changes['changes'])) |
535 | CHANGEDBY = '' |
536 | ORIGIN = '' |
537 | SIGNER = '' |
= Summary =
This branch fixes a soyuz parsing error when a debian source control
file is uploaded.
The error is described in bug 436182 (although, somewhat confusedly...
as I was discovering the issues as I went).
To re-cap, soyuz has a function for parsing tag files (such as debian
source control files - dsc, or changes files), which should adhere to
http:// www.debian. org/doc/ debian- policy/ ch-controlfield s.html
but it doesn't in a number of cases, in particular one case which
started occurring recently - fields containing values that span multiple
lines.
The original issue that is noted in bug 436182 is that, when a value for ParseTagFile) to fail.
a field spans multiple lines, our parser was leaving a trailing '\n' on
the parsed value - which caused the Source index to be created with a
*blank* line (as shown in the original snippet in the bug description).
This is completely incorrect and causes any reparsing of this file by
standard libraries (such as apt_pkg.
While creating a workaround for this a second issue was discovered: our
parser removes the required space/tab from subsequent lines of a value
that spans multiple lines. According to the policy:
"Many fields' values may span several lines; in this case each
continuation line must start with a space or a tab."
This issue can also be seen in the example snippet in the bug ParseTagFile which will ignore
description. The effect that a Source generated from this incorrectly
parsed data will be reparsed by apt_pgk.
the subsequent lines that are not preceded with space or tab.
== Proposed fix ==
I've already landed a work-around that will handle the corrupt data in bazaar. launchpad. net/~launchpad- pqm/launchpad/ devel/revision/ 9606 bazaar. launchpad. net/~launchpad- pqm/launchpad/ devel/revision/ 9654
the database (of which there are 27 as of 2009-10-06) when generating
the index files (see
http://
and
http://
) and will get this cherry-picked.
This branch itself ensures that our parse_tagfiles function parses the ParseTagFile( ). ParseTagFile( ) - if that's an
three different types of multi-line values available in tag files (a
new-line-delimited value, a value spanning multiple lines and a
description) with exactly the same result as apt_pkg.
Part of the reason for this is to make it easier to later remove the
function and replace it with apt_pkg.
option (I need to talk with Colin or someone who knows the history
regarding why we've created our own parsing function).
== Pre-implementation notes ==
== Implementation details ==
== Tests ==
bin/test -vv -t Testtagfiles -t TestTagFileDebi anPolicyCompat
== Demo and Q/A ==
I'll be QA'ing this on DF.
= Launchpad lint =
Checking for conflicts. and issues in doctests and templates.
Running jslint, xmllint, pyflakes, and pylint.
Using normal rules.
Wow - ok, so I guess this file hasn't been touched since our python
style guide came into effect ;).
I've fixed all the lint issues below.
Linting changed files: archiveuploader /tagfiles. py archiveuploader /tests/ test_tagfiles. py archiveuploader /tests/ data/test436182 _0.1_source. changes
lib/lp/
lib/lp/
lib/lp/
== Pyflakes notices ==
lib/lp/ archiveuploader /tests/ test_tagfiles. py
10: 'sys'...