Merge lp:~jml/pkgme/format-description into lp:pkgme

Proposed by Jonathan Lange
Status: Merged
Approved by: James Westby
Approved revision: 103
Merged at revision: 94
Proposed branch: lp:~jml/pkgme/format-description
Merge into: lp:pkgme
Diff against target: 195 lines (+164/-0)
2 files modified
pkgme/info_elements.py (+49/-0)
pkgme/tests/test_info_elements.py (+115/-0)
To merge this branch: bzr merge lp:~jml/pkgme/format-description
Reviewer Review Type Date Requested Status
James Westby Approve
Review via email: mp+86558@code.launchpad.net

Commit message

Auto-format description when it doesn't appear to be correctly formatted.

Description of the change

Gives Description powers of automatically formatting incoming descriptions so they fit into the Debian control file.

Implementation has been guided by http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description, which I assume is the relevant standard.

In distinguishing between auto-formatted descriptions and ones that we leave alone, I felt it appropriate to add error checking for the ones we leave as-is. This means that callers of Description.clean() can rely on getting a proper description.

I'm raising user-friendly errors when the description is invalid, but I'm not 100% sure whether there would be a more appropriate way of raising them.

As an alternative strategy, when we find errors we could instead go back and treat the whole string as one that needs auto-formatting.

To post a comment you must log in.
Revision history for this message
James Westby (james-w) wrote :

Hi,

I think this test shows undesirable behaviour:

162 + def test_non_indented_further_lines(self):

In the binary backend we're going to want to take the description from MyApps
and hand it over to this code.

If the developer enters a space at the start of the first line then it will
error, and there's nothing we can do except lstrip() the description.

I think that a better check for whether to act on it is whether there is
a space at the start of every line. You would still get in to trouble if
the developer put a space at the start of every line but left blank lines,
but that will be rarer.

I think this is what you meant by

  As an alternative strategy, when we find errors we could instead go back and treat the whole string as one that needs auto-formatting.

but I'm of course willing to be convinced otherwise.

Thanks,

James

review: Needs Fixing
lp:~jml/pkgme/format-description updated
102. By Jonathan Lange

Don't raise an error on unindented lines.

103. By Jonathan Lange

Just fix up wrong blank lines too.

Revision history for this message
Jonathan Lange (jml) wrote :

On Wed, Dec 21, 2011 at 4:23 PM, James Westby <email address hidden> wrote:
> Review: Needs Fixing
>
> Hi,
>
> I think this test shows undesirable behaviour:
>
> 162     + def test_non_indented_further_lines(self):
>
> In the binary backend we're going to want to take the description from MyApps
> and hand it over to this code.
>
> If the developer enters a space at the start of the first line then it will
> error, and there's nothing we can do except lstrip() the description.
>
> I think that a better check for whether to act on it is whether there is
> a space at the start of every line. You would still get in to trouble if
> the developer put a space at the start of every line but left blank lines,
> but that will be rarer.
>

I've changed this to not do a check at all, but to clean up the
description regardless. In this case, "clean" means to change any
blank lines to " ." and to indent any lines without indentation.

jml

Revision history for this message
James Westby (james-w) wrote :

Looks good to me, thanks for the changes.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'pkgme/info_elements.py'
2--- pkgme/info_elements.py 2011-08-10 14:33:16 +0000
3+++ pkgme/info_elements.py 2012-01-09 13:11:24 +0000
4@@ -164,12 +164,61 @@
5 description = """The dependencies of the binary package."""
6
7
8+def chomp(text):
9+ """Remove the trailing newline if present.
10+
11+ Named after the Perl built-in that does the same thing.
12+ """
13+ if text and text[-1] == '\n':
14+ return text[:-1]
15+ return text
16+
17+
18 class Description(InfoElement):
19
20 name = "description"
21 description = "The description of the package."
22 default = "a package"
23
24+ @classmethod
25+ def _format_lines(cls, lines):
26+ first_line, lines = lines[0], lines[1:]
27+ yield first_line
28+ for i, line in enumerate(lines):
29+ is_blank = bool(not line.strip())
30+ if is_blank:
31+ yield ' .'
32+ elif line[0] == ' ':
33+ yield line
34+ else:
35+ yield ' ' + line
36+
37+ @classmethod
38+ def clean(cls, value):
39+ """Prepare a string for the Description field in the control file.
40+
41+ If ``value`` looks to already be formatted according to Debian policy,
42+ then we pass it through. If it does not, then we do our best to clean
43+ it up.
44+
45+ We determine whether ``value`` is already Debian-formatted by checking
46+ whether the second line is indented by a space. If so, then we assume
47+ it's Debian-formatted. If later lines break Debian formatting, then
48+ we raise a ValueError.
49+
50+ See http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description
51+ for the rules governing the Description field.
52+
53+ :param value: A string to go into the Description field of a control
54+ file.
55+ :raise ValueError: When the string initially looks to be Debian-
56+ formatted but actually is not.
57+ :return: A correctly-formatted version of that string.
58+ """
59+ if not value.strip():
60+ return ''
61+ return '\n'.join(cls._format_lines(value.strip().splitlines()))
62+
63
64 class Version(InfoElement):
65
66
67=== modified file 'pkgme/tests/test_info_elements.py'
68--- pkgme/tests/test_info_elements.py 2011-08-10 14:33:16 +0000
69+++ pkgme/tests/test_info_elements.py 2012-01-09 13:11:24 +0000
70@@ -4,6 +4,7 @@
71
72 from pkgme.info_elements import (
73 BuildDepends,
74+ Description,
75 Distribution,
76 ExtraFiles,
77 InfoElement,
78@@ -132,3 +133,117 @@
79 line.strip().split('=') for line in fp.readlines())
80 self.assertEqual(
81 Distribution.get_default(), release_keys['DISTRIB_CODENAME'])
82+
83+
84+class DescriptionTestCase(TestCase):
85+
86+ # http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-f-Description
87+
88+ def test_single_line(self):
89+ # If the description is just a single line with no line breaks, then
90+ # that counts as the synopsis and no special formatting is needed.
91+ description = "single line description"
92+ cleaned = Description.clean(description)
93+ self.assertEqual(description, cleaned)
94+
95+ def test_single_line_trailing_newline(self):
96+ # If the single line ends with a newline then we remove that newline,
97+ # as it's not needed for the Description in the template.
98+ description = "single line description\n"
99+ cleaned = Description.clean(description)
100+ self.assertEqual(description.rstrip(), cleaned)
101+
102+ def test_indented_further_lines(self):
103+ # If the incoming description is already in the format needed by the
104+ # control file, then we do nothing to it.
105+ description = """initial synopsis
106+ follow-up line
107+ preformatted text
108+ more preformatted text
109+ more information
110+"""
111+ cleaned = Description.clean(description)
112+ self.assertEqual(description.rstrip(), cleaned)
113+
114+ def test_non_indented_further_lines(self):
115+ # If the incoming description *looks* like it's in correct Debian
116+ # format (i.e. has a leading space in the second line), then we assume
117+ # that they don't know what they are doing and reformat the lines
118+ # properly.
119+ bad_description = """\
120+initial synopsis
121+ follow-up line
122+more information
123+"""
124+ description = Description.clean(bad_description)
125+ self.assertEqual("""\
126+initial synopsis
127+ follow-up line
128+ more information""", description)
129+
130+ def test_bad_literal_description(self):
131+ # If the incoming description *looks* like it's in correct Debian
132+ # format (i.e. has a leading space in the second line), but then
133+ # contains actual blank lines rather than lines with '.', then we
134+ # raise an error.
135+ bad_description = """initial synopsis
136+ follow-up line
137+
138+ more information
139+"""
140+ description = Description.clean(bad_description)
141+ self.assertEqual("""initial synopsis
142+ follow-up line
143+ .
144+ more information""", description)
145+
146+ def test_multiple_lines(self):
147+ # If the description appears to be out of the Debian format and spans
148+ # multiple lines, then we indent those lines by a single space,
149+ # treating them as paragraph text that will be word-wrapped.
150+ description = """initial synopsis
151+follow-up line
152+more information
153+"""
154+ cleaned = Description.clean(description)
155+ self.assertEqual("""initial synopsis
156+ follow-up line
157+ more information""", cleaned)
158+
159+ def test_blank_lines(self):
160+ # If the description appears to be out of the Debian format and
161+ # contains blank lines, we mark those blank lines with a single '.'.
162+ description = """initial synopsis
163+follow-up line
164+
165+more information
166+
167+that previous line has spaces in it, but needs to be marked
168+up as a blank line all the same
169+"""
170+ cleaned = Description.clean(description)
171+ self.assertEqual("""initial synopsis
172+ follow-up line
173+ .
174+ more information
175+ .
176+ that previous line has spaces in it, but needs to be marked
177+ up as a blank line all the same""", cleaned)
178+
179+ def test_empty_description(self):
180+ # If the description is empty, then we return an empty description.
181+ self.assertEqual('', Description.clean(''))
182+ self.assertEqual('', Description.clean('\n'))
183+ self.assertEqual('', Description.clean(' '))
184+
185+ def test_initial_blank_lines(self):
186+ description = """
187+
188+initial synopsis
189+follow-up line
190+more information
191+"""
192+ cleaned = Description.clean(description)
193+ self.assertEqual("""initial synopsis
194+ follow-up line
195+ more information""", cleaned)

Subscribers

People subscribed via source and target branches