Merge lp:~mbp/launchpad/391780-markdown into lp:launchpad
- 391780-markdown
- Merge into devel
Status: | Merged |
---|---|
Approved by: | Martin Pool |
Approved revision: | no longer in the source branch. |
Merged at revision: | 14384 |
Proposed branch: | lp:~mbp/launchpad/391780-markdown |
Merge into: | lp:launchpad |
Diff against target: |
225 lines (+90/-4) 7 files modified
lib/lp/app/browser/stringformatter.py (+26/-1) lib/lp/app/browser/tests/test_stringformatter.py (+50/-0) lib/lp/registry/browser/person.py (+3/-1) lib/lp/registry/templates/product-index.pt (+3/-2) lib/lp/services/features/flags.py (+6/-0) setup.py (+1/-0) versions.cfg (+1/-0) |
To merge this branch: | bzr merge lp:~mbp/launchpad/391780-markdown |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Raphaël Badin (community) | Approve | ||
Review via email: mp+82832@code.launchpad.net |
Commit message
[r=rvb][bug=391780][incr] Markdown markup in project and user home pages
Description of the change
Support Markdown markup in project and user home pages, per bug 391780.
Martin Pool (mbp) wrote : | # |
thanks for all these reviews.
this is not quite ready to land, with some issues still discussed on the bug.
Thank you for timing this - of course we have to do that first.
That's a good point that constructing the instance seems expensive. I
was worried by your timings until I realized you're actually measuring
1000 iterations. On my machine:
"a *b* a" is 0.246ms
10 plain text lines - 0.34ms
100 indented code lines - 0.4ms
100 lines with inline emphasis and links - 29ms
1000 lines with markup - 306ms
So for really large amounts of text it is starting to eat up a fair
amount of our render budget, but for more realistic amounts it will be
ok. Perhaps in a future release we can render on the client (if the
client is smart) and then actually take net load off the server.
Alexander Regueiro (alexreg) wrote : | # |
I'm not sure the efficiency here is a big problem, though it does intrigue me. Which Python library are you using for Markdown rendering? There exist efficient C libraries, if it would make a difference.
Huw Wilkins (huwshimi) wrote : | # |
@mbp one thing we'll need to do is style this content (for all the different types of elements Markdown generates). The consideration here is that this is a user controlled content block. We'll want to style the elements here slightly differently to how we style the same elements elsewhere (or at least have the ability to do so).
I guess to make that happen it would be easiest if the element wrapping the markdown content had the class "markdown".
Martin Pool (mbp) wrote : | # |
alexreg python-markdown 2.0.3.
i think we can safely keep the option of a C rewrite in our pocket until later.
Martin Pool (mbp) wrote : | # |
I think what still needs to be done before the first tranche can land is:
* make links in the output be rel=nofollow
and good to in the first landing or soon after
* put some kind of css marker on markdown formatted content
* update to Markdown 2.1beta so we can have nl2br
* add patterns for things that are currently linkified by launchpad (bug N etc)
Alexander Regueiro (alexreg) wrote : | # |
Fair enough. Let's wait until efficiency proves itself as a problem in that case.
All four points seem quite sensible, though I think the last one won't prove too useful until we get wiki support in Launchpad (which I hope to have a go at soon). On project summary pages -- the only current application afaik -- there's no rush probably.
Martin Pool (mbp) wrote : | # |
Linkifying text like 'bug 391780' is important otherwise this will be a regression compared to the way text is currently displayed. Though, it is probably fairly unlikely that kind of text occurs in project or person home pages, so perhaps it's not crucial. It would be important for bug comments.
Martin Pool (mbp) wrote : | # |
When I tested again with Markdown 2.1.0 (now in lp-sourcedeps), I see a bit of difference in creating a formatter object every time vs reusing it, but not such a drastic difference that I think we have to worry about it right now::
mbp@joy% ./bin/py -m timeit -r10 -s 'import markdown' 'markdown.
1000 loops, best of 10: 1.31 msec per loop
mbp@joy% ./bin/py -m timeit -r10 -s 'import markdown;md =markdown.
1000 loops, best of 10: 1.07 msec per loop
so about .24ms overhead per markdown block that we format.
Martin Pool (mbp) wrote : | # |
It looks like <a rel=nofollow> is going to require development of a new extension, which could probably usefully be sent upstream. So I think it's ok to deploy this as a limited beta, without that - it only makes a difference to search engines, they won't be able to see it while it's hidden, and we can see if there are any other issues.
Raphaël Badin (rvb) wrote : | # |
> It looks like <a rel=nofollow> is going to require development of a new
> extension, which could probably usefully be sent upstream. So I think it's ok
> to deploy this as a limited beta, without that - it only makes a difference to
> search engines, they won't be able to see it while it's hidden, and we can see
> if there are any other issues.
I understood your goal was to test this on real data… since it's all behind a FF I don't see why you should not land this and see what happens.
Raphaël Badin (rvb) wrote : | # |
> When I tested again with Markdown 2.1.0 (now in lp-sourcedeps), I see a bit of
> so about .24ms overhead per markdown block that we format.
Right, the absolute number is not so great but it is still a 30% difference. On the other hand, maybe the difference will be even smaller when rendering more complex markup.
Martin Pool (mbp) wrote : | # |
2011/11/25 Raphaël Badin <email address hidden>:
>> When I tested again with Markdown 2.1.0 (now in lp-sourcedeps), I see a bit of
>> so about .24ms overhead per markdown block that we format.
>
> Right, the absolute number is not so great but it is still a 30% difference. On the other hand, maybe the difference will be even smaller when rendering more complex markup.
I think the relevant calculation is not to divide by the render time,
but rather to multiply by the number of blocks on the page. With this
branch, it's at most one, and 0.24ms is negligible.
I think the most we would forseeably get to is a heavily commented bug
or mp with say 100 comments, all using md, and then it would be 24ms
which is probably still not too bad, considering such a page probably
has a >2000ms render time today.
If eventually we want md even for very fine grained items we might
have to work out something else.
Anyhow, thanks for not blocking initial landing.
--
Martin
Raphaël Badin (rvb) wrote : | # |
> I think the relevant calculation is not to divide by the render time,
> but rather to multiply by the number of blocks on the page. With this
> branch, it's at most one, and 0.24ms is negligible.
>
> I think the most we would forseeably get to is a heavily commented bug
> or mp with say 100 comments, all using md, and then it would be 24ms
> which is probably still not too bad, considering such a page probably
> has a >2000ms render time today.
Good point. I've been working on optimizing queries and stuff for the past 3 weeks and this is fu***** with my brain a little I must say :).
> Anyhow, thanks for not blocking initial landing.
That's what FF are for: landing semi finished stuff to see how it works in production ;)
Anyway, thanks for pushing this forward Martin.
Preview Diff
1 | === modified file 'lib/lp/app/browser/stringformatter.py' |
2 | --- lib/lp/app/browser/stringformatter.py 2011-11-22 06:54:05 +0000 |
3 | +++ lib/lp/app/browser/stringformatter.py 2011-11-25 08:27:29 +0000 |
4 | @@ -22,6 +22,7 @@ |
5 | import re |
6 | from lxml import html |
7 | from xml.sax.saxutils import unescape as xml_unescape |
8 | +import markdown |
9 | |
10 | from zope.component import getUtility |
11 | from zope.interface import implements |
12 | @@ -39,6 +40,9 @@ |
13 | re_email_address, |
14 | obfuscate_email, |
15 | ) |
16 | +from lp.services.features import ( |
17 | + getFeatureFlag, |
18 | + ) |
19 | |
20 | |
21 | def escape(text, quote=True): |
22 | @@ -556,7 +560,8 @@ |
23 | 0*(?P<bugnum>\d+) |
24 | ) | |
25 | (?P<faq> |
26 | - \bfaq(?:[\s=-]|<br\s*/>)*(?:\#|item|number?|num\.?|no\.?)?(?:[\s=-]|<br\s*/>)* |
27 | + \bfaq(?:[\s=-]|<br\s*/>)*(?:\#|item|number?|num\.?|no\.?)? |
28 | + (?:[\s=-]|<br\s*/>)* |
29 | 0*(?P<faqnum>\d+) |
30 | ) | |
31 | (?P<oops> |
32 | @@ -982,6 +987,12 @@ |
33 | url = root_url + self._stringtoformat |
34 | return '<a href="%s">%s</a>' % (url, self._stringtoformat) |
35 | |
36 | + def markdown(self): |
37 | + if getFeatureFlag('markdown.enabled'): |
38 | + return format_markdown(self._stringtoformat) |
39 | + else: |
40 | + return self.text_to_html() |
41 | + |
42 | def traverse(self, name, furtherPath): |
43 | if name == 'nl_to_br': |
44 | return self.nl_to_br() |
45 | @@ -991,6 +1002,8 @@ |
46 | return self.lower() |
47 | elif name == 'break-long-words': |
48 | return self.break_long_words() |
49 | + elif name == 'markdown': |
50 | + return self.markdown() |
51 | elif name == 'text-to-html': |
52 | return self.text_to_html() |
53 | elif name == 'text-to-html-with-target': |
54 | @@ -1033,3 +1046,15 @@ |
55 | return self.oops_id() |
56 | else: |
57 | raise TraversalError(name) |
58 | + |
59 | + |
60 | +def format_markdown(text): |
61 | + """Return html form of marked-up text.""" |
62 | + # This returns whole paragraphs (in p tags), similarly to text_to_html. |
63 | + md = markdown.Markdown( |
64 | + safe_mode='escape', |
65 | + extensions=[ |
66 | + 'tables', |
67 | + 'nl2br', |
68 | + ]) |
69 | + return md.convert(text) # How easy was that? |
70 | |
71 | === modified file 'lib/lp/app/browser/tests/test_stringformatter.py' |
72 | --- lib/lp/app/browser/tests/test_stringformatter.py 2011-08-28 07:29:11 +0000 |
73 | +++ lib/lp/app/browser/tests/test_stringformatter.py 2011-11-25 08:27:29 +0000 |
74 | @@ -9,6 +9,11 @@ |
75 | from textwrap import dedent |
76 | import unittest |
77 | |
78 | +from testtools.matchers import ( |
79 | + Equals, |
80 | + Matcher, |
81 | + ) |
82 | + |
83 | from zope.component import getUtility |
84 | |
85 | from canonical.config import config |
86 | @@ -20,6 +25,7 @@ |
87 | linkify_bug_numbers, |
88 | ) |
89 | from lp.testing import TestCase |
90 | +from lp.services.features.testing import FeatureFixture |
91 | |
92 | |
93 | def test_split_paragraphs(): |
94 | @@ -401,6 +407,50 @@ |
95 | expected_string, formatted_string)) |
96 | |
97 | |
98 | +class MarksDownAs(Matcher): |
99 | + |
100 | + def __init__(self, expected_html): |
101 | + self.expected_html = expected_html |
102 | + |
103 | + def match(self, input_string): |
104 | + return Equals(self.expected_html).match( |
105 | + FormattersAPI(input_string).markdown()) |
106 | + |
107 | + |
108 | +class TestMarkdownDisabled(TestCase): |
109 | + """Feature flag can turn Markdown stuff off. |
110 | + """ |
111 | + |
112 | + layer = DatabaseFunctionalLayer # Fixtures need the database for now |
113 | + |
114 | + def setUp(self): |
115 | + super(TestMarkdownDisabled, self).setUp() |
116 | + self.useFixture(FeatureFixture({'markdown.enabled': None})) |
117 | + |
118 | + def test_plain_text(self): |
119 | + self.assertThat( |
120 | + 'hello **simple** world', |
121 | + MarksDownAs('<p>hello **simple** world</p>')) |
122 | + |
123 | + |
124 | +class TestMarkdown(TestCase): |
125 | + """Test for Markdown integration within Launchpad. |
126 | + |
127 | + Not an exhaustive test, more of a check for our integration and configuration. |
128 | + """ |
129 | + |
130 | + layer = DatabaseFunctionalLayer # Fixtures need the database for now |
131 | + |
132 | + def setUp(self): |
133 | + super(TestMarkdown, self).setUp() |
134 | + self.useFixture(FeatureFixture({'markdown.enabled': 'on'})) |
135 | + |
136 | + def test_plain_text(self): |
137 | + self.assertThat( |
138 | + 'hello world', |
139 | + MarksDownAs('<p>hello world</p>')) |
140 | + |
141 | + |
142 | def test_suite(): |
143 | suite = unittest.TestSuite() |
144 | suite.addTests(DocTestSuite()) |
145 | |
146 | === modified file 'lib/lp/registry/browser/person.py' |
147 | --- lib/lp/registry/browser/person.py 2011-11-20 07:23:45 +0000 |
148 | +++ lib/lp/registry/browser/person.py 2011-11-25 08:27:29 +0000 |
149 | @@ -2381,10 +2381,12 @@ |
150 | if content is None: |
151 | return None |
152 | elif self.is_probationary_or_invalid_user: |
153 | + # XXX: Is this really useful? They can post links in many other |
154 | + # places. -- mbp 2011-11-20. |
155 | return cgi.escape(content) |
156 | else: |
157 | formatter = FormattersAPI |
158 | - return formatter(content).text_to_html() |
159 | + return formatter(content).markdown() |
160 | |
161 | @cachedproperty |
162 | def recently_approved_members(self): |
163 | |
164 | === modified file 'lib/lp/registry/templates/product-index.pt' |
165 | --- lib/lp/registry/templates/product-index.pt 2011-06-16 13:50:58 +0000 |
166 | +++ lib/lp/registry/templates/product-index.pt 2011-11-25 08:27:29 +0000 |
167 | @@ -55,14 +55,15 @@ |
168 | <a tal:replace="structure overview_menu/review_license/fmt:icon"/> |
169 | </p> |
170 | |
171 | - <div class="summary" tal:content="context/summary"> |
172 | + <div class="summary" |
173 | + tal:content="structure context/summary/fmt:markdown"> |
174 | $Product.summary goes here. This should be quite short, |
175 | just a single paragraph of text really, giving the project |
176 | highlights. |
177 | </div> |
178 | |
179 | <div class="description" |
180 | - tal:content="structure context/description/fmt:text-to-html" |
181 | + tal:content="structure context/description/fmt:markdown" |
182 | tal:condition="context/description"> |
183 | $Product.description goes here. This should be a longer piece of |
184 | text, up to three paragraphs in length, which gives much more |
185 | |
186 | === modified file 'lib/lp/services/features/flags.py' |
187 | --- lib/lp/services/features/flags.py 2011-11-24 15:58:51 +0000 |
188 | +++ lib/lp/services/features/flags.py 2011-11-25 08:27:29 +0000 |
189 | @@ -113,6 +113,12 @@ |
190 | '', |
191 | '', |
192 | ''), |
193 | + ('markdown.enabled', |
194 | + 'boolean', |
195 | + 'Interpret selected user content as Markdown.', |
196 | + 'disabled', |
197 | + 'Markdown', |
198 | + 'https://launchpad.net/bugs/391780'), |
199 | ('memcache', |
200 | 'boolean', |
201 | 'Enables use of memcached where it is supported.', |
202 | |
203 | === modified file 'setup.py' |
204 | --- setup.py 2011-11-07 12:47:36 +0000 |
205 | +++ setup.py 2011-11-25 08:27:29 +0000 |
206 | @@ -52,6 +52,7 @@ |
207 | # Required for launchpadlib |
208 | 'keyring', |
209 | 'manuel', |
210 | + 'Markdown', |
211 | 'mechanize', |
212 | 'meliae', |
213 | 'mercurial', |
214 | |
215 | === modified file 'versions.cfg' |
216 | --- versions.cfg 2011-11-21 15:52:36 +0000 |
217 | +++ versions.cfg 2011-11-25 08:27:29 +0000 |
218 | @@ -43,6 +43,7 @@ |
219 | lazr.testing = 0.1.1 |
220 | lazr.uri = 1.0.2 |
221 | manuel = 1.1.1 |
222 | +Markdown = 2.1.0 |
223 | martian = 0.11 |
224 | mechanize = 0.1.11 |
225 | meliae = 0.2.0.final.0 |
Looks good.
A few remarks:
[0]
50 +def format_ markdown( text):
51 + """Return html form of marked-up text."""
52 + # This returns whole paragraphs (in p tags), similarly to text_to_html.
53 + md = markdown.Markdown(
54 + safe_mode='escape',
55 + extensions=[
56 + 'tables',
57 + ])
58 + return md.convert(text) # How easy was that?
Looks like there is a big overhead in creating the Mardown class each time we have a string to convert:
python -m timeit -s 'import markdown; md = markdown. Markdown( safe_mode= "escape" ,extensions= [ "tables",])' 'for x in range(1000): md.convert("a *b* a");'
10 loops, best of 3: 246 msec per loop
python -m timeit -s 'import markdown;' 'for x in range(1000): markdown. Markdown( safe_mode= "escape" ,extensions= [ "tables" ,]).convert( "a *b* a");'
10 loops, best of 3: 471 msec per loop
You might want to create and reuse instances stored in threading.local.
[1]
125 + def test_plain_ text(self) : '<p>hello world</p>'))
126 + self.assertThat(
127 + 'hello world',
128 + MarksDownAs(
Don't you want to include 'real' markdown markup here, just as an illustration for the reader?
[2]
160 - <div class="summary" tal:content= "context/ summary" > "structure context/ summary/ fmt:markdown" >
161 + <div class="summary"
162 + tal:content=
163 $Product.summary goes here. This should be quite short,
Small indentation inconsistency here ;)