Merge lp:~sinzui/launchpad/hold-message into lp:launchpad

Proposed by Curtis Hovey
Status: Merged
Merged at revision: 11685
Proposed branch: lp:~sinzui/launchpad/hold-message
Merge into: lp:launchpad
Diff against target: 334 lines (+175/-87)
5 files modified
lib/lp/services/mailman/doc/postings.txt (+0/-76)
lib/lp/services/mailman/monkeypatches/lpsize.py (+34/-3)
lib/lp/services/mailman/testing/__init__.py (+7/-3)
lib/lp/services/mailman/tests/test_lpmoderate.py (+5/-5)
lib/lp/services/mailman/tests/test_lpsize.py (+129/-0)
To merge this branch: bzr merge lp:~sinzui/launchpad/hold-message
Reviewer Review Type Date Requested Status
Edwin Grubbs (community) code Approve
Review via email: mp+37750@code.launchpad.net

Description of the change

This is my branch to forward small messages to the moderation queue.

    lp:~sinzui/launchpad/hold-message
    Diff size: 311
    Launchpad bug:
          https://bugs.launchpad.net/bugs/645702
    Test command: ./bin/test -vv \
          -t test_lpsize
    Pre-implementation: EdwinGrubbs, jcsacket
    Target release: 10.10

Forward small messages to the moderation queue
-----------------------------------------------

OOPS-1726XMLP285 shows that mailman forwarded a message for moderation that
exceeds the db storage limit.

There are two views on this issue. The first is that this is a copy of a
message mailman wants an admin to decide if the real message should be sent to
the list or discarded--the person only needs to see enough to make a decision.
the second issue is that mailman has rules for handling large messages and
they do not appear to agree with Lp. The soft limit enters moderation (which
is this case), and the hard limit is discarded immediately. I recall the soft
limit was raised at the request of statik.

Rules
-----

    * Update the lpsize tests to be unittests and delete the tests from
      postings.txt (often fails because of timeouts)
    * Add a rule to truncate the message so that it is small enough to
      be moderated.
      * Delete non text/plain parts because Lp only stores text/plain.
      * Truncate large text/parts. 10k will be enough for the moderator
        to make a decision to discard or forward. This message is shown
        in a list of messages; we want to limit the text to ensure the page
        is usable.

QA
--

    * Send an email with a large attachment to a list on staging.
      the attachment must be larger than 11m and not over 20m
    * Verify the message appears in the moderation queue

Lint
----

Linting changed files:
  lib/lp/services/mailman/doc/postings.txt
  lib/lp/services/mailman/monkeypatches/lpsize.py
  lib/lp/services/mailman/testing/__init__.py
  lib/lp/services/mailman/tests/test_lpsize.py

Test
----

    * lib/lp/services/mailman/doc/postings.txt
      * Removed redundant test that were prone to fail
    * lib/lp/services/mailman/testing/__init__.py
      * Fixed the test harness to make a proper email with an attachment.
        These tests were the first to really exercise it.
    * lib/lp/services/mailman/tests/test_lpsize.py
      * Converted the doctests to unittests
      * Added tests for the truncated_message function.

Implementation
--------------

    * lib/lp/services/mailman/monkeypatches/lpsize.py
      * Added truncated_message to create a small message and used it when
        calling hold().

To post a comment you must log in.
Revision history for this message
Edwin Grubbs (edwin-grubbs) wrote :
Download full text (6.1 KiB)

Hi Curtis,

This is a nice improvement. I have some comments below.

-Edwin

>=== modified file 'lib/lp/services/mailman/monkeypatches/lpsize.py'
>--- lib/lp/services/mailman/monkeypatches/lpsize.py 2010-08-20 20:31:18 +0000
>+++ lib/lp/services/mailman/monkeypatches/lpsize.py 2010-10-06 15:36:24 +0000
>@@ -3,15 +3,43 @@
>
> """A pipeline handler for checking message sizes."""
>
>-# pylint: disable-msg=F0401
>+import email
>+
> from Mailman import (
> Errors,
>+ Message,
> mm_cfg,
> )
> from Mailman.Handlers.LPModerate import hold
> from Mailman.Logging.Syslog import syslog
>
>
>+def truncated_message(original_message, limit=10000):
>+ """Create a smaller version of the message for moderation."""
>+ message = email.message_from_string(
>+ original_message.as_string(), Message.Message)
>+ for part in email.iterators.typed_subpart_iterator(message, 'multipart'):
>+ subparts = part.get_payload()
>+ removeable = []
>+ for subpart in subparts:
>+ if subpart.get_content_maintype() == 'multipart':
>+ # This part is handled in the outer loop.
>+ continue
>+ elif subpart.get_content_type() == 'text/plain':

Shouldn't it truncate text/html or any other "text/" content-type,
since the user's mail client might create an empty text/plain section?

>+ # Truncate the message at 10kb so that there is enough
>+ # information for the moderator to make a decision.
>+ content = subpart.get_payload().strip()
>+ if len(content) > limit:
>+ subpart.set_payload(
>+ content[:limit] + ' [truncated for moderation]',

It seems like [truncated for moderation] will be easier to see if
it is on a new line.

>+ subpart.get_charset())
>+ else:
>+ removeable.append(subpart)
>+ for subpart in removeable:
>+ subparts.remove(subpart)
>+ return message
>+
>+
> def process(mlist, msg, msgdata):
> """Check the message size (approximately) against hard and soft limits.
>
>=== added file 'lib/lp/services/mailman/tests/test_lpsize.py'
>--- lib/lp/services/mailman/tests/test_lpsize.py 1970-01-01 00:00:00 +0000
>+++ lib/lp/services/mailman/tests/test_lpsize.py 2010-10-06 15:36:24 +0000
>@@ -0,0 +1,129 @@
>+# Copyright 20010 Canonical Ltd. This software is licensed under the

I think the copyright will expire by the time Bender and Leela can use
it.

>+# GNU Affero General Public License version 3 (see the file LICENSE).
>+"""Test the lpsize monekypatches"""
>+
>+from __future__ import with_statement
>+
>+__metaclass__ = type
>+__all__ = []
>+
>+from email.mime.application import MIMEApplication
>+
>+from Mailman import Errors
>+from Mailman.Handlers import LPSize
>+
>+from canonical.config import config
>+from canonical.testing import (
>+ DatabaseFunctionalLayer,
>+ LaunchpadFunctionalLayer,
>+ )
>+from lp.services.mailman.testing import MailmanTestCase
>+
>+
>+class TestLPSizeTestCase(MailmanTestCase):
>+ """Test LPSize.
>+
>+ Mailman process() methods quietly return. They may...

Read more...

review: Approve (code)
Revision history for this message
Curtis Hovey (sinzui) wrote :
Download full text (5.1 KiB)

On Wed, 2010-10-06 at 17:23 +0000, Edwin Grubbs wrote:
> Review: Approve code
> Hi Curtis,
>
> This is a nice improvement. I have some comments below.
>
> -Edwin
>
>
> >=== mo>+# GNU Affero General Public License version 3 (see the file LICENSE).
> >+"""Test the lpsize monekypatches"""
> >+
> >+from __future__ import with_statement
> >+
> >+__metaclass__ = type
> >+__all__ = []
> >+
> >+from email.mime.application import MIMEApplication
> >+
> >+from Mailman import Errors
> >+from Mailman.Handlers import LPSize
> >+
> >+from canonical.config import config
> >+from canonical.testing import (
> >+ DatabaseFunctionalLayer,
> >+ LaunchpadFunctionalLayer,
> >+ )
> >+from lp.services.mailman.testing import MailmanTestCase
> >+
> >+
> dified file 'lib/lp/services/mailman/monkeypatches/lpsize.py'
> >--- lib/lp/services/mailman/monkeypatches/lpsize.py 2010-08-20 20:31:18 +0000
> >+++ lib/lp/services/mailman/monkeypatches/lpsize.py 2010-10-06 15:36:24 +0000
> >...
> >+def truncated_message(original_message, limit=10000):
> >+ """Create a smaller version of the message for moderation."""
> >+ message = email.message_from_string(
> >+ original_message.as_string(), Message.Message)
> >+ for part in email.iterators.typed_subpart_iterator(message, 'multipart'):
> >+ subparts = part.get_payload()
> >+ removeable = []
> >+ for subpart in subparts:
> >+ if subpart.get_content_maintype() == 'multipart':
> >+ # This part is handled in the outer loop.
> >+ continue
> >+ elif subpart.get_content_type() == 'text/plain':
>
>
>
> Shouldn't it truncate text/html or any other "text/" content-type,
> since the user's mail client might create an empty text/plain section?

Launchpad does not support text/html messages. Messages without
text/plain parts are discarded in the hold process. There is an old bug
about empty messages in the moderation queue. It was discovered that all
cases are from spammers. We did not see a need to support text/html
since list subscribers are not using it and getting messages in the
moderation queue.

> >+ # Truncate the message at 10kb so that there is enough
> >+ # information for the moderator to make a decision.
> >+ content = subpart.get_payload().strip()
> >+ if len(content) > limit:
> >+ subpart.set_payload(
> >+ content[:limit] + ' [truncated for moderation]',
>
>
>
> It seems like [truncated for moderation] will be easier to see if
> it is on a new line.

Agreed

> >=== added file 'lib/lp/services/mailman/tests/test_lpsize.py'
> >--- lib/lp/services/mailman/tests/test_lpsize.py 1970-01-01 00:00:00 +0000
> >+++ lib/lp/services/mailman/tests/test_lpsize.py 2010-10-06 15:36:24 +0000
> >@@ -0,0 +1,129 @@
> >+# Copyright 20010 Canonical Ltd. This software is licensed under the
>
>
>
> I think the copyright will expire by the time Bender and Leela can use
> it.

:)

...

> >+class TestLPSizeTestCase(MailmanTestCase):
> >+ """Test LPSize.
> >+
> >+ Mailman process() methods quietly return. They may set msg_data key-values
> ...

Read more...

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'lib/lp/services/mailman/doc/postings.txt'
2--- lib/lp/services/mailman/doc/postings.txt 2010-10-03 15:30:06 +0000
3+++ lib/lp/services/mailman/doc/postings.txt 2010-10-06 18:06:45 +0000
4@@ -832,79 +832,3 @@
5 >>> vette_watcher.wait_for_discard('narwhale')
6 >>> len(list(smtpd))
7 0
8-
9-
10-Large messages
11-==============
12-
13-Only messages which are less than about 40k in size are allowed straight
14-through on the mailing list. A message bigger than that will be held for
15-moderator approval.
16-
17- >>> from canonical.config import config
18- >>> config.mailman.soft_max_size
19- 40000
20-
21- >>> from email.mime.multipart import MIMEMultipart
22- >>> from email.mime.application import MIMEApplication
23- >>> from email.mime.text import MIMEText
24-
25- >>> big_message = MIMEMultipart()
26- >>> big_message['From'] = 'anne.person@example.com'
27- >>> big_message['To'] = 'itest-one@lists.launchpad.dev'
28- >>> big_message['Subject'] = 'A huge message'
29- >>> big_message['Message-ID'] = '<puma>'
30- >>> big_message.attach(
31- ... MIMEText('look at this pdf.', 'plain'))
32- >>> big_message.attach(
33- ... MIMEApplication('\n'.join(['x' * 50] * 1000), 'octet-stream'))
34- >>> inject('itest-one', big_message.as_string())
35-
36- >>> vette_watcher.wait_for_hold('itest-one', 'puma')
37- >>> print_message_summaries()
38- Number of messages: 1
39- bounces@canonical.com
40- ...
41- Itest One <noreply@launchpad.net>
42- New mailing list message requiring approval for Itest One
43-
44-Once this message is approved, it is posted through to the mailing list.
45-
46- >>> browser.open(
47- ... 'http://launchpad.dev:8085/~itest-one/+mailinglist-moderate')
48- >>> browser.getControl(name='field.%3Cpuma%3E').value = ['approve']
49- >>> browser.getControl('Moderate').click()
50- >>> smtpd_watcher.wait_for_mbox_delivery('puma')
51- >>> smtpd_watcher.wait_for_mbox_delivery('puma')
52-
53- >>> print_message_summaries()
54- Number of messages: 2
55- itest-one-bounces+anne.person=example.com@lists.launchpad.dev
56- <puma>
57- anne.person@example.com
58- [Itest-one] A huge message
59- itest-one-bounces+archive=mail-archive.dev@lists.launchpad.dev
60- <puma>
61- anne.person@example.com
62- [Itest-one] A huge message
63-
64-There is a hard limit of 1MB, over which the message is summarily logged and
65-discarded.
66-
67- >>> config.mailman.hard_max_size
68- 1000000
69-
70- >>> huge_message = MIMEMultipart()
71- >>> huge_message['From'] = 'anne.person@example.com'
72- >>> huge_message['To'] = 'itest-one@lists.launchpad.dev'
73- >>> huge_message['Subject'] = 'A huge message'
74- >>> huge_message['Message-ID'] = '<quahog>'
75- >>> huge_message.attach(
76- ... MIMEText('look at this huge pdf.', 'plain'))
77- >>> huge_message.attach(
78- ... MIMEApplication('\n'.join(['x' * 50] * 20000), 'octet-stream'))
79- >>> inject('itest-one', huge_message.as_string())
80-
81- >>> vette_watcher.wait_for_discard('quahog')
82- >>> print_message_summaries()
83- Number of messages: 0
84
85=== modified file 'lib/lp/services/mailman/monkeypatches/lpsize.py'
86--- lib/lp/services/mailman/monkeypatches/lpsize.py 2010-08-20 20:31:18 +0000
87+++ lib/lp/services/mailman/monkeypatches/lpsize.py 2010-10-06 18:06:45 +0000
88@@ -1,17 +1,45 @@
89-# Copyright 2009 Canonical Ltd. This software is licensed under the
90+# Copyright 2009-2010 Canonical Ltd. This software is licensed under the
91 # GNU Affero General Public License version 3 (see the file LICENSE).
92
93 """A pipeline handler for checking message sizes."""
94
95-# pylint: disable-msg=F0401
96+import email
97+
98 from Mailman import (
99 Errors,
100+ Message,
101 mm_cfg,
102 )
103 from Mailman.Handlers.LPModerate import hold
104 from Mailman.Logging.Syslog import syslog
105
106
107+def truncated_message(original_message, limit=10000):
108+ """Create a smaller version of the message for moderation."""
109+ message = email.message_from_string(
110+ original_message.as_string(), Message.Message)
111+ for part in email.iterators.typed_subpart_iterator(message, 'multipart'):
112+ subparts = part.get_payload()
113+ removeable = []
114+ for subpart in subparts:
115+ if subpart.get_content_maintype() == 'multipart':
116+ # This part is handled in the outer loop.
117+ continue
118+ elif subpart.get_content_type() == 'text/plain':
119+ # Truncate the message at 10kb so that there is enough
120+ # information for the moderator to make a decision.
121+ content = subpart.get_payload().strip()
122+ if len(content) > limit:
123+ subpart.set_payload(
124+ content[:limit] + '\n[truncated for moderation]',
125+ subpart.get_charset())
126+ else:
127+ removeable.append(subpart)
128+ for subpart in removeable:
129+ subparts.remove(subpart)
130+ return message
131+
132+
133 def process(mlist, msg, msgdata):
134 """Check the message size (approximately) against hard and soft limits.
135
136@@ -42,7 +70,10 @@
137 if message_size < mm_cfg.LAUNCHPAD_HARD_MAX_SIZE:
138 # Hold the message in Mailman. See lpmoderate.py for similar
139 # algorithm.
140- hold(mlist, msg, msgdata, 'Too big')
141+ # There is a limit to the size that can be stored in Lp. Send
142+ # a trucated copy of the message that has enough information for
143+ # the moderator to make a decision.
144+ hold(mlist, truncated_message(msg), msgdata, 'Too big')
145 # The message is larger than the hard limit, so log and discard.
146 syslog('vette', 'Discarding message w/size > hard limit: %s',
147 msg.get('message-id', 'n/a'))
148
149=== modified file 'lib/lp/services/mailman/testing/__init__.py'
150--- lib/lp/services/mailman/testing/__init__.py 2010-10-04 19:50:45 +0000
151+++ lib/lp/services/mailman/testing/__init__.py 2010-10-06 18:06:45 +0000
152@@ -7,6 +7,7 @@
153
154 from contextlib import contextmanager
155 import email
156+from email.mime.multipart import MIMEMultipart
157 from email.mime.text import MIMEText
158 import os
159 import shutil
160@@ -87,13 +88,16 @@
161 # Make a Mailman Message.Message.
162 if isinstance(sender, (list, tuple)):
163 sender = ', '.join(sender)
164- message = MIMEText(content, mime_type)
165+ message = MIMEMultipart()
166 message['from'] = sender
167 message['to'] = mm_list.getListAddress()
168 message['subject'] = subject
169 message['message-id'] = self.getUniqueString()
170+ message.attach(MIMEText(content, mime_type))
171+ if attachment is not None:
172+ # Rewrap the text message in a multipart message and add the
173+ # attachment.
174+ message.attach(attachment)
175 mm_message = email.message_from_string(
176 message.as_string(), Message.Message)
177- if attachment is not None:
178- mm_message.attach(attachment, 'octet-stream')
179 return mm_message
180
181=== modified file 'lib/lp/services/mailman/tests/test_lpmoderate.py'
182--- lib/lp/services/mailman/tests/test_lpmoderate.py 2010-10-04 19:50:45 +0000
183+++ lib/lp/services/mailman/tests/test_lpmoderate.py 2010-10-06 18:06:45 +0000
184@@ -18,11 +18,11 @@
185 """Test lpmoderate.
186
187 Mailman process() methods quietly return. They may set msg_data key-values
188- or raise an error to end processing. This group of tests tests often check
189- for errors, but that does not mean there is an error condition, it only
190- means message processing has reached a final decision. Messages that do
191- not cause a final decision pass-through and the process() methods ends
192- without a return.
193+ or raise an error to end processing. These tests often check for errors,
194+ but that does not mean there is an error condition, it only means message
195+ processing has reached a final decision. Messages that do not cause a
196+ final decision pass through, and the process() methods ends without a
197+ return.
198 """
199
200 layer = LaunchpadFunctionalLayer
201
202=== added file 'lib/lp/services/mailman/tests/test_lpsize.py'
203--- lib/lp/services/mailman/tests/test_lpsize.py 1970-01-01 00:00:00 +0000
204+++ lib/lp/services/mailman/tests/test_lpsize.py 2010-10-06 18:06:45 +0000
205@@ -0,0 +1,129 @@
206+# Copyright 2010 Canonical Ltd. This software is licensed under the
207+# GNU Affero General Public License version 3 (see the file LICENSE).
208+"""Test the lpsize monekypatches"""
209+
210+from __future__ import with_statement
211+
212+__metaclass__ = type
213+__all__ = []
214+
215+from email.mime.application import MIMEApplication
216+
217+from Mailman import Errors
218+from Mailman.Handlers import LPSize
219+
220+from canonical.config import config
221+from canonical.testing import (
222+ DatabaseFunctionalLayer,
223+ LaunchpadFunctionalLayer,
224+ )
225+from lp.services.mailman.testing import MailmanTestCase
226+
227+
228+class TestLPSizeTestCase(MailmanTestCase):
229+ """Test LPSize.
230+
231+ Mailman process() methods quietly return. They may set msg_data key-values
232+ or raise an error to end processing. These tests often check for errors,
233+ but that does not mean there is an error condition, it only means message
234+ processing has reached a final decision. Messages that do not cause a
235+ final decision pass through, and the process() methods ends without a
236+ return.
237+ """
238+
239+ layer = LaunchpadFunctionalLayer
240+
241+ def setUp(self):
242+ super(TestLPSizeTestCase, self).setUp()
243+ self.team, self.mailing_list = self.factory.makeTeamAndMailingList(
244+ 'team-1', 'team-1-owner')
245+ self.mm_list = self.makeMailmanList(self.mailing_list)
246+ self.subscriber_email = self.team.teamowner.preferredemail.email
247+
248+ def tearDown(self):
249+ super(TestLPSizeTestCase, self).tearDown()
250+ self.cleanMailmanList(self.mm_list)
251+
252+ def test_process_size_under_soft_limit(self):
253+ # Any message under 40kb is sent to the list.
254+ attachment = MIMEApplication(
255+ '\n'.join(['x' * 20] * 1000), 'octet-stream')
256+ message = self.makeMailmanMessage(
257+ self.mm_list, self.subscriber_email, 'subject', 'content',
258+ attachment=attachment)
259+ msg_data = {}
260+ silence = LPSize.process(self.mm_list, message, msg_data)
261+ self.assertEqual(None, silence)
262+
263+ def test_process_size_over_soft_limit_held(self):
264+ # Messages over 40kb held for moderation.
265+ self.assertEqual(40000, config.mailman.soft_max_size)
266+ attachment = MIMEApplication(
267+ '\n'.join(['x' * 40] * 1000), 'octet-stream')
268+ message = self.makeMailmanMessage(
269+ self.mm_list, self.subscriber_email, 'subject', 'content',
270+ attachment=attachment)
271+ msg_data = {}
272+ args = (self.mm_list, message, msg_data)
273+ self.assertRaises(
274+ Errors.HoldMessage, LPSize.process, *args)
275+ self.assertEqual(1, self.mailing_list.getReviewableMessages().count())
276+
277+ def test_process_size_over_hard_limit_discarded(self):
278+ # Messages over 1MB are discarded.
279+ self.assertEqual(1000000, config.mailman.hard_max_size)
280+ attachment = MIMEApplication(
281+ '\n'.join(['x' * 1000] * 1000), 'octet-stream')
282+ message = self.makeMailmanMessage(
283+ self.mm_list, self.subscriber_email, 'subject', 'content',
284+ attachment=attachment)
285+ msg_data = {}
286+ args = (self.mm_list, message, msg_data)
287+ self.assertRaises(
288+ Errors.DiscardMessage, LPSize.process, *args)
289+ self.assertEqual(0, self.mailing_list.getReviewableMessages().count())
290+
291+
292+class TestTruncatedMessage(MailmanTestCase):
293+ """Test truncated_message helper."""
294+
295+ layer = DatabaseFunctionalLayer
296+
297+ def setUp(self):
298+ super(TestTruncatedMessage, self).setUp()
299+ self.team, self.mailing_list = self.factory.makeTeamAndMailingList(
300+ 'team-1', 'team-1-owner')
301+ self.mm_list = self.makeMailmanList(self.mailing_list)
302+ self.subscriber_email = self.team.teamowner.preferredemail.email
303+
304+ def test_attchments_are_removed(self):
305+ # Plain-text and multipart are preserved, everything else is removed.
306+ attachment = MIMEApplication('binary gibberish', 'octet-stream')
307+ message = self.makeMailmanMessage(
308+ self.mm_list, self.subscriber_email, 'subject', 'content',
309+ attachment=attachment)
310+ moderated_message = LPSize.truncated_message(message)
311+ parts = [part for part in moderated_message.walk()]
312+ types = [part.get_content_type() for part in parts]
313+ self.assertEqual(['multipart/mixed', 'text/plain'], types)
314+
315+ def test_small_text_is_preserved(self):
316+ # Text parts below the limit are unchanged.
317+ message = self.makeMailmanMessage(
318+ self.mm_list, self.subscriber_email, 'subject', 'content')
319+ moderated_message = LPSize.truncated_message(message, limit=1000)
320+ parts = [part for part in moderated_message.walk()]
321+ types = [part.get_content_type() for part in parts]
322+ self.assertEqual(['multipart/mixed', 'text/plain'], types)
323+ self.assertEqual('content', parts[1].get_payload())
324+
325+ def test_large_text_is_truncated(self):
326+ # Text parts above the limit are truncated.
327+ message = self.makeMailmanMessage(
328+ self.mm_list, self.subscriber_email, 'subject', 'content excess')
329+ moderated_message = LPSize.truncated_message(message, limit=7)
330+ parts = [part for part in moderated_message.walk()]
331+ types = [part.get_content_type() for part in parts]
332+ self.assertEqual(['multipart/mixed', 'text/plain'], types)
333+ self.assertEqual(
334+ 'content\n[truncated for moderation]', parts[1].get_payload())