Merge lp:~thumper/launchpad/mail-header-oops into lp:launchpad

Proposed by Tim Penhey
Status: Merged
Approved by: Robert Collins
Approved revision: no longer in the source branch.
Merged at revision: 12566
Proposed branch: lp:~thumper/launchpad/mail-header-oops
Merge into: lp:launchpad
Diff against target: 241 lines (+91/-30)
4 files modified
lib/lp/services/mail/incoming.py (+36/-20)
lib/lp/services/mail/tests/incomingmail.txt (+6/-6)
lib/lp/services/mail/tests/test_incoming.py (+44/-1)
lib/lp/testing/factory.py (+5/-3)
To merge this branch: bzr merge lp:~thumper/launchpad/mail-header-oops
Reviewer Review Type Date Requested Status
Robert Collins (community) Approve
Ian Booth (community) *code Approve
Review via email: mp+52156@code.launchpad.net

Commit message

[r=lifeless,wallyworld][bug=701976,728827] Lower some log.warning to log.info as warnings generate oopses.

Description of the change

Warnings that are logged are turned into OOPSes by the
OOPS log handler added by the base CronScript class.

I've updated the behaviour of the header checking based
on the comments in bug 701976.

To post a comment you must log in.
Revision history for this message
Martin Pool (mbp) wrote :

+def extract_addresses(mail, raw_mail, file_alias_url, log):
+ # Extract the domain the mail was sent to. Mails sent to
+ # Launchpad should have an X-Launchpad-Original-To header.

Why not turn that into a docstring? And, maybe add things just saying the types of the parameters and the return code. It's the kind of thing that easily has bugs about passing a string vs a list of strings vs a file etc. (Or, you could use a statically type-checked language. :-P)

+ if ORIGINAL_TO_HEADER in mail:
+ return [mail[ORIGINAL_TO_HEADER]]
+
+ if ORIGINAL_TO_HEADER in raw_mail:
+ # Almost certainly a spam email with a blank line in the email headers
+ log.info('Suspected spam: %s' % file_alias_url)

I really do not see how the comment follows from the condition. How can you even have a blank line in the headers? You can have a blank line that causes things that ought to be headers to be part of the body, but no reasonable mail agent will see them as headers then.

Perhaps this should say something like:

 # Doesn't have an X-Launchpad-Original-To in the headers, but does have one in the body, because of a forwarding loop or attempted spam. See <https://bugs.launchpad.net/launchpad/+bug/701976>

Revision history for this message
Martin Pool (mbp) wrote :

I'm +1 on the dkim changes, and -1 (pending more information) on the x-l-original-to changes. Having that substring within a message doesn't mean it's spam.

Revision history for this message
Tim Penhey (thumper) wrote :

I agree that having that substring within the message doesn't mean it is spam, but having it in the message but not in the headers means that the headers have been messed with somehow, and that is more indicative of spam. I've updated the code to match your suggested wording.

Revision history for this message
Martin Pool (mbp) wrote :

I don't understand at the moment how incoming spam can cause the
headers to be split and x-l-original-to not to be attached. That
seems to me to be more likely to be a bug or misconfiguration in our
software.

Martin

Revision history for this message
Ian Booth (wallyworld) wrote :

I like the wording change suggested by Martin.
The scenarios tested for seem to match the observed data referred to in the bug report so to me it looks ok.

review: Approve (*code)
Revision history for this message
Martin Pool (mbp) wrote :

On 4 March 2011 14:54, Tim Penhey <email address hidden> wrote:
> I agree that having that substring within the message doesn't mean it is spam, but having it in the message but not in the headers means that the headers have been messed with somehow, and that is more indicative of spam.

I think we ought to identify why spam is causing the messages to be
corrupted. Checking for this string is just superstition.

Revision history for this message
Martin Pool (mbp) wrote :

Note that we're not checking for a body that looks like it contains headers, which might be a somewhat-reliable indicator of either spam or something broken. We're just looking for that string anywhere in the body. It's reasonably likely to match on messages that do not contain broken headers.

Revision history for this message
Robert Collins (lifeless) wrote :

@Martin - spam often has invalid headers (attempting smuggling) and our x-original-to mangler adds it after the invalid header, but our lp parser stops at the invalid point. So this is pretty reliable and won't reject valid mail AFAWCT.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'lib/lp/services/mail/incoming.py'
--- lib/lp/services/mail/incoming.py 2011-03-03 02:29:54 +0000
+++ lib/lp/services/mail/incoming.py 2011-03-09 20:52:00 +0000
@@ -143,19 +143,19 @@
143 # in addition to the dkim signature being valid, we have to check that it143 # in addition to the dkim signature being valid, we have to check that it
144 # was actually signed by the user's domain.144 # was actually signed by the user's domain.
145 if len(signing_details) != 1:145 if len(signing_details) != 1:
146 log.errors(146 log.info(
147 'expected exactly one DKIM details record: %r'147 'expected exactly one DKIM details record: %r'
148 % (signing_details,))148 % (signing_details,))
149 return False149 return False
150 signing_domain = signing_details[0]['d']150 signing_domain = signing_details[0]['d']
151 from_domain = extract_address_domain(signed_message['From'])151 from_domain = extract_address_domain(signed_message['From'])
152 if signing_domain != from_domain:152 if signing_domain != from_domain:
153 log.warning("DKIM signing domain %s doesn't match From address %s; "153 log.info("DKIM signing domain %s doesn't match From address %s; "
154 "disregarding signature"154 "disregarding signature"
155 % (signing_domain, from_domain))155 % (signing_domain, from_domain))
156 return False156 return False
157 if not _isDkimDomainTrusted(signing_domain):157 if not _isDkimDomainTrusted(signing_domain):
158 log.warning("valid DKIM signature from untrusted domain %s"158 log.info("valid DKIM signature from untrusted domain %s"
159 % (signing_domain,))159 % (signing_domain,))
160 return False160 return False
161 return True161 return True
@@ -262,6 +262,36 @@
262 self.setOopsToken('EMAIL')262 self.setOopsToken('EMAIL')
263263
264264
265ORIGINAL_TO_HEADER = 'X-Launchpad-Original-To'
266
267
268def extract_addresses(mail, raw_mail, file_alias_url, log):
269 """Extract the domain the mail was sent to.
270
271 Mails sent to Launchpad should have an X-Launchpad-Original-To header.
272 This is added by the MTA before it ends up the mailbox for Launchpad.
273 """
274 if ORIGINAL_TO_HEADER in mail:
275 return [mail[ORIGINAL_TO_HEADER]]
276
277 if ORIGINAL_TO_HEADER in raw_mail:
278 # Doesn't have an X-Launchpad-Original-To in the headers, but does
279 # have one in the body, because of a forwarding loop or attempted
280 # spam. See <https://bugs.launchpad.net/launchpad/+bug/701976>
281 log.info('Suspected spam: %s' % file_alias_url)
282 else:
283 # This most likely means a email configuration problem, and it should
284 # log an oops.
285 log.warn(
286 "No X-Launchpad-Original-To header was present "
287 "in email: %s" % file_alias_url)
288 # Process all addresses found as a fall back.
289 cc = mail.get_all('cc') or []
290 to = mail.get_all('to') or []
291 names_addresses = getaddresses(to + cc)
292 return [addr for name, addr in names_addresses]
293
294
265def report_oops(file_alias_url=None, error_msg=None):295def report_oops(file_alias_url=None, error_msg=None):
266 """Record an OOPS for the current exception and return the OOPS ID."""296 """Record an OOPS for the current exception and return the OOPS ID."""
267 info = sys.exc_info()297 info = sys.exc_info()
@@ -280,7 +310,7 @@
280310
281311
282def handleMail(trans=transaction,312def handleMail(trans=transaction,
283 signature_timestamp_checker=None):313 signature_timestamp_checker=None):
284 # First we define an error handler. We define it as a local314 # First we define an error handler. We define it as a local
285 # function, to avoid having to pass a lot of parameters.315 # function, to avoid having to pass a lot of parameters.
286 # pylint: disable-msg=W0631316 # pylint: disable-msg=W0631
@@ -394,21 +424,8 @@
394 file_alias_url, notify=False)424 file_alias_url, notify=False)
395 continue425 continue
396426
397 # Extract the domain the mail was sent to. Mails sent to427 addresses = extract_addresses(
398 # Launchpad should have an X-Launchpad-Original-To header.428 mail, raw_mail, file_alias_url, log)
399 if 'X-Launchpad-Original-To' in mail:
400 addresses = [mail['X-Launchpad-Original-To']]
401 else:
402 log = logging.getLogger('lp.services.mail')
403 log.warn(
404 "No X-Launchpad-Original-To header was present "
405 "in email: %s" %
406 file_alias_url)
407 # Process all addresses found as a fall back.
408 cc = mail.get_all('cc') or []
409 to = mail.get_all('to') or []
410 names_addresses = getaddresses(to + cc)
411 addresses = [addr for name, addr in names_addresses]
412429
413 try:430 try:
414 do_paranoid_envelope_to_validation(addresses)431 do_paranoid_envelope_to_validation(addresses)
@@ -464,7 +481,6 @@
464 # from being processed.481 # from being processed.
465 _handle_error(482 _handle_error(
466 "Unhandled exception", file_alias_url)483 "Unhandled exception", file_alias_url)
467 log = logging.getLogger('canonical.launchpad.mail')
468 if file_alias_url is not None:484 if file_alias_url is not None:
469 email_info = file_alias_url485 email_info = file_alias_url
470 else:486 else:
471487
=== modified file 'lib/lp/services/mail/tests/incomingmail.txt'
--- lib/lp/services/mail/tests/incomingmail.txt 2010-12-13 17:34:34 +0000
+++ lib/lp/services/mail/tests/incomingmail.txt 2011-03-09 20:52:00 +0000
@@ -101,10 +101,10 @@
101header is missing.)101header is missing.)
102102
103 >>> handleMailForTest()103 >>> handleMailForTest()
104 WARNING:lp.services.mail:No X-Launchpad-Original-To header was present ...104 WARNING:process-mail:No X-Launchpad-Original-To header was present ...
105 WARNING:lp.services.mail:No X-Launchpad-Original-To header was present ...105 WARNING:process-mail:No X-Launchpad-Original-To header was present ...
106 WARNING:lp.services.mail:No X-Launchpad-Original-To header was present ...106 WARNING:process-mail:No X-Launchpad-Original-To header was present ...
107 WARNING:lp.services.mail:No X-Launchpad-Original-To header was present ...107 WARNING:process-mail:No X-Launchpad-Original-To header was present ...
108108
109Now we can see that each handler handled the emails sent to its domain:109Now we can see that each handler handled the emails sent to its domain:
110110
@@ -252,7 +252,7 @@
252 ... """)252 ... """)
253 >>> msgid = sendmail(msg, ['edit@malone-domain'])253 >>> msgid = sendmail(msg, ['edit@malone-domain'])
254 >>> handleMailForTest()254 >>> handleMailForTest()
255 ERROR:canonical.launchpad.mail:An exception was raised inside the handler:255 ERROR:process-mail:An exception was raised inside the handler:
256 ...256 ...
257 TestOopsException257 TestOopsException
258258
@@ -300,7 +300,7 @@
300 ... """)300 ... """)
301 >>> msgid = sendmail(msg, ['edit@malone-domain'])301 >>> msgid = sendmail(msg, ['edit@malone-domain'])
302 >>> handleMailForTest()302 >>> handleMailForTest()
303 ERROR:canonical.launchpad.mail:An exception was raised inside the handler:303 ERROR:process-mail:An exception was raised inside the handler:
304 ...304 ...
305 Unauthorized305 Unauthorized
306306
307307
=== modified file 'lib/lp/services/mail/tests/test_incoming.py'
--- lib/lp/services/mail/tests/test_incoming.py 2011-03-03 02:29:54 +0000
+++ lib/lp/services/mail/tests/test_incoming.py 2011-03-09 20:52:00 +0000
@@ -6,7 +6,7 @@
6import os6import os
7import unittest7import unittest
88
9from testtools.matchers import Is9from testtools.matchers import Equals, Is
10import transaction10import transaction
11from zope.security.management import setSecurityPolicy11from zope.security.management import setSecurityPolicy
1212
@@ -19,10 +19,13 @@
19 helpers,19 helpers,
20 )20 )
21from canonical.testing.layers import LaunchpadZopelessLayer21from canonical.testing.layers import LaunchpadZopelessLayer
22from lp.services.log.logger import BufferLogger
22from lp.services.mail.incoming import (23from lp.services.mail.incoming import (
23 authenticateEmail,24 authenticateEmail,
25 extract_addresses,
24 handleMail,26 handleMail,
25 MailErrorUtility,27 MailErrorUtility,
28 ORIGINAL_TO_HEADER,
26 )29 )
27from lp.services.mail.sendmail import MailController30from lp.services.mail.sendmail import MailController
28from lp.services.mail.stub import TestMailer31from lp.services.mail.stub import TestMailer
@@ -119,6 +122,46 @@
119 self.assertThat(authenticateEmail(mail), Is(None))122 self.assertThat(authenticateEmail(mail), Is(None))
120123
121124
125class TestExtractAddresses(TestCaseWithFactory):
126
127 layer = LaunchpadZopelessLayer
128
129 def test_original_to(self):
130 mail = self.factory.makeSignedMessage()
131 original_to = 'eric@vikings.example.com'
132 mail[ORIGINAL_TO_HEADER] = original_to
133 self.assertThat(
134 extract_addresses(mail, None, None, None),
135 Equals([original_to]))
136
137 def test_original_to_in_body(self):
138 header_to = 'eric@vikings-r-us.example.com'
139 original_to = 'eric@vikings.example.com'
140 alias = 'librarian-somewhere'
141 body = '%s: %s\n\nsome body stuff' % (
142 ORIGINAL_TO_HEADER, original_to)
143 log = BufferLogger()
144 mail = self.factory.makeSignedMessage(
145 body=body, to_address=header_to)
146 addresses = extract_addresses(mail, mail.as_string(), alias, log)
147 self.assertThat(addresses, Equals([header_to]))
148 self.assertThat(
149 log.getLogBuffer(),
150 Equals('INFO Suspected spam: librarian-somewhere\n'))
151
152 def test_original_to_missing(self):
153 header_to = 'eric@vikings-r-us.example.com'
154 alias = 'librarian-somewhere'
155 log = BufferLogger()
156 mail = self.factory.makeSignedMessage(to_address=header_to)
157 addresses = extract_addresses(mail, mail.as_string(), alias, log)
158 self.assertThat(addresses, Equals([header_to]))
159 self.assertThat(
160 log.getLogBuffer(),
161 Equals('WARNING No X-Launchpad-Original-To header was present '
162 'in email: librarian-somewhere\n'))
163
164
122def setUp(test):165def setUp(test):
123 test._old_policy = setSecurityPolicy(LaunchpadSecurityPolicy)166 test._old_policy = setSecurityPolicy(LaunchpadSecurityPolicy)
124 LaunchpadZopelessLayer.switchDbUser(config.processmail.dbuser)167 LaunchpadZopelessLayer.switchDbUser(config.processmail.dbuser)
125168
=== modified file 'lib/lp/testing/factory.py'
--- lib/lp/testing/factory.py 2011-03-07 19:06:47 +0000
+++ lib/lp/testing/factory.py 2011-03-09 20:52:00 +0000
@@ -1829,7 +1829,7 @@
18291829
1830 def makeSignedMessage(self, msgid=None, body=None, subject=None,1830 def makeSignedMessage(self, msgid=None, body=None, subject=None,
1831 attachment_contents=None, force_transfer_encoding=False,1831 attachment_contents=None, force_transfer_encoding=False,
1832 email_address=None, signing_context=None):1832 email_address=None, signing_context=None, to_address=None):
1833 """Return an ISignedMessage.1833 """Return an ISignedMessage.
18341834
1835 :param msgid: An rfc2822 message-id.1835 :param msgid: An rfc2822 message-id.
@@ -1847,8 +1847,10 @@
1847 person = self.makePerson()1847 person = self.makePerson()
1848 email_address = removeSecurityProxy(person).preferredemail.email1848 email_address = removeSecurityProxy(person).preferredemail.email
1849 mail['From'] = email_address1849 mail['From'] = email_address
1850 mail['To'] = removeSecurityProxy(1850 if to_address is None:
1851 self.makePerson()).preferredemail.email1851 to_address = removeSecurityProxy(
1852 self.makePerson()).preferredemail.email
1853 mail['To'] = to_address
1852 if subject is None:1854 if subject is None:
1853 subject = self.getUniqueString('subject')1855 subject = self.getUniqueString('subject')
1854 mail['Subject'] = subject1856 mail['Subject'] = subject