OpenERP Community Backports (Addons)

Merge lp:~hirt/ocb-addons/6.1_fix_mailaddress-parsing into lp:ocb-addons/6.1

6.1_fix_mailaddress-parsing
Merge into 6.1

Proposed by Etienne Hirt on 2013-11-05

Status:

Merged

Merged at revision:

6824

Proposed branch:

lp:~hirt/ocb-addons/6.1_fix_mailaddress-parsing

Merge into:

lp:ocb-addons/6.1

Diff against target:

33 lines (+9/-3)

1 file modified

mail/mail_message.py (+9/-3)

To merge this branch:

bzr merge lp:~hirt/ocb-addons/6.1_fix_mailaddress-parsing

Low

In Progress

Link a bug report

Reviewer	Review Type	Date Requested	Status
Holger Brunn (Therp)	code review		Approve on 2014-01-06
Stefan Rijnhart (Opener)			Approve on 2013-12-11
Pedro Manuel Baeza		2013-11-05	Approve on 2013-11-29
Review via email: mp+194030@code.launchpad.net

Description of the change

While using 6.1 mail_message to_email function I found that it does not
cope with all the possible names in the emaillist.

example: 'bla@2.ch' <bla@2.ch>, (bla@2.ch) <bla@2.ch>, "'bla@2.ch'" <bla@2.ch>, etc

The regexp in the function to_email of mail/mail_message.py is enhanced that it also removes ", () and
even if the allowed '.

Revision history for this message

Pedro Manuel Baeza (pedro.baeza) wrote on 2013-11-29:

Thanks for the MP, Etienne.

Regards.

review: Approve

Revision history for this message

Stefan Rijnhart (Opener) (stefan-opener) wrote on 2013-11-29:

In the case of email address parsing, which is notoriously difficult, I would have more faith in a standard function than a regular expression, at least without a large test set. Did you look at getaddressess from email.utils? See http://docs.python.org/2/library/email.util.html#email.utils.getaddresses for docs, and the example at the top of http://pastebin.com/zbzXQYS9

review: Needs Information

Revision history for this message

Etienne Hirt (hirt) wrote on 2013-11-29:

Dear Stefan,
Thanks for the comments and hints. I fully agree with your doubts in regular expressions. I have the same fear. That's why I had one of my employees that often uses regular expressions in Perl etc to help me.

We use this RE now since 15.3.13 daily to import emails from our imap server into Openerp to be assigned to projects, purchase orders, opportunities etc. without any fails.

The getaddresses function as well as the self built addrparser in the pastebin can not cope with troubles like "'bla@2.ch'" <bla@2.ch>.

I'm therefore convinced from the proposed approach.

Revision history for this message

Stefan Rijnhart (Opener) (stefan-opener) wrote on 2013-11-30:

On 11/29/2013 11:17 PM, Etienne Hirt wrote:
> We use this RE now since 15.3.13 daily to import emails from our imap server into Openerp to be assigned to projects, purchase orders, opportunities etc. without any fails.
Hi Etienne, thank you for your response. I still think that testing a
regexp for email addresses on a single OpenERP server counts as
anecdotal evidence.

> The getaddresses function as well as the self built addrparser in the pastebin can not cope with troubles like "'bla@2.ch'" <bla@2.ch>.

Am I missing something? I get

from email.utils import getaddresses
>>> getaddresses(["\"'bla@2.ch'\" <bla@2.ch>"])
[("'bla@2.ch'", 'bla@2.ch')]

This what I would expect.

Revision history for this message

Etienne Hirt (hirt) wrote on 2013-12-10:

Dear Stefan,

Sorry for the delayed continuation. Still the approach with getaddresses is not perfect. See below where the name 'bla@21.ch' is converted into a funny emailaddress. However, it works sufficiently to replace the regexp. Will upload the update for your review.
You might have a better proposal to get the second column of the matrix.

Thanks

Etienne

text = "'bla@21.ch' <bla@22.ch>, (bla@31.ch) <bla@32.ch>, \"'bla@41.ch'\" <bla@42.ch>,"
getaddresses([text])
[('', "'bla@21.ch'"), ('', 'bla@22.ch'), (' (bla@31.ch)', 'bla@32.ch'), ("'bla@41.ch'", 'bla@42.ch')]

re.findall(r'([^ ,(\'"<@]+@[^> ,)\'"]+)',text)
['bla@21.ch', 'bla@22.ch', 'bla@31.ch', 'bla@32.ch', 'bla@41.ch', 'bla@42.ch']

Revision history for this message

Stefan Rijnhart (Opener) (stefan-opener) wrote on 2013-12-11:

Hi Etienne,

thanks for the follow up! I believe the interpretation of 'bla@21.ch' as a separate emailaddress is correct as RFC822 mentions that strings (like the display name) containing the character '@' should be enclosed in the ascii quote character '"', which is not the case here.

You may like to compress the code using a list comprehension like

return [address[1] for address in getaddresses([text])]

but if you don't that is fine too. Approved.

review: Approve

Revision history for this message

Etienne Hirt (hirt) wrote on 2013-12-20:

Dear Stefan,

I finalized the code now using list comprehension. Hope this does not delay the merging.

Revision history for this message

Holger Brunn (Therp) (hbrunn) on 2014-01-06:

review: Approve (code review)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk

Subscribers

People subscribed via source and target branches

to all changes:

Etienne Hirt

OpenERP Community Backports

1	=== modified file 'mail/mail_message.py'
2	--- mail/mail_message.py 2013-11-20 01:17:39 +0000
3	+++ mail/mail_message.py 2013-12-20 22:41:55 +0000
4	@@ -26,10 +26,10 @@
5	import email
6	import logging
7	import pytz
8	-import re
9	import time
10	from email.header import decode_header
11	from email.message import Message
12	+from email.utils import getaddresses
13
14	import tools
15	from osv import osv
16	@@ -60,9 +60,15 @@
17	return ''.join([tools.ustr(x[0], x[1]) for x in text])
18
19	def to_email(text):
20	- """Return a list of the email addresses found in ``text``"""
21	+ """Returns a list of the email addresses found in ``text``
22	+ """
23	if not text: return []
24	- return re.findall(r'([^ ,<@]+@[^> ,]+)', text)
25	+
26	+ people = getaddresses([text])
27	+ addresses = [person[1] for person in people]
28	+
29	+ return addresses
30	+
31
32	class mail_message_common(osv.osv_memory):
33	"""Common abstract class for holding the main attributes of a

OpenERP Community Backports (Addons)

Merge lp:~hirt/ocb-addons/6.1_fix_mailaddress-parsing into lp:ocb-addons/6.1

Commit message

Description of the change

Preview Diff

Subscribers