Merge lp:~toshio/mailman/maildir into lp:mailman

Proposed by Toshio Kuratomi
Status: Merged
Approved by: Barry Warsaw
Approved revision: 7113
Merged at revision: 7114
Proposed branch: lp:~toshio/mailman/maildir
Merge into: lp:mailman
Diff against target: 279 lines (+211/-6)
5 files modified
src/mailman/archiving/docs/common.rst (+8/-4)
src/mailman/archiving/prototype.py (+48/-2)
src/mailman/archiving/tests/test_prototype.py (+151/-0)
src/mailman/config/config.py (+1/-0)
src/mailman/config/schema.cfg (+3/-0)
To merge this branch: bzr merge lp:~toshio/mailman/maildir
Reviewer Review Type Date Requested Status
Barry Warsaw Approve
Review via email: mp+97142@code.launchpad.net

Description of the change

Change the prototype archiver to save to maildir.

Add an archive_dir config path

To post a comment you must log in.
Revision history for this message
Barry Warsaw (barry) wrote :

We did a face-to-face review explaining my strange import line sorting rules. Toshio is going to take a crack at adding a unittest (and the src/mailman/archiving/docs/common.rst test will need fixing). Otherwise, the branch looks good and I'll merge it when the updates are pushed. Thanks!

review: Needs Fixing
lp:~toshio/mailman/maildir updated
7108. By toshio <email address hidden>

Style cleanups for the prototype patch

7109. By toshio <email address hidden>

Update documentation for the prototype archiver

7110. By toshio <email address hidden>

Use 2.6+ syntax for assigning an exception

7111. By toshio <email address hidden>

Unittests for the prototype archiver

7112. By toshio <email address hidden>

Revert to using a try: except: finally: for locking as the context manager
fails when the lock is stolen out from under it (as from a thread)

7113. By toshio <email address hidden>

Merge with upstream

Revision history for this message
Barry Warsaw (barry) wrote :
Download full text (3.7 KiB)

Great branch, thanks Toshio. There are a few minor things I'd like to clean up on the branch, but I'm happy to do that and land this for you. I'll mention them here just for future reference.

In common.rst:
    >>> len (os.listdir(archivepath)) >= 1
    True

it's better to write it like the following (aside from the PEP 8 violation of spaces after the 'len' :) so that it's easier to debug:

    >>> len(os.listdir(archive_path))
    1

Also, there's no need to re-import os (since it's imported earlier in the file) or to import config, since the doctest framework injects that into the doctest globals.

Minor nit: two blank lines between major sections in rst and .py files.

In prototype.py:

hashlib and base64 modules are unused (pyflakes ftw! :). Very minor nit: I like using just `log` when there's only one logger in a module.

I prefer full sentences (e.g. capitalized and period at the end) for docstrings and comments. I generally prefer `error` as targets for except clauses. Abbreviations in general make for less readable code, though there are occasional exceptions <wink>.

I wonder if config.py's ensure_directories_exist() should create the maildir for the prototype archiver? This would avoid having the archiver do this every time a message is added to the archive, where it will almost always fail with EEXIST. It would have to be part of the IArchive interface though to keep things separated, possibly a `prepare()` method that gets called on startup. OTOH, it's possible that an archiver will have to do additional preparation specifically for the mailing list, and it's probably not a good idea to have to call this every time a mailing list is created. So I guess for now, having archive_message() create the directory is the best choice. We can't add an __init__() to the archiver though because it never gets instantiated (maybe that should change too). Just thinking out loud here.

I think I'll pull the os.path.join() out of the Lock() instantiation.

I'm kind of favoring .format() rather than % interpolation these days, but of course we have to use '{0}' instead of '{}' until we switch to Python 3 :). Python 2.6 doesn't support '{}'.

Do you think the lock timeout needs to be configurable?

I think I'm going to rewrite the comment above mailbox.add() and not store it in message_key. It makes pyflakes (since message_key) is unused, but the comment will preserve the fact that the return value could be useful in the future.

test_prototype:

Generally, I like sticking to PEP 8 for test class names, e.g. TestPrototypeArchiver. I've started to put these into the __all__ of the module. I'm not sure it matters, but it looks a bit better to me these days. ;)

I almost always prefer double-triple-quoted strings for multiline string delimiters.

For consistency, I generally use <email address hidden>, bart@, cris@, dave@, and so on. I'm a bit more schizo on message-ids. I think I'm converging on short alphabetical animal names, e.g. <ant>, <bee>, <cat>, <dog>, <eel>, and so on.

A better way to set tempdirs for tests, especially when they correspond to ini file variables, is to use the config.push() interface, which allows you to layer on...

Read more...

review: Approve
lp:~toshio/mailman/maildir updated
7114. By toshio <email address hidden>

Python-2.6 compat fixes

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/mailman/archiving/docs/common.rst'
2--- src/mailman/archiving/docs/common.rst 2012-03-14 04:03:01 +0000
3+++ src/mailman/archiving/docs/common.rst 2012-03-14 05:34:19 +0000
4@@ -62,13 +62,17 @@
5 >>> os.path.exists(pckpath)
6 True
7
8-Note however that the prototype archiver can't archive messages.
9+The prototype archiver is available to simplistically archive messages.
10+::
11
12 >>> archivers['prototype'].archive_message(mlist, msg)
13- Traceback (most recent call last):
14- ...
15- NotImplementedError
16
17+ >>> import os
18+ >>> from mailman import config
19+ >>> archivepath = os.path.join(config.ARCHIVE_DIR, 'prototype',
20+ ... mlist.fqdn_listname, 'new')
21+ >>> len (os.listdir(archivepath)) >= 1
22+ True
23
24 The Mail-Archive.com
25 ====================
26
27=== modified file 'src/mailman/archiving/prototype.py'
28--- src/mailman/archiving/prototype.py 2012-03-14 04:03:01 +0000
29+++ src/mailman/archiving/prototype.py 2012-03-14 05:34:19 +0000
30@@ -25,11 +25,23 @@
31 ]
32
33
34+import os
35+import errno
36+import hashlib
37+import logging
38+
39+from base64 import b32encode
40+from datetime import timedelta
41+from mailbox import Maildir
42 from urlparse import urljoin
43+
44+from flufl.lock import Lock, TimeOutError
45 from zope.interface import implements
46
47+from mailman.config import config
48 from mailman.interfaces.archiver import IArchiver
49
50+elog = logging.getLogger('mailman.error')
51
52
53
54 class Prototype:
55@@ -61,5 +73,39 @@
56
57 @staticmethod
58 def archive_message(mlist, message):
59- """See `IArchiver`."""
60- raise NotImplementedError
61+ """See `IArchiver`.
62+
63+ This sample archiver saves nmessages into a maildir
64+ """
65+ archive_dir = os.path.join(config.ARCHIVE_DIR, 'prototype')
66+ try:
67+ os.makedirs(archive_dir, 0775)
68+ except OSError as e:
69+ # If this already exists, then we're fine
70+ if e.errno != errno.EEXIST:
71+ raise
72+
73+ # Maildir will throw an error if the directories are partially created
74+ # (for instance the toplevel exists but cur, new, or tmp do not)
75+ # therefore we don't create the toplevel as we did above
76+ list_dir = os.path.join(archive_dir, mlist.fqdn_listname)
77+ mail_box = Maildir(list_dir, create=True, factory=None)
78+
79+ # Lock the maildir as Maildir.add() is not threadsafe
80+ lock = Lock(os.path.join(config.LOCK_DIR, '%s-maildir.lock'
81+ % mlist.fqdn_listname))
82+ try:
83+ lock.lock(timeout=timedelta(seconds=1))
84+ # Add the message to the Maildir
85+ # Message_key could be used to construct the file path if
86+ # necessary::
87+ # os.path.join(archive_dir, mlist.fqdn_listname, 'new',
88+ # message_key)
89+ message_key = mail_box.add(message)
90+ except TimeOutError:
91+ # log the error and go on
92+ elog.error('Unable to lock archive for %s, discarded'
93+ ' message: %s' % (mlist.fqdn_listname,
94+ message.get('message-id', '<unknown>')))
95+ finally:
96+ lock.unlock(unconditionally=True)
97
98=== added directory 'src/mailman/archiving/tests'
99=== added file 'src/mailman/archiving/tests/__init__.py'
100=== added file 'src/mailman/archiving/tests/test_prototype.py'
101--- src/mailman/archiving/tests/test_prototype.py 1970-01-01 00:00:00 +0000
102+++ src/mailman/archiving/tests/test_prototype.py 2012-03-14 05:34:19 +0000
103@@ -0,0 +1,151 @@
104+# Copyright (C) 2012 by the Free Software Foundation, Inc.
105+#
106+# This file is part of GNU Mailman.
107+#
108+# GNU Mailman is free software: you can redistribute it and/or modify it under
109+# the terms of the GNU General Public License as published by the Free
110+# Software Foundation, either version 3 of the License, or (at your option)
111+# any later version.
112+#
113+# GNU Mailman is distributed in the hope that it will be useful, but WITHOUT
114+# ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
115+# FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
116+# more details.
117+#
118+# You should have received a copy of the GNU General Public License along with
119+# GNU Mailman. If not, see <http://www.gnu.org/licenses/>.
120+
121+"""Test the prototype archiver."""
122+
123+from __future__ import absolute_import, print_function, unicode_literals
124+
125+__metaclass__ = type
126+__all__ = [
127+ ]
128+
129+import os
130+import shutil
131+import tempfile
132+import unittest
133+import threading
134+
135+from flufl.lock import Lock
136+
137+from mailman.app.lifecycle import create_list
138+from mailman.archiving import prototype
139+from mailman.config import config
140+from mailman.testing.helpers import LogFileMark
141+from mailman.testing.helpers import specialized_message_from_string as smfs
142+from mailman.testing.layers import ConfigLayer
143+
144+
145+class test_PrototypeArchiveMethod(unittest.TestCase):
146+ layer = ConfigLayer
147+
148+ def setUp(self):
149+ # Create a fake mailing list and message object
150+ self.message = smfs('''\
151+To: test@example.com
152+From: admin@example.com
153+Subject: Testing the test list
154+Message-ID: <DEADBEEF@example.com>
155+
156+Tests are better than not tests
157+but the water deserves to be swum
158+ ''')
159+
160+ self.mlist = create_list('test@example.com')
161+ config.db.commit()
162+
163+ # Set some config directories where we won't bash real data
164+ config.ARCHIVE_DIR = '%s%s' % (tempfile.mkdtemp(), os.path.sep)
165+ config.LOCK_DIR = tempfile.mkdtemp()
166+
167+ # Structure of a maildir
168+ self.expected_dir_structure = frozenset(
169+ (os.path.join(config.ARCHIVE_DIR, f) for f in (
170+ '',
171+ 'prototype',
172+ os.path.join('prototype', self.mlist.fqdn_listname),
173+ os.path.join('prototype', self.mlist.fqdn_listname, 'cur'),
174+ os.path.join('prototype', self.mlist.fqdn_listname, 'new'),
175+ os.path.join('prototype', self.mlist.fqdn_listname, 'tmp'),
176+ )
177+ )
178+ )
179+
180+ def tearDown(self):
181+ shutil.rmtree(config.ARCHIVE_DIR)
182+ shutil.rmtree(config.LOCK_DIR)
183+
184+ def _find(self, path):
185+ all_filenames = set()
186+ for dirs in os.walk(path):
187+ directory = dirs[0]
188+ if not isinstance(directory, unicode):
189+ directory = unicode(directory)
190+ all_filenames.add(directory)
191+ if dirs[2]:
192+ for filename in dirs[2]:
193+ new_filename = os.path.join(dirs[0], filename)
194+ if not isinstance(new_filename, unicode):
195+ new_filename = unicode(new_filename)
196+ all_filenames.add(new_filename)
197+ return all_filenames
198+
199+ def test_archive_maildir_created(self):
200+ prototype.Prototype.archive_message(self.mlist, self.message)
201+ all_filenames = self._find(config.ARCHIVE_DIR)
202+ # Check that the directory structure has been created and we have one
203+ # more file (the archived message) than expected directories
204+ self.assertTrue(self.expected_dir_structure.issubset(all_filenames))
205+ self.assertEqual(len(all_filenames), len(self.expected_dir_structure) + 1)
206+
207+ def test_archive_maildir_existance_does_not_raise(self):
208+ os.makedirs(os.path.join(config.ARCHIVE_DIR, 'prototype',
209+ self.mlist.fqdn_listname, 'cur'))
210+ os.mkdir(os.path.join(config.ARCHIVE_DIR, 'prototype',
211+ self.mlist.fqdn_listname, 'new'))
212+ os.mkdir(os.path.join(config.ARCHIVE_DIR, 'prototype',
213+ self.mlist.fqdn_listname, 'tmp'))
214+
215+ # Checking that no exception is raised in this circumstance because it
216+ # will be the common case (adding a new message to an archive whose
217+ # directories have alreay been created)
218+ try:
219+ prototype.Prototype.archive_message(self.mlist, self.message)
220+ except:
221+ self.assertTrue(False, 'Exception raised when the archive'
222+ ' directory structure already in place')
223+
224+ def test_archive_lock_used(self):
225+ # Test that locking the maildir when adding works as a failure here
226+ # could mean we lose mail
227+ lock = Lock(os.path.join(config.LOCK_DIR, '%s-maildir.lock'
228+ % self.mlist.fqdn_listname))
229+ with lock:
230+ # Take this lock. Then make sure the archiver fails while that's
231+ # working.
232+ archive_thread = threading.Thread(
233+ target=prototype.Prototype.archive_message,
234+ args=(self.mlist, self.message))
235+ mark = LogFileMark('mailman.error')
236+ archive_thread.run()
237+ # Test that the archiver output the correct error
238+ line = mark.readline()
239+ self.assertTrue(line.endswith('Unable to lock archive for %s,'
240+ ' discarded message: %s\n' % (self.mlist.fqdn_listname,
241+ self.message.get('message-id'))))
242+
243+ # Check that the file didn't get created
244+ created_files = self._find(config.ARCHIVE_DIR)
245+ self.assertEqual(self.expected_dir_structure, created_files)
246+
247+ def test_mail_added(self):
248+ prototype.Prototype.archive_message(self.mlist, self.message)
249+ for filename in os.listdir(os.path.join(config.ARCHIVE_DIR,
250+ 'prototype', self.mlist.fqdn_listname, 'new')):
251+ # Check that the email has been added
252+ email = open(os.path.join(config.ARCHIVE_DIR, 'prototype',
253+ self.mlist.fqdn_listname, 'new', filename))
254+ self.assertTrue((repr(self.message)).endswith(email.read()))
255
256=== modified file 'src/mailman/config/config.py'
257--- src/mailman/config/config.py 2012-03-03 18:40:16 +0000
258+++ src/mailman/config/config.py 2012-03-14 05:34:19 +0000
259@@ -173,6 +173,7 @@
260 lock_dir = category.lock_dir,
261 log_dir = category.log_dir,
262 messages_dir = category.messages_dir,
263+ archive_dir = category.archive_dir,
264 pipermail_private_dir = category.pipermail_private_dir,
265 pipermail_public_dir = category.pipermail_public_dir,
266 queue_dir = category.queue_dir,
267
268=== modified file 'src/mailman/config/schema.cfg'
269--- src/mailman/config/schema.cfg 2012-03-13 05:01:58 +0000
270+++ src/mailman/config/schema.cfg 2012-03-14 05:34:19 +0000
271@@ -113,6 +113,9 @@
272 ext_dir: $var_dir/ext
273 # Directory where the default IMessageStore puts its messages.
274 messages_dir: $var_dir/messages
275+# Directory for archive backends to store their archives in.
276+# Archivers should create a subdirectory in here to store their files
277+archive_dir: $var_dir/archives
278 # Directory for public Pipermail archiver artifacts.
279 pipermail_public_dir: $var_dir/archives/public
280 # Directory for private Pipermail archiver artifacts.