Merge lp:~elachuni/ubuntu-webcatalog/cleanup into lp:ubuntu-webcatalog
- cleanup
- Merge into trunk
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Anthony Lenton | ||||
Approved revision: | 101 | ||||
Merged at revision: | 102 | ||||
Proposed branch: | lp:~elachuni/ubuntu-webcatalog/cleanup | ||||
Merge into: | lp:ubuntu-webcatalog | ||||
Diff against target: |
501 lines (+318/-28) 5 files modified
setup.py (+1/-1) src/webcatalog/fixtures/initial_data.json (+9/-9) src/webcatalog/management/commands/cleanup.py (+163/-0) src/webcatalog/tests/factory.py (+27/-1) src/webcatalog/tests/test_commands.py (+118/-17) |
||||
To merge this branch: | bzr merge lp:~elachuni/ubuntu-webcatalog/cleanup | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Canonical Consumer Applications Hackers | Pending | ||
Review via email: mp+102237@code.launchpad.net |
Commit message
Added a command to clean up nonces and sessions.
Description of the change
This branch adds a command to clean up the oauth nonces and sessions table.
It's based largely on the script used by SSO to do the same thing, that has proved useful to keep up and cope with large amounts of data that need to be deleted from a high(ish) traffic db.
The replication bits have been stripped out (as webcatalog isn't deployed with slony), and it's been converted into a Django management command, so that it's simpler to run from the appservers, on a regular basis from a cronjob.
Michael Nelson (michael.nelson) wrote : | # |
Anthony Lenton (elachuni) wrote : | # |
> One broader question though, given that, by definition, a nonce should
> only be used once, shouldn't we just be deleting them as they are
> used... why are they kept around? (I mean, there will be a small
> number that are not used which would need to be cleaned, but I don't
> understand why we're keeping all the used ones).
Nonces are generated by the client, and they're stored on the server to avoid replay attacks. The server's plan is to simply refuse any request that includes a nonce it has already seen. As this would mean that the server needs to store all nonces indefinitely, the server *also* sets an interval after which a request with an old timestamp is rejected. So, stored nonces that are older than this interval (5 hours currently) can be deleted.
> > +try:
> > + import psycopg2
> > + from psycopg2.extensions import ISOLATION_
> > + psycopg2_available = True
> > +except ImportError:
> > + psycopg2_available = False
>
> I don't really get why this is done - why not just import it and the
> command will error if it's not available. As it is, this var seems to
> be used just to raise a different error below, in which case we could
> do:
>
> try:
> import psycopg2
> from psycopg2.extensions import ISOLATION_
> except ImportError:
> raise CommandError('This command requires psycopg2')
>
> or maybe it's just left over from the SSO version which used the
> variable differently?
>
Right, we could do that import locally within the command's handle method and avoid the variable; I just preferred to keep the import global. The issue with raising an exception outside of the handle() method is that it won't be caught and rendered as a pretty warning message.
> > +class OAuthNonceClean
> > + @staticmethod
> > + def declare_
> > + print "Opening cursor"
>
> Left over from debug? Or should it be a log?
Commands print their output directly, instead of logging. Checking other commands like import_
> > +class CleanupTestCase
> > + @patch(
> > + @patch(
>
> Ah - if you do switch to logging instead of print, you can use the
> LogLevelTestCase (or whatever the helper is called).
Right, or set the instance's stdout before calling?
> > + def run_and_
> > + mock_connection):
> > + func()
> > + actual_output = ''.join(
> > + for call in mock_stdout.
> > + self.assertEqua
> > + call_list =
> mock_connection
> > + for expected, call in zip(sql, call_list):
> > + actual = (' '.join(
> > + self.assertEqua
>
> What's the reason for testiing with a mock here? Why not pop...
Michael Nelson (michael.nelson) wrote : | # |
Approve
On Tue, Apr 17, 2012 at 2:37 PM, Anthony Lenton
<email address hidden> wrote:
>> One broader question though, given that, by definition, a nonce should
>> only be used once, shouldn't we just be deleting them as they are
>> used... why are they kept around? (I mean, there will be a small
>> number that are not used which would need to be cleaned, but I don't
>> understand why we're keeping all the used ones).
>
> Nonces are generated by the client, and they're stored on the server to avoid replay attacks. The server's plan is to simply refuse any request that includes a nonce it has already seen. As this would mean that the server needs to store all nonces indefinitely, the server *also* sets an interval after which a request with an old timestamp is rejected. So, stored nonces that are older than this interval (5 hours currently) can be deleted.
>
Ah - sorry - I was thinking of something we created and then let
clients use once, of course. *facepalm*.
>> > +try:
>> > + import psycopg2
>> > + from psycopg2.extensions import ISOLATION_
>> > + psycopg2_available = True
>> > +except ImportError:
>> > + psycopg2_available = False
>>
>> I don't really get why this is done - why not just import it and the
>> command will error if it's not available. As it is, this var seems to
>> be used just to raise a different error below, in which case we could
>> do:
>>
>> try:
>> import psycopg2
>> from psycopg2.extensions import ISOLATION_
>> except ImportError:
>> raise CommandError('This command requires psycopg2')
>>
>> or maybe it's just left over from the SSO version which used the
>> variable differently?
>>
>
> Right, we could do that import locally within the command's handle method and avoid the variable; I just preferred to keep the import global. The issue with raising an exception outside of the handle() method is that it won't be caught and rendered as a pretty warning message.
>
Ah - nice. I'd not read about:
https:/
Great.
>> > +class OAuthNonceClean
>> > + @staticmethod
>> > + def declare_
>> > + print "Opening cursor"
>>
>> Left over from debug? Or should it be a log?
>
> Commands print their output directly, instead of logging. Checking other commands like import_
>
Probably not worth it - I just wasn't sure whether it was intentional.
>> > +class CleanupTestCase
>> > + @patch(
>> > + @patch(
>>
>> Ah - if you do switch to logging instead of print, you can use the
>> LogLevelTestCase (or whatever the helper is called).
>
> Right, or set the instance's stdout before calling?
>
>> > + def run_and_
>> > + mock_connection):
>> > + func()
>> > + actual_output = ''.join(
>> >...
- 101. By Anthony Lenton
-
Added tests for cleanup commands with data.
Preview Diff
1 | === modified file 'setup.py' |
2 | --- setup.py 2012-03-01 14:53:20 +0000 |
3 | +++ setup.py 2012-04-17 17:00:26 +0000 |
4 | @@ -35,7 +35,7 @@ |
5 | install_requires = [ |
6 | 'django', |
7 | 'setuptools', |
8 | - 'south', |
9 | + 'south==0.7.3', |
10 | 'configglue==0.10', |
11 | 'django-configglue==0.4', |
12 | 'django-openid-auth==0.2', |
13 | |
14 | === modified file 'src/webcatalog/fixtures/initial_data.json' |
15 | --- src/webcatalog/fixtures/initial_data.json 2012-03-09 15:11:21 +0000 |
16 | +++ src/webcatalog/fixtures/initial_data.json 2012-04-17 17:00:26 +0000 |
17 | @@ -90,6 +90,15 @@ |
18 | } |
19 | }, |
20 | { |
21 | + "pk": 13, |
22 | + "model": "webcatalog.department", |
23 | + "fields": { |
24 | + "name": "Graphics", |
25 | + "parent": null, |
26 | + "slug": "graphics" |
27 | + } |
28 | + }, |
29 | + { |
30 | "pk": 12, |
31 | "model": "webcatalog.department", |
32 | "fields": { |
33 | @@ -99,15 +108,6 @@ |
34 | } |
35 | }, |
36 | { |
37 | - "pk": 13, |
38 | - "model": "webcatalog.department", |
39 | - "fields": { |
40 | - "name": "Graphics", |
41 | - "parent": null, |
42 | - "slug": "graphics" |
43 | - } |
44 | - }, |
45 | - { |
46 | "pk": 14, |
47 | "model": "webcatalog.department", |
48 | "fields": { |
49 | |
50 | === added file 'src/webcatalog/management/commands/cleanup.py' |
51 | --- src/webcatalog/management/commands/cleanup.py 1970-01-01 00:00:00 +0000 |
52 | +++ src/webcatalog/management/commands/cleanup.py 2012-04-17 17:00:26 +0000 |
53 | @@ -0,0 +1,163 @@ |
54 | +# -*- coding: utf-8 -*- |
55 | +# This file is part of the Apps Directory |
56 | +# Copyright (C) 2011 Canonical Ltd. |
57 | +# |
58 | +# This program is free software: you can redistribute it and/or modify |
59 | +# it under the terms of the GNU Affero General Public License as |
60 | +# published by the Free Software Foundation, either version 3 of the |
61 | +# License, or (at your option) any later version. |
62 | +# |
63 | +# This program is distributed in the hope that it will be useful, |
64 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
65 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
66 | +# GNU Affero General Public License for more details. |
67 | +# |
68 | +# You should have received a copy of the GNU Affero General Public License |
69 | +# along with this program. If not, see <http://www.gnu.org/licenses/>. |
70 | + |
71 | +"""Management command to clean up sessions and OAuth nonce tables.""" |
72 | + |
73 | +import time |
74 | +from django.core.management.base import ( |
75 | + CommandError, |
76 | + LabelCommand, |
77 | + make_option, |
78 | +) |
79 | +from django.db import connection |
80 | + |
81 | +try: |
82 | + import psycopg2 |
83 | + from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT |
84 | + psycopg2_available = True |
85 | +except ImportError: |
86 | + psycopg2_available = False |
87 | + |
88 | + |
89 | +class DjangoSessionCleaner(object): |
90 | + @staticmethod |
91 | + def declare_expired_keys_cursor(cur): |
92 | + print "Opening cursor" |
93 | + # We order by expire_date to force the index to be used. |
94 | + cur.execute("CLOSE ALL") |
95 | + cur.execute(""" |
96 | + DECLARE _django_session_clean NO SCROLL CURSOR WITH HOLD FOR |
97 | + SELECT session_key FROM django_session |
98 | + WHERE expire_date < CURRENT_TIMESTAMP |
99 | + ORDER BY expire_date |
100 | + """) |
101 | + |
102 | + @staticmethod |
103 | + def remove_batch(cur, batch_size): |
104 | + cur.execute("FETCH %s FROM _django_session_clean", [batch_size]) |
105 | + session_keys = [session_key for session_key, in cur.fetchall()] |
106 | + if not session_keys: |
107 | + return 0 |
108 | + cur.execute(""" |
109 | + DELETE FROM django_session |
110 | + WHERE |
111 | + expire_date < CURRENT_TIMESTAMP |
112 | + AND session_key = ANY(%s) |
113 | + """, (session_keys,)) |
114 | + # len() is better here than querying the cursor for rows actually |
115 | + # removed, as we don't want to terminate the script early in the |
116 | + # unlikely case all the sessions in out batch were reused between |
117 | + # now and when the expired keys cursor was opened. |
118 | + return len(session_keys) |
119 | + |
120 | + |
121 | +class OAuthNonceCleaner(object): |
122 | + @staticmethod |
123 | + def declare_expired_keys_cursor(cur): |
124 | + print "Opening cursor" |
125 | + # We order by created_at to force the index to be used. |
126 | + # Use a 5 hour clock skew as defined in openid.store.nonce |
127 | + cur.execute("CLOSE ALL") |
128 | + cur.execute(""" |
129 | + DECLARE _oauth_nonce_clean NO SCROLL CURSOR WITH HOLD FOR |
130 | + SELECT id FROM webcatalog_nonce |
131 | + WHERE created_at < (CURRENT_TIMESTAMP - INTERVAL '5' HOUR) |
132 | + ORDER BY created_at |
133 | + """) |
134 | + |
135 | + @staticmethod |
136 | + def remove_batch(cur, batch_size): |
137 | + cur.execute("FETCH %s FROM _oauth_nonce_clean", [batch_size]) |
138 | + nonce_ids = [nonce_id for nonce_id, in cur.fetchall()] |
139 | + if not nonce_ids: |
140 | + return 0 |
141 | + cur.execute(""" |
142 | + DELETE FROM webcatalog_nonce |
143 | + WHERE id = ANY(%s) |
144 | + """, (nonce_ids,)) |
145 | + # len() is better here than querying the cursor for rows actually |
146 | + # removed, as we don't want to terminate the script early in the |
147 | + # unlikely case all the sessions in out batch were reused between |
148 | + # now and when the expired keys cursor was opened. |
149 | + return len(nonce_ids) |
150 | + |
151 | + |
152 | +CLEANERS = { |
153 | + 'django_session': DjangoSessionCleaner, |
154 | + 'webcatalog_nonce': OAuthNonceCleaner, |
155 | +} |
156 | + |
157 | + |
158 | +class Command(LabelCommand): |
159 | + help = "Cleans the Django session and OAuth nonce tables." |
160 | + |
161 | + option_list = LabelCommand.option_list + ( |
162 | + make_option('--cursor-life', default=3600, dest="cursor_secs", |
163 | + action='store', |
164 | + help="Hold a cursor open a maximum of SECS seconds"), |
165 | + make_option('--batch-time', default=5, dest="batch_secs", |
166 | + action="store", |
167 | + help="Try to keep batch removal transaction time to SECS seconds"), |
168 | + ) |
169 | + |
170 | + def handle_label(self, table, **options): |
171 | + if not psycopg2_available: |
172 | + raise CommandError('This command requires psycopg2') |
173 | + |
174 | + cursor_secs = int(options.get('cursor_secs', 3600)) |
175 | + target_batch_time = int(options.get('batch_secs', 5)) |
176 | + |
177 | + cleaner_cls = CLEANERS.get(table, None) |
178 | + if cleaner_cls is None: |
179 | + msg = ("Invalid cleaner.\nNo cleaner found for table: %s\n" |
180 | + "Supported tables are: %s" % (table, CLEANERS.keys())) |
181 | + raise CommandError(msg) |
182 | + cleaner = cleaner_cls() |
183 | + connection.isolation_level = ISOLATION_LEVEL_AUTOCOMMIT |
184 | + cur = connection.cursor() |
185 | + |
186 | + removed = -1 |
187 | + total_removed = 0 |
188 | + batch_size = 1 |
189 | + cursor_opened_time = 0 |
190 | + |
191 | + while True: |
192 | + # We don't want to hold the cursor open too long to allow |
193 | + # vacuum to reclaim rows. 1 hour default before it is reopened. |
194 | + if time.time() - cursor_opened_time > cursor_secs: |
195 | + cleaner.declare_expired_keys_cursor(cur) |
196 | + cursor_opened_time = time.time() |
197 | + |
198 | + # Remove a batch of session rows. |
199 | + batch_start = time.time() |
200 | + removed = cleaner.remove_batch(cur, batch_size) |
201 | + actual_batch_time = time.time() - batch_start |
202 | + total_removed += removed |
203 | + print "Removed %d rows (%d total removed). Batch size %d" % ( |
204 | + removed, total_removed, batch_size) |
205 | + |
206 | + # Done |
207 | + if removed == 0: |
208 | + print "All done." |
209 | + break |
210 | + |
211 | + # Increase or decrease the batch size by 10%, minimum 1. |
212 | + batch_size_wobble = int(batch_size * 0.1 + 1) |
213 | + if actual_batch_time > target_batch_time * 1.1: |
214 | + batch_size -= batch_size_wobble |
215 | + elif actual_batch_time < target_batch_time * 0.9: |
216 | + batch_size += batch_size_wobble |
217 | |
218 | === modified file 'src/webcatalog/tests/factory.py' |
219 | --- src/webcatalog/tests/factory.py 2012-04-13 20:45:49 +0000 |
220 | +++ src/webcatalog/tests/factory.py 2012-04-17 17:00:26 +0000 |
221 | @@ -22,9 +22,13 @@ |
222 | with_statement, |
223 | ) |
224 | import os |
225 | +from datetime import ( |
226 | + datetime, |
227 | + timedelta, |
228 | +) |
229 | from itertools import count |
230 | - |
231 | from django.contrib.auth.models import User |
232 | +from django.contrib.sessions.models import Session |
233 | from django.test import TestCase |
234 | from django_openid_auth.models import UserOpenID |
235 | |
236 | @@ -36,6 +40,7 @@ |
237 | DistroSeries, |
238 | Exhibit, |
239 | Machine, |
240 | + Nonce, |
241 | Token, |
242 | ) |
243 | from webcatalog.utilities import full_claimed_id |
244 | @@ -194,6 +199,27 @@ |
245 | token.save() |
246 | return token, consumer |
247 | |
248 | + def make_nonce(self, token=None, consumer=None, created_at=None): |
249 | + if token is None or consumer is None: |
250 | + assert token is None and consumer is None |
251 | + token, consumer = self.make_oauth_token_and_consumer() |
252 | + nonce = Nonce.objects.create(token=token, consumer=consumer, |
253 | + nonce=self.get_unique_string(prefix='nonce-')) |
254 | + if created_at: |
255 | + nonce.created_at = created_at |
256 | + nonce.save() |
257 | + return nonce |
258 | + |
259 | + def make_session(self, expire_date=None): |
260 | + if expire_date is None: |
261 | + expire_date = datetime.now() + timedelta( |
262 | + seconds=self.get_unique_integer()) |
263 | + return Session.objects.create( |
264 | + session_key=self.get_unique_string(prefix='key-'), |
265 | + session_data=self.get_unique_string(prefix='session-data-'), |
266 | + expire_date=expire_date |
267 | + ) |
268 | + |
269 | |
270 | class TestCaseWithFactory(TestCase): |
271 | |
272 | |
273 | === modified file 'src/webcatalog/tests/test_commands.py' |
274 | --- src/webcatalog/tests/test_commands.py 2012-04-13 20:45:49 +0000 |
275 | +++ src/webcatalog/tests/test_commands.py 2012-04-17 17:00:26 +0000 |
276 | @@ -33,7 +33,12 @@ |
277 | from decimal import Decimal |
278 | |
279 | from django.conf import settings |
280 | +from django.contrib.sessions.models import Session |
281 | from django.core.management import call_command |
282 | +from django.db import connection |
283 | +from django.utils.unittest import skipUnless |
284 | +from django.test.testcases import TransactionTestCase |
285 | + |
286 | from mock import ( |
287 | patch, |
288 | MagicMock, |
289 | @@ -45,23 +50,27 @@ |
290 | Application, |
291 | DistroSeries, |
292 | Exhibit, |
293 | + Nonce, |
294 | ReviewStatsImport, |
295 | ) |
296 | -from webcatalog.management.commands.import_app_install_data import ( |
297 | - Command as ImportAppInstallCommand, |
298 | - ) |
299 | -from webcatalog.management.commands.import_ratings_stats import ( |
300 | - Command as ImportRatingsStatsCommand, |
301 | - ) |
302 | -from webcatalog.tests.factory import TestCaseWithFactory |
303 | +from webcatalog.management.commands import ( |
304 | + import_app_install_data, |
305 | + import_ratings_stats, |
306 | +) |
307 | +from webcatalog.tests.factory import ( |
308 | + TestCaseWithFactory, |
309 | + WebCatalogObjectFactory, |
310 | +) |
311 | |
312 | __metaclass__ = type |
313 | __all__ = [ |
314 | + 'CheckAllLatestTestCase', |
315 | + 'CleanupTestCase', |
316 | + 'CleanupDataTestCase', |
317 | 'ImportAppInstallTestCase', |
318 | 'ImportExhibitsTestCase', |
319 | 'ImportForPurchaseAppsTestCase', |
320 | 'ImportRatingsTestCase', |
321 | - 'CheckAllLatestTestCase', |
322 | ] |
323 | |
324 | |
325 | @@ -188,7 +197,7 @@ |
326 | with patch(get_uri_fn) as mock_get_uri: |
327 | mock_get_uri.return_value = 'http://example.com/my.deb' |
328 | with patch('urllib.urlretrieve') as mock_urlretrieve: |
329 | - ImportAppInstallCommand().get_package_data_for_series( |
330 | + import_app_install_data.Command().get_package_data_for_series( |
331 | 'app-install-data', 'natty', tmp_dir) |
332 | shutil.rmtree(tmp_dir) |
333 | |
334 | @@ -305,7 +314,7 @@ |
335 | self.assertEqual(scribus.screenshots, ["http://example.com/1.png"]) |
336 | |
337 | def get_package_uri_for_series(self): |
338 | - uri = ImportAppInstallCommand().get_package_uri_for_series( |
339 | + uri = import_app_install_data.Command().get_package_uri_for_series( |
340 | 'app-install-data', 'natty') |
341 | |
342 | self.assertEqual('http://example.com/app-install-1.01.deb', uri) |
343 | @@ -397,7 +406,7 @@ |
344 | self.assertEqual(0, DistroSeries.objects.filter( |
345 | code_name='natty').count()) |
346 | |
347 | - ImportAppInstallCommand().update_from_cache('natty') |
348 | + import_app_install_data.Command().update_from_cache('natty') |
349 | |
350 | self.assertEqual(1, DistroSeries.objects.filter( |
351 | code_name='natty').count()) |
352 | @@ -405,7 +414,7 @@ |
353 | def test_apt_cache_apps_used_also(self): |
354 | self.assertEqual(0, Application.objects.count()) |
355 | |
356 | - ImportAppInstallCommand().update_from_cache('natty') |
357 | + import_app_install_data.Command().update_from_cache('natty') |
358 | |
359 | # There are 4 packages in our mock_cache which is patched in the |
360 | # setUp. |
361 | @@ -531,7 +540,7 @@ |
362 | candidate = Mock() |
363 | candidate.uri = 'http://extras.ubuntu.com/pool/f/foobar.deb' |
364 | candidate.record = {'Icon': 'foo.png'} |
365 | - command = ImportAppInstallCommand() |
366 | + command = import_app_install_data.Command() |
367 | |
368 | command.fetch_icon_from_extras(app, candidate) |
369 | |
370 | @@ -546,7 +555,7 @@ |
371 | candidate = Mock() |
372 | candidate.uri = 'http://extras.ubuntu.com/pool/f/foobar.deb' |
373 | candidate.record = {} |
374 | - command = ImportAppInstallCommand() |
375 | + command = import_app_install_data.Command() |
376 | |
377 | command.fetch_icon_from_extras(app, candidate) |
378 | |
379 | @@ -559,7 +568,7 @@ |
380 | candidate = Mock() |
381 | candidate.uri = 'http://archive.ubuntu.com/pool/f/foobar.deb' |
382 | candidate.record = {'Icon': 'foobar.png'} |
383 | - command = ImportAppInstallCommand() |
384 | + command = import_app_install_data.Command() |
385 | |
386 | command.fetch_icon_from_extras(app, candidate) |
387 | |
388 | @@ -655,7 +664,7 @@ |
389 | package_name=self.factory.get_unique_string(prefix='package-'), |
390 | ratings_average='5.00', ratings_total=1, |
391 | histogram='[0, 0, 0, 0, 1]')) for x in range(3000)] |
392 | - command = ImportRatingsStatsCommand() |
393 | + command = import_ratings_stats.Command() |
394 | |
395 | # update_apps_with_stats returns None on success: |
396 | self.assertIsNone(command.update_apps_with_stats(natty, stats)) |
397 | @@ -665,7 +674,7 @@ |
398 | app = self.factory.make_application() |
399 | stats = [ReviewsStats.from_dict(dict(package_name=app.package_name, |
400 | ratings_average='5.00', ratings_total=1, histogram=None))] |
401 | - command = ImportRatingsStatsCommand() |
402 | + command = import_ratings_stats.Command() |
403 | |
404 | self.assertIsNone(command.update_apps_with_stats(app.distroseries, |
405 | stats)) |
406 | @@ -788,3 +797,95 @@ |
407 | self.assertEqual([True, False], [app.is_latest for app in retrieved]) |
408 | self.assertTrue(Application.objects.get(package_name='bar').is_latest) |
409 | self.assertTrue(Application.objects.get(package_name='baz').is_latest) |
410 | + |
411 | + |
412 | +class CleanupTestCase(TestCaseWithFactory): |
413 | + @patch('webcatalog.management.commands.cleanup.connection') |
414 | + @patch('sys.stdout') |
415 | + def run_and_check_output_and_sql(self, func, output, sql, mock_stdout, |
416 | + mock_connection): |
417 | + func() |
418 | + actual_output = ''.join(str(call[0][0]) |
419 | + for call in mock_stdout.write.call_args_list) |
420 | + self.assertEqual(output, actual_output.split('\n')) |
421 | + call_list = mock_connection.cursor.return_value.execute.call_args_list |
422 | + for expected, call in zip(sql, call_list): |
423 | + actual = (' '.join(call[0][0].split()),) + call[0][1:] |
424 | + self.assertEqual(expected, actual) |
425 | + |
426 | + @skipUnless(connection.vendor == 'postgresql', "Requires postgresql") |
427 | + def test_clean_nonces_no_nonces(self): |
428 | + def cleanup(): |
429 | + call_command('cleanup', 'webcatalog_nonce') |
430 | + output = ['Opening cursor', |
431 | + 'Removed 0 rows (0 total removed). Batch size 1', |
432 | + 'All done.', ''] |
433 | + sql = [ |
434 | + ('CLOSE ALL',), |
435 | + ('DECLARE _oauth_nonce_clean NO SCROLL CURSOR WITH HOLD FOR ' |
436 | + 'SELECT id FROM webcatalog_nonce ' |
437 | + "WHERE created_at < (CURRENT_TIMESTAMP - INTERVAL '5' HOUR) " |
438 | + 'ORDER BY created_at',), |
439 | + ('FETCH %s FROM _oauth_nonce_clean', [1])] |
440 | + self.run_and_check_output_and_sql(cleanup, output, sql) |
441 | + |
442 | + @skipUnless(connection.vendor == 'postgresql', "Requires postgresql") |
443 | + def test_clean_session_no_sessions(self): |
444 | + def cleanup(): |
445 | + call_command('cleanup', 'django_session') |
446 | + output = ['Opening cursor', |
447 | + 'Removed 0 rows (0 total removed). Batch size 1', |
448 | + 'All done.', ''] |
449 | + sql = [ |
450 | + ('CLOSE ALL',), |
451 | + ('DECLARE _django_session_clean NO SCROLL CURSOR WITH HOLD FOR ' |
452 | + 'SELECT session_key FROM django_session WHERE expire_date < ' |
453 | + 'CURRENT_TIMESTAMP ORDER BY expire_date',), |
454 | + ('FETCH %s FROM _django_session_clean', [1])] |
455 | + self.run_and_check_output_and_sql(cleanup, output, sql) |
456 | + |
457 | + |
458 | +class CleanupDataTestCase(TransactionTestCase): |
459 | + def setUp(self): |
460 | + self.factory = WebCatalogObjectFactory() |
461 | + self.patch_stdout = patch('sys.stdout') |
462 | + self.mock_stdout = self.patch_stdout.start() |
463 | + self.addCleanup(self.patch_stdout.stop) |
464 | + |
465 | + def check_expected_output(self): |
466 | + expected_output = '''Opening cursor |
467 | +Removed 1 rows (1 total removed). Batch size 1 |
468 | +Removed 2 rows (3 total removed). Batch size 2 |
469 | +Removed 1 rows (4 total removed). Batch size 3 |
470 | +Removed 0 rows (4 total removed). Batch size 4 |
471 | +All done. |
472 | +''' |
473 | + actual_output = ''.join(str(call[0][0]) |
474 | + for call in self.mock_stdout.write.call_args_list) |
475 | + self.assertEqual(expected_output, actual_output) |
476 | + |
477 | + @skipUnless(connection.vendor == 'postgresql', "Requires postgresql") |
478 | + def test_cleanup_nonces(self): |
479 | + now = datetime.now() |
480 | + for i in range(8): |
481 | + created_at = now - timedelta(seconds=5000 * i) |
482 | + self.factory.make_nonce(created_at=created_at) |
483 | + self.assertEqual(8, Nonce.objects.count()) |
484 | + |
485 | + call_command('cleanup', 'webcatalog_nonce') |
486 | + |
487 | + self.assertEqual(4, Nonce.objects.count()) |
488 | + self.check_expected_output() |
489 | + |
490 | + @skipUnless(connection.vendor == 'postgresql', "Requires postgresql") |
491 | + def test_cleanup_sessions(self): |
492 | + now = datetime.now() |
493 | + for i in range(8): |
494 | + expire_date = now + timedelta(seconds=5000 * (4 - i)) |
495 | + self.factory.make_session(expire_date=expire_date) |
496 | + self.assertEqual(8, Session.objects.count()) |
497 | + |
498 | + call_command('cleanup', 'django_session') |
499 | + |
500 | + self.assertEqual(4, Session.objects.count()) |
501 | + self.check_expected_output() |
On Tue, Apr 17, 2012 at 6:54 AM, Anthony Lenton ca-hackers) /code.launchpad .net/~elachuni/ ubuntu- webcatalog/ cleanup/ +merge/ 102237
<email address hidden> wrote:
> Anthony Lenton has proposed merging lp:~elachuni/ubuntu-webcatalog/cleanup into lp:ubuntu-webcatalog.
>
> Requested reviews:
> Canonical Consumer Applications Hackers (canonical-
>
> For more details, see:
> https:/
>
> This branch adds a command to clean up the oauth nonces and sessions table.
>
> It's based largely on the script used by SSO to do the same thing, that has proved useful to keep up and cope with large amounts of data that need to be deleted from a high(ish) traffic db.
Cool - a few comments below about different assumptions (or things I
haven't understood).
One broader question though, given that, by definition, a nonce should
only be used once, shouldn't we just be deleting them as they are
used... why are they kept around? (I mean, there will be a small
number that are not used which would need to be cleaned, but I don't
understand why we're keeping all the used ones).
> /code.launchpad .net/~elachuni/ ubuntu- webcatalog/ cleanup/ +merge/ 102237 /management/ commands/ cleanup. py' management/ commands/ cleanup. py 1970-01-01 00:00:00 +0000 management/ commands/ cleanup. py 2012-04-17 04:53:18 +0000 www.gnu. org/licenses/>. core.management .base import ( LEVEL_AUTOCOMMI T
> The replication bits have been stripped out (as webcatalog isn't deployed with slony), and it's been converted into a Django management command, so that it's simpler to run from the appservers, on a regular basis from a cronjob.
> --
> https:/
> You are subscribed to branch lp:ubuntu-webcatalog.
>
> === modified file 'setup.py'
> === added file 'src/webcatalog
> --- src/webcatalog/
> +++ src/webcatalog/
> @@ -0,0 +1,165 @@
> +# -*- coding: utf-8 -*-
> +# This file is part of the Apps Directory
> +# Copyright (C) 2011 Canonical Ltd.
> +#
> +# This program is free software: you can redistribute it and/or modify
> +# it under the terms of the GNU Affero General Public License as
> +# published by the Free Software Foundation, either version 3 of the
> +# License, or (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> +# GNU Affero General Public License for more details.
> +#
> +# You should have received a copy of the GNU Affero General Public License
> +# along with this program. If not, see <http://
> +
> +"""Management command to clean up sessions and OAuth nonce tables."""
> +
> +import time
> +from django.
> + CommandError,
> + LabelCommand,
> + make_option,
> +)
> +from django.db import connection
> +
> +try:
> + import psycopg2
> + from psycopg2.extensions import ISOLATION_
> + psycopg2_available = True
> +except ImportError:
> + psycopg2_available = False
I don't really get why this is done - why not just import it and the
command will error if it's not available. As it is, this var seems to
be used just to raise a different error below, in which case we ...