Merge lp:~widelands-dev/widelands-website/anti_spam_app into lp:widelands-website
- anti_spam_app
- Merge into trunk
Status: | Merged |
---|---|
Merged at revision: | 477 |
Proposed branch: | lp:~widelands-dev/widelands-website/anti_spam_app |
Merge into: | lp:widelands-website |
Diff against target: |
434 lines (+207/-64) 14 files modified
check_input/__init__.py (+1/-0) check_input/admin.py (+18/-0) check_input/apps.py (+5/-0) check_input/management/commands/send_suspicious_mail.py (+15/-13) check_input/migrations/0001_initial.py (+30/-0) check_input/models.py (+50/-0) check_input/urls.py (+6/-0) check_input/views.py (+36/-0) local_settings.py.sample (+1/-2) pybb/forms.py (+1/-30) pybb/views.py (+34/-16) settings.py (+1/-0) templates/check_input/moderate_info.html (+8/-3) urls.py (+1/-0) |
To merge this branch: | bzr merge lp:~widelands-dev/widelands-website/anti_spam_app |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
SirVer | Approve | ||
Review via email: mp+334232@code.launchpad.net |
Commit message
Description of the change
Decouple check for spam (suspicious user content) from pybb. Main goal is to add the possibility to add such checks to different apps, e.g. wiki or comments.
For pybb we have already the hidden property, so i used it. For other apps it is maybe possible to add a template filter or a middleware to hide suspicious content. This is work for the future...
- Implement a new app 'check_input' containing:
- A new model SuspiciousInput which is responsible for checking users input and collecting data about suspicious content. Beside the username and suspicious text, the model stores the applications model where this suspicious text comes from. E.g. pybb/Post or pybb/Topic with the appropriate ID
- Moved management command for sending hidden_posts_mail into the new app and renamed it
- Added an admin page for the new model
- Removed the http 403 response, instead reworked the informational view of moderating users input which is prompted to the user who wrote suspicious content.
- Make pybb use of the new app, including a check when editing posts
Deleting a user removes also the entries in model SuspiciousInput.
An entry in SuspiciousInput get not deleted when a post get unhidden. So an admin is responsible for removing the entry in this case.
kaputtnik (franku) wrote : | # |
Thanks for the review :-)
Will merge it at Thursday. I want to make backup of the database before merging this. Making a backup of the database prevent saving anything. So i am thinking about having an additional noticeable row below the navigation showing something like
"The website is currently set to read only!"
So there is no need to set the website into maintenance mode when making a backup, and visitors are informed. What do you think?
GunChleoc (gunchleoc) wrote : | # |
I think that's a good idea - maybe still switch the forums off though, since navigating them also needs changes to the database.
kaputtnik (franku) wrote : | # |
After reading your post and rethinking my suggestion it is much problematic as it seems. Beside the forums, there are also other things which write to the database while using the website, e.g. Downloading a map (Increasing the download number).
It is also not clear how to handle active sessions. Say a user with an active session visits the website when it is read only and then he tries to log out.
Better set the website status to maintenance when making a backup. Currently this seems saver to me :-)
SirVer (sirver) wrote : | # |
Yes, setting it to maintance is surely saver - but you could run a backup once on the live system just to see how long it takes and if it is feasible to set the site to maintance for this.
kaputtnik (franku) wrote : | # |
> but you could run a backup once on the live system just to see how long it takes
I remember the last time i did this it was not clear to me how long it takes. Say if the terminal lost the ssh connection it stays in a frozen way (no response from the server). That time it takes incredible long time... but i am not sure if the time was really used by mysqldump because i cant see if the connection was lost. Locally making a mysqldump is very fast, faster than with mixed InnoDB and MyIsam tables.
SirVer (sirver) wrote : | # |
You can check the duration by using /usr/bin/time <command>. You can avoid terminal connection drop by running tmux.
Basically:
$ tmux
$ /usr/bin/time mysqldump <more options>
Ups connection to the server dropped here at some point.
ssh again, now
$ tmux attach
here you are, back in your old session.
kaputtnik (franku) wrote : | # |
Merged and deployed.
Making a backup of the database was done in a few seconds... I have to admit that i only made a backup of the productive database.
User only having staff status (and are not superusers), may not see the entry "Check Input/Suspicious Input" on the main admin page. I gave GunCheloc the appropriate rights so she can see and enter the sub page. Don't know if some one else should also have the rights.
Hopefully all works as intended and testet locally ...
I use screen as a terminal multiplexer, i think your approach with tmux works also with screen?
SirVer (sirver) wrote : | # |
> I use screen as a terminal multiplexer, i think your approach with tmux works also with screen?
Yes, it does.
Thanks for deploying!
GunChleoc (gunchleoc) wrote : | # |
I think the rights are sufficient - I don't think that anybody else will want to change this at the moment.
Preview Diff
1 | === added directory 'check_input' |
2 | === added file 'check_input/__init__.py' |
3 | --- check_input/__init__.py 1970-01-01 00:00:00 +0000 |
4 | +++ check_input/__init__.py 2017-11-29 22:46:31 +0000 |
5 | @@ -0,0 +1,1 @@ |
6 | +default_app_config = 'check_input.apps.CheckInput' |
7 | |
8 | === added file 'check_input/admin.py' |
9 | --- check_input/admin.py 1970-01-01 00:00:00 +0000 |
10 | +++ check_input/admin.py 2017-11-29 22:46:31 +0000 |
11 | @@ -0,0 +1,18 @@ |
12 | +from check_input.models import SuspiciousInput |
13 | +from django.contrib import admin |
14 | +from django.contrib.contenttypes.models import ContentType |
15 | + |
16 | + |
17 | +class SuspiciousInputAdmin(admin.ModelAdmin): |
18 | + list_display = ('text', 'user', 'get_app') |
19 | + readonly_fields = ('text', 'user', 'get_app',) |
20 | + exclude = ('content_type', 'object_id', ) |
21 | + |
22 | + def get_app(self, obj): |
23 | + app = ContentType.objects.get_for_id( |
24 | + obj.content_type_id) |
25 | + |
26 | + return '%s/%s' % (app.app_label, app.name) |
27 | + get_app.short_description = 'Found in App/Model' |
28 | + |
29 | +admin.site.register(SuspiciousInput, SuspiciousInputAdmin) |
30 | |
31 | === added file 'check_input/apps.py' |
32 | --- check_input/apps.py 1970-01-01 00:00:00 +0000 |
33 | +++ check_input/apps.py 2017-11-29 22:46:31 +0000 |
34 | @@ -0,0 +1,5 @@ |
35 | +from django.apps import AppConfig |
36 | + |
37 | +class CheckInput(AppConfig): |
38 | + name = 'check_input' |
39 | + verbose_name = "Check Input" |
40 | |
41 | === added directory 'check_input/management' |
42 | === added file 'check_input/management/__init__.py' |
43 | === added directory 'check_input/management/commands' |
44 | === added file 'check_input/management/commands/__init__.py' |
45 | === renamed file 'pybb/management/commands/send_hidden_post_mail.py' => 'check_input/management/commands/send_suspicious_mail.py' |
46 | --- pybb/management/commands/send_hidden_post_mail.py 2016-12-13 18:28:51 +0000 |
47 | +++ check_input/management/commands/send_suspicious_mail.py 2017-11-29 22:46:31 +0000 |
48 | @@ -1,24 +1,26 @@ |
49 | from django.core.management.base import BaseCommand |
50 | -from pybb.models import Post |
51 | +from check_input.models import SuspiciousInput |
52 | from django.core.mail import send_mail |
53 | from django.conf import settings |
54 | from django.contrib.sites.models import Site |
55 | +from django.contrib.contenttypes.models import ContentType |
56 | |
57 | |
58 | class Command(BaseCommand): |
59 | - help = 'Send emails if hidden posts are found' |
60 | + help = 'Send email if suspicious content is found' |
61 | |
62 | def handle(self, *args, **options): |
63 | - hidden_posts = Post.objects.filter(hidden=True) |
64 | - |
65 | - if hidden_posts: |
66 | - message = 'There were %d hidden posts found:' % len(hidden_posts) |
67 | - for post in hidden_posts: |
68 | - message += '\n' + post.user.username + \ |
69 | - ': ' + post.body_text[:70] |
70 | - |
71 | - message += '\n\nAdmin page: ' + Site.objects.get_current().domain + \ |
72 | - '/admin/pybb/post/' |
73 | + spams = SuspiciousInput.objects.all() |
74 | + if spams: |
75 | + message = 'There were %d hidden posts found:' % len(spams) |
76 | + |
77 | + for spam in spams: |
78 | + app = ContentType.objects.get_for_id( |
79 | + spam.content_type_id) |
80 | + message += '\nIn %s/%s: ' % (app.app_label, app.model) |
81 | + message += '\n User \'%s\' wrote: %s' % (spam.user, spam.text) |
82 | + |
83 | + message += '\n\nAdmin page: https://%s/admin/pybb/post/' % Site.objects.get_current().domain |
84 | recipients = [addr[1] for addr in settings.ADMINS] |
85 | - send_mail('Hidden posts were found', message, 'pybb@widelands.org', |
86 | + send_mail('Hidden posts were found', message, 'admins@widelands.org', |
87 | recipients, fail_silently=False) |
88 | |
89 | === added directory 'check_input/migrations' |
90 | === added file 'check_input/migrations/0001_initial.py' |
91 | --- check_input/migrations/0001_initial.py 1970-01-01 00:00:00 +0000 |
92 | +++ check_input/migrations/0001_initial.py 2017-11-29 22:46:31 +0000 |
93 | @@ -0,0 +1,30 @@ |
94 | +# -*- coding: utf-8 -*- |
95 | +from __future__ import unicode_literals |
96 | + |
97 | +from django.db import models, migrations |
98 | +from django.conf import settings |
99 | + |
100 | + |
101 | +class Migration(migrations.Migration): |
102 | + |
103 | + dependencies = [ |
104 | + ('contenttypes', '0002_remove_content_type_name'), |
105 | + migrations.swappable_dependency(settings.AUTH_USER_MODEL), |
106 | + ] |
107 | + |
108 | + operations = [ |
109 | + migrations.CreateModel( |
110 | + name='SuspiciousInput', |
111 | + fields=[ |
112 | + ('id', models.AutoField(verbose_name='ID', serialize=False, auto_created=True, primary_key=True)), |
113 | + ('text', models.CharField(max_length=200, verbose_name=b'suspicious user input')), |
114 | + ('object_id', models.PositiveIntegerField()), |
115 | + ('content_type', models.ForeignKey(verbose_name=b'related model', to='contenttypes.ContentType')), |
116 | + ('user', models.ForeignKey(verbose_name=b'related user', to=settings.AUTH_USER_MODEL)), |
117 | + ], |
118 | + options={ |
119 | + 'ordering': ['content_type_id'], |
120 | + 'default_permissions': ('change', 'delete'), |
121 | + }, |
122 | + ), |
123 | + ] |
124 | |
125 | === added file 'check_input/migrations/__init__.py' |
126 | === added file 'check_input/models.py' |
127 | --- check_input/models.py 1970-01-01 00:00:00 +0000 |
128 | +++ check_input/models.py 2017-11-29 22:46:31 +0000 |
129 | @@ -0,0 +1,50 @@ |
130 | +from django.db import models |
131 | +from django.contrib.auth.models import User |
132 | +from django.contrib.contenttypes.models import ContentType |
133 | +from django.contrib.contenttypes.fields import GenericForeignKey |
134 | +from django.conf import settings |
135 | +import re |
136 | + |
137 | + |
138 | +class SuspiciousInput(models.Model): |
139 | + """Model for collecting suspicios user input. |
140 | + |
141 | + Call the check_input method with this attributes: |
142 | + content_object = Model instance of a saved(!) object |
143 | + user = user |
144 | + text = text to check for suspicious content |
145 | + |
146 | + Example: |
147 | + is_suspicous = SuspiciousInput.check_input(content_object=post, |
148 | +user=post.user, text=post.body) |
149 | + |
150 | + """ |
151 | + |
152 | + text = models.CharField( |
153 | + max_length=200, verbose_name='suspicious user input') |
154 | + user = models.ForeignKey(User, verbose_name='related user') |
155 | + content_type = models.ForeignKey(ContentType, verbose_name='related model') |
156 | + object_id = models.PositiveIntegerField() |
157 | + content_object = GenericForeignKey('content_type', 'object_id') |
158 | + |
159 | + class Meta: |
160 | + ordering = ['content_type_id'] |
161 | + default_permissions = ('change', 'delete',) |
162 | + |
163 | + def __unicode__(self): |
164 | + return self.text |
165 | + |
166 | + def is_suspicious(self): |
167 | + if any(x in self.text.lower() for x in settings.ANTI_SPAM_KWRDS): |
168 | + return True |
169 | + if re.search(settings.ANTI_SPAM_PHONE_NR, self.text): |
170 | + return True |
171 | + return False |
172 | + |
173 | + @classmethod |
174 | + def check_input(cls, *args, **kwargs): |
175 | + user_input = cls(*args, **kwargs) |
176 | + is_spam = user_input.is_suspicious() |
177 | + if is_spam: |
178 | + user_input.save() |
179 | + return is_spam |
180 | |
181 | === added file 'check_input/urls.py' |
182 | --- check_input/urls.py 1970-01-01 00:00:00 +0000 |
183 | +++ check_input/urls.py 2017-11-29 22:46:31 +0000 |
184 | @@ -0,0 +1,6 @@ |
185 | +from django.conf.urls import * |
186 | +from check_input import views |
187 | + |
188 | +urlpatterns = [ |
189 | + url(r'^$', views.moderate_info, name='found_spam'), |
190 | + ] |
191 | |
192 | === added file 'check_input/views.py' |
193 | --- check_input/views.py 1970-01-01 00:00:00 +0000 |
194 | +++ check_input/views.py 2017-11-29 22:46:31 +0000 |
195 | @@ -0,0 +1,36 @@ |
196 | +from django.shortcuts import render |
197 | +from django.shortcuts import get_object_or_404 |
198 | +from django.http import HttpResponseRedirect |
199 | +from django.conf import settings |
200 | +from django.contrib.auth import logout |
201 | +from django.contrib.auth.models import User |
202 | +from check_input.models import SuspiciousInput |
203 | + |
204 | + |
205 | +def moderate_info(request): |
206 | + """Redirect to the moderate comments info page.""" |
207 | + |
208 | + # We need the try to catch logged out users |
209 | + try: |
210 | + hidden_posts_count = SuspiciousInput.objects.filter( |
211 | + user=request.user).count() |
212 | + except TypeError: |
213 | + return HttpResponseRedirect('/') |
214 | + |
215 | + # Don't make the page accesible through browsers addressbar |
216 | + if hidden_posts_count == 0: |
217 | + return HttpResponseRedirect('/') |
218 | + |
219 | + if hidden_posts_count >= settings.MAX_HIDDEN_POSTS: |
220 | + user = get_object_or_404(User, username=request.user) |
221 | + # Set the user inactive so he can't login |
222 | + user.is_active = False |
223 | + user.save() |
224 | + # Log the user out |
225 | + logout(request) |
226 | + |
227 | + context={ |
228 | + 'max_count': settings.MAX_HIDDEN_POSTS, |
229 | + 'act_count': hidden_posts_count, |
230 | + } |
231 | + return render(request, 'check_input/moderate_info.html', context=context) |
232 | |
233 | === modified file 'local_settings.py.sample' |
234 | --- local_settings.py.sample 2017-11-01 16:29:00 +0000 |
235 | +++ local_settings.py.sample 2017-11-29 22:46:31 +0000 |
236 | @@ -48,8 +48,7 @@ |
237 | |
238 | # Anti spam keywords |
239 | # If these are found, the posts/topics in forum get hidden |
240 | -ANTI_SPAM_BODY = ['spam'] |
241 | -ANTI_SPAM_TOPIC = ['spam'] |
242 | +ANTI_SPAM_KWRDS = ['spam'] |
243 | ANTI_SPAM_PHONE_NR = re.compile('\d{8,16}') |
244 | MAX_HIDDEN_POSTS = 5 |
245 | |
246 | |
247 | === modified file 'pybb/forms.py' |
248 | --- pybb/forms.py 2017-04-23 21:00:01 +0000 |
249 | +++ pybb/forms.py 2017-11-29 22:46:31 +0000 |
250 | @@ -10,7 +10,6 @@ |
251 | from pybb.models import Topic, Post, PrivateMessage, Attachment |
252 | from pybb import settings as pybb_settings |
253 | from django.conf import settings |
254 | -from notification.models import send, get_observers_for |
255 | |
256 | |
257 | class AddPostForm(forms.ModelForm): |
258 | @@ -62,42 +61,14 @@ |
259 | topic_is_new = False |
260 | topic = self.topic |
261 | |
262 | - # Check for spam and hide the post |
263 | - # TODO(Franku): This is currently a simple keyword search. Maybe add akismet check here |
264 | - # could be improved... |
265 | - # The admins get informed of hidden post(s) over |
266 | - # a Django command. See pybb/management/commands |
267 | - hidden = False |
268 | - text = self.cleaned_data['body'] |
269 | - if any(x in text.lower() for x in settings.ANTI_SPAM_BODY): |
270 | - hidden = True |
271 | - |
272 | - if re.search(settings.ANTI_SPAM_PHONE_NR, text): |
273 | - hidden = True |
274 | - |
275 | - if topic_is_new: |
276 | - text = self.cleaned_data['name'] |
277 | - if any(x in text.lower() for x in settings.ANTI_SPAM_TOPIC): |
278 | - hidden = True |
279 | - if re.search(settings.ANTI_SPAM_PHONE_NR, text): |
280 | - hidden = True |
281 | - |
282 | post = Post(topic=topic, user=self.user, user_ip=self.ip, |
283 | markup=self.cleaned_data['markup'], |
284 | - body=self.cleaned_data['body'], hidden=hidden) |
285 | + body=self.cleaned_data['body']) |
286 | post.save(*args, **kwargs) |
287 | |
288 | if pybb_settings.ATTACHMENT_ENABLE: |
289 | self.save_attachment(post, self.cleaned_data['attachment']) |
290 | |
291 | - if not hidden: |
292 | - if topic_is_new: |
293 | - send(get_observers_for('forum_new_topic'), 'forum_new_topic', |
294 | - {'topic': topic, 'post': post, 'user': topic.user}, queue = True) |
295 | - else: |
296 | - send(self.topic.subscribers.all(), 'forum_new_post', |
297 | - {'post': post, 'topic': topic, 'user': post.user}, queue = True) |
298 | - |
299 | return post |
300 | |
301 | def save_attachment(self, post, memfile): |
302 | |
303 | === modified file 'pybb/views.py' |
304 | --- pybb/views.py 2017-10-27 17:30:08 +0000 |
305 | +++ pybb/views.py 2017-11-29 22:46:31 +0000 |
306 | @@ -11,7 +11,7 @@ |
307 | from django.db import connection |
308 | from django.utils import translation |
309 | from django.shortcuts import render |
310 | -from django.contrib.auth import logout |
311 | + |
312 | |
313 | from pybb.util import render_to, paged, build_form, quote_text, ajax, urlize |
314 | from pybb.models import Category, Forum, Topic, Post, PrivateMessage, Attachment,\ |
315 | @@ -21,6 +21,7 @@ |
316 | from pybb.orm import load_related |
317 | |
318 | from wl_utils import get_real_ip |
319 | +from check_input.models import SuspiciousInput |
320 | |
321 | try: |
322 | from notification import models as notification |
323 | @@ -158,22 +159,34 @@ |
324 | |
325 | if form.is_valid(): |
326 | post = form.save() |
327 | + |
328 | + is_spam = False |
329 | + # Check for spam in topics name for new topics |
330 | if not topic: |
331 | - post.topic.subscribers.add(request.user) |
332 | - |
333 | - if post.hidden: |
334 | - hidden_posts_count = Post.objects.filter( |
335 | - user=request.user, hidden=True).count() |
336 | - |
337 | - if hidden_posts_count >= settings.MAX_HIDDEN_POSTS: |
338 | - user = get_object_or_404(User, username=request.user) |
339 | - # Set the user inactive so he can't login |
340 | - user.is_active = False |
341 | - user.save() |
342 | - # Log the user out |
343 | - logout(request) |
344 | - return HttpResponse(status=403) |
345 | - return HttpResponseRedirect('pybb_moderate_info') |
346 | + is_spam = SuspiciousInput.check_input( |
347 | + content_object=post.topic, user=post.topic.user, text=post.topic.name) |
348 | + # Check for spam in Post |
349 | + if not is_spam: |
350 | + is_spam = SuspiciousInput.check_input( |
351 | + content_object=post, user=post.user, text=post.body) |
352 | + |
353 | + if is_spam: |
354 | + post.hidden = is_spam |
355 | + post.save(update_fields=['hidden']) |
356 | + return HttpResponseRedirect('/moderated/') |
357 | + |
358 | + if notification: |
359 | + if not topic: |
360 | + # Inform subscribers of a new topic |
361 | + notification.send(notification.get_observers_for('forum_new_topic'), 'forum_new_topic', |
362 | + {'topic': post.topic, 'post': post, 'user': post.topic.user}, queue = True) |
363 | + # Topics author is subscriber for all new posts in his topic |
364 | + post.topic.subscribers.add(request.user) |
365 | + |
366 | + else: |
367 | + # Send mails about a new post to topic subscribers |
368 | + notification.send(post.topic.subscribers.all(), 'forum_new_post', |
369 | + {'post': post, 'topic': topic, 'user': post.user}, queue = True) |
370 | |
371 | return HttpResponseRedirect(post.get_absolute_url()) |
372 | |
373 | @@ -220,6 +233,11 @@ |
374 | |
375 | if form.is_valid(): |
376 | post = form.save() |
377 | + is_spam = SuspiciousInput.check_input(content_object=post, user=post.user, text=post.body) |
378 | + if is_spam: |
379 | + post.hidden = is_spam |
380 | + post.save() |
381 | + return HttpResponseRedirect('/moderated/') |
382 | return HttpResponseRedirect(post.get_absolute_url()) |
383 | |
384 | return {'form': form, |
385 | |
386 | === modified file 'settings.py' |
387 | --- settings.py 2017-09-29 21:24:37 +0000 |
388 | +++ settings.py 2017-11-29 22:46:31 +0000 |
389 | @@ -94,6 +94,7 @@ |
390 | 'wlmaps', |
391 | 'wlscreens', |
392 | 'wlggz', |
393 | + 'check_input', |
394 | 'haystack', # search engine; see option HAYSTACK_CONNECTIONS |
395 | |
396 | # Modified 3rd party apps |
397 | |
398 | === added directory 'templates/check_input' |
399 | === renamed file 'templates/pybb/pybb_moderate_info.html' => 'templates/check_input/moderate_info.html' |
400 | --- templates/pybb/pybb_moderate_info.html 2016-10-08 09:30:34 +0000 |
401 | +++ templates/check_input/moderate_info.html 2017-11-29 22:46:31 +0000 |
402 | @@ -1,12 +1,17 @@ |
403 | -{% extends 'pybb/base.html' %} |
404 | +{% extends 'base.html' %} |
405 | |
406 | {% block content %} |
407 | |
408 | <h1>All comments have to be moderated</h1> |
409 | |
410 | <div class="blogEntry"> |
411 | - <p>Your comment has been saved but hidden to normal users. A moderator |
412 | - will take a look at it and review it as soon as possible.</p> |
413 | + <p>Your comment has to be reviewd by a moderator.</p> |
414 | + {% if act_count == max_count|add:"-1" %} |
415 | + <p><strong>Attention! The next time you will get logged out and deactivated!</strong></p> |
416 | + {% endif %} |
417 | + {% if act_count == max_count %} |
418 | + <p><strong>You can't login anymore...</strong></p> |
419 | + {% endif %} |
420 | </div> |
421 | |
422 | {% endblock %} |
423 | |
424 | === modified file 'urls.py' |
425 | --- urls.py 2017-08-24 16:19:01 +0000 |
426 | +++ urls.py 2017-11-29 22:46:31 +0000 |
427 | @@ -65,6 +65,7 @@ |
428 | url(r'^maps/', include('wlmaps.urls')), |
429 | url(r'^screenshots/', include('wlscreens.urls')), |
430 | url(r'^ggz/', include('wlggz.urls')), |
431 | + url(r'^moderated/', include('check_input.urls')), |
432 | ] |
433 | |
434 | try: |
lgtm!