Merge lp:~ralfjung-e/mailman/2.1 into lp:mailman/2.1

Proposed by Ralf Jung
Status: Merged
Merged at revision: 1815
Proposed branch: lp:~ralfjung-e/mailman/2.1
Merge into: lp:mailman/2.1
Diff against target: 167 lines (+76/-4)
7 files modified
Mailman/Cgi/listinfo.py (+17/-2)
Mailman/Cgi/subscribe.py (+8/-2)
Mailman/Defaults.py.in (+19/-0)
Mailman/Utils.py (+29/-0)
templates/de/listinfo.html (+1/-0)
templates/en/listinfo.html (+1/-0)
templates/fr/listinfo.html (+1/-0)
To merge this branch: bzr merge lp:~ralfjung-e/mailman/2.1
Reviewer Review Type Date Requested Status
Mark Sapiro Approve
Review via email: mp+368614@code.launchpad.net

Commit message

Implement a simple CAPTCHA scheme based on questions and answers configured by the site admin.

Fixes https://bugs.launchpad.net/mailman/+bug/1774826.

I have manually edited listinfo.html for a few languages for testing; is there a way to automate that or do I have to manually do that for all languages?

To post a comment you must log in.
Revision history for this message
Jim Popovitch (jimpop) wrote :

I like this solution, it's simple and easily customizable, doesn't rely on 3rd-party integrations, and the CAPTCHA questions can be easily updated. Nice work Ralf.

One thing that I would like to see in the future is storing the CAPTCHAs in pck or even sql, etc.

Revision history for this message
Mark Sapiro (msapiro) wrote :

Thanks for this. I am currently traveling and I will review it in detail as I have time.

Revision history for this message
Jim Popovitch (jimpop) wrote :

If line 20 (in the patch below) sets "captcha_idx = 0" python throws an error:

   TypeError: cannot concatenate 'str' and 'int' objects

I think a better solution would be to remove the captcha_idx from that section of code if it is None, (and set it to None at line 20).

lp:~ralfjung-e/mailman/2.1 updated
1816. By Ralf Jung <email address hidden>

fix computing the form hash when there is no CAPTCHA

Revision history for this message
Ralf Jung (ralfjung-e) wrote :

Ah good catch, I forgot to test this without CAPTCHAS set.

However, the variable is used later to compute the hash for the form secret. Removing it conditionally will mean we have to mirror the same logic on the other side in subscribe.py. So for now I changed it to use the empty string instead. I don't know if this will automatically get included in this MR?

lp:~ralfjung-e/mailman/2.1 updated
1817. By Ralf Jung <email address hidden>

Mention in the docs that 'en' is used as the default key

1818. By Ralf Jung <email address hidden>

Don't enable CAPTCHA if 'en' key is not set

Revision history for this message
Jim Popovitch (jimpop) wrote :

The recent change will be included in the MR, you can confirm this by looking at the diff down below these comments on webform for the MR.

Revision history for this message
Mark Sapiro (msapiro) wrote :

Thanks very much Ralf. I made a few minor style changes and added a NEWS item, but this is committed at https://bazaar.launchpad.net/~mailman-coders/mailman/2.1/revision/1815

review: Approve
Revision history for this message
Ralf Jung (ralfjung-e) wrote :

Great. :) I see you added <mm-captcha-ui> to all the templates, thanks!

Oh and could you please correct my name in NEWS? It's Ralf with an "f". :)

Revision history for this message
Mark Sapiro (msapiro) wrote :

Sorry about the misspelling of your name (I did get it right in the commit message). Fixed now.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'Mailman/Cgi/listinfo.py'
2--- Mailman/Cgi/listinfo.py 2018-06-21 16:23:09 +0000
3+++ Mailman/Cgi/listinfo.py 2019-06-10 20:07:09 +0000
4@@ -216,10 +216,25 @@
5 # drop one : resulting in an invalid format, but it's only
6 # for our hash so it doesn't matter.
7 remote = remote.rsplit(':', 1)[0]
8+ # render CAPTCHA, if configured
9+ if isinstance(mm_cfg.CAPTCHAS, dict) and 'en' in mm_cfg.CAPTCHAS:
10+ (captcha_question, captcha_box, captcha_idx) = \
11+ Utils.captcha_display(mlist, lang, mm_cfg.CAPTCHAS)
12+ pre_question = _(
13+ '''Please answer the following question to prove that
14+ you are not a bot:'''
15+ )
16+ replacements['<mm-captcha-ui>'] = (
17+ """<tr><td BGCOLOR="#dddddd">%s<br>%s</td><td>%s</td></tr>"""
18+ % (pre_question, captcha_question, captcha_box))
19+ else:
20+ captcha_idx = "" # just to have something to include in the hash below
21+ # fill form
22 replacements['<mm-subscribe-form-start>'] += (
23- '<input type="hidden" name="sub_form_token" value="%s:%s">\n'
24- % (now, Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET + ":" +
25+ '<input type="hidden" name="sub_form_token" value="%s:%s:%s">\n'
26+ % (now, captcha_idx, Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET + ":" +
27 now + ":" +
28+ captcha_idx + ":" +
29 mlist.internal_name() + ":" +
30 remote
31 ).hexdigest()
32
33=== modified file 'Mailman/Cgi/subscribe.py'
34--- Mailman/Cgi/subscribe.py 2018-06-17 23:47:34 +0000
35+++ Mailman/Cgi/subscribe.py 2019-06-10 20:07:09 +0000
36@@ -168,13 +168,14 @@
37 # for our hash so it doesn't matter.
38 remote1 = remote.rsplit(':', 1)[0]
39 try:
40- ftime, fhash = cgidata.getfirst('sub_form_token', '').split(':')
41+ ftime, fcaptcha_idx, fhash = cgidata.getfirst('sub_form_token', '').split(':')
42 then = int(ftime)
43 except ValueError:
44- ftime = fhash = ''
45+ ftime = fcaptcha_idx = fhash = ''
46 then = 0
47 token = Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET + ":" +
48 ftime + ":" +
49+ fcaptcha_idx + ":" +
50 mlist.internal_name() + ":" +
51 remote1).hexdigest()
52 if ftime and now - then > mm_cfg.FORM_LIFETIME:
53@@ -189,6 +190,11 @@
54 results.append(
55 _('There was no hidden token in your submission or it was corrupted.'))
56 results.append(_('You must GET the form before submitting it.'))
57+ # Check captcha
58+ if isinstance(mm_cfg.CAPTCHAS, dict):
59+ captcha_answer = cgidata.getvalue('captcha_answer', '')
60+ if not Utils.captcha_verify(fcaptcha_idx, captcha_answer, mm_cfg.CAPTCHAS):
61+ results.append(_('This was not the right answer to the CAPTCHA question.'))
62 # Was an attempt made to subscribe the list to itself?
63 if email == mlist.GetListEmail():
64 syslog('mischief', 'Attempt to self subscribe %s: %s', email, remote)
65
66=== modified file 'Mailman/Defaults.py.in'
67--- Mailman/Defaults.py.in 2019-05-22 00:10:40 +0000
68+++ Mailman/Defaults.py.in 2019-06-10 20:07:09 +0000
69@@ -131,6 +131,25 @@
70 # test.
71 SUBSCRIBE_FORM_MIN_TIME = seconds(5)
72
73+# Use a custom question-answer CAPTCHA to protect against subscription spam.
74+# Has no effect unless SUBSCRIBE_FORM_SECRET is set.
75+# Should be set to a dict mapping language keys to a list of pairs
76+# of questions and regexes for the answers, e.g.
77+# CAPTCHAS = {
78+# 'en': [
79+# ('What is two times six?', '(12|twelve)'),
80+# ('What is this mailing list software called?', '[Mm]ailman'),
81+# ],
82+# 'de': [
83+# ('Was ist 3 mal 6?', '(18|achtzehn)'),
84+# ],
85+# }
86+# The regular expression must match the full string, i.e., it is implicitly
87+# acting as if it had "^" in the beginning and "$" at the end.
88+# An 'en' key must be present and is used as fall-back if there are no questions
89+# for the currently set language.
90+CAPTCHAS = None
91+
92 # Use Google reCAPTCHA to protect the subscription form from spam bots. The
93 # following must be set to a pair of keys issued by the reCAPTCHA service at
94 # https://www.google.com/recaptcha/admin
95
96=== modified file 'Mailman/Utils.py'
97--- Mailman/Utils.py 2019-03-02 02:24:14 +0000
98+++ Mailman/Utils.py 2019-06-10 20:07:09 +0000
99@@ -1576,3 +1576,32 @@
100 if not re.search(r'127\.0\.1\.255$', text, re.MULTILINE):
101 return True
102 return False
103+
104+
105+def captcha_display(mlist, lang, captchas):
106+ """Returns a CAPTCHA question, the HTML for the answer box, and
107+ the data to be put into the CSRF token"""
108+ if not lang in captchas:
109+ lang = 'en'
110+ captchas = captchas[lang]
111+ idx = random.randrange(len(captchas))
112+ question = captchas[idx][0]
113+ box_html = mlist.FormatBox('captcha_answer', size=30)
114+ # Remember to encode the language in the index so that we can get it out again!
115+ return (websafe(question), box_html, lang + "-" + str(idx))
116+
117+def captcha_verify(idx, given_answer, captchas):
118+ try:
119+ (lang, idx) = idx.split("-")
120+ idx = int(idx)
121+ except ValueError:
122+ return False
123+ if not lang in captchas:
124+ return False
125+ captchas = captchas[lang]
126+ if not idx in range(len(captchas)):
127+ return False
128+ # Check the given answer.
129+ # We append a `$` to emulate `re.fullmatch`.
130+ correct_answer_pattern = captchas[idx][1] + "$"
131+ return re.match(correct_answer_pattern, given_answer)
132
133=== modified file 'templates/de/listinfo.html'
134--- templates/de/listinfo.html 2018-01-29 12:58:42 +0000
135+++ templates/de/listinfo.html 2019-06-10 20:07:09 +0000
136@@ -115,6 +115,7 @@
137 </tr>
138 <mm-digest-question-end>
139 <mm-recaptcha-ui>
140+ <mm-captcha-ui>
141 <tr>
142 <td colspan="3">
143 <center><MM-Subscribe-Button></center>
144
145=== modified file 'templates/en/listinfo.html'
146--- templates/en/listinfo.html 2018-01-29 12:58:42 +0000
147+++ templates/en/listinfo.html 2019-06-10 20:07:09 +0000
148@@ -116,6 +116,7 @@
149 </tr>
150 <mm-digest-question-end>
151 <mm-recaptcha-ui>
152+ <mm-captcha-ui>
153 <tr>
154 <td colspan="3">
155 <center><MM-Subscribe-Button></center>
156
157=== modified file 'templates/fr/listinfo.html'
158--- templates/fr/listinfo.html 2018-01-29 12:58:42 +0000
159+++ templates/fr/listinfo.html 2019-06-10 20:07:09 +0000
160@@ -119,6 +119,7 @@
161 </tr>
162 <mm-digest-question-end>
163 <mm-recaptcha-ui>
164+ <mm-captcha-ui>
165 <tr>
166 <td colspan="3">
167 <center><MM-Subscribe-Button></center>