Merge lp:~milo/linaro-patchmetrics/update-patches-fix into lp:linaro-patchmetrics
- update-patches-fix
- Merge into trunk
Status: | Merged |
---|---|
Merged at revision: | 400 |
Proposed branch: | lp:~milo/linaro-patchmetrics/update-patches-fix |
Merge into: | lp:linaro-patchmetrics |
Prerequisite: | lp:~milo/linaro-patchmetrics/crowd-teams |
Diff against target: |
517 lines (+309/-28) 7 files modified
apps/patchmetrics/bin/get-linaro-membership.py (+0/-1) apps/patchmetrics/crowd.py (+22/-2) apps/patchwork/bin/update-committed-patches.py (+28/-7) apps/patchwork/db.py (+120/-0) apps/patchwork/parser.py (+17/-18) apps/patchwork/utils.py (+119/-0) apps/settings.py (+3/-0) |
To merge this branch: | bzr merge lp:~milo/linaro-patchmetrics/update-patches-fix |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Paul Sokolovsky | Approve | ||
Review via email: mp+186357@code.launchpad.net |
Commit message
Description of the change
This MP aims at fixing small issues experienced when running the update-
For general information about the changes, please refer to the commit message.
What was not captured in that message is:
- Some PEP8 fixes
- New settings introduced used to define a file path where to store the SQLite DB
- 383. By Milo Casagrande
-
Fixed output collection for rev-list git command.
- 384. By Milo Casagrande
-
Fixed missing exception.
- 385. By Milo Casagrande
-
Fixed formatting.
- 386. By Milo Casagrande
-
Merged from trunk.
- 387. By Milo Casagrande
-
Removed unused imports, added comment.
- 388. By Milo Casagrande
-
Merged from trunk.
- 389. By Milo Casagrande
-
Updated to use new Crowd code.
- 390. By Milo Casagrande
-
Merged from trunk.
- 391. By Milo Casagrande
-
Merged from trunk.
Paul Sokolovsky (pfalcon) wrote : | # |
Paul Sokolovsky (pfalcon) wrote : | # |
84 + if settings.
85 + print >> sys.stderr, ("CROWD_
86 + "settings.py.\n")
apparently cut&paste error in msg.
Paul Sokolovsky (pfalcon) wrote : | # |
Otherwise, logic looks good, but there really should be expiration of old cached users.
Milo Casagrande (milo) wrote : | # |
On Wed, Dec 4, 2013 at 6:18 PM, Paul Sokolovsky
<email address hidden> wrote:
> Review: Approve
>
> Otherwise, logic looks good, but there really should be expiration of old cached users.
Thanks Paul,
I'll work on that and propose another merge.
--
Milo Casagrande | Automation Engineer
Linaro.org <www.linaro.org> │ Open source software for ARM SoCsThanks Paul,
- 392. By Milo Casagrande
-
Fixed print statement.
Preview Diff
1 | === modified file 'apps/patchmetrics/bin/get-linaro-membership.py' | |||
2 | --- apps/patchmetrics/bin/get-linaro-membership.py 2013-10-17 15:53:51 +0000 | |||
3 | +++ apps/patchmetrics/bin/get-linaro-membership.py 2013-12-05 10:08:30 +0000 | |||
4 | @@ -48,7 +48,6 @@ | |||
5 | 48 | linaro = lp.people['linaro'] | 48 | linaro = lp.people['linaro'] |
6 | 49 | 49 | ||
7 | 50 | if settings.AUTH_CROWD_APPLICATION_USER: | 50 | if settings.AUTH_CROWD_APPLICATION_USER: |
8 | 51 | print "Using Crowd groups" | ||
9 | 52 | whitelisted_groups = [] | 51 | whitelisted_groups = [] |
10 | 53 | if settings.CROWD_GROUPS_WHITELIST is None: | 52 | if settings.CROWD_GROUPS_WHITELIST is None: |
11 | 54 | print "No CROWD_GROUPS_WHITELIST defined, all groups will be used." | 53 | print "No CROWD_GROUPS_WHITELIST defined, all groups will be used." |
12 | 55 | 54 | ||
13 | === modified file 'apps/patchmetrics/crowd.py' | |||
14 | --- apps/patchmetrics/crowd.py 2013-09-27 16:32:30 +0000 | |||
15 | +++ apps/patchmetrics/crowd.py 2013-12-05 10:08:30 +0000 | |||
16 | @@ -20,7 +20,6 @@ | |||
17 | 20 | import base64 | 20 | import base64 |
18 | 21 | import httplib | 21 | import httplib |
19 | 22 | import urllib | 22 | import urllib |
20 | 23 | import ConfigParser | ||
21 | 24 | import json | 23 | import json |
22 | 25 | import types | 24 | import types |
23 | 26 | 25 | ||
24 | @@ -182,6 +181,26 @@ | |||
25 | 182 | 181 | ||
26 | 183 | return user | 182 | return user |
27 | 184 | 183 | ||
28 | 184 | def is_valid_user(self, email): | ||
29 | 185 | """Handy function to check if a user exists or not. | ||
30 | 186 | |||
31 | 187 | :param email: The user email. | ||
32 | 188 | :return True or False. | ||
33 | 189 | """ | ||
34 | 190 | params = {"username": email} | ||
35 | 191 | |||
36 | 192 | resource = "/user?{0}".format(urllib.urlencode(params)) | ||
37 | 193 | api_url = "/crowd/rest/usermanagement/1{0}".format(resource) | ||
38 | 194 | |||
39 | 195 | valid = True | ||
40 | 196 | try: | ||
41 | 197 | self._get(api_url) | ||
42 | 198 | except CrowdNotFoundException: | ||
43 | 199 | # In case of other exceptions, raise them. | ||
44 | 200 | valid = False | ||
45 | 201 | |||
46 | 202 | return valid | ||
47 | 203 | |||
48 | 185 | def _get_rest_usermanagement(self, resource): | 204 | def _get_rest_usermanagement(self, resource): |
49 | 186 | api_url = "/{0}{1}".format(self._uri, resource) | 205 | api_url = "/{0}{1}".format(self._uri, resource) |
50 | 187 | return self._get(api_url) | 206 | return self._get(api_url) |
51 | @@ -211,4 +230,5 @@ | |||
52 | 211 | raise CrowdForbiddenException( | 230 | raise CrowdForbiddenException( |
53 | 212 | "Access forbidden to fulfill the request") | 231 | "Access forbidden to fulfill the request") |
54 | 213 | else: | 232 | else: |
56 | 214 | raise CrowdException("Unknown Crowd status {0}".format(resp.status)) | 233 | raise CrowdException( |
57 | 234 | "Unknown Crowd status {0}".format(resp.status)) | ||
58 | 215 | 235 | ||
59 | === modified file 'apps/patchwork/bin/update-committed-patches.py' | |||
60 | --- apps/patchwork/bin/update-committed-patches.py 2013-10-17 12:22:58 +0000 | |||
61 | +++ apps/patchwork/bin/update-committed-patches.py 2013-12-05 10:08:30 +0000 | |||
62 | @@ -10,26 +10,47 @@ | |||
63 | 10 | # accepted once they reach the master branch. | 10 | # accepted once they reach the master branch. |
64 | 11 | 11 | ||
65 | 12 | import _pythonpath | 12 | import _pythonpath |
66 | 13 | import atexit | ||
67 | 13 | import sys | 14 | import sys |
68 | 15 | |||
69 | 14 | from datetime import datetime | 16 | from datetime import datetime |
74 | 15 | from operator import attrgetter | 17 | from django.conf import settings |
71 | 16 | |||
72 | 17 | from lockfile import FileLock, LockFailed, LockTimeout | ||
73 | 18 | |||
75 | 19 | from django.db.models import Q | 18 | from django.db.models import Q |
76 | 20 | from dulwich.repo import Repo | 19 | from dulwich.repo import Repo |
77 | 20 | from lockfile import FileLock, LockFailed, LockTimeout | ||
78 | 21 | from operator import attrgetter | ||
79 | 21 | 22 | ||
80 | 23 | from patchmetrics.crowd import Crowd | ||
81 | 24 | from patchwork.db import PatchworkDB | ||
82 | 22 | from patchwork.models import Project, State | 25 | from patchwork.models import Project, State |
83 | 23 | from patchwork.utils import ( | 26 | from patchwork.utils import ( |
84 | 24 | ensure_source_checkout_for_project, | 27 | ensure_source_checkout_for_project, |
85 | 25 | get_commits_to_parse, | 28 | get_commits_to_parse, |
87 | 26 | get_diff_for_commit, | 29 | get_diff_for_linaro_commit, |
88 | 27 | get_patches_matching_commit, | 30 | get_patches_matching_commit, |
89 | 28 | GitException | 31 | GitException |
91 | 29 | ) | 32 | ) |
92 | 30 | 33 | ||
93 | 31 | 34 | ||
94 | 32 | def main(): | 35 | def main(): |
95 | 36 | |||
96 | 37 | if settings.AUTH_CROWD_APPLICATION_USER is None: | ||
97 | 38 | print >> sys.stderr, ("AUTH_CROWD_APPLICATION_USER not defined in " | ||
98 | 39 | "settings.py.\n") | ||
99 | 40 | sys.exit(1) | ||
100 | 41 | else: | ||
101 | 42 | crwd = Crowd(settings.AUTH_CROWD_APPLICATION_USER, | ||
102 | 43 | settings.AUTH_CROWD_APPLICATION_PASSWORD, | ||
103 | 44 | settings.AUTH_CROWD_SERVER_REST_URI) | ||
104 | 45 | |||
105 | 46 | if settings.CROWD_USERS_DB_FILE is None: | ||
106 | 47 | print >> sys.stderr, ("CROWD_USERS_DB_FILE not defined in " | ||
107 | 48 | "settings.py.\n") | ||
108 | 49 | sys.exit(1) | ||
109 | 50 | |||
110 | 51 | cache_db = PatchworkDB(settings.CROWD_USERS_DB_FILE) | ||
111 | 52 | atexit.register(cache_db.close) | ||
112 | 53 | |||
113 | 33 | projects = Project.objects.exclude( | 54 | projects = Project.objects.exclude( |
114 | 34 | Q(source_tree__isnull=True) | Q(source_tree='')) | 55 | Q(source_tree__isnull=True) | Q(source_tree='')) |
115 | 35 | 56 | ||
116 | @@ -66,7 +87,7 @@ | |||
117 | 66 | for commit_id in commit_ids: | 87 | for commit_id in commit_ids: |
118 | 67 | project.last_seen_commit_ref = commit_id | 88 | project.last_seen_commit_ref = commit_id |
119 | 68 | project.save() | 89 | project.save() |
121 | 69 | diff = get_diff_for_commit(root, commit_id) | 90 | diff = get_diff_for_linaro_commit(root, commit_id, crwd, cache_db) |
122 | 70 | if diff is None: | 91 | if diff is None: |
123 | 71 | # Some commits result in empty diffs (e.g. only changing file | 92 | # Some commits result in empty diffs (e.g. only changing file |
124 | 72 | # permissions); in these cases we just skip them. | 93 | # permissions); in these cases we just skip them. |
125 | 73 | 94 | ||
126 | === added file 'apps/patchwork/db.py' | |||
127 | --- apps/patchwork/db.py 1970-01-01 00:00:00 +0000 | |||
128 | +++ apps/patchwork/db.py 2013-12-05 10:08:30 +0000 | |||
129 | @@ -0,0 +1,120 @@ | |||
130 | 1 | # Copyright (C) 2013 Linaro | ||
131 | 2 | # | ||
132 | 3 | # Author: Milo Casagrande <milo.casagrande@linaro.org> | ||
133 | 4 | # This file is part of the Patchmetrics package. | ||
134 | 5 | # | ||
135 | 6 | # Patchmetrics is free software; you can redistribute it and/or modify | ||
136 | 7 | # it under the terms of the GNU General Public License as published by | ||
137 | 8 | # the Free Software Foundation; either version 2 of the License, or | ||
138 | 9 | # (at your option) any later version. | ||
139 | 10 | # | ||
140 | 11 | # Patchmetrics is distributed in the hope that it will be useful, | ||
141 | 12 | # but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
142 | 13 | # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
143 | 14 | # GNU General Public License for more details. | ||
144 | 15 | # | ||
145 | 16 | # You should have received a copy of the GNU General Public License | ||
146 | 17 | # along with Patchwork; if not, write to the Free Software | ||
147 | 18 | # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA | ||
148 | 19 | |||
149 | 20 | import sqlite3 | ||
150 | 21 | |||
151 | 22 | |||
152 | 23 | class PatchworkDB(object): | ||
153 | 24 | """A cache backed by a SQLite DB to store users. | ||
154 | 25 | |||
155 | 26 | This is used to provide a users cache in order to speed up the process | ||
156 | 27 | of analyzing patches from a git repository. | ||
157 | 28 | |||
158 | 29 | Users are stored with their email address, and a boolean value indicating | ||
159 | 30 | if the user is a valid Linaro user. A valid Linaro user is a user that | ||
160 | 31 | exists in Linaro Login (Crowd). | ||
161 | 32 | |||
162 | 33 | There is no user logic here, it is just used as a cache backend. | ||
163 | 34 | """ | ||
164 | 35 | |||
165 | 36 | def __init__(self, db_file): | ||
166 | 37 | self.db_file = db_file | ||
167 | 38 | self._connection = None | ||
168 | 39 | self._cursor = None | ||
169 | 40 | |||
170 | 41 | self._init_db() | ||
171 | 42 | |||
172 | 43 | def _init_db(self): | ||
173 | 44 | if not self.table_exists('users'): | ||
174 | 45 | self.cursor.execute('''CREATE TABLE users | ||
175 | 46 | (id TEXT PRIMARY KEY, valid INTEGER, ts TIMESTAMP)''') | ||
176 | 47 | |||
177 | 48 | def close(self): | ||
178 | 49 | """Closes the DB connection.""" | ||
179 | 50 | # Force a commit, even if we are in autocommit mode. | ||
180 | 51 | self.save() | ||
181 | 52 | self.connection.close() | ||
182 | 53 | |||
183 | 54 | def save(self): | ||
184 | 55 | """Performs a commit on the DB.""" | ||
185 | 56 | self.connection.commit() | ||
186 | 57 | |||
187 | 58 | @property | ||
188 | 59 | def connection(self): | ||
189 | 60 | """The connection to the DB.""" | ||
190 | 61 | if self._connection is None: | ||
191 | 62 | self._connection = sqlite3.connect(self.db_file) | ||
192 | 63 | return self._connection | ||
193 | 64 | |||
194 | 65 | @property | ||
195 | 66 | def cursor(self): | ||
196 | 67 | """The connection cursor.""" | ||
197 | 68 | if self._cursor is None: | ||
198 | 69 | self._cursor = self.connection.cursor() | ||
199 | 70 | return self._cursor | ||
200 | 71 | |||
201 | 72 | def table_exists(self, table): | ||
202 | 73 | """Checks if a table exists in the DB. | ||
203 | 74 | |||
204 | 75 | :param table: The name of the table to check. | ||
205 | 76 | """ | ||
206 | 77 | self.cursor.execute('''SELECT name FROM sqlite_master WHERE | ||
207 | 78 | type='table' AND name=?''', [table]) | ||
208 | 79 | |||
209 | 80 | exists = False | ||
210 | 81 | if self.cursor.fetchone(): | ||
211 | 82 | exists = True | ||
212 | 83 | return exists | ||
213 | 84 | |||
214 | 85 | def insert_user(self, email, valid, timestamp): | ||
215 | 86 | """Inserts a user in the DB. | ||
216 | 87 | |||
217 | 88 | :param email: The user email, this is the unique key in the DB. | ||
218 | 89 | :param valid: If the user is a valid Linaro user. | ||
219 | 90 | :param timestamp: When the user was added in the cache. | ||
220 | 91 | """ | ||
221 | 92 | self.cursor.execute('''INSERT INTO users(id, valid, ts) | ||
222 | 93 | VALUES (?, ?, ?)''', (email, valid, timestamp)) | ||
223 | 94 | |||
224 | 95 | def update_user(self, email, valid, timestamp): | ||
225 | 96 | """Updates a user in the DB.""" | ||
226 | 97 | self.cursor.execute('''UPDATE users SET valid=?, ts=? WHERE id=?''', | ||
227 | 98 | (valid, timestamp, email)) | ||
228 | 99 | |||
229 | 100 | def get_user(self, email): | ||
230 | 101 | """Retrieves a user from the DB. | ||
231 | 102 | |||
232 | 103 | :param email: The user email. | ||
233 | 104 | :return A tuple with the data found, None otherwise. | ||
234 | 105 | """ | ||
235 | 106 | self.cursor.execute('''SELECT * FROM users where id=?''', [email]) | ||
236 | 107 | return self.cursor.fetchone() | ||
237 | 108 | |||
238 | 109 | def user_exists(self, email): | ||
239 | 110 | """Verifies if a user exists in the DB. | ||
240 | 111 | |||
241 | 112 | :param email: The user email. | ||
242 | 113 | :return True or False. | ||
243 | 114 | """ | ||
244 | 115 | result = self.get_user(email) | ||
245 | 116 | |||
246 | 117 | exists = False | ||
247 | 118 | if result: | ||
248 | 119 | exists = True | ||
249 | 120 | return exists | ||
250 | 0 | 121 | ||
251 | === modified file 'apps/patchwork/parser.py' | |||
252 | --- apps/patchwork/parser.py 2012-10-18 09:28:17 +0000 | |||
253 | +++ apps/patchwork/parser.py 2013-12-05 10:08:30 +0000 | |||
254 | @@ -20,7 +20,9 @@ | |||
255 | 20 | # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA | 20 | # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
256 | 21 | 21 | ||
257 | 22 | 22 | ||
258 | 23 | import StringIO | ||
259 | 23 | import re | 24 | import re |
260 | 25 | import types | ||
261 | 24 | 26 | ||
262 | 25 | try: | 27 | try: |
263 | 26 | import hashlib | 28 | import hashlib |
264 | @@ -32,7 +34,14 @@ | |||
265 | 32 | _hunk_re = re.compile('^\@\@ -\d+(?:,(\d+))? \+\d+(?:,(\d+))? \@\@') | 34 | _hunk_re = re.compile('^\@\@ -\d+(?:,(\d+))? \+\d+(?:,(\d+))? \@\@') |
266 | 33 | _filename_re = re.compile('^(---|\+\+\+) (\S+)') | 35 | _filename_re = re.compile('^(---|\+\+\+) (\S+)') |
267 | 34 | 36 | ||
268 | 37 | |||
269 | 35 | def parse_patch(text): | 38 | def parse_patch(text): |
270 | 39 | text_file = None | ||
271 | 40 | if isinstance(text, types.StringTypes): | ||
272 | 41 | text_file = StringIO.StringIO(text) | ||
273 | 42 | else: | ||
274 | 43 | text_file = text | ||
275 | 44 | |||
276 | 36 | patchbuf = '' | 45 | patchbuf = '' |
277 | 37 | commentbuf = '' | 46 | commentbuf = '' |
278 | 38 | buf = '' | 47 | buf = '' |
279 | @@ -63,20 +72,15 @@ | |||
280 | 63 | lc = (0, 0) | 72 | lc = (0, 0) |
281 | 64 | hunk = 0 | 73 | hunk = 0 |
282 | 65 | 74 | ||
287 | 66 | 75 | for line in text_file: | |
284 | 67 | for line in text.split('\n'): | ||
285 | 68 | line += '\n' | ||
286 | 69 | |||
288 | 70 | if state == 0: | 76 | if state == 0: |
289 | 71 | if line.startswith('diff ') or line.startswith('===') \ | 77 | if line.startswith('diff ') or line.startswith('===') \ |
290 | 72 | or line.startswith('Index: '): | 78 | or line.startswith('Index: '): |
291 | 73 | state = 1 | 79 | state = 1 |
292 | 74 | buf += line | 80 | buf += line |
293 | 75 | |||
294 | 76 | elif line.startswith('--- '): | 81 | elif line.startswith('--- '): |
295 | 77 | state = 2 | 82 | state = 2 |
296 | 78 | buf += line | 83 | buf += line |
297 | 79 | |||
298 | 80 | else: | 84 | else: |
299 | 81 | commentbuf += line | 85 | commentbuf += line |
300 | 82 | 86 | ||
301 | @@ -89,11 +93,9 @@ | |||
302 | 89 | if line.startswith('+++ '): | 93 | if line.startswith('+++ '): |
303 | 90 | state = 3 | 94 | state = 3 |
304 | 91 | buf += line | 95 | buf += line |
305 | 92 | |||
306 | 93 | elif hunk: | 96 | elif hunk: |
307 | 94 | state = 1 | 97 | state = 1 |
308 | 95 | buf += line | 98 | buf += line |
309 | 96 | |||
310 | 97 | else: | 99 | else: |
311 | 98 | state = 0 | 100 | state = 0 |
312 | 99 | commentbuf += buf + line | 101 | commentbuf += buf + line |
313 | @@ -102,7 +104,6 @@ | |||
314 | 102 | elif state == 3: | 104 | elif state == 3: |
315 | 103 | match = _hunk_re.match(line) | 105 | match = _hunk_re.match(line) |
316 | 104 | if match: | 106 | if match: |
317 | 105 | |||
318 | 106 | def fn(x): | 107 | def fn(x): |
319 | 107 | if not x: | 108 | if not x: |
320 | 108 | return 1 | 109 | return 1 |
321 | @@ -113,16 +114,13 @@ | |||
322 | 113 | state = 4 | 114 | state = 4 |
323 | 114 | patchbuf += buf + line | 115 | patchbuf += buf + line |
324 | 115 | buf = '' | 116 | buf = '' |
325 | 116 | |||
326 | 117 | elif line.startswith('--- '): | 117 | elif line.startswith('--- '): |
327 | 118 | patchbuf += buf + line | 118 | patchbuf += buf + line |
328 | 119 | buf = '' | 119 | buf = '' |
329 | 120 | state = 2 | 120 | state = 2 |
330 | 121 | |||
331 | 122 | elif hunk: | 121 | elif hunk: |
332 | 123 | state = 1 | 122 | state = 1 |
333 | 124 | buf += line | 123 | buf += line |
334 | 125 | |||
335 | 126 | else: | 124 | else: |
336 | 127 | state = 0 | 125 | state = 0 |
337 | 128 | commentbuf += buf + line | 126 | commentbuf += buf + line |
338 | @@ -161,6 +159,7 @@ | |||
339 | 161 | 159 | ||
340 | 162 | return (patchbuf, commentbuf) | 160 | return (patchbuf, commentbuf) |
341 | 163 | 161 | ||
342 | 162 | |||
343 | 164 | def hash_patch(str): | 163 | def hash_patch(str): |
344 | 165 | # normalise spaces | 164 | # normalise spaces |
345 | 166 | str = str.replace('\r', '') | 165 | str = str.replace('\r', '') |
346 | @@ -213,12 +212,12 @@ | |||
347 | 213 | from optparse import OptionParser | 212 | from optparse import OptionParser |
348 | 214 | 213 | ||
349 | 215 | parser = OptionParser() | 214 | parser = OptionParser() |
356 | 216 | parser.add_option('-p', '--patch', action = 'store_true', | 215 | parser.add_option('-p', '--patch', action='store_true', |
357 | 217 | dest = 'print_patch', help = 'print parsed patch') | 216 | dest='print_patch', help='print parsed patch') |
358 | 218 | parser.add_option('-c', '--comment', action = 'store_true', | 217 | parser.add_option('-c', '--comment', action='store_true', |
359 | 219 | dest = 'print_comment', help = 'print parsed comment') | 218 | dest='print_comment', help='print parsed comment') |
360 | 220 | parser.add_option('-#', '--hash', action = 'store_true', | 219 | parser.add_option('-#', '--hash', action='store_true', |
361 | 221 | dest = 'print_hash', help = 'print patch hash') | 220 | dest='print_hash', help='print patch hash') |
362 | 222 | 221 | ||
363 | 223 | (options, args) = parser.parse_args() | 222 | (options, args) = parser.parse_args() |
364 | 224 | 223 | ||
365 | 225 | 224 | ||
366 | === modified file 'apps/patchwork/utils.py' | |||
367 | --- apps/patchwork/utils.py 2013-10-16 12:29:19 +0000 | |||
368 | +++ apps/patchwork/utils.py 2013-12-05 10:08:30 +0000 | |||
369 | @@ -17,8 +17,10 @@ | |||
370 | 17 | # along with Patchwork; if not, write to the Free Software | 17 | # along with Patchwork; if not, write to the Free Software |
371 | 18 | # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA | 18 | # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA |
372 | 19 | 19 | ||
373 | 20 | import StringIO | ||
374 | 20 | import atexit | 21 | import atexit |
375 | 21 | import chardet | 22 | import chardet |
376 | 23 | import datetime | ||
377 | 22 | import os | 24 | import os |
378 | 23 | import re | 25 | import re |
379 | 24 | import subprocess | 26 | import subprocess |
380 | @@ -395,6 +397,123 @@ | |||
381 | 395 | return patch | 397 | return patch |
382 | 396 | 398 | ||
383 | 397 | 399 | ||
384 | 400 | def get_diff_for_linaro_commit(root, commit_id, crowd, db): | ||
385 | 401 | """Gets a diff from a commit id for a Linaro made change. | ||
386 | 402 | |||
387 | 403 | If the author or the committer of the commit is not a Linaro user, then | ||
388 | 404 | the commit will not be considered. | ||
389 | 405 | |||
390 | 406 | This is a slightly different implementation than `get_diff_for_commit`, | ||
391 | 407 | where here it checks if the user is a valid Linaro account before taking | ||
392 | 408 | in account the patch. | ||
393 | 409 | |||
394 | 410 | :param root: The path on the file system where the repository is stored. | ||
395 | 411 | :param commit_id: The commmit ID to parse. | ||
396 | 412 | :param crowd: A Crowd object used to determine if the user is from Linaro. | ||
397 | 413 | :param db: A PatchworkDB object used as a cache for users. | ||
398 | 414 | """ | ||
399 | 415 | stdout = "" | ||
400 | 416 | patch = None | ||
401 | 417 | author = None | ||
402 | 418 | committer = None | ||
403 | 419 | |||
404 | 420 | # We create a custom view on the commit id diff. We extract the author | ||
405 | 421 | # and committer emails, and check if they are from Linaro people, otherwise | ||
406 | 422 | # we do not even consider the commit and move one. | ||
407 | 423 | # Most of the time patchwork/patchmetrics spends a lot of time calculating | ||
408 | 424 | # diffs for patches that might not even be from Linaro people. | ||
409 | 425 | # The first format prints on stdout the author and commiter emails. | ||
410 | 426 | # The rest of the format string is exactly as what is shown with a simple | ||
411 | 427 | # `git show COMMIT_ID`. | ||
412 | 428 | format_string = "%ae%n%ce%n" | ||
413 | 429 | format_string += "commit %H%n" | ||
414 | 430 | format_string += "Author: %aN <%aE>%n" | ||
415 | 431 | format_string += "Date: %ad%n" | ||
416 | 432 | |||
417 | 433 | cmd_args = ['git', 'show', '--format=format:{0}'.format(format_string), | ||
418 | 434 | commit_id] | ||
419 | 435 | proc = subprocess.Popen(cmd_args, cwd=root, bufsize=4096, | ||
420 | 436 | stdout=subprocess.PIPE, stderr=DEVNULL) | ||
421 | 437 | |||
422 | 438 | while True: | ||
423 | 439 | for line in iter(proc.stdout.readline, b''): | ||
424 | 440 | stdout += line | ||
425 | 441 | if proc.poll() is not None: | ||
426 | 442 | break | ||
427 | 443 | |||
428 | 444 | if proc.returncode != 0: | ||
429 | 445 | print ("Error retrieving diff for commit {0}".format(commit_id)) | ||
430 | 446 | else: | ||
431 | 447 | # Since with big diffs a lot of time is spent decoding the output from | ||
432 | 448 | # the git command, we first read two lines from the output (author and | ||
433 | 449 | # commiter emails), perform validations on those, and in case one of | ||
434 | 450 | # them is a valid Linaro user we continue with output decoding and | ||
435 | 451 | # diff parsing. | ||
436 | 452 | string_file = StringIO.StringIO(stdout) | ||
437 | 453 | |||
438 | 454 | author = string_file.readline().strip().decode('utf-8') | ||
439 | 455 | committer = string_file.readline().strip().decode('utf-8') | ||
440 | 456 | |||
441 | 457 | valid_patch = False | ||
442 | 458 | if author == committer: | ||
443 | 459 | valid_patch |= is_linaro_user(author, db, crowd) | ||
444 | 460 | else: | ||
445 | 461 | valid_patch |= is_linaro_user(author, db, crowd) | ||
446 | 462 | valid_patch |= is_linaro_user(committer, db, crowd) | ||
447 | 463 | |||
448 | 464 | if valid_patch: | ||
449 | 465 | try: | ||
450 | 466 | diff = stdout.decode('utf-8') | ||
451 | 467 | except UnicodeDecodeError: | ||
452 | 468 | try: | ||
453 | 469 | # chardet.detect is rather slow so we only use it when we | ||
454 | 470 | # fail # to decode from utf-8. | ||
455 | 471 | encoding = chardet.detect(stdout)['encoding'] | ||
456 | 472 | diff = stdout.decode(encoding) | ||
457 | 473 | except UnicodeDecodeError: | ||
458 | 474 | print ("Skipping commit {0} as we can't find the " | ||
459 | 475 | "appropriate charset to decode " | ||
460 | 476 | "it".format(commit_id)) | ||
461 | 477 | return None | ||
462 | 478 | |||
463 | 479 | # This is a double action, but it is still faster then decoding | ||
464 | 480 | # everytime the output even when the patch does not come from a | ||
465 | 481 | # Linaro user. | ||
466 | 482 | string_file = StringIO.StringIO(diff) | ||
467 | 483 | # We need to pass the StringIO object pointing to the 3rd line | ||
468 | 484 | # since the first two have already been read, and they contains | ||
469 | 485 | # information not needed for the patch. | ||
470 | 486 | for i in range(0, 2): | ||
471 | 487 | string_file.readline() | ||
472 | 488 | patch, _ = parse_patch(string_file) | ||
473 | 489 | |||
474 | 490 | return patch | ||
475 | 491 | |||
476 | 492 | |||
477 | 493 | def is_linaro_user(user, db, crowd): | ||
478 | 494 | """Verifies if a user is a valid Linaro account. | ||
479 | 495 | |||
480 | 496 | :param user: The email address to verify. | ||
481 | 497 | :param db: A PatchworkDB instance for the cache backend. | ||
482 | 498 | :param crowd: A Crowd instance to retrieve user data if necessary. | ||
483 | 499 | """ | ||
484 | 500 | cached_user = db.get_user(user) | ||
485 | 501 | |||
486 | 502 | is_valid = False | ||
487 | 503 | if cached_user: | ||
488 | 504 | # TODO: we need to set a timeout and check again even if the user | ||
489 | 505 | # is cached, it might have been removed from Crowd. | ||
490 | 506 | is_valid |= cached_user[1] | ||
491 | 507 | else: | ||
492 | 508 | # XXX: The first time this runs, it might hit the Crowd server hard, | ||
493 | 509 | # since none of the users will be cached. | ||
494 | 510 | valid_user = crowd.is_valid_user(user) | ||
495 | 511 | is_valid |= valid_user | ||
496 | 512 | db.insert_user(user, valid_user, datetime.datetime.now()) | ||
497 | 513 | |||
498 | 514 | return is_valid | ||
499 | 515 | |||
500 | 516 | |||
501 | 398 | def get_patches_matching_commit(project, commit_diff, commit_msg, | 517 | def get_patches_matching_commit(project, commit_diff, commit_msg, |
502 | 399 | commit_author): | 518 | commit_author): |
503 | 400 | """Return the Patch objects that are likely matches to the given commit. | 519 | """Return the Patch objects that are likely matches to the given commit. |
504 | 401 | 520 | ||
505 | === modified file 'apps/settings.py' | |||
506 | --- apps/settings.py 2013-10-23 19:18:06 +0000 | |||
507 | +++ apps/settings.py 2013-12-05 10:08:30 +0000 | |||
508 | @@ -147,6 +147,9 @@ | |||
509 | 147 | # File with a list of groups that should be created in the DB. If other groups | 147 | # File with a list of groups that should be created in the DB. If other groups |
510 | 148 | # are taken out from Crowd, and they are not here, they will be skipped. | 148 | # are taken out from Crowd, and they are not here, they will be skipped. |
511 | 149 | CROWD_GROUPS_WHITELIST = None | 149 | CROWD_GROUPS_WHITELIST = None |
512 | 150 | # Full path to a DB file. | ||
513 | 151 | # This is used to have a cache of which users exists in the Crowd database. | ||
514 | 152 | CROWD_USERS_DB_FILE = None | ||
515 | 150 | 153 | ||
516 | 151 | # Where to store the git repos that we scan to identify which patches have | 154 | # Where to store the git repos that we scan to identify which patches have |
517 | 152 | # been committed. | 155 | # been committed. |
r383: It says it "Fixed output collection for rev-list git command.", but it's unclear what exactly is fixed. Replacement code doesn't look to be better then original, it actually gotta have stdout/stderr read deadlock bug. But it's definitely more verbose and magic-ridden (like use of 2-arg iter, "b" string prefix etc.) Please consider these rhetoric remarks - if such changes are done, it's nice to give more details, though this specific case is minor.