Merge ~ilasc/launchpad:git-repack-cron-script into launchpad:master
- Git
- lp:~ilasc/launchpad
- git-repack-cron-script
- Merge into master
Status: | Merged |
---|---|
Approved by: | Ioana Lasc |
Approved revision: | a078850beec0a6bee402dfc649f259481bb58708 |
Merge reported by: | Otto Co-Pilot |
Merged at revision: | not available |
Proposed branch: | ~ilasc/launchpad:git-repack-cron-script |
Merge into: | launchpad:master |
Diff against target: |
340 lines (+289/-0) 6 files modified
cronscripts/repack_git_repositories.py (+33/-0) database/schema/security.cfg (+1/-0) lib/lp/code/model/gitrepository.py (+1/-0) lib/lp/code/scripts/repackgitrepository.py (+97/-0) lib/lp/code/scripts/tests/test_repack_git_repositories.py (+154/-0) lib/lp/services/config/schema-lazr.conf (+3/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Colin Watson (community) | Approve | ||
Review via email: mp+398963@code.launchpad.net |
Commit message
Add cron script for git repack
Description of the change
- 8e67df7... by Ioana Lasc
-
Organize imports
Ioana Lasc (ilasc) wrote : | # |
Colin Watson (cjwatson) : | # |
- 7ce4087... by Ioana Lasc
-
Move repackgitrepository to lp/code/scripts
- 51ca1f4... by Ioana Lasc
-
Cleanup unnecessary git-repack user
Ioana Lasc (ilasc) wrote : | # |
Thanks Colin, MP is now ready for review.
Colin Watson (cjwatson) wrote : | # |
Looks like a good first attempt, thanks!
This is potentially quite operationally expensive, so in this review I've paid special attention to things that I think could make it generally kinder and gentler to our infrastructure; the "Needs Fixing" is mainly for those.
Colin Watson (cjwatson) : | # |
- 3020f70... by Ioana Lasc
-
Add TunableLoop to git repack automation
Ioana Lasc (ilasc) wrote : | # |
Thanks Colin, excellent suggestions as usual!
This is now working with a TunableLoop but needs more work: the TunableLoop doesn't hook into the fake turnip thread for the calls to gitHostingClient and it is oops-ing with: https:/
I'm working on getting the LaunchpadCronScript and TunableLoop to interact with Fake Turnip Thread in unit test instead of trying to contact the real application.
In terms of the permissions added to the branchscanner user, this is how it crashed if SELECT rights are not granted to the user:https:/
And it appears I've gone overboard with permissions as UPDATE is not necessary, so will be taking that out.
I didn't miss your last comment about adding at least another test for the case where we have more than single repository qualifying for repack. As soon as I get the Unit Tests to play nicely with the FakeTurnipAppli
- 82b632e... by Ioana Lasc
-
Set default timeout function for RepackTunableLoop
- 5f84608... by Ioana Lasc
-
Add unit tests for RepackTunableLoop
Ioana Lasc (ilasc) wrote : | # |
MP now ready for review.
Colin Watson (cjwatson) wrote : | # |
Thanks for the updates. Various details follow; at least the index needs to be sorted out before landing this.
- a078850... by Ioana Lasc
-
Address review comments for repack cron job
Preview Diff
1 | diff --git a/cronscripts/repack_git_repositories.py b/cronscripts/repack_git_repositories.py | |||
2 | 0 | new file mode 100755 | 0 | new file mode 100755 |
3 | index 0000000..e553616 | |||
4 | --- /dev/null | |||
5 | +++ b/cronscripts/repack_git_repositories.py | |||
6 | @@ -0,0 +1,33 @@ | |||
7 | 1 | #!/usr/bin/python2 -S | ||
8 | 2 | # | ||
9 | 3 | # Copyright 2021 Canonical Ltd. This software is licensed under the | ||
10 | 4 | # GNU Affero General Public License version 3 (see the file LICENSE). | ||
11 | 5 | |||
12 | 6 | import _pythonpath | ||
13 | 7 | |||
14 | 8 | from lp.code.scripts.repackgitrepository import RepackTunableLoop | ||
15 | 9 | from lp.services.config import config | ||
16 | 10 | from lp.services.scripts.base import LaunchpadCronScript | ||
17 | 11 | from lp.services.timeout import set_default_timeout_function | ||
18 | 12 | |||
19 | 13 | |||
20 | 14 | class RepackGitRepositories(LaunchpadCronScript): | ||
21 | 15 | |||
22 | 16 | def add_my_options(self): | ||
23 | 17 | self.parser.add_option( | ||
24 | 18 | "--dry-run", action="store_true", | ||
25 | 19 | dest="dry_run", default=False, | ||
26 | 20 | help="Reports which repositories would be repacked without " | ||
27 | 21 | "actually repacking the repositories.") | ||
28 | 22 | |||
29 | 23 | def main(self): | ||
30 | 24 | set_default_timeout_function( | ||
31 | 25 | lambda: config.repack_git_repositories.timeout) | ||
32 | 26 | updater = RepackTunableLoop(self.logger, self.options.dry_run) | ||
33 | 27 | updater.run() | ||
34 | 28 | |||
35 | 29 | |||
36 | 30 | if __name__ == '__main__': | ||
37 | 31 | script = RepackGitRepositories( | ||
38 | 32 | 'repack_git_repositories', dbuser='branchscanner') | ||
39 | 33 | script.lock_and_run() | ||
40 | diff --git a/database/schema/security.cfg b/database/schema/security.cfg | |||
41 | index 2f4ffdf..2911da2 100644 | |||
42 | --- a/database/schema/security.cfg | |||
43 | +++ b/database/schema/security.cfg | |||
44 | @@ -699,6 +699,7 @@ public.bugsubscriptionfilterimportance = SELECT | |||
45 | 699 | public.bugsubscriptionfilterstatus = SELECT | 699 | public.bugsubscriptionfilterstatus = SELECT |
46 | 700 | public.bugsubscriptionfiltertag = SELECT | 700 | public.bugsubscriptionfiltertag = SELECT |
47 | 701 | public.bugtag = SELECT | 701 | public.bugtag = SELECT |
48 | 702 | public.codeimport = SELECT | ||
49 | 702 | public.codereviewmessage = SELECT | 703 | public.codereviewmessage = SELECT |
50 | 703 | public.codereviewvote = SELECT | 704 | public.codereviewvote = SELECT |
51 | 704 | public.diff = SELECT, INSERT, DELETE | 705 | public.diff = SELECT, INSERT, DELETE |
52 | diff --git a/lib/lp/code/model/gitrepository.py b/lib/lp/code/model/gitrepository.py | |||
53 | index 589dcbf..429b502 100644 | |||
54 | --- a/lib/lp/code/model/gitrepository.py | |||
55 | +++ b/lib/lp/code/model/gitrepository.py | |||
56 | @@ -489,6 +489,7 @@ class GitRepository(StormBase, WebhookTargetMixin, GitIdentityMixin): | |||
57 | 489 | 489 | ||
58 | 490 | def repackRepository(self): | 490 | def repackRepository(self): |
59 | 491 | getUtility(IGitHostingClient).repackRepository(self.getInternalPath()) | 491 | getUtility(IGitHostingClient).repackRepository(self.getInternalPath()) |
60 | 492 | self.date_last_repacked = UTC_NOW | ||
61 | 492 | 493 | ||
62 | 493 | @property | 494 | @property |
63 | 494 | def namespace(self): | 495 | def namespace(self): |
64 | diff --git a/lib/lp/code/scripts/repackgitrepository.py b/lib/lp/code/scripts/repackgitrepository.py | |||
65 | 495 | new file mode 100644 | 496 | new file mode 100644 |
66 | index 0000000..0c25ff6 | |||
67 | --- /dev/null | |||
68 | +++ b/lib/lp/code/scripts/repackgitrepository.py | |||
69 | @@ -0,0 +1,97 @@ | |||
70 | 1 | # Copyright 2021 Canonical Ltd. This software is licensed under the | ||
71 | 2 | # GNU Affero General Public License version 3 (see the file LICENSE). | ||
72 | 3 | |||
73 | 4 | """Functions used with the repack Git repositories script.""" | ||
74 | 5 | |||
75 | 6 | __metaclass__ = type | ||
76 | 7 | |||
77 | 8 | from psycopg2.extensions import TransactionRollbackError | ||
78 | 9 | import six | ||
79 | 10 | from storm.expr import ( | ||
80 | 11 | And, | ||
81 | 12 | Or, | ||
82 | 13 | ) | ||
83 | 14 | import transaction | ||
84 | 15 | |||
85 | 16 | from lp.code.enums import GitRepositoryStatus | ||
86 | 17 | from lp.code.errors import CannotRepackRepository | ||
87 | 18 | from lp.code.model.gitrepository import GitRepository | ||
88 | 19 | from lp.services.config import config | ||
89 | 20 | from lp.services.database.interfaces import IStore | ||
90 | 21 | from lp.services.looptuner import ( | ||
91 | 22 | LoopTuner, | ||
92 | 23 | TunableLoop, | ||
93 | 24 | ) | ||
94 | 25 | |||
95 | 26 | |||
96 | 27 | class RepackTunableLoop(TunableLoop): | ||
97 | 28 | tuner_class = LoopTuner | ||
98 | 29 | maximum_chunk_size = 5 | ||
99 | 30 | |||
100 | 31 | def __init__(self, log, dry_run, abort_time=None): | ||
101 | 32 | super(RepackTunableLoop, self).__init__(log, abort_time) | ||
102 | 33 | self.dry_run = dry_run | ||
103 | 34 | self.start_at = 0 | ||
104 | 35 | self.logger = log | ||
105 | 36 | self.store = IStore(GitRepository) | ||
106 | 37 | |||
107 | 38 | def findRepackCandidates(self): | ||
108 | 39 | repos = self.store.find( | ||
109 | 40 | GitRepository, | ||
110 | 41 | (Or( | ||
111 | 42 | GitRepository.loose_object_count >= | ||
112 | 43 | config.codehosting.loose_objects_threshold, | ||
113 | 44 | GitRepository.status == GitRepositoryStatus.AVAILABLE, | ||
114 | 45 | GitRepository.pack_count >= | ||
115 | 46 | config.codehosting.packs_threshold | ||
116 | 47 | ), | ||
117 | 48 | And(GitRepository.id > self.start_at))).order_by(GitRepository.id) | ||
118 | 49 | return repos | ||
119 | 50 | |||
120 | 51 | def isDone(self): | ||
121 | 52 | return self.findRepackCandidates().is_empty() | ||
122 | 53 | |||
123 | 54 | def __call__(self, chunk_size): | ||
124 | 55 | repackable_repos = list(self.findRepackCandidates()[:chunk_size]) | ||
125 | 56 | counter = 0 | ||
126 | 57 | for repo in repackable_repos: | ||
127 | 58 | try: | ||
128 | 59 | if self.dry_run: | ||
129 | 60 | print ('Would repack %s' % repo.identity) | ||
130 | 61 | else: | ||
131 | 62 | self.logger.info( | ||
132 | 63 | 'Requesting automatic git repository repack for %s.' | ||
133 | 64 | % repo.identity) | ||
134 | 65 | repo.repackRepository() | ||
135 | 66 | counter += 1 | ||
136 | 67 | except CannotRepackRepository as e: | ||
137 | 68 | self.logger.error( | ||
138 | 69 | 'An error occurred while requesting repository repack %s' | ||
139 | 70 | % e.message) | ||
140 | 71 | continue | ||
141 | 72 | except TransactionRollbackError as error: | ||
142 | 73 | self.logger.error( | ||
143 | 74 | 'An error occurred while requesting repository repack %s' | ||
144 | 75 | % six.text_type(error)) | ||
145 | 76 | if transaction is not None: | ||
146 | 77 | transaction.abort() | ||
147 | 78 | continue | ||
148 | 79 | |||
149 | 80 | if self.dry_run: | ||
150 | 81 | print( | ||
151 | 82 | 'Reporting %d automatic git repository repacks ' | ||
152 | 83 | 'would have been requested as part of this run ' | ||
153 | 84 | 'out of the %d qualifying for repack.' | ||
154 | 85 | % (counter, len(repackable_repos))) | ||
155 | 86 | else: | ||
156 | 87 | self.logger.info( | ||
157 | 88 | 'Requested %d automatic git repository repacks ' | ||
158 | 89 | 'out of the %d qualifying for repack.' | ||
159 | 90 | % (counter, len(repackable_repos))) | ||
160 | 91 | |||
161 | 92 | self.start_at = repackable_repos[-1].id | ||
162 | 93 | |||
163 | 94 | if not self.dry_run: | ||
164 | 95 | transaction.commit() | ||
165 | 96 | else: | ||
166 | 97 | transaction.abort() | ||
167 | diff --git a/lib/lp/code/scripts/tests/test_repack_git_repositories.py b/lib/lp/code/scripts/tests/test_repack_git_repositories.py | |||
168 | 0 | new file mode 100644 | 98 | new file mode 100644 |
169 | index 0000000..59585cd | |||
170 | --- /dev/null | |||
171 | +++ b/lib/lp/code/scripts/tests/test_repack_git_repositories.py | |||
172 | @@ -0,0 +1,154 @@ | |||
173 | 1 | # Copyright 2021 Canonical Ltd. This software is licensed under the | ||
174 | 2 | # GNU Affero General Public License version 3 (see the file LICENSE). | ||
175 | 3 | |||
176 | 4 | """Test the repack_git_repositories script.""" | ||
177 | 5 | import threading | ||
178 | 6 | from wsgiref.simple_server import ( | ||
179 | 7 | make_server, | ||
180 | 8 | WSGIRequestHandler, | ||
181 | 9 | ) | ||
182 | 10 | |||
183 | 11 | import transaction | ||
184 | 12 | from zope.security.proxy import removeSecurityProxy | ||
185 | 13 | |||
186 | 14 | from lp.services.config import config | ||
187 | 15 | from lp.services.config.fixture import ( | ||
188 | 16 | ConfigFixture, | ||
189 | 17 | ConfigUseFixture, | ||
190 | 18 | ) | ||
191 | 19 | from lp.testing import TestCaseWithFactory | ||
192 | 20 | from lp.testing.layers import ZopelessAppServerLayer | ||
193 | 21 | from lp.testing.script import run_script | ||
194 | 22 | |||
195 | 23 | |||
196 | 24 | class SilentWSGIRequestHandler(WSGIRequestHandler): | ||
197 | 25 | """A request handler that doesn't log requests.""" | ||
198 | 26 | |||
199 | 27 | def log_message(self, fmt, *args): | ||
200 | 28 | pass | ||
201 | 29 | |||
202 | 30 | |||
203 | 31 | class FakeTurnipApplication: | ||
204 | 32 | """A WSGI application that provides a fake turnip endpoint.""" | ||
205 | 33 | |||
206 | 34 | def __init__(self): | ||
207 | 35 | self.contents = [] | ||
208 | 36 | |||
209 | 37 | def __call__(self, environ, start_response): | ||
210 | 38 | self.contents.append(environ['PATH_INFO']) | ||
211 | 39 | start_response('200 OK', [('Content-Type', 'text/plain')]) | ||
212 | 40 | return [b''] | ||
213 | 41 | |||
214 | 42 | |||
215 | 43 | class FakeTurnipServer(threading.Thread): | ||
216 | 44 | """Thread that runs a fake turnip server.""" | ||
217 | 45 | |||
218 | 46 | def __init__(self): | ||
219 | 47 | super(FakeTurnipServer, self).__init__() | ||
220 | 48 | self.name = 'FakeTurnipServer' | ||
221 | 49 | self.app = FakeTurnipApplication() | ||
222 | 50 | self.server = make_server( | ||
223 | 51 | 'localhost', 0, self.app, handler_class=SilentWSGIRequestHandler) | ||
224 | 52 | |||
225 | 53 | def run(self): | ||
226 | 54 | self.server.serve_forever() | ||
227 | 55 | |||
228 | 56 | def getURL(self): | ||
229 | 57 | host, port = self.server.server_address | ||
230 | 58 | return 'http://%s:%d/' % (host, port) | ||
231 | 59 | |||
232 | 60 | def stop(self): | ||
233 | 61 | self.server.shutdown() | ||
234 | 62 | |||
235 | 63 | |||
236 | 64 | class TestRequestGitRepack(TestCaseWithFactory): | ||
237 | 65 | |||
238 | 66 | layer = ZopelessAppServerLayer | ||
239 | 67 | |||
240 | 68 | def setUp(self): | ||
241 | 69 | super(TestRequestGitRepack, self).setUp() | ||
242 | 70 | |||
243 | 71 | def runScript_no_Turnip(self): | ||
244 | 72 | transaction.commit() | ||
245 | 73 | |||
246 | 74 | (ret, out, err) = run_script('cronscripts/repack_git_repositories.py') | ||
247 | 75 | self.assertIn( | ||
248 | 76 | 'An error occurred while requesting repository repack', | ||
249 | 77 | err) | ||
250 | 78 | self.assertIn( | ||
251 | 79 | 'Failed to repack Git repository 1', err) | ||
252 | 80 | self.assertIn( | ||
253 | 81 | 'Requested 0 automatic git repository ' | ||
254 | 82 | 'repacks out of the 1 qualifying for repack.', err) | ||
255 | 83 | transaction.commit() | ||
256 | 84 | |||
257 | 85 | def runScript_with_Turnip(self): | ||
258 | 86 | transaction.commit() | ||
259 | 87 | (ret, out, err) = run_script('cronscripts/repack_git_repositories.py') | ||
260 | 88 | self.assertIn( | ||
261 | 89 | 'Requested 1 automatic git repository repacks ' | ||
262 | 90 | 'out of the 1 qualifying for repack.', err) | ||
263 | 91 | transaction.commit() | ||
264 | 92 | |||
265 | 93 | def makeTurnipServer(self): | ||
266 | 94 | self.turnip_server = FakeTurnipServer() | ||
267 | 95 | config_name = self.factory.getUniqueString() | ||
268 | 96 | config_fixture = self.useFixture(ConfigFixture( | ||
269 | 97 | config_name, config.instance_name)) | ||
270 | 98 | setting_lines = [ | ||
271 | 99 | '[codehosting]', | ||
272 | 100 | 'internal_git_api_endpoint: %s' % self.turnip_server.getURL(), | ||
273 | 101 | ] | ||
274 | 102 | config_fixture.add_section('\n' + '\n'.join(setting_lines)) | ||
275 | 103 | self.useFixture(ConfigUseFixture(config_name)) | ||
276 | 104 | self.turnip_server.start() | ||
277 | 105 | self.addCleanup(self.turnip_server.stop) | ||
278 | 106 | return self.turnip_server | ||
279 | 107 | |||
280 | 108 | def test_auto_repack_without_Turnip(self): | ||
281 | 109 | repo = self.factory.makeGitRepository() | ||
282 | 110 | repo = removeSecurityProxy(repo) | ||
283 | 111 | repo.loose_object_count = 7000 | ||
284 | 112 | repo.pack_count = 43 | ||
285 | 113 | |||
286 | 114 | # Do not start the fake turnip server here | ||
287 | 115 | # to test if the RequestGitRepack will catch and | ||
288 | 116 | # log correctly the failure to establish | ||
289 | 117 | # a connection to Turnip | ||
290 | 118 | self.runScript_no_Turnip() | ||
291 | 119 | self.assertIsNone(repo.date_last_repacked) | ||
292 | 120 | |||
293 | 121 | def test_auto_repack_with_Turnip_one_repo(self): | ||
294 | 122 | # Test repack works when only one repository | ||
295 | 123 | # qualifies for a repack | ||
296 | 124 | repo = self.factory.makeGitRepository() | ||
297 | 125 | repo = removeSecurityProxy(repo) | ||
298 | 126 | repo.loose_object_count = 7000 | ||
299 | 127 | repo.pack_count = 43 | ||
300 | 128 | transaction.commit() | ||
301 | 129 | |||
302 | 130 | self.makeTurnipServer() | ||
303 | 131 | |||
304 | 132 | self.runScript_with_Turnip() | ||
305 | 133 | |||
306 | 134 | self.assertIsNotNone(repo.date_last_repacked) | ||
307 | 135 | |||
308 | 136 | def test_auto_repack_with_Turnip_multiple_repos(self): | ||
309 | 137 | # Test repack works when 10 repositories | ||
310 | 138 | # qualifies for a repack | ||
311 | 139 | repo = [] | ||
312 | 140 | for i in range(10): | ||
313 | 141 | repo.append(self.factory.makeGitRepository()) | ||
314 | 142 | repo[i] = removeSecurityProxy(repo[i]) | ||
315 | 143 | repo[i].loose_object_count = 7000 | ||
316 | 144 | repo[i].pack_count = 43 | ||
317 | 145 | transaction.commit() | ||
318 | 146 | |||
319 | 147 | self.makeTurnipServer() | ||
320 | 148 | |||
321 | 149 | self.runScript_with_Turnip() | ||
322 | 150 | |||
323 | 151 | for i in range(10): | ||
324 | 152 | self.assertIsNotNone(repo[i].date_last_repacked) | ||
325 | 153 | self.assertEqual("/repo/%s/repack" % repo[i].getInternalPath(), | ||
326 | 154 | self.turnip_server.app.contents[i]) | ||
327 | diff --git a/lib/lp/services/config/schema-lazr.conf b/lib/lp/services/config/schema-lazr.conf | |||
328 | index fe28a8c..c676504 100644 | |||
329 | --- a/lib/lp/services/config/schema-lazr.conf | |||
330 | +++ b/lib/lp/services/config/schema-lazr.conf | |||
331 | @@ -1495,6 +1495,9 @@ dbuser: request-daily-builds | |||
332 | 1495 | # datatype: integer | 1495 | # datatype: integer |
333 | 1496 | timeout: 60 | 1496 | timeout: 60 |
334 | 1497 | 1497 | ||
335 | 1498 | [repack_git_repositories] | ||
336 | 1499 | # datatype: integer | ||
337 | 1500 | timeout: 15 | ||
338 | 1498 | 1501 | ||
339 | 1499 | [revisionkarma] | 1502 | [revisionkarma] |
340 | 1500 | # The database user which will be used by this process. | 1503 | # The database user which will be used by this process. |
The unit test (test_request_ git_repack) doesn't seem to trigger the script, this is really a copy/edit/debug and attempt to understand how we trigger daily builds, but not getting far on this.