Merge ubuntu-debuginfod:getter into ubuntu-debuginfod:master
- Git
- lp:ubuntu-debuginfod
- getter
- Merge into master
Status: | Merged |
---|---|
Merged at revision: | 0116435f33b6e26b4ae229e7492bddf7a92fe7cc |
Proposed branch: | ubuntu-debuginfod:getter |
Merge into: | ubuntu-debuginfod:master |
Diff against target: |
598 lines (+558/-0) 5 files modified
README.md (+2/-0) ddebgetter.py (+383/-0) ddebpoller.py (+1/-0) debuggetter.py (+92/-0) debuginfod.py (+80/-0) |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Athos Ribeiro (community) | Approve | ||
Lena Voytek (community) | Approve | ||
Canonical Server Core Reviewers | Pending | ||
Bryce Harrington | Pending | ||
Canonical Server Reporter | Pending | ||
Review via email: mp+434017@code.launchpad.net |
Commit message
Description of the change
This MP implements the "getter" classes, along with a Celery application that makes use of those classes and transforms them into tasks.
The idea here is to follow what I did for the poller: have a basic class that implements primitives, and extend it in order to handle a specific scenario. In our case, there are currently two objects that will need to be fetched: a list of ddebs, and the source code for the package (with patches applied). The former is a simple thing to do, but the latter is more involved.
Before we even start downloading the source code, we need to determine whether it is suitable to be consumed by debuginfod. We do that by inspecting the DWARF of each ddeb we've downloaded and looking for a specific pathname that should be embedded into the DW_AT_comp_dir tag. If we find it, we can proceed with downloading the source code. And then, a bit more processing needs to be done...
There are two basic ways of grabbing the source code: I can either download the source artifacts from LP and use dpkg-source to extract them (a la "dget"), or I can use git-ubuntu's "applied" tags. You will notice that both methods have been implemented, and will be tried in order.
The getters are idempotent, which is almost a requirement due to the retry mechanism that Celery offers. Please let me know if you need more details about the Celery parts and I'll be happy to provide them.
You can run this application by invoking:
celery -A debuginfod worker -c 4 --loglevel=INFO
I will update the README file soon to cover that.
Lena Voytek (lvoytek) wrote : | # |
Added some additional comments alongside Athos'
Thanks!
Sergio Durigan Junior (sergiodj) wrote : | # |
Thanks, Athos and Lena.
I think I've addressed everything, except for Athos' comment about using lazy strings instead of f-strings. Please let me know what you think.
Lena Voytek (lvoytek) wrote : | # |
All changes brought up look good now, thanks!
Bryce Harrington (bryce) : | # |
Bryce Harrington (bryce) wrote : | # |
I also wondered about suggesting using %s style strings for logger in an earlier review. Logger also supports f-string style for lazy substitution, there's a config variable you set when you're creating the logger object.
However, I didn't mention it because I don't have much experience with logger myself (I should start using it more), and couldn't express the advantage of doing it that way vs. passing static strings. I think it's fine the way Sergio's coded it, but am curious if there's a strong rationale for using logger's string formatting instead?
Athos Ribeiro (athos-ribeiro) wrote : | # |
Thanks Sergio.
@Bryce, for the logger lazy evaluation, the advantage is that python will only generate the strings (i.e., perform the substitutions) in case the string is actually going to be logged (instead of always generating the string).
Sergio Durigan Junior (sergiodj) wrote : | # |
Thanks for the review, Athos, and thanks for the further clarification, Bryce.
I merged this branch after replacing [] by .get, as suggested by Athos.
Bryce Harrington (bryce) wrote : | # |
Lena and Athos already are on top of the review, so leaving approval to them, but here's a few more review comments. Mostly style suggestions and nitpicks, nothing that I can see that should block landing.
Sergio Durigan Junior (sergiodj) wrote : | # |
Thanks for the review, Bryce.
As you might have noticed, I merged the branch already, but your comments are valuable as ever and I intend to address them in a subsequent commit.
Preview Diff
1 | diff --git a/README.md b/README.md |
2 | index 995ba36..8fe7f00 100644 |
3 | --- a/README.md |
4 | +++ b/README.md |
5 | @@ -13,6 +13,8 @@ python3-debian |
6 | python3-git |
7 | python3-celery |
8 | python3-launchpadlib |
9 | +python3-requests |
10 | +python3-sdnotify |
11 | ``` |
12 | |
13 | * Applications |
14 | diff --git a/ddebgetter.py b/ddebgetter.py |
15 | new file mode 100644 |
16 | index 0000000..62eebc6 |
17 | --- /dev/null |
18 | +++ b/ddebgetter.py |
19 | @@ -0,0 +1,383 @@ |
20 | +#!/usr/bin/python3 |
21 | + |
22 | +# Copyright (C) 2022 Canonical Ltd. |
23 | + |
24 | +# This program is free software: you can redistribute it and/or modify |
25 | +# it under the terms of the GNU General Public License as published by |
26 | +# the Free Software Foundation, either version 3 of the License, or |
27 | +# (at your option) any later version. |
28 | + |
29 | +# This program is distributed in the hope that it will be useful, |
30 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
31 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
32 | +# GNU General Public License for more details. |
33 | + |
34 | +# You should have received a copy of the GNU General Public License |
35 | +# along with this program. If not, see <https://www.gnu.org/licenses/>. |
36 | + |
37 | +# Authors: Sergio Durigan Junior <sergio.durigan@canonical.com> |
38 | + |
39 | +import os |
40 | +from urllib import parse |
41 | +import lzma |
42 | +import tarfile |
43 | +import subprocess |
44 | + |
45 | +from debian import debfile |
46 | +import tempfile |
47 | +from elftools.common.utils import bytelist2string |
48 | +from elftools.common.exceptions import ELFError, ELFParseError, DWARFError |
49 | +from elftools.elf.elffile import ELFFile |
50 | + |
51 | +from git import Git, Repo |
52 | +from git.exc import GitCommandError |
53 | + |
54 | +from debuggetter import DebugGetter, DebugGetterRetry, DEFAULT_MIRROR_DIR |
55 | + |
56 | + |
57 | +class DdebGetter(DebugGetter): |
58 | + """Get (fetch) a ddeb.""" |
59 | + |
60 | + def __init__(self, mirror_dir=DEFAULT_MIRROR_DIR): |
61 | + """Initialize the object. |
62 | + |
63 | + See DebugGetter's __init__ for an explanation of the arguments.""" |
64 | + super().__init__(subdir="ddebs", mirror_dir=mirror_dir) |
65 | + |
66 | + def process_request(self, request): |
67 | + """Process a request, usually coming from Celery. |
68 | + |
69 | + :param request: The dictionary containing the information |
70 | + necessary to fetch this ddeb. |
71 | + :type request: dict(str : str) |
72 | + """ |
73 | + if not request or not request.get("ddebs"): |
74 | + return |
75 | + |
76 | + self._logger.debug(f"Processing request to download ddebs: {request}") |
77 | + self._download_ddebs( |
78 | + request["source_package"], |
79 | + request["component"], |
80 | + request["ddebs"], |
81 | + ) |
82 | + |
83 | + def _download_ddebs(self, source_package, component, urls): |
84 | + """Download the ddebs associated with a package/version. |
85 | + |
86 | + :param str source_package: Source package name. |
87 | + |
88 | + :param str version: Source package version. |
89 | + |
90 | + :param str component: Source package component. |
91 | + |
92 | + :param list urls: List of ddeb URLs.""" |
93 | + savepath = self._make_savepath(source_package, component) |
94 | + os.makedirs(savepath, mode=0o755, exist_ok=True) |
95 | + for url in urls: |
96 | + self._logger.debug(f"Downloading '{url}' into '{savepath}'") |
97 | + self._download_from_lp(url, savepath) |
98 | + |
99 | + |
100 | +# The function below was taken from git-ubuntu. We need to make |
101 | +# sure we perform the same version -> tag transformation as it |
102 | +# does. |
103 | +def _git_dep14_tag(version): |
104 | + """Munge a version string according to http://dep.debian.net/deps/dep14/ |
105 | + |
106 | + :param str version: The version to be adjusted.""" |
107 | + version = version.replace("~", "_") |
108 | + version = version.replace(":", "%") |
109 | + version = version.replace("..", ".#.") |
110 | + if version.endswith("."): |
111 | + version = version + "#" |
112 | + if version.endswith(".lock"): |
113 | + pre, _, _ = version.partition(".lock") |
114 | + version = pre + ".#lock" |
115 | + return version |
116 | + |
117 | + |
118 | +class DdebSourceCodeGetter(DebugGetter): |
119 | + """Get (fetch) the source code associated with a ddeb.""" |
120 | + |
121 | + def __init__(self, mirror_dir="/srv/debug-mirror"): |
122 | + super().__init__(mirror_dir=mirror_dir, subdir="ddebs") |
123 | + |
124 | + def process_request(self, request): |
125 | + """Process a request, usually coming from Celery. |
126 | + |
127 | + :param request: The dictionary containing the information |
128 | + necessary to fetch this source code. |
129 | + :type request: dict(str : str) |
130 | + """ |
131 | + if not request or not request.get("ddebs"): |
132 | + return |
133 | + |
134 | + self._logger.debug( |
135 | + f"Processing request to download source code (for ddebs): {request}" |
136 | + ) |
137 | + self._process_source( |
138 | + request["source_package"], |
139 | + request["version"], |
140 | + request["component"], |
141 | + request["ddebs"], |
142 | + request["sources"], |
143 | + ) |
144 | + |
145 | + def _get_comp_dirs(self, debug_file, filename): |
146 | + """Get the list of available DW_AT_comp_dir declarations from the |
147 | + DWARF file. |
148 | + |
149 | + :param TextIOWrapper debug_file: The .debug file. |
150 | + """ |
151 | + elf_file = ELFFile(debug_file) |
152 | + |
153 | + if not elf_file.has_dwarf_info(): |
154 | + self._logger.debug(f"'{filename}' doesn't have DWARF") |
155 | + return [] |
156 | + |
157 | + result = [] |
158 | + dwarf_file = elf_file.get_dwarf_info() |
159 | + for cu in dwarf_file.iter_CUs(): |
160 | + attr = cu.get_top_DIE().attributes |
161 | + if "DW_AT_comp_dir" in attr.keys(): |
162 | + value = attr["DW_AT_comp_dir"].value |
163 | + if isinstance(value, bytes): |
164 | + result.append(bytelist2string(value).decode("UTF-8")) |
165 | + self._logger.debug( |
166 | + f"Found the following DW_AT_comp_dir for '{filename}':\n{result}" |
167 | + ) |
168 | + return result |
169 | + |
170 | + def _should_fetch_source_for_debugfile(self, source_package, version, filepath): |
171 | + """Whether we should fetch the source code for the specified source |
172 | + package/deb. |
173 | + |
174 | + We return True if the debug file associated with the source |
175 | + package has any DW_AT_comp_dir entity whose path points to |
176 | + "/usr/src/source_package-version/". |
177 | + |
178 | + :param str source_package: Source package name. |
179 | + |
180 | + :param str version: Source package version. |
181 | + |
182 | + :param str filepath: Full path to the .ddeb file.""" |
183 | + deb_file = debfile.DebFile(filepath) |
184 | + # The path we expect to find in any of the DW_AT_comp_dirs. |
185 | + expected_path = f"/usr/src/{source_package}-{version}/" |
186 | + for deb_internal_file in list(deb_file.data): |
187 | + if not deb_internal_file.endswith(".debug"): |
188 | + continue |
189 | + with tempfile.TemporaryFile() as debug_file: |
190 | + debug_file.write(deb_file.data.get_content(deb_internal_file)) |
191 | + try: |
192 | + comp_dirs = self._get_comp_dirs(debug_file, deb_internal_file) |
193 | + except (ELFError, ELFParseError, DWARFError) as e: |
194 | + self._logger.warning( |
195 | + f"Exception while looking for DW_AT_comp_dirs: {e}" |
196 | + ) |
197 | + # There's no point in retrying anything here, so |
198 | + # we just generate a warning and keep going. |
199 | + continue |
200 | + |
201 | + for comp_dir in comp_dirs: |
202 | + if comp_dir.startswith(expected_path): |
203 | + self._logger.debug(f"Found a good DW_AT_comp_dir: {comp_dir}") |
204 | + return True |
205 | + return False |
206 | + |
207 | + def _should_fetch_source(self, source_package, version, component, filenames): |
208 | + """Return True if we should fetch the source code for a specific |
209 | + package, False otherwise. |
210 | + |
211 | + Whether or not we should fetch the source code is determined |
212 | + by the presence of a specific path prefix (/usr/src/...) in |
213 | + the DW_AT_comp_dir declarations of a DWARF file. |
214 | + |
215 | + :param str source_package: Source package name. |
216 | + |
217 | + :param str version: Source package version. |
218 | + |
219 | + :param str component: Source package component (main, |
220 | + universe, etc.) |
221 | + |
222 | + :param list urls: is the list of URLs used to fetch the respective debug |
223 | + information for the package. |
224 | + """ |
225 | + savepath = self._make_savepath(source_package, component) |
226 | + for fname in filenames: |
227 | + filepath = os.path.join(savepath, fname) |
228 | + if not os.path.isfile(filepath): |
229 | + continue |
230 | + if self._should_fetch_source_for_debugfile( |
231 | + source_package, version, filepath |
232 | + ): |
233 | + return True |
234 | + return False |
235 | + |
236 | + def _download_source_code_from_git(self, source_package, version, filepath): |
237 | + """Download the source code using Launchpad's git repository. |
238 | + |
239 | + :param str source_package: Source package name. |
240 | + |
241 | + :param str version: Source package version. |
242 | + |
243 | + :param str filepath: The full pathname where the resulting |
244 | + source code tarball should be saved. |
245 | + """ |
246 | + with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as source_dir: |
247 | + g = Git() |
248 | + git_dir = os.path.join(source_dir, f"{source_package}") |
249 | + tag = "applied/" + _git_dep14_tag(f"{version}") |
250 | + self._logger.debug( |
251 | + f"Cloning '{source_package}' git repo into '{git_dir}' (tag: '{tag}')" |
252 | + ) |
253 | + try: |
254 | + g.clone( |
255 | + f"https://git.launchpad.net/ubuntu/+source/{source_package}", |
256 | + git_dir, |
257 | + depth=1, |
258 | + branch=tag, |
259 | + ) |
260 | + except GitCommandError as e: |
261 | + # Couldn't perform the download. Let's signal and |
262 | + # bail out. |
263 | + self._logger.warning( |
264 | + f"Could not clone git repo for '{source_package}': {e}" |
265 | + ) |
266 | + return False |
267 | + repo = Repo(git_dir) |
268 | + prefix_path = os.path.join("/usr/src/", f"{source_package}-{version}/") |
269 | + with lzma.open(filepath, "w") as xzfile: |
270 | + self._logger.debug( |
271 | + f"Archiving git repo for '{source_package}-{version}' as '{filepath}'" |
272 | + ) |
273 | + repo.archive(xzfile, prefix=prefix_path, format="tar") |
274 | + |
275 | + return True |
276 | + |
277 | + def _adjust_tar_filepath(self, tarinfo): |
278 | + """Adjust the filepath for a TarInfo file. |
279 | + |
280 | + This function is needed because TarFile.add strips the leading |
281 | + slash from the filenames, so we have to workaround it by |
282 | + re-adding the slash ourselves. |
283 | + |
284 | + This function is intended to be used as a callback provided to |
285 | + TarFile.add. |
286 | + |
287 | + :param TarInfo tarinfo: The tarinfo. |
288 | + """ |
289 | + tarinfo.name = f"/{tarinfo.name}" |
290 | + return tarinfo |
291 | + |
292 | + def _download_source_code_from_dsc( |
293 | + self, source_package, version, filepath, source_urls |
294 | + ): |
295 | + |
296 | + """Download the source code using the .dsc file. |
297 | + |
298 | + :param str source_package: Source package name. |
299 | + |
300 | + :param str version: Source package version. |
301 | + |
302 | + :param str filepath: The full pathname where the resulting |
303 | + source code tarball should be saved. |
304 | + |
305 | + :param list source_urls: List of URLs used to fetch the source |
306 | + package. This is usually the list returned by the |
307 | + sourceFileUrls() Launchpad API call. |
308 | + """ |
309 | + with tempfile.TemporaryDirectory(ignore_cleanup_errors=True) as source_dir: |
310 | + for url in source_urls: |
311 | + self._download_from_lp(url, source_dir) |
312 | + |
313 | + dscfile = os.path.join(source_dir, f"{source_package}_{version}.dsc") |
314 | + if not os.path.isfile(dscfile): |
315 | + self._logger.warning(f"'{dscfile}' doesn't exist, but it should.") |
316 | + return False |
317 | + |
318 | + outdir = os.path.join(source_dir, "outdir") |
319 | + try: |
320 | + subprocess.run( |
321 | + ["/usr/bin/dpkg-source", "-x", dscfile, outdir], |
322 | + cwd=source_dir, |
323 | + check=True, |
324 | + ) |
325 | + except subprocess.CalledProcessError: |
326 | + self._logger.warning(f"Call to 'dpkg-source -x' failed.") |
327 | + return False |
328 | + |
329 | + if not os.path.isdir(outdir): |
330 | + self._logger.warning( |
331 | + f"'{outdir}' has not been created by 'dpkg-source -x'." |
332 | + ) |
333 | + return False |
334 | + |
335 | + prefix_path = os.path.join("/usr/src/", f"{source_package}-{version}/") |
336 | + |
337 | + with tarfile.open(filepath, "w:xz") as tfile: |
338 | + tfile.add(outdir, arcname=prefix_path, filter=self._adjust_tar_filepath) |
339 | + |
340 | + return True |
341 | + |
342 | + def _download_source_code(self, source_package, version, component, source_urls): |
343 | + """Download the source code for a package. |
344 | + |
345 | + :param str source_package: Source package name. |
346 | + |
347 | + :param str version: Source package version. |
348 | + |
349 | + :param str component: Source package component. |
350 | + |
351 | + :param list source_urls: List of source file URLS.""" |
352 | + savepath = self._make_savepath(source_package, component) |
353 | + txzfilepath = os.path.join(savepath, f"{source_package}-{version}.tar.xz") |
354 | + if os.path.exists(txzfilepath): |
355 | + self._logger.debug(f"'{txzfilepath}' already exists; doing nothing.") |
356 | + return |
357 | + |
358 | + if self._download_source_code_from_dsc( |
359 | + source_package, version, txzfilepath, source_urls |
360 | + ): |
361 | + self._logger.info( |
362 | + f"Downloaded source code from dsc for '{source_package}-{version}' as '{txzfilepath}'" |
363 | + ) |
364 | + return |
365 | + |
366 | + if self._download_source_code_from_git(source_package, version, txzfilepath): |
367 | + self._logger.info( |
368 | + f"Downloaded source code from git for '{source_package}-{version}' as '{txzfilepath}'" |
369 | + ) |
370 | + return |
371 | + |
372 | + # In the (likely?) event that there is a problem with |
373 | + # Launchpad, let's raise an exception signalling that we'd |
374 | + # like to retry the task. |
375 | + raise DebugGetterRetry() |
376 | + |
377 | + def _process_source( |
378 | + self, source_package, version, component, ddeb_urls, source_urls |
379 | + ): |
380 | + """Process the request to fetch the source code. |
381 | + |
382 | + We only download the source code if we know it will be |
383 | + properly indexed by debuginfod. |
384 | + |
385 | + :param str source_package: Source package name. |
386 | + |
387 | + :param str version: Source package version. |
388 | + |
389 | + :param str component: Source package component. |
390 | + |
391 | + :param list urls: List of ddeb URLs.""" |
392 | + # Convert URLs into filenames. |
393 | + filenames = [ |
394 | + os.path.basename(parse.urlparse(fname).path) for fname in ddeb_urls |
395 | + ] |
396 | + if not self._should_fetch_source(source_package, version, component, filenames): |
397 | + self._logger.info( |
398 | + f"Should not fetch source code for '{source_package}-{version}'" |
399 | + ) |
400 | + return |
401 | + |
402 | + self._download_source_code(source_package, version, component, source_urls) |
403 | diff --git a/ddebpoller.py b/ddebpoller.py |
404 | index a19d92c..24ba613 100644 |
405 | --- a/ddebpoller.py |
406 | +++ b/ddebpoller.py |
407 | @@ -65,6 +65,7 @@ class DdebPoller(DebugPoller): |
408 | "component": pkg.component_name, |
409 | "distro_series": distro_series, |
410 | "ddebs": ddeb_urls, |
411 | + "sources": pkg.sourceFileUrls(), |
412 | } |
413 | self._logger.debug( |
414 | f"For source package '{pkgname}-{pkgver}', found {ddebs_len}:\n{ddeb_urls}" |
415 | diff --git a/debuggetter.py b/debuggetter.py |
416 | new file mode 100644 |
417 | index 0000000..9bb6102 |
418 | --- /dev/null |
419 | +++ b/debuggetter.py |
420 | @@ -0,0 +1,92 @@ |
421 | +#!/usr/bin/python3 |
422 | + |
423 | +# Copyright (C) 2022 Canonical Ltd. |
424 | + |
425 | +# This program is free software: you can redistribute it and/or modify |
426 | +# it under the terms of the GNU General Public License as published by |
427 | +# the Free Software Foundation, either version 3 of the License, or |
428 | +# (at your option) any later version. |
429 | + |
430 | +# This program is distributed in the hope that it will be useful, |
431 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
432 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
433 | +# GNU General Public License for more details. |
434 | + |
435 | +# You should have received a copy of the GNU General Public License |
436 | +# along with this program. If not, see <https://www.gnu.org/licenses/>. |
437 | + |
438 | +# Authors: Sergio Durigan Junior <sergio.durigan@canonical.com> |
439 | + |
440 | +import os |
441 | +import requests |
442 | +from urllib import parse |
443 | +import logging |
444 | + |
445 | + |
446 | +class DebugGetterTimeout(Exception): |
447 | + """Default exception to raise when there's a timeout issue.""" |
448 | + |
449 | + |
450 | +class DebugGetterRetry(Exception): |
451 | + """Default exception to raise when we'd like to signal that the task |
452 | + should be retried.""" |
453 | + |
454 | + |
455 | +DEFAULT_MIRROR_DIR = "/srv/debug-mirror" |
456 | + |
457 | +class DebugGetter: |
458 | + """Base class for a Debug Getter.""" |
459 | + |
460 | + def __init__(self, subdir, mirror_dir=DEFAULT_MIRROR_DIR): |
461 | + """Initialize the object. |
462 | + |
463 | + :param str mirror_dir: The directory we use to save the |
464 | + mirrored files from Launchpad. |
465 | + |
466 | + :param str subdir: The subdirectory (insider mirror_dir) where |
467 | + the module will save its files. For example, a ddeb |
468 | + getter module should specify "ddebs" here. |
469 | + """ |
470 | + self._mirror_dir = mirror_dir |
471 | + self._subdir = subdir |
472 | + self._logger = logging.getLogger(__name__) |
473 | + |
474 | + def _make_savepath(self, source_package, component): |
475 | + """Return the full save path for a package. |
476 | + |
477 | + :param str source_package: The package name. |
478 | + :param str component: The component name (main, universe, etc.). |
479 | + """ |
480 | + return os.path.join( |
481 | + self._mirror_dir, |
482 | + self._subdir, |
483 | + component, |
484 | + source_package[:1], |
485 | + source_package, |
486 | + ) |
487 | + |
488 | + def _download_from_lp(self, url, savepath): |
489 | + filepath = os.path.join(savepath, os.path.basename(parse.urlparse(url).path)) |
490 | + if os.path.exists(filepath): |
491 | + self._logger.debug(f"'{filepath}' exists, doing nothing") |
492 | + return |
493 | + |
494 | + try: |
495 | + with requests.Session() as s: |
496 | + with s.get(url, allow_redirects=True, timeout=10, stream=True) as r: |
497 | + r.raise_for_status() |
498 | + with open(filepath, "wb") as f: |
499 | + # 10 MB for chunk_size should be enough... |
500 | + for chunk in r.iter_content(chunk_size=10 * 1024 * 1024): |
501 | + f.write(chunk) |
502 | + except ( |
503 | + requests.ConnectionError, |
504 | + requests.HTTPError, |
505 | + requests.ConnectTimeout, |
506 | + requests.ReadTimeout, |
507 | + requests.Timeout, |
508 | + ) as e: |
509 | + self._logger.warning(f"Timeout while downloading '{url}': '{e}'") |
510 | + raise DebugGetterTimeout() |
511 | + |
512 | + self._logger.debug(f"Saved '{filepath}'") |
513 | diff --git a/debuginfod.py b/debuginfod.py |
514 | new file mode 100644 |
515 | index 0000000..d64efd4 |
516 | --- /dev/null |
517 | +++ b/debuginfod.py |
518 | @@ -0,0 +1,80 @@ |
519 | +#!/usr/bin/python3 |
520 | + |
521 | +# Copyright (C) 2022 Canonical Ltd. |
522 | + |
523 | +# This program is free software: you can redistribute it and/or modify |
524 | +# it under the terms of the GNU General Public License as published by |
525 | +# the Free Software Foundation, either version 3 of the License, or |
526 | +# (at your option) any later version. |
527 | + |
528 | +# This program is distributed in the hope that it will be useful, |
529 | +# but WITHOUT ANY WARRANTY; without even the implied warranty of |
530 | +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
531 | +# GNU General Public License for more details. |
532 | + |
533 | +# You should have received a copy of the GNU General Public License |
534 | +# along with this program. If not, see <https://www.gnu.org/licenses/>. |
535 | + |
536 | +# Authors: Sergio Durigan Junior <sergio.durigan@canonical.com> |
537 | + |
538 | +from celery import Celery |
539 | + |
540 | +from debuggetter import DebugGetterTimeout, DebugGetterRetry |
541 | + |
542 | +from ddebgetter import DdebGetter, DdebSourceCodeGetter |
543 | + |
544 | +app = Celery("debuginfod", broker="pyamqp://guest@localhost//") |
545 | + |
546 | +app.conf.update( |
547 | + task_serializer="json", |
548 | + accept_content=["json"], |
549 | + result_serializer="json", |
550 | + timezone="America/Toronto", |
551 | + enable_utc=True, |
552 | + task_acks_late=True, |
553 | + # Only one task per worker. |
554 | + worker_prefetch_multiplier=1, |
555 | + worker_log_format="[%(asctime)s: %(levelname)s/%(processName)s] %(funcName)s: %(message)s", |
556 | +) |
557 | + |
558 | + |
559 | +@app.task( |
560 | + name="grab_ddebs", |
561 | + autoretry_for=(DebugGetterTimeout, DebugGetterRetry), |
562 | + # 1 day |
563 | + default_retry_delay=24 * 60 * 60, |
564 | + # Launchpad can be problematic, so we retry every day for a |
565 | + # week. |
566 | + retry_kwargs={"max_retries": 7}, |
567 | +) |
568 | +def grab_ddebs(msg): |
569 | + """Dispatch the DdebGetter task. |
570 | + |
571 | + :param dict(str -> str) msg: The dictionary containing the message |
572 | + that will be processed by the getter. |
573 | + """ |
574 | + g = DdebGetter() |
575 | + g.process_request(msg) |
576 | + |
577 | + |
578 | +@app.task( |
579 | + name="grab_ddebs_sources", |
580 | + autoretry_for=(DebugGetterTimeout, DebugGetterRetry), |
581 | + # 1 day |
582 | + default_retry_delay=24 * 60 * 60, |
583 | + # Launchpad can be problematic, so we retry every day for a |
584 | + # week. |
585 | + retry_kwargs={"max_retries": 7}, |
586 | +) |
587 | +def grab_ddebs_sources(msg): |
588 | + """Dispatch the DdebSourceCodeGetter task. |
589 | + |
590 | + :param dict(str -> str) msg: The dictionary containing the message |
591 | + that will be processed by the getter. |
592 | + """ |
593 | + g = DdebSourceCodeGetter() |
594 | + g.process_request(msg) |
595 | + |
596 | + |
597 | +if __name__ == "__main__": |
598 | + app.start() |
Thanks, Sergio!
I added a few comments below and inline:
- The README file needs update to the dependencies now you also require requests here.
- You could use the logger lazy string generation by using %s instead of f strings.