Merge ubuntu-debuginfod:fix-requests-get into ubuntu-debuginfod:master

Proposed by Sergio Durigan Junior
Status: Merged
Merged at revision: 92f8d6ead978709ec8f2788e8c750e9cbbddd1ef
Proposed branch: ubuntu-debuginfod:fix-requests-get
Merge into: ubuntu-debuginfod:master
Diff against target: 15 lines (+1/-3)
1 file modified
debuggetter.py (+1/-3)
Reviewer Review Type Date Requested Status
Bryce Harrington (community) Approve
Athos Ribeiro Pending
Canonical Server Reporter Pending
Review via email: mp+434754@code.launchpad.net

Description of the change

This MP fixes the code that downloads things from LP.

I've found that requests will happily deflate gzip files because it always sends "Accept-Encoding: gzip deflate". Because some source packages have .diff.gz files, this automatically decompression ends up messing with "dpkg-source -x". I've been experiencing these errors locally.

Upstream says that one should use requests.raw when reading data from the object in this case:

https://github.com/psf/requests/issues/1586
https://requests.readthedocs.io/en/latest/user/quickstart/?highlight=deflate#raw-response-content

so this is what I'm proposing here.

To post a comment you must log in.
Revision history for this message
Bryce Harrington (bryce) wrote :

The rtd article says to use stream=True in the get() call, which you're already doing. The other link explains convincingly that r.raw is the way to go.

I wonder if there might be plain text files you're downloading that do need the deflation, but presumably you've verified that not to be the case.

You might leave yourself these two links in your commit message (or as comments) for future reference.

I also checked the api docs for copyfileobj to see if there's any error conditions / side effects / exceptions / parameters / return codes to note but it has nothing special.

So, LGTM, +1

review: Approve
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

On Thursday, December 15 2022, Bryce Harrington wrote:

> Review: Approve
>
> The rtd article says to use stream=True in the get() call, which you're already doing. The other link explains convincingly that r.raw is the way to go.
>
> I wonder if there might be plain text files you're downloading that do need the deflation, but presumably you've verified that not to be the case.
>
> You might leave yourself these two links in your commit message (or as comments) for future reference.
>
> I also checked the api docs for copyfileobj to see if there's any error conditions / side effects / exceptions / parameters / return codes to note but it has nothing special.
>
> So, LGTM, +1

Thanks, Bryce. I'll go ahead and merge the fix.

--
Sergio
GPG key ID: E92F D0B3 6B14 F1F4 D8E0 EB2F 106D A1C8 C3CB BF14

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/debuggetter.py b/debuggetter.py
2index bafa914..e6b1de0 100644
3--- a/debuggetter.py
4+++ b/debuggetter.py
5@@ -113,9 +113,7 @@ class DebugGetter:
6 with s.get(url, allow_redirects=True, timeout=10, stream=True) as r:
7 r.raise_for_status()
8 with tempfile.NamedTemporaryFile(mode="wb") as tmpfile:
9- # 10 MB for chunk_size should be enough...
10- for chunk in r.iter_content(chunk_size=10 * 1024 * 1024):
11- tmpfile.write(chunk)
12+ shutil.copyfileobj(r.raw, tmpfile)
13 tmpfile.flush()
14 self._try_to_move_atomically(tmpfile.name, filepath)
15 except (

Subscribers

People subscribed via source and target branches

to all changes: