Merge ~slingamn/ssh-import-id:norequests.release into ssh-import-id:master

Proposed by Shivaram Lingamneni
Status: Merged
Merged at revision: fb51a7a9150c4c7c7d93f35461169cb12bdae4b6
Proposed branch: ~slingamn/ssh-import-id:norequests.release
Merge into: ssh-import-id:master
Diff against target: 121 lines (+22/-23)
3 files modified
debian/control (+0/-2)
setup.py (+2/-2)
ssh_import_id/__init__.py (+20/-19)
Reviewer Review Type Date Requested Status
Robie Basak Approve
Scott Moser Pending
Review via email: mp+389139@code.launchpad.net

Commit message

Remove dependency on python-requests.

Description of the change

The end goal is to remove python3-requests and its dependencies from the Ubuntu base system. ssh-import-id was using requests because Python didn't provide a sane way to do TLS certificate verification; this is fixed now with ssl.create_default_context().

Thanks for your time!

To post a comment you must log in.
Revision history for this message
Scott Moser (smoser) wrote :

one comment in line that i think needs fixing.

I also added 'ssh-import-id' team, and hopefully someone from sub-team https://launchpad.net/~canonical-server will respond.

I'd also request that you update the comment in setup.py 'will fail due to requests not available'.

Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

Thanks for the review! I've addressed the comments thus far.

I also realized the project uses tox, so I've brought the changes back into compliance with pylint/pycodestyle.

Revision history for this message
Scott Moser (smoser) wrote :

I am happy with this .. but I'm really hoping that someone in ~canonical-server would grab it and merge it...

Revision history for this message
Richard Harding (rharding) wrote :

Sounds good, thanks for the patch. I'll see if we can get it in.

On Mon, Aug 17, 2020 at 12:59 PM Scott Moser <email address hidden>
wrote:

> I am happy with this .. but I'm really hoping that someone in
> ~canonical-server would grab it and merge it...
>
> --
>
> https://code.launchpad.net/~slingamn/ssh-import-id/+git/ssh-import-id/+merge/389139
> Your team ssh-import-id is requested to review the proposed merge of
> ~slingamn/ssh-import-id:norequests.release into ssh-import-id:master.
>

Revision history for this message
Robie Basak (racb) wrote :

Thank you for working on improvements to ssh-import-id!

Note that cloud-init depends on python3-requests. Are you looking to remove the requests dependency from there, too? Without that, I don't see how this change will help you towards your goal.

However, now that the standard library has sufficient support, I have no objection to removing the dependency on requests from ssh-import-id. Unless it makes the code worse, preferring the standard library over an external library is usually a good thing.

I'm not sure about the shim you've added to emulate enough of the requests library here though. What I'd expect to see instead is ssh-import-id adjusted to use the standard library as if it was written to do that in the first place. But instead there's now this extra indirection through your new shim, including some protocol-specific handling that looks like it should be in a dedicated protocol handling library rather than reimplemented inside a high level tool like this one.

What's the reason you chose to use http.client instead of urllib.request? Then you could avoid the URL inspection in http_get() I think? Is there some feature needed that is missing from urllib.request?

If I'm wrong or I've missed something, please could you explain? Otherwise, please drop the new https.py and change the URL fetch code in the two places it is used to natively use urllib.request instead, handling the resulting HTTPResponse object directly rather than emulating the old response object that you want to remove. You can use urllib.request.Request() to include User-Agent to maintain the existing functionality. Feel free to abstract common code out to a function if appropriate.

review: Needs Fixing
Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

Thanks for the mention of cloud-init; I knew about software-properties-common (which I have an open review for) and apport, but I wasn't aware of cloud-init. I am prepared to remove requests from cloud-init as well.

Py3's urllib looks pretty sane (I got accustomed to avoiding packages with names starting in 'urllib' because the Py2 ones were bad). I'll rewrite this branch to use it.

Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

I replaced http.client with urllib, which removed the URL parsing code.

There's still a bit of a shim in there, because urllib's API is in fact still bad. At this point, the shim's purpose is just to read the entire response, close the socket, and then return the body together with the status code and headers.

Revision history for this message
Robie Basak (racb) wrote :

Sorry - while I agree that your indirection will work correctly, I still think that HttpResponse is unnecessary and adds extra complexity.

Instead of indirecting through HttpResponse, why don't you accept a urlopen Response object as native object that you're willing to touch and access its properties directly? For example:

with urlopen(...) as response:
    data = json.loads(response.read().decode('utf-8'))

The reason I'm opposed to your HttpResponse shim is that it seems to wrap something that already exists as an object. I'm not keen on wrapping library APIs to adjust them just because the user of the API doesn't like it. That just means that the next developer now has to read and understand the adjusted API instead of already knowing the standard library API actually being used. And why does HttpResponse.headers exist? I don't see that used anywhere?

I also don't agree with wrapping away standard Python idioms such as file objects and context managers. These are perfectly fine to access directly, and as they are standard concepts, a newcomer to the project will understand exactly what is going on more easily.

For example:

try:
    with urlopen(Request(...)) as response:
        # If needed, handle different response codes here
        return response.read().decode('utf-8')
except urllib.error.HTTPError:
    # Handle the 404 here if necessary

If needed, wrap urlopen(Request(...)) into a function. The difference between this and what you have done is that the wrapper function's API is then automatically clear and concise: "Returns the same thing that urlopen() does, so if you know the standard library then you're in familiar territory again" as opposed to "Returns a special object that I created, so now you must learn my object design before you can understand the top level code".

Further comments inline.

Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

Well, I disagree with you (and so do the maintainers of cloud-init, who've created their own elaborate compatibility layer on top of python-requests), but this is your project and I'll write this however you'd prefer :-)

(python-requests is the wrong solution to the problem, but there is genuinely a problem with the standard library APIs here --- hence the urllib.request documentation continuing to state that "The Requests package is recommended for a higher-level HTTP client interface.")

(The response headers are used for reporting issues with GitHub rate-limiting requests.)

Revision history for this message
Robie Basak (racb) wrote :

Thanks. This looks much better!

> I don't think this has changed significantly. Note that these are already wrapped in `except Exception as e`.

I believe 404 is now raised immediately as an exception now though, so the previous code paths that handle special response.status_code are unreachable now. So I think this still needs fixing.

> Well, I disagree with you (and so do the maintainers of cloud-init, who've created their own elaborate compatibility layer on top of python-requests)

IMHO, whether a standard library API is "acceptable" or not is not an absolute, but a per-project decision. I might well make a different decision for a different project. Here, the need for the code to be quick and simple to grasp, and the cost of extra lines of code, is vastly greater than a project that receives continued sustained development effort. I might well form the opposite opinion on the usability of urllib.request in the context of cloud-init, and that wouldn't be a contradictory opinion.

> ...there is genuinely a problem with the standard library APIs here

I don't see any problem with the API in this branch. Is there something you think is bad that I'm not seeing, or is it that the problems are in unrelated areas that ssh-import-id doesn't currently touch, but you think that should cause ssh-import-id to avoid using urllib.request regardless?

> (The response headers are used for reporting issues with GitHub rate-limiting requests.)

Ah! Sorry I missed that before. The way you handled this is fine.

review: Needs Fixing
Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

>I believe 404 is now raised immediately as an exception now though, so the previous code paths that handle special response.status_code are unreachable now. So I think this still needs fixing.

Thanks, fixed.

>I don't see any problem with the API in this branch. Is there something you think is bad that I'm not seeing, or is it that the problems are in unrelated areas that ssh-import-id doesn't currently touch, but you think that should cause ssh-import-id to avoid using urllib.request regardless?

We probably just disagree about whether exposing the file object is a good API for HTTP requests.

Revision history for this message
Robie Basak (racb) wrote :

Sorry I haven't managed to get back to this yet. The code looks good. The only thing I noticed is that "import ssl" is no longer required, but I can fix that up before merging.

I'd just like to do some testing before merging, which I hope to get to this week. Thank you for your work!

Do you need this in Groovy, or is a subsequent Ubuntu release OK for you? If the former I'll need to merge this and upload it to Groovy before feature freeze on Thursday.

Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

Thanks! It's not important to me that this land in Groovy as opposed to the subsequent release.

Revision history for this message
Scott Moser (smoser) wrote :

@Robie,
Thanks for your time reviewing.
What kind of test do you want to see?

 I'd say the following:

   * ssh-import-id valid-user
   * ssh-import-id lp:valid-user
   * ssh-import-id gh:valid-user
   * ssh-import-id invalid-user
   * ssh-import-id lp:invalid-user
   * ssh-import-id gh:invalid-user
   * URL="http://localhost:9999/%s.pub" ssh-import-id valid-user
   * URL="http://localhost:9999/%s.pub" ssh-import-id invalid-user

In the URL cases, assume that URL has a 'valid-user.pub' and no 'invalid-user.pub'.

Clearly it'd be good to have these tests automated and in c-i.

@Shivaram,
Thanks for your effort with this.
Go ahead and push a removal of the 'import ssl'.

Revision history for this message
Scott Moser (smoser) wrote :

If it was up to me, I'd like to get this into groovy. If we get testing done, I'll be happy to make a release, and upload to ubuntu.

Thanks

Revision history for this message
Robie Basak (racb) wrote :

Hi Scott,

On Wed, Aug 26, 2020 at 02:02:02PM -0000, Scott Moser wrote:
> @Robie,
> Thanks for your time reviewing.
> What kind of test do you want to see?
>
> I'd say the following:
>
> * ssh-import-id valid-user
> * ssh-import-id lp:valid-user
> * ssh-import-id gh:valid-user
> * ssh-import-id invalid-user
> * ssh-import-id lp:invalid-user
> * ssh-import-id gh:invalid-user
> * URL="http://localhost:9999/%s.pub" ssh-import-id valid-user
> * URL="http://localhost:9999/%s.pub" ssh-import-id invalid-user

Yes - I was planning to do these kinds of things manually, commit a drop
of "import ssl" and then merge. I realise that feature freeze is
tomorrow. I'm not sure if I'll get to this in time for that or not. If
you'd like to take this on instead, please do.

> Clearly it'd be good to have these tests automated and in c-i.

Yes, although that has the additional issue of needing a fake server to
test against.

Robie

Revision history for this message
Scott Moser (smoser) wrote :

> Hi Scott,
>
> > Clearly it'd be good to have these tests automated and in c-i.
>
> Yes, although that has the additional issue of needing a fake server to
> test against.

curtin's 'webserv' [1] does this, If you bind to port 0, then kernel
gives you a port, and it prints it stdout, so then you just do:

   [1] https://git.launchpad.net/curtin/tree/tools/webserv

  $ webserv > out &
  $ read host port < out
  $ # now use "http://127.0.0.1:$port/%s"

simplestreams does something similar i think. its easy enough.

Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

Sorry --- missed the feedback about 'import ssl' earlier. Fixed now.

Revision history for this message
Shivaram Lingamneni (slingamn) wrote :

Just wanted to check on the status of this.

Revision history for this message
Robie Basak (racb) wrote :

As it happens this just got to the top of my queue today. Looking at it now.

Revision history for this message
Robie Basak (racb) wrote :

Testing with all of Scott's scenarios above look good. Thanks!

I'll merge this now. It will need an upload to Ubuntu post-Groovy.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/debian/control b/debian/control
2index 39c35b4..99c5b75 100644
3--- a/debian/control
4+++ b/debian/control
5@@ -6,7 +6,6 @@ Uploaders: Andrew Starr-Bochicchio <asb@debian.org>
6 Build-Depends: debhelper (>= 10),
7 dh-python,
8 python3-all,
9- python3-requests (>= 1.1.0),
10 python3-distro,
11 python3-setuptools
12 Standards-Version: 4.1.3
13@@ -18,7 +17,6 @@ Package: ssh-import-id
14 Architecture: all
15 Depends: ca-certificates,
16 openssh-client,
17- python3-requests (>= 1.1.0),
18 python3-distro,
19 wget,
20 ${misc:Depends},
21diff --git a/setup.py b/setup.py
22index a75844e..f00afa6 100755
23--- a/setup.py
24+++ b/setup.py
25@@ -24,7 +24,7 @@ def read_version():
26 # shove 'version' into the path so we can import it without going through
27 # ssh_import_id which has deps that wont be available at setup.py time.
28 # specifically, from 'ssh_import_id import version'
29- # will fail due to requests not available.
30+ # may fail due to dependencies not available.
31 verdir = os.path.abspath(
32 os.path.join(os.path.dirname(__file__), "ssh_import_id"))
33 sys.path.insert(0, verdir)
34@@ -51,7 +51,7 @@ setup(
35 platforms=['any'],
36 packages=['ssh_import_id'],
37 scripts=['usr/bin/ssh-import-id-gh', 'usr/bin/ssh-import-id-lp'],
38- install_requires=["requests>=1.1.0", "distro"],
39+ install_requires=["distro"],
40 entry_points={
41 'console_scripts': [
42 'ssh-import-id = ssh_import_id:main'
43diff --git a/ssh_import_id/__init__.py b/ssh_import_id/__init__.py
44index e438b4e..034adda 100644
45--- a/ssh_import_id/__init__.py
46+++ b/ssh_import_id/__init__.py
47@@ -26,21 +26,19 @@ except ImportError:
48 JSONDecodeError = ValueError
49 import logging
50 import os
51-import stat
52 import subprocess
53 import sys
54 import tempfile
55+import urllib.error
56+from urllib.parse import quote_plus
57+from urllib.request import Request, urlopen
58
59 import distro
60-import requests
61
62-try:
63- from urllib.parse import quote_plus
64-except ImportError:
65- from urllib import quote_plus
66+from .version import VERSION
67
68
69-from .version import VERSION
70+DEFAULT_TIMEOUT = 15.0
71
72
73 DEFAULT_PROTO = "lp"
74@@ -315,14 +313,16 @@ def fetch_keys_lp(lpid, useragent):
75 url = "https://launchpad.net/~%s/+sshkeys" % (quote_plus(lpid))
76 headers = {'User-Agent': user_agent(useragent)}
77
78- response = requests.get(url, verify=True, headers=headers)
79- if response.status_code != 200:
80+ try:
81+ with urlopen(Request(url, headers=headers),
82+ timeout=DEFAULT_TIMEOUT) as response:
83+ keys = response.read().decode('utf-8')
84+ except urllib.error.HTTPError as e:
85 msg = 'Requesting Launchpad keys failed.'
86- if response.status_code == 404:
87+ if e.code == 404:
88 msg = 'Launchpad user not found.'
89- die(msg + " status_code=%d user=%s" % (response.status_code, lpid))
90+ die(msg + " status_code=%d user=%s" % (e.code, lpid))
91
92- keys = str(response.text)
93 # pylint: disable=broad-except
94 except Exception as e:
95 die(str(e))
96@@ -336,17 +336,18 @@ def fetch_keys_gh(ghid, useragent):
97 try:
98 url = "https://api.github.com/users/%s/keys" % (quote_plus(ghid))
99 headers = {'User-Agent': user_agent(useragent)}
100- resp = requests.get(url, headers=headers, verify=True)
101- text = resp.text
102- data = json.loads(text)
103- if resp.status_code != 200:
104+ try:
105+ with urlopen(Request(url, headers=headers),
106+ timeout=DEFAULT_TIMEOUT) as resp:
107+ data = json.load(resp)
108+ except urllib.error.HTTPError as e:
109 msg = 'Requesting GitHub keys failed.'
110- if resp.status_code == 404:
111+ if e.code == 404:
112 msg = 'Username "%s" not found at GitHub API.' % ghid
113- elif resp.headers.get(x_ratelimit_remaining) == "0":
114+ elif e.hdrs.get(x_ratelimit_remaining) == "0":
115 msg = ('GitHub REST API rate-limited this IP address. See %s .'
116 % help_url)
117- die(msg + " status_code=%d user=%s" % (resp.status_code, ghid))
118+ die(msg + " status_code=%d user=%s" % (e.code, ghid))
119 for keyobj in data:
120 keys += "%s %s@github/%s\n" % (keyobj['key'], ghid, keyobj['id'])
121 # pylint: disable=broad-except

Subscribers

People subscribed via source and target branches