Code review comment for lp:~jelmer/bzr/urllib-verifies-ssl-certs

Revision history for this message
Vincent Ladeuil (vila) wrote :

> Add support for verifying SSL certificates when using urllib.

\o/

>
> This branch does actually work, but it's got some open ends:
>
> * The location with the CAs to trust differs per system. Rather than hardcoding
> a list, I think we should have a configuration option that allows the user
> to specify what CA list they want to verify against.

pycurl already handles that for windows but getting it right in the general
case... sounds like a packaging issue :-/

>
> This option could then also be pointed at a user-specific list
> (rather than a system wide list). And packagers can put in the default
> for their distribution.
>
> Where do I obtain this option from, which config stack ?

I'd go with global to start with specifying the default value at
registration time if possible.

Arguably it's a system-wide config file that should be used, making packager
life easier.

>
> * We probably want some way to specify that self-signed certificates are ok.
> Probably also a new option? We should probably still warn the user
> when a self-signed certificate is being trusted.

authentication.conf is the right place for that and the option is even
documented but not implemented ;)

>
> * There are no additional tests. There probably should be some, at least for
> e.g. the match_hostname helper.

Yup. There are already https tests for pycurl that should be activated now
(<broken record> hey, did I mention the 'scenarios' attribute is cool for
that ? </broken record> ;)

>
> * I have the feeling that I'm missing something. This was way too easy.
> What is it?

Hehe.

The last time I worked on this area we were still supporting python-2.4/2.5
where https wasn't supported. Now, you also found the relevant code in
python-3.2 so... that certainly simplify things *a lot* :)

This wasn't a full review but as I understand it you weren't requesting one
either ;)

As for tests, if you look at the pycurl ones you'll certainly find that most
of the needed infrastructure is there to build certificates for special
purposes (self-signed, obsolete, etc) and easily setup an https server.

99 + SSLSocket.getpeercert()) matches the *hostname*. RFC 2818 rules
100 + are mostly followed, but IP addresses are not accepted for *hostname*.

Hmm, IP addresses not accepted ? I bet our tests won't like that much.

pycurl itself was not open enough to test for these weird certificates AFAIR
but it's also a robust implementation so I wasn't concerned, but if we make
urllib the default one, we have to make sure we're making the right checks.

I'll need to dig a bit more to find back the relevant discussions about this
topic.

Also, defaulting to urllib is fine but the documentation probably needs to
be updated (created even ?) to make it clear that we're not defaulting to
pycurl. We had issues in the past (including failures to check lp
certificate) which had led to the actual defaults (urllib for http, pycurl
for https), let's make the switch crystal clear.

141 def connect_to_origin(self):
142 - ssl_sock = _ssl_wrap_socket(self.sock, self.key_file, self.cert_file)
143 + ssl_sock = ssl.wrap_socket(self.sock, self.key_file, self.cert_file,
144 + cert_reqs=ssl.CERT_REQUIRED,
145 + ca_certs="/etc/ssl/certs/ca-certificates.crt")
146 + match_hostname(ssl_sock.getpeercert(), self.host)

Isn't it great that this is enough to make it work for both the targeted
server as well as the proxy :)

But joke aside, getting 'ca_certs' right will be a chore. It's a shame that
python doesn't handle that, really.

We probably want to mention this path in all CertificateError exceptions to
help people debug the related issues.

If nothing else, it means we will have to make sure that all packagers get
it right for the their platform or making urllib the default will not be as
smooth as we want it to be... worth raising the topic on the mailing list
(bzr or bzr-packagers ??) maybe ?

All of that being said, I will probably handle the tow bugs separately.

It's awesome that you added support for cert checks but it seems a bit
premature to make it the default.

« Back to merge proposal