usso

Code review comment for lp:~vds/usso/support_sha1

support_sha1
Merge into trunk

Revision history for this message

Jeroen T. Vermeulen (jtv) wrote on 2013-01-23:

Thanks for doing this! I have some notes. I don't know how urgent they are, since AIUI this code isn't functional yet, so perhaps it's best to land the code anyway, and then address any remaining points in a later branch. Maybe put TODO or XXX markers in the code? Just make sure not to forget. :)

The notes are:

* I don't think signature() would produce the right output in the HMAC-SHA1 case. You use fmt.Sprint("%s&%s%s", a, b, c, d, e, f, g, h, i, j) but Sprint() doesn't format. So it'll just put that format string at the front as-is, with all the other values appended to it: "%s&%s%sAaaBbbCccDdd" etc. You probably want fmt.Sprintf(). But then you'll also need to provide a format string that consumes all the values you pass! The current code has a format string for 3 parameters, followed by about 10 parameters. This kind of thing is annoyingly hard to test for, because you have a large amount of text and you need to fish out of it the things that you expect to see. Things that you _don't_ expect to see are hard to detect. Normally an end-to-end test would reveal those, but not when the real OAuth server isn't part of the test setup. There are ways to deal with this problem, and I'll try to make them gradually clearer over the rest of this review.

* The text composition in GetAuthorizationHeader() and signature() looks very repetitive, both within each function and between the two. Also the repeated code leaves out the query-escaping here and there just because it can. Is there no easy way to regularize that into a standard-library call or loop, so that it's easier to avoid little one-off mistakes? I know it would be very easy in Python, although it does look to be a bit harder in Go. Especially if you want to pass a map and get consistently-sorted output.

* Speaking of which: does signature() need the parameters to be written in a consistent order? The python SSO code sorted them, but I'm not sure net/url's Values.Encode() in Go promises any kind of consistency. I didn't get around to resolving this in my code. :-(

* Is there any particular reason why SSOData.Timestamp and SSOData.Nonce are initialized inside GetAuthorizationHeader? It seems to me that should be SSOData's responsibility.

* You introduce some line-breaking. Do we have a standard for doing that in Go? The Go way of dealing with long lines, apparently, is to let them run for as long as they want to. If you do want to break them up, the Cloud Engineering convention in Python at least is to put the first line break *right after* the opening parenthesis or brace that contains the thing you want to break up. *Except* in function definitions, where you continue the parameter list until the line gets too long and then break the line. You do it that way in some places, but not consistently. My personal instinct is to go with the Go way, and if the line gets too long, take that as a signal that something has become too complicated and needs refactoring.

* signature() returns the empty string in the case of an unknown signature method. Is that meant to indicate an error to the caller, or is it meant as a sensible fallback? If it's an error condition, it should probably be clearer and the caller ought to check for it — maybe the code should even just panic() instead. Or if this is a fallback, for what situations is it really sensible? Maybe there is some specific signature method that should produce an empty signature, and any unknown signature method should be an error again. Being specific about this can ease debugging later.

* It looks like the tests could do with a simple factory function: "initialize an SSOData with test data for me." In Launchpad our lives got massively better once we built a reusable factory where we could generate all the test objects we needed. If type A required another object of type B, the factory method for type A would just call the factory method for type B. (Another important design point was that everything in the generated object was as randomized as possible, to help tests expose bugs. Nothing was ever hard-coded or "just using the same value" as something else.)

* Don't forget to unit-test! You have two big end-to-end tests there, but functions like signature() and GetAuthorizationHeader() also deserve to be tested in isolation. That way your tests can go into much more detail, and catch things like the fmt.Sprint() problem I mentioned at the top. This is another reason why it would be nice to have a single function, if possible, supporting the common text composition between those two functions: you can test it once and know that it works well for both places where you use it. (Or if it doesn't, this also makes it easier to find the problems. :)

* Let's imagine you have a reusable function for composing header text. Call it makeHeaderText() for example (probably a crappy name, sorry!) and both signature() and GetAuthorizationHeader() can make use of it. Then often it's better to make signature() and GetAuthorizationHeader() not call makeHeaderText() itself, but return _the value that you pass to makeHeaderText()_ instead. And then the caller itself can take that return value and call makeHeaderText(). That makes the calling code a tiny bit longer, but you get something back: whatever it is that you pass to makeHeaderText() will also be easier to inspect in your tests. So then you can write better tests for signature() and GetAuthorizationHeader() as well.

Jeroen

Thanks for doing this!  I have some notes.  I don't know how urgent they are, since AIUI this code isn't functional yet, so perhaps it's best to land the code anyway, and then address any remaining points in a later branch.  Maybe put TODO or XXX markers in the code?  Just make sure not to forget.  :)

The notes are:

* I don't think signature() would produce the right output in the HMAC-SHA1 case.  You use fmt.Sprint("%s&%s%s", a, b, c, d, e, f, g, h, i, j) but Sprint() doesn't format.  So it'll just put that format string at the front as-is, with all the other values appended to it: "%s&%s%sAaaBbbCccDdd" etc.  You probably want fmt.Sprintf().  But then you'll also need to provide a format string that consumes all the values you pass!  The current code has a format string for 3 parameters, followed by about 10 parameters.  This kind of thing is annoyingly hard to test for, because you have a large amount of text and you need to fish out of it the things that you expect to see.  Things that you _don't_ expect to see are hard to detect.  Normally an end-to-end test would reveal those, but not when the real OAuth server isn't part of the test setup.  There are ways to deal with this problem, and I'll try to make them gradually clearer over the rest of this review.

* The text composition in GetAuthorizationHeader() and signature() looks very repetitive, both within each function and between the two.  Also the repeated code leaves out the query-escaping here and there just because it can.  Is there no easy way to regularize that into a standard-library call or loop, so that it's easier to avoid little one-off mistakes?  I know it would be very easy in Python, although it does look to be a bit harder in Go.  Especially if you want to pass a map and get consistently-sorted output.

* Speaking of which: does signature() need the parameters to be written in a consistent order?  The python SSO code sorted them, but I'm not sure net/url's Values.Encode() in Go promises any kind of consistency.  I didn't get around to resolving this in my code.  :-(

* Is there any particular reason why SSOData.Timestamp and SSOData.Nonce are initialized inside GetAuthorizationHeader?  It seems to me that should be SSOData's responsibility.

* You introduce some line-breaking.  Do we have a standard for doing that in Go?  The Go way of dealing with long lines, apparently, is to let them run for as long as they want to.  If you do want to break them up, the Cloud Engineering convention in Python at least is to put the first line break *right after* the opening parenthesis or brace that contains the thing you want to break up.  *Except* in function definitions, where you continue the parameter list until the line gets too long and then break the line.  You do it that way in some places, but not consistently.  My personal instinct is to go with the Go way, and if the line gets too long, take that as a signal that something has become too complicated and needs refactoring.

* signature() returns the empty string in the case of an unknown signature method.  Is that meant to indicate an error to the caller, or is it meant as a sensible fallback?  If it's an error condition, it should probably be clearer and the caller ought to check for it — maybe the code should even just panic() instead.  Or if this is a fallback, for what situations is it really sensible?  Maybe there is some specific signature method that should produce an empty signature, and any unknown signature method should be an error again.  Being specific about this can ease debugging later.

* It looks like the tests could do with a simple factory function: "initialize an SSOData with test data for me."  In Launchpad our lives got massively better once we built a reusable factory where we could generate all the test objects we needed.  If type A required another object of type B, the factory method for type A would just call the factory method for type B.  (Another important design point was that everything in the generated object was as randomized as possible, to help tests expose bugs.  Nothing was ever hard-coded or "just using the same value" as something else.)

* Don't forget to unit-test!  You have two big end-to-end tests there, but functions like signature() and GetAuthorizationHeader() also deserve to be tested in isolation.  That way your tests can go into much more detail, and catch things like the fmt.Sprint() problem I mentioned at the top.  This is another reason why it would be nice to have a single function, if possible, supporting the common text composition between those two functions: you can test it once and know that it works well for both places where you use it.  (Or if it doesn't, this also makes it easier to find the problems. :)

* Let's imagine you have a reusable function for composing header text.  Call it makeHeaderText() for example (probably a crappy name, sorry!) and both signature() and GetAuthorizationHeader() can make use of it.  Then often it's better to make signature() and GetAuthorizationHeader() not call makeHeaderText() itself, but return _the value that you pass to makeHeaderText()_ instead.  And then the caller itself can take that return value and call makeHeaderText().  That makes the calling code a tiny bit longer, but you get something back: whatever it is that you pass to makeHeaderText() will also be easier to inspect in your tests.  So then you can write better tests for signature() and GetAuthorizationHeader() as well.

Jeroen

« Back to merge proposal