Merge lp:~marcustomlinson/unity-scopes-api/fix_url_port_extraction into lp:unity-scopes-api/devel

Proposed by Marcus Tomlinson
Status: Merged
Approved by: Marcus Tomlinson
Approved revision: 236
Merged at revision: 228
Proposed branch: lp:~marcustomlinson/unity-scopes-api/fix_url_port_extraction
Merge into: lp:unity-scopes-api/devel
Diff against target: 382 lines (+276/-33)
5 files modified
include/unity/scopes/internal/smartscopes/SmartScopesClient.h (+5/-0)
src/scopes/internal/smartscopes/SmartScopesClient.cpp (+105/-29)
test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp (+148/-0)
test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp (+0/-2)
valgrind-suppress (+18/-2)
To merge this branch: bzr merge lp:~marcustomlinson/unity-scopes-api/fix_url_port_extraction
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Paweł Stołowski (community) Approve
Michi Henning (community) Needs Information
Review via email: mp+207864@code.launchpad.net

Commit message

Don't assume the port to be at the very end of a url string.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Michi Henning (michihenning) wrote :

Strictly speaking, for strings, you should use string::size_type, not uint64_t. (For example, on a 32-bit platform, uint64_t needlessly forces a 64-bit type when a 32-bit type would be sufficient.)

    // find the last occurrence of ':' in the url in order to extract the port number
    // * ignore the colon after "http"/"https"

    const size_t hier_pos = strlen("https");

    uint64_t port_pos = base_url.find_last_of(':');
    if (port_pos != std::string::npos && port_pos > hier_pos)

What happens if the URL is of the form "http://somewhere.org/some/path"? port_pos will be 4 in that case, which is not greater than hier_pos.

Just checking. I find it almost impossible to understand string parsing code such as this by just reading it. Not your fault, of course. The C++ string class would have to be one of the worst string classes anyone has ever come up with :-(

review: Needs Information
228. By Marcus Tomlinson

Use std::string::size_type for string find positions

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

> Strictly speaking, for strings, you should use string::size_type, not
> uint64_t. (For example, on a 32-bit platform, uint64_t needlessly forces a
> 64-bit type when a 32-bit type would be sufficient.)

ok, I've made that change.

> // find the last occurrence of ':' in the url in order to extract the port
> number
> // * ignore the colon after "http"/"https"
>
> const size_t hier_pos = strlen("https");
>
> uint64_t port_pos = base_url.find_last_of(':');
> if (port_pos != std::string::npos && port_pos > hier_pos)
>
> What happens if the URL is of the form "http://somewhere.org/some/path"?
> port_pos will be 4 in that case, which is not greater than hier_pos.

This is correct. That if statement is determining if there IS a port in the url. As "http://somewhere.org/some/path" does not have a port in the name, the else logic is executed.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
229. By Marcus Tomlinson

Added some more comments

230. By Marcus Tomlinson

Further clarification of string logic

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Paweł Stołowski (stolowski) wrote :

I agree, it's very hard to tell if this parsing is correct and handles all cases; making any adjustments to it without risking breaking something in the future will be tricky . Could you please add some tests that exercise this parser with different uris (including broken ones), with and without SMART_SCOPES_SERVER env var set and with default and non-default port?
Perhaps it would make sense to move this parser out from SmartScopesClient to a utility function to make testing easy (and also in case we need another SmartScopesClient ctor)? But I'm not going to insist on that, you can very well just extend existing SmartScopesClient tests.

review: Needs Fixing
231. By Marcus Tomlinson

Moved url parsing into set_url() method, and added set_port(), port(), and url() methods.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
232. By Marcus Tomlinson

reset_port() and reset_url() better describes what these methods are doing.

233. By Marcus Tomlinson

Added valgrind suppressions for putenv and getenv.
Added tests for url parsing to SmartScopesClientTest.

234. By Marcus Tomlinson

Spelling mistake in comments

235. By Marcus Tomlinson

Removed "TODO" comment

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Paweł Stołowski (stolowski) wrote :

Good stuff, thanks for the tests!

One more minor thing:

> What happens if the URL is of the form "http://somewhere.org/some/path"?
> port_pos will be 4 in that case, which is not greater than hier_pos.
> This is correct. That if statement is determining if there IS a port in the url. As > > >"http://somewhere.org/some/path" does not have a port in the name, the else logic is >executed.

Could you please add a comment in the code explaining it's ok for http as well?

review: Needs Fixing
236. By Marcus Tomlinson

Added comment better explaining how we determine that there is a port in the url.

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

> Good stuff, thanks for the tests!
>
> One more minor thing:
>
> > What happens if the URL is of the form "http://somewhere.org/some/path"?
> > port_pos will be 4 in that case, which is not greater than hier_pos.
> > This is correct. That if statement is determining if there IS a port in the
> url. As > > >"http://somewhere.org/some/path" does not have a port in the
> name, the else logic is >executed.
>
> Could you please add a comment in the code explaining it's ok for http as
> well?

Ok I've added "// if there is a port specified in the url (i.e. a colon occurs after "http:" / "https:")" before the if statement that checks for port.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Paweł Stołowski (stolowski) wrote :

+1

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'include/unity/scopes/internal/smartscopes/SmartScopesClient.h'
--- include/unity/scopes/internal/smartscopes/SmartScopesClient.h 2014-02-20 19:19:25 +0000
+++ include/unity/scopes/internal/smartscopes/SmartScopesClient.h 2014-02-25 09:40:06 +0000
@@ -128,6 +128,11 @@
128128
129 virtual ~SmartScopesClient();129 virtual ~SmartScopesClient();
130130
131 void reset_url(std::string const& url = "");
132 void reset_port(uint port = 0);
133 std::string url();
134 uint port();
135
131 bool get_remote_scopes(std::vector<RemoteScope>& scopes, std::string const& locale = "", bool caching_enabled = true);136 bool get_remote_scopes(std::vector<RemoteScope>& scopes, std::string const& locale = "", bool caching_enabled = true);
132137
133 SearchHandle::UPtr search(std::string const& base_url,138 SearchHandle::UPtr search(std::string const& base_url,
134139
=== modified file 'src/scopes/internal/smartscopes/SmartScopesClient.cpp'
--- src/scopes/internal/smartscopes/SmartScopesClient.cpp 2014-02-20 19:19:25 +0000
+++ src/scopes/internal/smartscopes/SmartScopesClient.cpp 2014-02-25 09:40:06 +0000
@@ -93,38 +93,32 @@
93 uint port)93 uint port)
94 : http_client_(http_client)94 : http_client_(http_client)
95 , json_node_(json_node)95 , json_node_(json_node)
96 , url_(url)96 , port_(0)
97 , port_(port)
98 , have_latest_cache_(false)97 , have_latest_cache_(false)
99 , query_counter_(0)98 , query_counter_(0)
100{99{
101 // initialise url_100 try
102 if (url_.empty())101 {
103 {102 // try to set url from url supplied
104 char* base_url_env = ::getenv("SMART_SCOPES_SERVER");103 reset_url(url);
105 std::string base_url = base_url_env ? base_url_env : "";104 }
106 if (!base_url.empty())105 catch(...)
107 {106 {
108 // find the last occurrence of ':' in the url in order to extract the port number107 try
109 // * ignore the colon after "http"/"https"108 {
110109 // url supplied failed, try to set url automatically
111 const size_t hier_pos = strlen("https");110 reset_url();
112111 }
113 uint64_t found = base_url.find_last_of(':');112 catch(...)
114 if (found != std::string::npos && found > hier_pos)113 {
115 {114 std::cerr << "SmartScopesClient::SmartScopesClient: Failed to initialise SSS url" << std::endl;
116 url_ = base_url.substr(0, found);115 }
117 port_ = std::stoi(base_url.substr(found + 1));116 }
118 }117
119 else118 // force set a port only if one was explicitly provided
120 {119 if (port != 0)
121 url_ = base_url;120 {
122 }121 reset_port(port);
123 }
124 else
125 {
126 url_ = c_base_url;
127 }
128 }122 }
129123
130 // initialise cached_scopes_124 // initialise cached_scopes_
@@ -135,6 +129,88 @@
135{129{
136}130}
137131
132void SmartScopesClient::reset_url(std::string const& url)
133{
134 std::string base_url = url;
135
136 // if a url was not provided, get the environment variable
137 if (base_url.empty())
138 {
139 char* sss_url_env = ::getenv("SMART_SCOPES_SERVER");
140 base_url = sss_url_env ? sss_url_env : "";
141
142 // if the env var was not provided, use the c_base_url constant
143 if (base_url.empty())
144 {
145 base_url = c_base_url;
146 }
147 }
148
149 // find the last occurrence of ':' in the url in order to extract the port number
150 // * ignore the colon after "http" / "https"
151
152 const size_t hier_pos = strlen("https");
153 std::string::size_type port_pos = base_url.find_last_of(':');
154
155 // if there is a port specified in the url (i.e. a colon occurs after "http:" / "https:")
156 if (port_pos != std::string::npos && port_pos > hier_pos)
157 {
158 url_ = base_url.substr(0, port_pos);
159 std::string port_str;
160 std::string::size_type url_cont = base_url.find('/', port_pos);
161
162 // if the address continues after the port
163 if (url_cont != std::string::npos)
164 {
165 // extract port, and add the rest of the address to url_
166 url_ += base_url.substr(url_cont);
167 port_str = base_url.substr(port_pos + 1, url_cont - port_pos - 1);
168 }
169 // else if the remainder of the string is just the port
170 else
171 {
172 // extract port
173 port_str = base_url.substr(port_pos + 1);
174 }
175
176 // check if port_str is actually a number before setting port_
177 if (!port_str.empty() &&
178 std::find_if(begin(port_str), end(port_str),
179 [](char const& c){ return !std::isdigit(c); }) == port_str.end())
180 {
181 port_ = std::stoi(port_str);
182 }
183 else
184 {
185 // the port supplied is not a number, don't trust the url either
186 url_ = "";
187 port_ = 0;
188 throw unity::InvalidArgumentException("Invalid url: " + url);
189 }
190 }
191 // else if there is no port specified in the url
192 else
193 {
194 url_ = base_url;
195 port_ = 0;
196 }
197}
198
199void SmartScopesClient::reset_port(uint port)
200{
201 port_ = port;
202}
203
204std::string SmartScopesClient::url()
205{
206 return url_;
207}
208
209uint SmartScopesClient::port()
210{
211 return port_;
212}
213
138// returns false if cache used214// returns false if cache used
139bool SmartScopesClient::get_remote_scopes(std::vector<RemoteScope>& remote_scopes, std::string const& locale, bool caching_enabled)215bool SmartScopesClient::get_remote_scopes(std::vector<RemoteScope>& remote_scopes, std::string const& locale, bool caching_enabled)
140{216{
141217
=== modified file 'test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp'
--- test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp 2014-02-20 19:19:25 +0000
+++ test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp 2014-02-25 09:40:06 +0000
@@ -190,4 +190,152 @@
190 EXPECT_EQ(2, results.size());190 EXPECT_EQ(2, results.size());
191}191}
192192
193TEST_F(SmartScopesClientTest, url_parsing)
194{
195 // check initial values to be expected test values
196 EXPECT_EQ(server_.port_, ssc_->port());
197 EXPECT_EQ("http://127.0.0.1", ssc_->url());
198
199 // empty the environment var (in case there already is one set)
200 std::string server_url_env = "SMART_SCOPES_SERVER=";
201 ::putenv(const_cast<char*>(server_url_env.c_str()));
202
203 // reset url and check that we now have contant values
204 EXPECT_NO_THROW(ssc_->reset_url());
205 EXPECT_EQ(0, ssc_->port());
206 EXPECT_EQ("https://productsearch.ubuntu.com/smartscopes/v2", ssc_->url());
207
208 // http addr, no port
209 EXPECT_NO_THROW(ssc_->reset_url("http://hello.com/there"));
210 EXPECT_EQ(0, ssc_->port());
211 EXPECT_EQ("http://hello.com/there", ssc_->url());
212
213 // https addr, no port
214 EXPECT_NO_THROW(ssc_->reset_url("https://hello.com/there"));
215 EXPECT_EQ(0, ssc_->port());
216 EXPECT_NO_THROW(ssc_->reset_port(1500));
217 EXPECT_EQ(1500, ssc_->port());
218 EXPECT_EQ("https://hello.com/there", ssc_->url());
219
220 // http addr, with port (end)
221 EXPECT_NO_THROW(ssc_->reset_url("http://hello.com:1500"));
222 EXPECT_EQ(1500, ssc_->port());
223 EXPECT_EQ("http://hello.com", ssc_->url());
224
225 // https addr, with port (end)
226 EXPECT_NO_THROW(ssc_->reset_url("https://hello.com:1500"));
227 EXPECT_EQ(1500, ssc_->port());
228 EXPECT_NO_THROW(ssc_->reset_port(2000));
229 EXPECT_EQ(2000, ssc_->port());
230 EXPECT_EQ("https://hello.com", ssc_->url());
231
232 // http addr, with port (mid)
233 EXPECT_NO_THROW(ssc_->reset_url("http://hello.com:1500/there"));
234 EXPECT_EQ(1500, ssc_->port());
235 EXPECT_EQ("http://hello.com/there", ssc_->url());
236
237 // https addr, with port (mid)
238 EXPECT_NO_THROW(ssc_->reset_url("https://hello.com:1500/there"));
239 EXPECT_EQ(1500, ssc_->port());
240 EXPECT_EQ("https://hello.com/there", ssc_->url());
241
242 // https addr, with broken port
243 EXPECT_THROW(ssc_->reset_url("https://hello.com:15d00/there"), std::exception);
244 EXPECT_EQ(0, ssc_->port());
245 EXPECT_EQ("", ssc_->url());
246
247 // https addr, with floating colon
248 EXPECT_THROW(ssc_->reset_url("https://hello.com:1500/the:re"), std::exception);
249 EXPECT_EQ(0, ssc_->port());
250 EXPECT_NO_THROW(ssc_->reset_port(1500));
251 EXPECT_EQ(1500, ssc_->port());
252 EXPECT_EQ("", ssc_->url());
253}
254
255TEST_F(SmartScopesClientTest, url_parsing_env_var)
256{
257 // check initial values to be expected test values
258 EXPECT_EQ(server_.port_, ssc_->port());
259 EXPECT_EQ("http://127.0.0.1", ssc_->url());
260
261 // empty the environment var (in case there already is one set)
262 std::string server_url_env = "SMART_SCOPES_SERVER=";
263 ::putenv(const_cast<char*>(server_url_env.c_str()));
264
265 // reset url and check that we now have contant values
266 EXPECT_NO_THROW(ssc_->reset_url());
267 EXPECT_EQ(0, ssc_->port());
268 EXPECT_EQ("https://productsearch.ubuntu.com/smartscopes/v2", ssc_->url());
269
270 // env var, http addr, no port
271 server_url_env = "SMART_SCOPES_SERVER=http://hello.com/there";
272 ::putenv(const_cast<char*>(server_url_env.c_str()));
273
274 EXPECT_NO_THROW(ssc_->reset_url());
275 EXPECT_EQ(0, ssc_->port());
276 EXPECT_EQ("http://hello.com/there", ssc_->url());
277
278 // env var, https addr, no port
279 server_url_env = "SMART_SCOPES_SERVER=https://hello.com/there";
280 ::putenv(const_cast<char*>(server_url_env.c_str()));
281
282 EXPECT_NO_THROW(ssc_->reset_url());
283 EXPECT_EQ(0, ssc_->port());
284 EXPECT_NO_THROW(ssc_->reset_port(1500));
285 EXPECT_EQ(1500, ssc_->port());
286 EXPECT_EQ("https://hello.com/there", ssc_->url());
287
288 // env var, http addr, with port (end)
289 server_url_env = "SMART_SCOPES_SERVER=http://hello.com:1500";
290 ::putenv(const_cast<char*>(server_url_env.c_str()));
291
292 EXPECT_NO_THROW(ssc_->reset_url());
293 EXPECT_EQ(1500, ssc_->port());
294 EXPECT_EQ("http://hello.com", ssc_->url());
295
296 // env var, https addr, with port (end)
297 server_url_env = "SMART_SCOPES_SERVER=https://hello.com:1500";
298 ::putenv(const_cast<char*>(server_url_env.c_str()));
299
300 EXPECT_NO_THROW(ssc_->reset_url());
301 EXPECT_EQ(1500, ssc_->port());
302 EXPECT_NO_THROW(ssc_->reset_port(2000));
303 EXPECT_EQ(2000, ssc_->port());
304 EXPECT_EQ("https://hello.com", ssc_->url());
305
306 // env var, http addr, with port (mid)
307 server_url_env = "SMART_SCOPES_SERVER=http://hello.com:1500/there";
308 ::putenv(const_cast<char*>(server_url_env.c_str()));
309
310 EXPECT_NO_THROW(ssc_->reset_url());
311 EXPECT_EQ(1500, ssc_->port());
312 EXPECT_EQ("http://hello.com/there", ssc_->url());
313
314 // env var, https addr, with port (mid)
315 server_url_env = "SMART_SCOPES_SERVER=https://hello.com:1500/there";
316 ::putenv(const_cast<char*>(server_url_env.c_str()));
317
318 EXPECT_NO_THROW(ssc_->reset_url());
319 EXPECT_EQ(1500, ssc_->port());
320 EXPECT_EQ("https://hello.com/there", ssc_->url());
321
322 // env var, https addr, with broken port
323 server_url_env = "SMART_SCOPES_SERVER=https://hello.com:15d00/there";
324 ::putenv(const_cast<char*>(server_url_env.c_str()));
325
326 EXPECT_THROW(ssc_->reset_url(), std::exception);
327 EXPECT_EQ(0, ssc_->port());
328 EXPECT_EQ("", ssc_->url());
329
330 // env var, https addr, with floating colon
331 server_url_env = "SMART_SCOPES_SERVER=https://hello.com:1500/the:re";
332 ::putenv(const_cast<char*>(server_url_env.c_str()));
333
334 EXPECT_THROW(ssc_->reset_url(), std::exception);
335 EXPECT_EQ(0, ssc_->port());
336 EXPECT_NO_THROW(ssc_->reset_port(1500));
337 EXPECT_EQ(1500, ssc_->port());
338 EXPECT_EQ("", ssc_->url());
339}
340
193} // namespace341} // namespace
194342
=== modified file 'test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp'
--- test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp 2014-02-20 19:19:25 +0000
+++ test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp 2014-02-25 09:40:06 +0000
@@ -41,8 +41,6 @@
41using namespace unity::scopes::internal::smartscopes;41using namespace unity::scopes::internal::smartscopes;
42using namespace unity::test::scopes::internal::smartscopes;42using namespace unity::test::scopes::internal::smartscopes;
4343
44///! TODO: more tests
45
46namespace44namespace
47{45{
4846
4947
=== modified file 'valgrind-suppress'
--- valgrind-suppress 2014-02-03 11:29:47 +0000
+++ valgrind-suppress 2014-02-25 09:40:06 +0000
@@ -72,7 +72,6 @@
72 obj:/lib/x86_64-linux-gnu/ld-2.18.so72 obj:/lib/x86_64-linux-gnu/ld-2.18.so
73}73}
7474
75
76# False positives for memory leaks in Qt75# False positives for memory leaks in Qt
7776
78{77{
@@ -99,7 +98,6 @@
99 ...98 ...
100 fun:_ZN21QNetworkAccessManager13createRequestENS_9OperationERK15QNetworkRequestP9QIODevice99 fun:_ZN21QNetworkAccessManager13createRequestENS_9OperationERK15QNetworkRequestP9QIODevice
101 fun:_ZN21QNetworkAccessManager3getERK15QNetworkRequest100 fun:_ZN21QNetworkAccessManager3getERK15QNetworkRequest
102 fun:_ZN18HttpClientQtThread3runEv
103}101}
104102
105{103{
@@ -109,3 +107,21 @@
109 ...107 ...
110 fun:_ZNK14QFactoryLoader8instanceEi108 fun:_ZNK14QFactoryLoader8instanceEi
111}109}
110
111# Bogus "invalid read" reports for ::putenv and ::genenv
112
113{
114 putenv_read
115 Memcheck:Addr1
116 ...
117 fun:putenv
118 ...
119}
120
121{
122 getenv_read
123 Memcheck:Addr2
124 ...
125 fun:getenv
126 ...
127}

Subscribers

People subscribed via source and target branches

to all changes: