Merge lp:~marcustomlinson/unity-scopes-api/fix_url_port_extraction into lp:unity-scopes-api/devel

Proposed by Marcus Tomlinson
Status: Merged
Approved by: Marcus Tomlinson
Approved revision: 236
Merged at revision: 228
Proposed branch: lp:~marcustomlinson/unity-scopes-api/fix_url_port_extraction
Merge into: lp:unity-scopes-api/devel
Diff against target: 382 lines (+276/-33)
5 files modified
include/unity/scopes/internal/smartscopes/SmartScopesClient.h (+5/-0)
src/scopes/internal/smartscopes/SmartScopesClient.cpp (+105/-29)
test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp (+148/-0)
test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp (+0/-2)
valgrind-suppress (+18/-2)
To merge this branch: bzr merge lp:~marcustomlinson/unity-scopes-api/fix_url_port_extraction
Reviewer Review Type Date Requested Status
PS Jenkins bot (community) continuous-integration Approve
Paweł Stołowski (community) Approve
Michi Henning (community) Needs Information
Review via email: mp+207864@code.launchpad.net

Commit message

Don't assume the port to be at the very end of a url string.

To post a comment you must log in.
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Michi Henning (michihenning) wrote :

Strictly speaking, for strings, you should use string::size_type, not uint64_t. (For example, on a 32-bit platform, uint64_t needlessly forces a 64-bit type when a 32-bit type would be sufficient.)

    // find the last occurrence of ':' in the url in order to extract the port number
    // * ignore the colon after "http"/"https"

    const size_t hier_pos = strlen("https");

    uint64_t port_pos = base_url.find_last_of(':');
    if (port_pos != std::string::npos && port_pos > hier_pos)

What happens if the URL is of the form "http://somewhere.org/some/path"? port_pos will be 4 in that case, which is not greater than hier_pos.

Just checking. I find it almost impossible to understand string parsing code such as this by just reading it. Not your fault, of course. The C++ string class would have to be one of the worst string classes anyone has ever come up with :-(

review: Needs Information
228. By Marcus Tomlinson

Use std::string::size_type for string find positions

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

> Strictly speaking, for strings, you should use string::size_type, not
> uint64_t. (For example, on a 32-bit platform, uint64_t needlessly forces a
> 64-bit type when a 32-bit type would be sufficient.)

ok, I've made that change.

> // find the last occurrence of ':' in the url in order to extract the port
> number
> // * ignore the colon after "http"/"https"
>
> const size_t hier_pos = strlen("https");
>
> uint64_t port_pos = base_url.find_last_of(':');
> if (port_pos != std::string::npos && port_pos > hier_pos)
>
> What happens if the URL is of the form "http://somewhere.org/some/path"?
> port_pos will be 4 in that case, which is not greater than hier_pos.

This is correct. That if statement is determining if there IS a port in the url. As "http://somewhere.org/some/path" does not have a port in the name, the else logic is executed.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
229. By Marcus Tomlinson

Added some more comments

230. By Marcus Tomlinson

Further clarification of string logic

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Paweł Stołowski (stolowski) wrote :

I agree, it's very hard to tell if this parsing is correct and handles all cases; making any adjustments to it without risking breaking something in the future will be tricky . Could you please add some tests that exercise this parser with different uris (including broken ones), with and without SMART_SCOPES_SERVER env var set and with default and non-default port?
Perhaps it would make sense to move this parser out from SmartScopesClient to a utility function to make testing easy (and also in case we need another SmartScopesClient ctor)? But I'm not going to insist on that, you can very well just extend existing SmartScopesClient tests.

review: Needs Fixing
231. By Marcus Tomlinson

Moved url parsing into set_url() method, and added set_port(), port(), and url() methods.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
232. By Marcus Tomlinson

reset_port() and reset_url() better describes what these methods are doing.

233. By Marcus Tomlinson

Added valgrind suppressions for putenv and getenv.
Added tests for url parsing to SmartScopesClientTest.

234. By Marcus Tomlinson

Spelling mistake in comments

235. By Marcus Tomlinson

Removed "TODO" comment

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Paweł Stołowski (stolowski) wrote :

Good stuff, thanks for the tests!

One more minor thing:

> What happens if the URL is of the form "http://somewhere.org/some/path"?
> port_pos will be 4 in that case, which is not greater than hier_pos.
> This is correct. That if statement is determining if there IS a port in the url. As > > >"http://somewhere.org/some/path" does not have a port in the name, the else logic is >executed.

Could you please add a comment in the code explaining it's ok for http as well?

review: Needs Fixing
236. By Marcus Tomlinson

Added comment better explaining how we determine that there is a port in the url.

Revision history for this message
Marcus Tomlinson (marcustomlinson) wrote :

> Good stuff, thanks for the tests!
>
> One more minor thing:
>
> > What happens if the URL is of the form "http://somewhere.org/some/path"?
> > port_pos will be 4 in that case, which is not greater than hier_pos.
> > This is correct. That if statement is determining if there IS a port in the
> url. As > > >"http://somewhere.org/some/path" does not have a port in the
> name, the else logic is >executed.
>
> Could you please add a comment in the code explaining it's ok for http as
> well?

Ok I've added "// if there is a port specified in the url (i.e. a colon occurs after "http:" / "https:")" before the if statement that checks for port.

Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)
Revision history for this message
Paweł Stołowski (stolowski) wrote :

+1

review: Approve
Revision history for this message
PS Jenkins bot (ps-jenkins) wrote :
review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'include/unity/scopes/internal/smartscopes/SmartScopesClient.h'
2--- include/unity/scopes/internal/smartscopes/SmartScopesClient.h 2014-02-20 19:19:25 +0000
3+++ include/unity/scopes/internal/smartscopes/SmartScopesClient.h 2014-02-25 09:40:06 +0000
4@@ -128,6 +128,11 @@
5
6 virtual ~SmartScopesClient();
7
8+ void reset_url(std::string const& url = "");
9+ void reset_port(uint port = 0);
10+ std::string url();
11+ uint port();
12+
13 bool get_remote_scopes(std::vector<RemoteScope>& scopes, std::string const& locale = "", bool caching_enabled = true);
14
15 SearchHandle::UPtr search(std::string const& base_url,
16
17=== modified file 'src/scopes/internal/smartscopes/SmartScopesClient.cpp'
18--- src/scopes/internal/smartscopes/SmartScopesClient.cpp 2014-02-20 19:19:25 +0000
19+++ src/scopes/internal/smartscopes/SmartScopesClient.cpp 2014-02-25 09:40:06 +0000
20@@ -93,38 +93,32 @@
21 uint port)
22 : http_client_(http_client)
23 , json_node_(json_node)
24- , url_(url)
25- , port_(port)
26+ , port_(0)
27 , have_latest_cache_(false)
28 , query_counter_(0)
29 {
30- // initialise url_
31- if (url_.empty())
32- {
33- char* base_url_env = ::getenv("SMART_SCOPES_SERVER");
34- std::string base_url = base_url_env ? base_url_env : "";
35- if (!base_url.empty())
36- {
37- // find the last occurrence of ':' in the url in order to extract the port number
38- // * ignore the colon after "http"/"https"
39-
40- const size_t hier_pos = strlen("https");
41-
42- uint64_t found = base_url.find_last_of(':');
43- if (found != std::string::npos && found > hier_pos)
44- {
45- url_ = base_url.substr(0, found);
46- port_ = std::stoi(base_url.substr(found + 1));
47- }
48- else
49- {
50- url_ = base_url;
51- }
52- }
53- else
54- {
55- url_ = c_base_url;
56- }
57+ try
58+ {
59+ // try to set url from url supplied
60+ reset_url(url);
61+ }
62+ catch(...)
63+ {
64+ try
65+ {
66+ // url supplied failed, try to set url automatically
67+ reset_url();
68+ }
69+ catch(...)
70+ {
71+ std::cerr << "SmartScopesClient::SmartScopesClient: Failed to initialise SSS url" << std::endl;
72+ }
73+ }
74+
75+ // force set a port only if one was explicitly provided
76+ if (port != 0)
77+ {
78+ reset_port(port);
79 }
80
81 // initialise cached_scopes_
82@@ -135,6 +129,88 @@
83 {
84 }
85
86+void SmartScopesClient::reset_url(std::string const& url)
87+{
88+ std::string base_url = url;
89+
90+ // if a url was not provided, get the environment variable
91+ if (base_url.empty())
92+ {
93+ char* sss_url_env = ::getenv("SMART_SCOPES_SERVER");
94+ base_url = sss_url_env ? sss_url_env : "";
95+
96+ // if the env var was not provided, use the c_base_url constant
97+ if (base_url.empty())
98+ {
99+ base_url = c_base_url;
100+ }
101+ }
102+
103+ // find the last occurrence of ':' in the url in order to extract the port number
104+ // * ignore the colon after "http" / "https"
105+
106+ const size_t hier_pos = strlen("https");
107+ std::string::size_type port_pos = base_url.find_last_of(':');
108+
109+ // if there is a port specified in the url (i.e. a colon occurs after "http:" / "https:")
110+ if (port_pos != std::string::npos && port_pos > hier_pos)
111+ {
112+ url_ = base_url.substr(0, port_pos);
113+ std::string port_str;
114+ std::string::size_type url_cont = base_url.find('/', port_pos);
115+
116+ // if the address continues after the port
117+ if (url_cont != std::string::npos)
118+ {
119+ // extract port, and add the rest of the address to url_
120+ url_ += base_url.substr(url_cont);
121+ port_str = base_url.substr(port_pos + 1, url_cont - port_pos - 1);
122+ }
123+ // else if the remainder of the string is just the port
124+ else
125+ {
126+ // extract port
127+ port_str = base_url.substr(port_pos + 1);
128+ }
129+
130+ // check if port_str is actually a number before setting port_
131+ if (!port_str.empty() &&
132+ std::find_if(begin(port_str), end(port_str),
133+ [](char const& c){ return !std::isdigit(c); }) == port_str.end())
134+ {
135+ port_ = std::stoi(port_str);
136+ }
137+ else
138+ {
139+ // the port supplied is not a number, don't trust the url either
140+ url_ = "";
141+ port_ = 0;
142+ throw unity::InvalidArgumentException("Invalid url: " + url);
143+ }
144+ }
145+ // else if there is no port specified in the url
146+ else
147+ {
148+ url_ = base_url;
149+ port_ = 0;
150+ }
151+}
152+
153+void SmartScopesClient::reset_port(uint port)
154+{
155+ port_ = port;
156+}
157+
158+std::string SmartScopesClient::url()
159+{
160+ return url_;
161+}
162+
163+uint SmartScopesClient::port()
164+{
165+ return port_;
166+}
167+
168 // returns false if cache used
169 bool SmartScopesClient::get_remote_scopes(std::vector<RemoteScope>& remote_scopes, std::string const& locale, bool caching_enabled)
170 {
171
172=== modified file 'test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp'
173--- test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp 2014-02-20 19:19:25 +0000
174+++ test/gtest/scopes/internal/smartscopes/SmartScopesClient/SmartScopesClient_test.cpp 2014-02-25 09:40:06 +0000
175@@ -190,4 +190,152 @@
176 EXPECT_EQ(2, results.size());
177 }
178
179+TEST_F(SmartScopesClientTest, url_parsing)
180+{
181+ // check initial values to be expected test values
182+ EXPECT_EQ(server_.port_, ssc_->port());
183+ EXPECT_EQ("http://127.0.0.1", ssc_->url());
184+
185+ // empty the environment var (in case there already is one set)
186+ std::string server_url_env = "SMART_SCOPES_SERVER=";
187+ ::putenv(const_cast<char*>(server_url_env.c_str()));
188+
189+ // reset url and check that we now have contant values
190+ EXPECT_NO_THROW(ssc_->reset_url());
191+ EXPECT_EQ(0, ssc_->port());
192+ EXPECT_EQ("https://productsearch.ubuntu.com/smartscopes/v2", ssc_->url());
193+
194+ // http addr, no port
195+ EXPECT_NO_THROW(ssc_->reset_url("http://hello.com/there"));
196+ EXPECT_EQ(0, ssc_->port());
197+ EXPECT_EQ("http://hello.com/there", ssc_->url());
198+
199+ // https addr, no port
200+ EXPECT_NO_THROW(ssc_->reset_url("https://hello.com/there"));
201+ EXPECT_EQ(0, ssc_->port());
202+ EXPECT_NO_THROW(ssc_->reset_port(1500));
203+ EXPECT_EQ(1500, ssc_->port());
204+ EXPECT_EQ("https://hello.com/there", ssc_->url());
205+
206+ // http addr, with port (end)
207+ EXPECT_NO_THROW(ssc_->reset_url("http://hello.com:1500"));
208+ EXPECT_EQ(1500, ssc_->port());
209+ EXPECT_EQ("http://hello.com", ssc_->url());
210+
211+ // https addr, with port (end)
212+ EXPECT_NO_THROW(ssc_->reset_url("https://hello.com:1500"));
213+ EXPECT_EQ(1500, ssc_->port());
214+ EXPECT_NO_THROW(ssc_->reset_port(2000));
215+ EXPECT_EQ(2000, ssc_->port());
216+ EXPECT_EQ("https://hello.com", ssc_->url());
217+
218+ // http addr, with port (mid)
219+ EXPECT_NO_THROW(ssc_->reset_url("http://hello.com:1500/there"));
220+ EXPECT_EQ(1500, ssc_->port());
221+ EXPECT_EQ("http://hello.com/there", ssc_->url());
222+
223+ // https addr, with port (mid)
224+ EXPECT_NO_THROW(ssc_->reset_url("https://hello.com:1500/there"));
225+ EXPECT_EQ(1500, ssc_->port());
226+ EXPECT_EQ("https://hello.com/there", ssc_->url());
227+
228+ // https addr, with broken port
229+ EXPECT_THROW(ssc_->reset_url("https://hello.com:15d00/there"), std::exception);
230+ EXPECT_EQ(0, ssc_->port());
231+ EXPECT_EQ("", ssc_->url());
232+
233+ // https addr, with floating colon
234+ EXPECT_THROW(ssc_->reset_url("https://hello.com:1500/the:re"), std::exception);
235+ EXPECT_EQ(0, ssc_->port());
236+ EXPECT_NO_THROW(ssc_->reset_port(1500));
237+ EXPECT_EQ(1500, ssc_->port());
238+ EXPECT_EQ("", ssc_->url());
239+}
240+
241+TEST_F(SmartScopesClientTest, url_parsing_env_var)
242+{
243+ // check initial values to be expected test values
244+ EXPECT_EQ(server_.port_, ssc_->port());
245+ EXPECT_EQ("http://127.0.0.1", ssc_->url());
246+
247+ // empty the environment var (in case there already is one set)
248+ std::string server_url_env = "SMART_SCOPES_SERVER=";
249+ ::putenv(const_cast<char*>(server_url_env.c_str()));
250+
251+ // reset url and check that we now have contant values
252+ EXPECT_NO_THROW(ssc_->reset_url());
253+ EXPECT_EQ(0, ssc_->port());
254+ EXPECT_EQ("https://productsearch.ubuntu.com/smartscopes/v2", ssc_->url());
255+
256+ // env var, http addr, no port
257+ server_url_env = "SMART_SCOPES_SERVER=http://hello.com/there";
258+ ::putenv(const_cast<char*>(server_url_env.c_str()));
259+
260+ EXPECT_NO_THROW(ssc_->reset_url());
261+ EXPECT_EQ(0, ssc_->port());
262+ EXPECT_EQ("http://hello.com/there", ssc_->url());
263+
264+ // env var, https addr, no port
265+ server_url_env = "SMART_SCOPES_SERVER=https://hello.com/there";
266+ ::putenv(const_cast<char*>(server_url_env.c_str()));
267+
268+ EXPECT_NO_THROW(ssc_->reset_url());
269+ EXPECT_EQ(0, ssc_->port());
270+ EXPECT_NO_THROW(ssc_->reset_port(1500));
271+ EXPECT_EQ(1500, ssc_->port());
272+ EXPECT_EQ("https://hello.com/there", ssc_->url());
273+
274+ // env var, http addr, with port (end)
275+ server_url_env = "SMART_SCOPES_SERVER=http://hello.com:1500";
276+ ::putenv(const_cast<char*>(server_url_env.c_str()));
277+
278+ EXPECT_NO_THROW(ssc_->reset_url());
279+ EXPECT_EQ(1500, ssc_->port());
280+ EXPECT_EQ("http://hello.com", ssc_->url());
281+
282+ // env var, https addr, with port (end)
283+ server_url_env = "SMART_SCOPES_SERVER=https://hello.com:1500";
284+ ::putenv(const_cast<char*>(server_url_env.c_str()));
285+
286+ EXPECT_NO_THROW(ssc_->reset_url());
287+ EXPECT_EQ(1500, ssc_->port());
288+ EXPECT_NO_THROW(ssc_->reset_port(2000));
289+ EXPECT_EQ(2000, ssc_->port());
290+ EXPECT_EQ("https://hello.com", ssc_->url());
291+
292+ // env var, http addr, with port (mid)
293+ server_url_env = "SMART_SCOPES_SERVER=http://hello.com:1500/there";
294+ ::putenv(const_cast<char*>(server_url_env.c_str()));
295+
296+ EXPECT_NO_THROW(ssc_->reset_url());
297+ EXPECT_EQ(1500, ssc_->port());
298+ EXPECT_EQ("http://hello.com/there", ssc_->url());
299+
300+ // env var, https addr, with port (mid)
301+ server_url_env = "SMART_SCOPES_SERVER=https://hello.com:1500/there";
302+ ::putenv(const_cast<char*>(server_url_env.c_str()));
303+
304+ EXPECT_NO_THROW(ssc_->reset_url());
305+ EXPECT_EQ(1500, ssc_->port());
306+ EXPECT_EQ("https://hello.com/there", ssc_->url());
307+
308+ // env var, https addr, with broken port
309+ server_url_env = "SMART_SCOPES_SERVER=https://hello.com:15d00/there";
310+ ::putenv(const_cast<char*>(server_url_env.c_str()));
311+
312+ EXPECT_THROW(ssc_->reset_url(), std::exception);
313+ EXPECT_EQ(0, ssc_->port());
314+ EXPECT_EQ("", ssc_->url());
315+
316+ // env var, https addr, with floating colon
317+ server_url_env = "SMART_SCOPES_SERVER=https://hello.com:1500/the:re";
318+ ::putenv(const_cast<char*>(server_url_env.c_str()));
319+
320+ EXPECT_THROW(ssc_->reset_url(), std::exception);
321+ EXPECT_EQ(0, ssc_->port());
322+ EXPECT_NO_THROW(ssc_->reset_port(1500));
323+ EXPECT_EQ(1500, ssc_->port());
324+ EXPECT_EQ("", ssc_->url());
325+}
326+
327 } // namespace
328
329=== modified file 'test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp'
330--- test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp 2014-02-20 19:19:25 +0000
331+++ test/gtest/scopes/internal/smartscopes/smartscopesproxy/smartscopesproxy_test.cpp 2014-02-25 09:40:06 +0000
332@@ -41,8 +41,6 @@
333 using namespace unity::scopes::internal::smartscopes;
334 using namespace unity::test::scopes::internal::smartscopes;
335
336-///! TODO: more tests
337-
338 namespace
339 {
340
341
342=== modified file 'valgrind-suppress'
343--- valgrind-suppress 2014-02-03 11:29:47 +0000
344+++ valgrind-suppress 2014-02-25 09:40:06 +0000
345@@ -72,7 +72,6 @@
346 obj:/lib/x86_64-linux-gnu/ld-2.18.so
347 }
348
349-
350 # False positives for memory leaks in Qt
351
352 {
353@@ -99,7 +98,6 @@
354 ...
355 fun:_ZN21QNetworkAccessManager13createRequestENS_9OperationERK15QNetworkRequestP9QIODevice
356 fun:_ZN21QNetworkAccessManager3getERK15QNetworkRequest
357- fun:_ZN18HttpClientQtThread3runEv
358 }
359
360 {
361@@ -109,3 +107,21 @@
362 ...
363 fun:_ZNK14QFactoryLoader8instanceEi
364 }
365+
366+# Bogus "invalid read" reports for ::putenv and ::genenv
367+
368+{
369+ putenv_read
370+ Memcheck:Addr1
371+ ...
372+ fun:putenv
373+ ...
374+}
375+
376+{
377+ getenv_read
378+ Memcheck:Addr2
379+ ...
380+ fun:getenv
381+ ...
382+}

Subscribers

People subscribed via source and target branches

to all changes: