Merge lp:~danielturcanu/zorba/mytrunk into lp:zorba
- mytrunk
- Merge into trunk
Status: | Superseded |
---|---|
Proposed branch: | lp:~danielturcanu/zorba/mytrunk |
Merge into: | lp:zorba |
Diff against target: |
176 lines (+38/-11) 13 files modified
ChangeLog (+2/-0) src/diagnostics/dict_XX_cpp.xq (+1/-0) src/diagnostics/pregenerated/dict_en.cpp (+1/-0) src/runtime/strings/strings_impl.cpp (+1/-1) src/util/utf8_util_base.h (+1/-2) src/zorbatypes/URI.cpp (+19/-8) src/zorbatypes/URI.h (+2/-0) test/rbkt/ExpQueryResults/zorba/resolving/path_to_uri.xml.res (+1/-0) test/rbkt/ExpQueryResults/zorba/string/Regex/regex_a4.xml.res (+1/-0) test/rbkt/Queries/zorba/resolving/path_to_uri.spec (+4/-0) test/rbkt/Queries/zorba/resolving/path_to_uri.xq (+3/-0) test/rbkt/Queries/zorba/resolving/test.xml (+1/-0) test/rbkt/Queries/zorba/string/Regex/regex_a4.xq (+1/-0) |
To merge this branch: | bzr merge lp:~danielturcanu/zorba/mytrunk |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Chris Hillery | Disapprove | ||
Matthias Brantner | Pending | ||
Review via email: mp+78253@code.launchpad.net |
This proposal supersedes a proposal from 2011-10-04.
Commit message
Description of the change
Matthias Brantner (matthias-brantner) wrote : Posted in a previous version of this proposal | # |
Matthias Brantner (matthias-brantner) wrote : Posted in a previous version of this proposal | # |
See comment recently made.
Chris Hillery (ceejatec) wrote : | # |
This isn't a good fix. The "path" of a URI isn't a filesystem path; it's a specific part of a URI (see RFC 3986). The path portion of a URI cannot be "absolute" or "relative". So it doesn't make sense to have this functionality in the URI class.
fn:resolve-uri() is not supposed to accept a filesystem path, so the test case path-to-uri.xq is not valid. If it DOES work on Unix as written, that's probably a different bug.
I am commenting specifically on the fix to bug 868329. I have nothing to say about the fix for bug 868325 except that it should probably be in a separate commit.
Daniel Turcanu (danielturcanu) wrote : | # |
For Windows, the absolute path for URI is still in the form C:/, only that it looks like C3A/.
At least for file:// scheme. I guess I should add a check only for that scheme.
About absolute and relative paths in URI, think about what fn:resolve-uri works with. It can receive a relative path, or absolute path or entire URI. When receiving absolute paths, it doesn't work now in Windows for file://.
Chris Hillery (ceejatec) wrote : | # |
The path component of a file: URI on Windows can contain C:/ (or C%3A/), but that's not directly relevant.
fn:resolve-uri() works with exactly URIs. It is not defined to accept filesystem paths. Thus, if the input is "C:/foo", it will interpret that strictly as a URI (and most likely fail since the "C:" scheme probably isn't known).
fn:doc() is an explicit exception in Zorba - it is defined (by us) to accept either a URI *or* a filesystem path. This is technically not W3C-spec compliant, but it was decided that it was necessary for a good customer experience. But fn:resolve-uri() does not have the same "magic". If you want, you could maybe make a case that it should have the same magic, but even if we agreed on that, it should use the same code from inside the implementation of fn:doc() to achieve it.
Daniel Turcanu (danielturcanu) wrote : | # |
Ok, I guess you are right. I reread again the URI spec and it doesn't say anything about Windows paths.
But how to transform Windows filepaths into file URI in a OS independent way?
There is another function fn:encode-for-uri, which is supposed to encode a path to be used as uri, but that isn't prepared for Windows either.
fn:encode-
I would expect /C%3A/path1/
I think the problem is in this function, it should be made to work differently for Windows.
Chris Hillery (ceejatec) wrote : | # |
The W3C doesn't care about filesystems. :) encode-for-uri() just escapes illegal characters in URIs.
Fortunately, EXPath cares: the File module has functions to go back and forth. file:path-to-uri() converts a filesystem path to a file: URI, and file:path-to-uri() (which is a bit oddly named) does the reverse.
Note that those functions don't work "in an OS independent way" as you say - they work differently depending on the platform running the script. So file:path-
Daniel Turcanu (danielturcanu) wrote : | # |
Maybe W3C doesn't care about file systems, but URIs do.
At least http, ftp and file all work with file system paths. We actually work with URL, not URI, we don't have general URIs so far.
I tested now file:path-to-uri and it doesn't do what I want: if I give it an absolute path, then it returns a complete URI, and if I give it a relative path, then it also returns a complete URI, with the path relative to my rbkt dir in the build directory. This seems like a wrong thing. If my rbkt dir is hardcoded in the file module, then it won't work for other users if I deploy the file module.
I still think that this is a job for fn:encode-for-uri. I need a function to encode a file system path into a path for URI. So if it is absolute path, make it absolute path for URI, not a full URI, and if it is relative path, make it relative path for URI.
And then this path I can apply to fn:resolve-uri.
Chris Hillery (ceejatec) wrote : | # |
HTTP and FTP URIs certainly don't work with filesystem paths. File: URIs only do by convention, and there are (somewhat hand-wavy) rules for mapping filesystem paths to file: URIs. But a filesystem path is-not-a file: URI, and so a filesystem path is never an appropriate argument for fn:resolve-uri() or in fact for any other function defined by XQuery F&O.
Also, URLs are no different than URIs except that URLs are expected to directly point to a resource, where a generic URI may simply name the resource. But they follow the same rules. Neither of them have any relationship to filesystem paths. I'm not saying this is good or bad, it's simply true. And Zorba most certainly works with full URIs, not just URLs.
file:path-to-uri() is documented to resolve relative filesystem paths using the current working directory as the base. So you're seeing your RBKT dir because that's the current working directory for testdriver, I suspect. Again, I'm not saying this is good or bad, it's simply true. I would certainly support you if you wanted to say that there should be a two-argument form of this function that allowed you to supply a different base URI or path, but you'll need to run that through the EXPath process (whatever that is).
Finally, regarding fn:encode-
http://
It only escapes *characters* in a string which are illegal in the path segment of a URI. It is not intended in any way to convert a filesystem path to a valid URI (relative or otherwise).
Daniel Turcanu (danielturcanu) wrote : | # |
Please stop giving arguments from the spec. I also know the spec.
I am trying to come with a solution to the problem. This problem cannot be solved by sticking to the spec.
So I want like this: have a variable that contains an absolute or relative file system path, and resolve it to a base absolute URI. There should be no if(is_absolute_
Maybe we shoud add another function in the file module, like file:path-
Chris Hillery (ceejatec) wrote : | # |
Sorry, but we implement a spec. You've suggested at least two changes which would break our compliance with that spec, so I don't have much choice but to quote the spec back at you.
It sounds like what you want would be best served by an extension to the Expath file module. As I said in my previous note, I'd certainly support you if you wanted to propose adding the two-argument form of file:path-to-uri(). It would be easy to implement, since we already have this functionality in the C++ API (zorba:
Daniel Turcanu (danielturcanu) wrote : | # |
So you propose a function like file:resolve-
I guess it's fine too.
- 10473. By Daniel Turcanu
-
Reverted changes in URI
- 10474. By Daniel Turcanu
-
Updated to trunk
- 10475. By Daniel Turcanu
-
Update to trunk
- 10476. By Daniel Turcanu
-
Update to trunk
- 10477. By Daniel Turcanu
-
Reverted change to http_client
- 10478. By Daniel Turcanu
-
Update to trunk
- 10479. By Daniel Turcanu
-
Deleted blank uri test
- 10480. By Daniel Turcanu
-
Update to trunk
- 10481. By Daniel Turcanu
-
Update to trunk
- 10482. By Daniel Turcanu
-
Update to trunk
Unmerged revisions
Preview Diff
1 | === modified file 'ChangeLog' | |||
2 | --- ChangeLog 2011-10-04 18:22:59 +0000 | |||
3 | +++ ChangeLog 2011-10-05 13:12:05 +0000 | |||
4 | @@ -41,6 +41,8 @@ | |||
5 | 41 | * Fixed bug #863730 (static delete-node* functions don't raise ZDDY0012) | 41 | * Fixed bug #863730 (static delete-node* functions don't raise ZDDY0012) |
6 | 42 | * Implemented the probe-index-range-value for general indexes | 42 | * Implemented the probe-index-range-value for general indexes |
7 | 43 | * Fixed bug #867662 ("nullptr" warning) | 43 | * Fixed bug #867662 ("nullptr" warning) |
8 | 44 | * Fixed bug #868329 (URI resolver fails on Windows with absolute paths) | ||
9 | 45 | * Fixed bug #868325 (fn:analyze-string fails with some recursive subgroups) | ||
10 | 44 | 46 | ||
11 | 45 | version 2.0.1 | 47 | version 2.0.1 |
12 | 46 | 48 | ||
13 | 47 | 49 | ||
14 | === modified file 'src/diagnostics/dict_XX_cpp.xq' | |||
15 | --- src/diagnostics/dict_XX_cpp.xq 2011-08-05 02:21:55 +0000 | |||
16 | +++ src/diagnostics/dict_XX_cpp.xq 2011-10-05 13:12:05 +0000 | |||
17 | @@ -64,6 +64,7 @@ | |||
18 | 64 | return string-join( | 64 | return string-join( |
19 | 65 | ( util:copyright(), | 65 | ( util:copyright(), |
20 | 66 | '#include "stdafx.h"', | 66 | '#include "stdafx.h"', |
21 | 67 | '#include "zorba/config.h"', | ||
22 | 67 | '#include "diagnostics/dict_impl.h"', | 68 | '#include "diagnostics/dict_impl.h"', |
23 | 68 | '', | 69 | '', |
24 | 69 | 'namespace zorba {', | 70 | 'namespace zorba {', |
25 | 70 | 71 | ||
26 | === modified file 'src/diagnostics/pregenerated/dict_en.cpp' | |||
27 | --- src/diagnostics/pregenerated/dict_en.cpp 2011-10-04 05:28:07 +0000 | |||
28 | +++ src/diagnostics/pregenerated/dict_en.cpp 2011-10-05 13:12:05 +0000 | |||
29 | @@ -20,6 +20,7 @@ | |||
30 | 20 | */ | 20 | */ |
31 | 21 | 21 | ||
32 | 22 | #include "stdafx.h" | 22 | #include "stdafx.h" |
33 | 23 | #include "zorba/config.h" | ||
34 | 23 | #include "diagnostics/dict_impl.h" | 24 | #include "diagnostics/dict_impl.h" |
35 | 24 | 25 | ||
36 | 25 | namespace zorba { | 26 | namespace zorba { |
37 | 26 | 27 | ||
38 | === modified file 'src/runtime/strings/strings_impl.cpp' | |||
39 | --- src/runtime/strings/strings_impl.cpp 2011-08-10 18:58:11 +0000 | |||
40 | +++ src/runtime/strings/strings_impl.cpp 2011-10-05 13:12:05 +0000 | |||
41 | @@ -1688,7 +1688,7 @@ | |||
42 | 1688 | GENV_ITEMFACTORY->createString(strid_item, zstrid); | 1688 | GENV_ITEMFACTORY->createString(strid_item, zstrid); |
43 | 1689 | store::Item_t id_attrib_item; | 1689 | store::Item_t id_attrib_item; |
44 | 1690 | GENV_ITEMFACTORY->createAttributeNode(id_attrib_item, group_elem.getp(), nr_attrib_name, untyped_type_name, strid_item); | 1690 | GENV_ITEMFACTORY->createAttributeNode(id_attrib_item, group_elem.getp(), nr_attrib_name, untyped_type_name, strid_item); |
46 | 1691 | if(match_startg < 0) | 1691 | if((match_startg < 0) || (match_startg < match_endgood)) |
47 | 1692 | continue; | 1692 | continue; |
48 | 1693 | match_endgood = match_endg; | 1693 | match_endgood = match_endg; |
49 | 1694 | if((i+1)<nr_pattern_groups) | 1694 | if((i+1)<nr_pattern_groups) |
50 | 1695 | 1695 | ||
51 | === modified file 'src/util/utf8_util_base.h' | |||
52 | --- src/util/utf8_util_base.h 2011-07-17 20:05:49 +0000 | |||
53 | +++ src/util/utf8_util_base.h 2011-10-05 13:12:05 +0000 | |||
54 | @@ -148,8 +148,7 @@ | |||
55 | 148 | * @return Returns the Unicode code-point of the next character. | 148 | * @return Returns the Unicode code-point of the next character. |
56 | 149 | */ | 149 | */ |
57 | 150 | template<class OctetIterator> | 150 | template<class OctetIterator> |
60 | 151 | ZORBA_DLL_PUBLIC | 151 | ZORBA_DLL_PUBLIC unicode::code_point next_char( OctetIterator &i ); |
59 | 152 | unicode::code_point next_char( OctetIterator &i ); | ||
61 | 153 | 152 | ||
62 | 154 | /** | 153 | /** |
63 | 155 | * Decodes the previous Unicode character. | 154 | * Decodes the previous Unicode character. |
64 | 156 | 155 | ||
65 | === modified file 'src/zorbatypes/URI.cpp' | |||
66 | --- src/zorbatypes/URI.cpp 2011-06-24 23:00:33 +0000 | |||
67 | +++ src/zorbatypes/URI.cpp 2011-10-05 13:12:05 +0000 | |||
68 | @@ -1191,7 +1191,16 @@ | |||
69 | 1191 | return is_set(Scheme) && !theScheme.empty(); | 1191 | return is_set(Scheme) && !theScheme.empty(); |
70 | 1192 | } | 1192 | } |
71 | 1193 | 1193 | ||
73 | 1194 | 1194 | bool URI::is_absolute_path(zstring &thePath) | |
74 | 1195 | { | ||
75 | 1196 | #ifndef WIN32 | ||
76 | 1197 | return thePath[0] == '/'; | ||
77 | 1198 | #else | ||
78 | 1199 | return (thePath.length() > 2) && | ||
79 | 1200 | ((thePath[1] == ':') || | ||
80 | 1201 | ((thePath[1] == '%') && (thePath[2] == '3') && (thePath[3] == 'A'))); | ||
81 | 1202 | #endif | ||
82 | 1203 | } | ||
83 | 1195 | 1204 | ||
84 | 1196 | /******************************************************************************* | 1205 | /******************************************************************************* |
85 | 1197 | 1206 | ||
86 | @@ -1331,7 +1340,7 @@ | |||
87 | 1331 | 1340 | ||
88 | 1332 | // If the path component begins with a slash character ("/"), then | 1341 | // If the path component begins with a slash character ("/"), then |
89 | 1333 | // the reference is an absolute-path and we skip to step 7. | 1342 | // the reference is an absolute-path and we skip to step 7. |
91 | 1334 | if ( (is_set(Path)) && (thePath[0] =='/') ) | 1343 | if ( is_set(Path) && is_absolute_path(thePath) ) |
92 | 1335 | { | 1344 | { |
93 | 1336 | invalidate_text(); | 1345 | invalidate_text(); |
94 | 1337 | return; | 1346 | return; |
95 | @@ -1342,12 +1351,14 @@ | |||
96 | 1342 | 1351 | ||
97 | 1343 | if ( base_uri->is_set(Path) ) | 1352 | if ( base_uri->is_set(Path) ) |
98 | 1344 | { | 1353 | { |
105 | 1345 | zstring::size_type last_slash = base_path.rfind("/"); | 1354 | if(!is_absolute_path(thePath)) |
106 | 1346 | if ( last_slash != zstring::npos ) | 1355 | { |
107 | 1347 | path = base_path.substr(0, last_slash+1); | 1356 | zstring::size_type last_slash = base_path.rfind("/"); |
108 | 1348 | // else | 1357 | if ( last_slash != zstring::npos ) |
109 | 1349 | // path = "/"; | 1358 | path = base_path.substr(0, last_slash+1); |
110 | 1350 | 1359 | // else | |
111 | 1360 | // path = "/"; | ||
112 | 1361 | } | ||
113 | 1351 | } | 1362 | } |
114 | 1352 | 1363 | ||
115 | 1353 | // 6b - append the relative URI path | 1364 | // 6b - append the relative URI path |
116 | 1354 | 1365 | ||
117 | === modified file 'src/zorbatypes/URI.h' | |||
118 | --- src/zorbatypes/URI.h 2011-06-14 17:26:33 +0000 | |||
119 | +++ src/zorbatypes/URI.h 2011-10-05 13:12:05 +0000 | |||
120 | @@ -190,6 +190,8 @@ | |||
121 | 190 | void unset_state(uint32_t s) const { theState &= ~s; } | 190 | void unset_state(uint32_t s) const { theState &= ~s; } |
122 | 191 | 191 | ||
123 | 192 | void invalidate_text() const; | 192 | void invalidate_text() const; |
124 | 193 | |||
125 | 194 | bool is_absolute_path(zstring &thePath); | ||
126 | 193 | }; | 195 | }; |
127 | 194 | 196 | ||
128 | 195 | 197 | ||
129 | 196 | 198 | ||
130 | === added file 'test/rbkt/ExpQueryResults/zorba/resolving/path_to_uri.xml.res' | |||
131 | --- test/rbkt/ExpQueryResults/zorba/resolving/path_to_uri.xml.res 1970-01-01 00:00:00 +0000 | |||
132 | +++ test/rbkt/ExpQueryResults/zorba/resolving/path_to_uri.xml.res 2011-10-05 13:12:05 +0000 | |||
133 | @@ -0,0 +1,1 @@ | |||
134 | 1 | <a/> | ||
135 | 0 | \ No newline at end of file | 2 | \ No newline at end of file |
136 | 1 | 3 | ||
137 | === added directory 'test/rbkt/ExpQueryResults/zorba/string/Regex' | |||
138 | === added file 'test/rbkt/ExpQueryResults/zorba/string/Regex/regex_a4.xml.res' | |||
139 | --- test/rbkt/ExpQueryResults/zorba/string/Regex/regex_a4.xml.res 1970-01-01 00:00:00 +0000 | |||
140 | +++ test/rbkt/ExpQueryResults/zorba/string/Regex/regex_a4.xml.res 2011-10-05 13:12:05 +0000 | |||
141 | @@ -0,0 +1,1 @@ | |||
142 | 1 | <fn:analyze-string-result xmlns:fn="http://www.w3.org/2005/xpath-functions"><fn:match>a<fn:group nr="1"><fn:group nr="2"/><fn:group nr="3">c</fn:group></fn:group></fn:match></fn:analyze-string-result> | ||
143 | 0 | \ No newline at end of file | 2 | \ No newline at end of file |
144 | 1 | 3 | ||
145 | === added file 'test/rbkt/Queries/zorba/resolving/path_to_uri.spec' | |||
146 | --- test/rbkt/Queries/zorba/resolving/path_to_uri.spec 1970-01-01 00:00:00 +0000 | |||
147 | +++ test/rbkt/Queries/zorba/resolving/path_to_uri.spec 2011-10-05 13:12:05 +0000 | |||
148 | @@ -0,0 +1,4 @@ | |||
149 | 1 | Args: | ||
150 | 2 | -x | ||
151 | 3 | input-context:=xs:string($RBKT_SRC_DIR/Queries/zorba/resolving/test.xml) | ||
152 | 4 | |||
153 | 0 | 5 | ||
154 | === added file 'test/rbkt/Queries/zorba/resolving/path_to_uri.xq' | |||
155 | --- test/rbkt/Queries/zorba/resolving/path_to_uri.xq 1970-01-01 00:00:00 +0000 | |||
156 | +++ test/rbkt/Queries/zorba/resolving/path_to_uri.xq 2011-10-05 13:12:05 +0000 | |||
157 | @@ -0,0 +1,3 @@ | |||
158 | 1 | declare variable $input-context as xs:string external; | ||
159 | 2 | |||
160 | 3 | fn:doc(fn:resolve-uri($input-context)) | ||
161 | 0 | \ No newline at end of file | 4 | \ No newline at end of file |
162 | 1 | 5 | ||
163 | === added file 'test/rbkt/Queries/zorba/resolving/test.xml' | |||
164 | --- test/rbkt/Queries/zorba/resolving/test.xml 1970-01-01 00:00:00 +0000 | |||
165 | +++ test/rbkt/Queries/zorba/resolving/test.xml 2011-10-05 13:12:05 +0000 | |||
166 | @@ -0,0 +1,1 @@ | |||
167 | 1 | <a/> | ||
168 | 0 | \ No newline at end of file | 2 | \ No newline at end of file |
169 | 1 | 3 | ||
170 | === added directory 'test/rbkt/Queries/zorba/string/Regex' | |||
171 | === added file 'test/rbkt/Queries/zorba/string/Regex/regex_a4.xq' | |||
172 | --- test/rbkt/Queries/zorba/string/Regex/regex_a4.xq 1970-01-01 00:00:00 +0000 | |||
173 | +++ test/rbkt/Queries/zorba/string/Regex/regex_a4.xq 2011-10-05 13:12:05 +0000 | |||
174 | @@ -0,0 +1,1 @@ | |||
175 | 1 | fn:analyze-string("ac", "((a)|(c))+") | ||
176 | 0 | \ No newline at end of file | 2 | \ No newline at end of file |
Please open two bugs for this and link the bug numbers from the ChangeLog. Also, please describe (or mention the bugs) in the commit message. We already went back and forth with URI fixes plenty of times and it's very confusing if it's unclear which scenario your commit is supposed to fix.