Merge lp:~vila/bzr/472161-ftp-utf8 into lp:bzr

Proposed by Vincent Ladeuil on 2010-01-07
Status: Merged
Approved by: John A Meinel on 2010-01-07
Approved revision: 4936
Merged at revision: not available
Proposed branch: lp:~vila/bzr/472161-ftp-utf8
Merge into: lp:bzr
Diff against target: 108 lines (+17/-16)
6 files modified
NEWS (+3/-0)
bzrlib/osutils.py (+5/-3)
bzrlib/tests/ftp_server/medusa_based.py (+2/-0)
bzrlib/tests/ftp_server/pyftpdlib_based.py (+3/-2)
bzrlib/tests/per_transport.py (+4/-0)
bzrlib/transport/ftp/__init__.py (+0/-11)
To merge this branch: bzr merge lp:~vila/bzr/472161-ftp-utf8
Reviewer Review Type Date Requested Status
John A Meinel 2010-01-07 Approve on 2010-01-07
Review via email: mp+16967@code.launchpad.net
To post a comment you must log in.
Vincent Ladeuil (vila) wrote :

This fix bug #472161 by... simply deleting _remote_path().

It also makes osutils.fancy_rename() more robust against uses with true unicode paths.

I'm not super proud of the trick to disable the unicode tests against medusa,
but I didn't want to make more involved changes for an ftp server that we
don't support with python >= 2.6.

I finally found RFC3640 which clearly states that Unicode paths are to be supported
as utf8-encoded by default.

The RFC mentions the FEAT command to ensure that the server supports UTF8 but I didn't
implement it to avoid the additional roundtrip (we'll find soon enough if UTF8 is not supported).

Feedback and real-world tests welcome !

John A Meinel (jameinel) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Ladeuil wrote:
> Vincent Ladeuil has proposed merging lp:~vila/bzr/472161-ftp-utf8 into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
>
>
> This fix bug #472161 by... simply deleting _remote_path().
>
> It also makes osutils.fancy_rename() more robust against uses with true unicode paths.
>
> I'm not super proud of the trick to disable the unicode tests against medusa,
> but I didn't want to make more involved changes for an ftp server that we
> don't support with python >= 2.6.
>
> I finally found RFC3640 which clearly states that Unicode paths are to be supported
> as utf8-encoded by default.
>
> The RFC mentions the FEAT command to ensure that the server supports UTF8 but I didn't
> implement it to avoid the additional roundtrip (we'll find soon enough if UTF8 is not supported).
>
> Feedback and real-world tests welcome !
>

 review: approve
 merge: approve

I did not test this manually, but the code looks good and your
descriptions seem fine.

I think this is something that really just needs real-world testing. FTP
servers are pretty notorious for handling edge cases differently.

+ no_unicode_support = getattr(self._server,
'_no_unicode_support', False)
+ if no_unicode_support:
+ raise TestSkipped("test server cannot handle unicode paths")
+

^- Arguably this would be better seeing something actually fail and
raising knownFailure. At least, I prefer XFAIL because then when we fix
stuff, we know which tests should be updated.

I probably wouldn't make it an '_' attribute. Instead just do
"no_unicode_support".

But what you have is fine if you just want to land it.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAktGAL0ACgkQJdeBCYSNAAOBXQCgmykxzpdWWyWBGROuIIiVm3p6
EEQAnA+UwN8IcUmrAiPG4v5SC2Lk8KL7
=yNOA
-----END PGP SIGNATURE-----

review: Approve
lp:~vila/bzr/472161-ftp-utf8 updated on 2010-01-07
4937. By Vincent Ladeuil on 2010-01-07

Fixed as per jam's review.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2010-01-07 06:34:14 +0000
3+++ NEWS 2010-01-07 14:15:28 +0000
4@@ -61,6 +61,9 @@
5 returns ``EINTR`` by calling ``PyErr_CheckSignals``. This affected the
6 optional ``_readdir_pyx`` extension. (Andrew Bennetts, #495023)
7
8+* FTP transports support Unicode paths by encoding/decoding them as utf8.
9+ (Vincent Ladeuil, #472161)
10+
11 * Give a clearer message if the lockdir disappears after being apparently
12 successfully taken. (Martin Pool, #498378)
13
14
15=== modified file 'bzrlib/osutils.py'
16--- bzrlib/osutils.py 2009-12-23 00:15:34 +0000
17+++ bzrlib/osutils.py 2010-01-07 14:15:27 +0000
18@@ -202,13 +202,15 @@
19 :param old: The old path, to rename from
20 :param new: The new path, to rename to
21 :param rename_func: The potentially non-atomic rename function
22- :param unlink_func: A way to delete the target file if the full rename succeeds
23+ :param unlink_func: A way to delete the target file if the full rename
24+ succeeds
25 """
26-
27+ new = safe_unicode(new)
28 # sftp rename doesn't allow overwriting, so play tricks:
29 base = os.path.basename(new)
30 dirname = os.path.dirname(new)
31- tmp_name = u'tmp.%s.%.9f.%d.%s' % (base, time.time(), os.getpid(), rand_chars(10))
32+ tmp_name = u'tmp.%s.%.9f.%d.%s' % (base, time.time(),
33+ os.getpid(), rand_chars(10))
34 tmp_name = pathjoin(dirname, tmp_name)
35
36 # Rename the file out of the way, but keep track if it didn't exist
37
38=== modified file 'bzrlib/tests/ftp_server/medusa_based.py'
39--- bzrlib/tests/ftp_server/medusa_based.py 2009-03-23 14:59:43 +0000
40+++ bzrlib/tests/ftp_server/medusa_based.py 2010-01-07 14:15:27 +0000
41@@ -213,6 +213,8 @@
42 class FTPTestServer(transport.Server):
43 """Common code for FTP server facilities."""
44
45+ _no_unicode_support = True
46+
47 def __init__(self):
48 self._root = None
49 self._ftp_server = None
50
51=== modified file 'bzrlib/tests/ftp_server/pyftpdlib_based.py'
52--- bzrlib/tests/ftp_server/pyftpdlib_based.py 2009-10-06 08:24:14 +0000
53+++ bzrlib/tests/ftp_server/pyftpdlib_based.py 2010-01-07 14:15:27 +0000
54@@ -50,14 +50,15 @@
55
56 def listdir(self, path):
57 """List the content of a directory."""
58- # FIXME: need tests with unicode paths
59 return [osutils.safe_utf8(s) for s in os.listdir(path)]
60
61 def fs2ftp(self, fspath):
62 p = ftpserver.AbstractedFS.fs2ftp(self, fspath)
63- # FIXME: need tests with unicode paths
64 return osutils.safe_utf8(p)
65
66+ def ftp2fs(self, ftppath):
67+ p = osutils.safe_unicode(ftppath)
68+ return ftpserver.AbstractedFS.ftp2fs(self, p)
69
70 class BzrConformingFTPHandler(ftpserver.FTPHandler):
71
72
73=== modified file 'bzrlib/tests/per_transport.py'
74--- bzrlib/tests/per_transport.py 2009-10-05 12:50:34 +0000
75+++ bzrlib/tests/per_transport.py 2010-01-07 14:15:27 +0000
76@@ -1497,6 +1497,10 @@
77 u'\u65e5', # Kanji person
78 ]
79
80+ no_unicode_support = getattr(self._server, '_no_unicode_support', False)
81+ if no_unicode_support:
82+ raise TestSkipped("test server cannot handle unicode paths")
83+
84 try:
85 self.build_tree(files, transport=t, line_endings='binary')
86 except UnicodeError:
87
88=== modified file 'bzrlib/transport/ftp/__init__.py'
89--- bzrlib/transport/ftp/__init__.py 2009-10-06 08:24:14 +0000
90+++ bzrlib/transport/ftp/__init__.py 2010-01-07 14:15:27 +0000
91@@ -205,17 +205,6 @@
92 #raise TransportError(msg='Error for path: %s' % (path,), orig_error=e)
93 raise
94
95- def _remote_path(self, relpath):
96- # XXX: It seems that ftplib does not handle Unicode paths
97- # at the same time, medusa won't handle utf8 paths So if
98- # we .encode(utf8) here (see ConnectedTransport
99- # implementation), then we get a Server failure. while
100- # if we use str(), we get a UnicodeError, and the test
101- # suite just skips testing UnicodePaths.
102- relative = str(urlutils.unescape(relpath))
103- remote_path = self._combine_paths(self._path, relative)
104- return remote_path
105-
106 def has(self, relpath):
107 """Does the target location exist?"""
108 # FIXME jam 20060516 We *do* ask about directories in the test suite