lp:~aaron-whitehouse/duplicity/08-unicode

Created by Aaron Whitehouse and last modified

Branch for conversion of strings to unicode/bytes. This should make the transition to Python 3 significantly easier.

Current status: Conducting final clean-up/checks pre-merge -- I believe this branch is all working correctly.

In order to merge the changes back into trunk as soon as possible, this focuses on just the strings in the 'local' side of duplicity, i.e. those involved in interpreting the selection functions from the commandline arguments or include/exclude lists and reading each file from the local filesystem to compare to these selection functions.

Patches/Comments welcome - lists@whitehouse . kiwi . nz

Notes:
* This branch introduces Path.uc_name with the filename in unicode, in addition to Path.name with the filename in bytes (i.e. each Path will have a .name and .uc_name). These are both created for each Path object and code throughout duplicity should move to using .uc_name where it needs the unicode name and .name where it needs the bytes name, rather than converting. Valid filenames on the disk in Linux can be invalid unicode, so for Python 2 support it is important that we keep the original bytes .name and use that rather than converting the unicode version. That said, for all internal purposes (e.g. string comparison) we should be using the unicode version. From Python 3.2, it looks as though this has been solved properly and os.fsencode/os.fsencode uses 'surrogateescape' characters (https://www.python.org/dev/peps/pep-0383/ ) to ensure you can always recreate the original bytes filename from the internal unicode representation. As I understand it, that should allow the more standard "decode/encode at the boundaries, use unicode everywhere internally" approach.

Get this branch:
bzr branch lp:~aaron-whitehouse/duplicity/08-unicode
Only Aaron Whitehouse can upload to this branch. If you are Aaron Whitehouse please log in for upload directions.

Branch merges

Related bugs

Related blueprints

Branch information

Owner:
Aaron Whitehouse
Project:
Duplicity
Status:
Merged

Recent revisions

1226. By Aaron Whitehouse

Various code tidy-ups pre submitting for merge. None should change behaviour.

1225. By Aaron Whitehouse

* Change fsdecode to use globals.fsencoding
* Add 'ANSI_X3.4-1968' to the list of fsencodings that globals.fsencode treats as probably UTF-8

1223. By Aaron Whitehouse

Sync with trunk.

1222. By Aaron Whitehouse

* Merge with trunk
* Add back in testing/testfiles.tar.gz (accidentally made unversioned in previous commit when doing manual merge of archives)
* PEP8 error (from trunk)

1221. By Aaron Whitehouse

Merge with trunk (including manual merge of testfiles.tar.gz to add select-unicode to the version without the non-UTF8 file)

1220. By Aaron Whitehouse

Merge with trunk (and fix PEP error in backends/pydrivebackend.py)

1219. By Aaron Whitehouse

Remove conditional pexpect in testing/functional/__init__.py -- while the commented-out text is the nicer approach in versions after pexpect 4.0, we need to support earlier versions at this stage and a single code path is simpler.

1218. By Aaron Whitehouse

Merge with trunk

1217. By Aaron Whitehouse

Merge with trunk.

Branch metadata

Branch format:
Branch format 7
Repository format:
Bazaar repository format 2a (needs bzr 1.16 or later)
Stacked on:
lp:~duplicity-team/duplicity/0.8-series
This branch contains Public information 
Everyone can see this information.

Subscribers