Merge lp:~gz/bzr-builddeb/nonascii_import_dir_508258 into lp:bzr-builddeb
Status: | Merged | ||||
---|---|---|---|---|---|
Approved by: | Jelmer Vernooij | ||||
Approved revision: | 638 | ||||
Merged at revision: | 637 | ||||
Proposed branch: | lp:~gz/bzr-builddeb/nonascii_import_dir_508258 | ||||
Merge into: | lp:bzr-builddeb | ||||
Diff against target: |
108 lines (+28/-22) 3 files modified
bzrtools_import.py (+6/-16) tests/__init__.py (+8/-2) tests/test_bzrtools_import.py (+14/-4) |
||||
To merge this branch: | bzr merge lp:~gz/bzr-builddeb/nonascii_import_dir_508258 | ||||
Related bugs: |
|
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jelmer Vernooij | code | Approve | |
Review via email: mp+80243@code.launchpad.net |
Description of the change
This branch should allow the package importer to work with non-ascii filenames from upstream sources. There are a few provisos as this depends on the fix from bug 488519. If the name can't be represented in the filesystem encoding, a (much nicer) error giving the filename and encoding is raised, except on bzr 2.1 which will fail in much the same way as currently.
The package importer is already making an effort to say it wants to use UTF-8 filenames, but is being foiled by bad filesystem walking code. Nearly every use of os.listdir is a bug, it's easy to use incorrectly even for code like this that doesn't care about running on non-posix systems. Swapping in osutils.walkdirs instead is not ideal because it's a rather baroque interface and still doesn't handle the edge case of undecodable bytestring names. However, it's in all the bzrlib versions that builddeb needs to support and gets the common case right.
It would be nice to get rid of the bits copied over from bzrtools entirely. The bzrtools_import module is built around the idea of importing from tar, and this bug is in the compat layer designed to make an actual directory look like a tarfile. But bzr-builddeb only wants the import_dir function, so that design is backwards. The only significant chunk is the TreeTransform construction in _import_archive which should really live somewhere else.
+1
Also, I like the idea of getting rid of the bzrtools import code.