Merge ~smoser/cloud-init:bug/1751051-subp-encode-with-utf8 into cloud-init:master

Proposed by Scott Moser on 2018-02-22
Status: Merged
Merged at revision: 46cb6716c27d4496ce3d2bea7684803f522f277d
Proposed branch: ~smoser/cloud-init:bug/1751051-subp-encode-with-utf8
Merge into: cloud-init:master
Diff against target: 72 lines (+40/-1)
2 files modified
cloudinit/ (+6/-1)
tests/unittests/ (+34/-0)
Reviewer Review Type Date Requested Status
Server Team CI bot continuous-integration Approve on 2018-02-23
Ryan Harper 2018-02-22 Approve on 2018-02-23
Review via email:

Commit message

subp: Fix subp usage with non-ascii characters when no system locale.

If python starts up without a locale set, then its default encoding
ends up set as ascii. That is not easily changed with the likes of
setlocale. In order to avoid UnicodeDecodeErrors cloud-init will
encode to bytes a python3 string or python2 basestring so that the
values passed to Popen are already bytes.

LP: #1751051

Description of the change

see commit message

To post a comment you must log in.

FAILED: Continuous integration, rev:927423ed11f3e2695384521a4ffe8c54df153a75
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:

review: Needs Fixing (continuous-integration)

FAILED: Continuous integration, rev:529ac9d19a1d73807efb956ae7de1ece9f51c0f9
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:

review: Needs Fixing (continuous-integration)
Ryan Harper (raharper) wrote :

Nifty test, and good change. The comment doesn't quite match the code;

'cloud-init will decode'

should be 'cloud-init will encode'


Scott Moser (smoser) wrote :

yeah, good catch. updated commit message and the comment also.

Ryan Harper (raharper) :
review: Approve

PASSED: Continuous integration, rev:6f9d258eda583a4250182bb2cb1ab6af93c83383
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:

review: Approve (continuous-integration)

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/cloudinit/ b/cloudinit/
index 338fb97..5a919cf 100644
--- a/cloudinit/
+++ b/cloudinit/
@@ -1865,8 +1865,13 @@ def subp(args, data=None, rcs=None, env=None, capture=True, shell=False,
1865 if not isinstance(data, bytes):1865 if not isinstance(data, bytes):
1866 data = data.encode()1866 data = data.encode()
1868 # Popen converts entries in the arguments array from non-bytes to bytes.
1869 # When locale is unset it may use ascii for that encoding which can
1870 # cause UnicodeDecodeErrors. (LP: #1751051)
1871 bytes_args = [x if isinstance(x, six.binary_type) else x.encode("utf-8")
1872 for x in args]
1868 try:1873 try:
1869 sp = subprocess.Popen(args, stdout=stdout,1874 sp = subprocess.Popen(bytes_args, stdout=stdout,
1870 stderr=stderr, stdin=stdin,1875 stderr=stderr, stdin=stdin,
1871 env=env, shell=shell)1876 env=env, shell=shell)
1872 (out, err) = sp.communicate(data)1877 (out, err) = sp.communicate(data)
diff --git a/tests/unittests/ b/tests/unittests/
index 4a92e74..89ae40f 100644
--- a/tests/unittests/
+++ b/tests/unittests/
@@ -8,7 +8,9 @@ import shutil
8import stat8import stat
9import tempfile9import tempfile
11import json
11import six12import six
13import sys
12import yaml14import yaml
14from cloudinit import importer, util16from cloudinit import importer, util
@@ -733,6 +735,38 @@ class TestSubp(helpers.CiTestCase):
733 self.assertEqual("/target/my/path/",735 self.assertEqual("/target/my/path/",
734 util.target_path("/target/", "///my/path/"))736 util.target_path("/target/", "///my/path/"))
738 def test_c_lang_can_take_utf8_args(self):
739 """Independent of system LC_CTYPE, args can contain utf-8 strings.
741 When python starts up, its default encoding gets set based on
742 the value of LC_CTYPE. If no system locale is set, the default
743 encoding for both python2 and python3 in some paths will end up
744 being ascii.
746 Attempts to use setlocale or patching (or changing) os.environ
747 in the current environment seem to not be effective.
749 This test starts up a python with LC_CTYPE set to C so that
750 the default encoding will be set to ascii. In such an environment
751 Popen(['command', 'non-ascii-arg']) would cause a UnicodeDecodeError.
752 """
753 python_prog = '\n'.join([
754 'import json, sys',
755 'from cloudinit.util import subp',
756 'data =',
757 'cmd = json.loads(data)',
758 'subp(cmd, capture=False)',
759 ''])
760 cmd = [BASH, '-c', 'echo -n "$@"', '--',
761 self.utf8_valid.decode("utf-8")]
762 python_subp = [sys.executable, '-c', python_prog]
764 out, _err = util.subp(
765 python_subp, update_env={'LC_CTYPE': 'C'},
766 data=json.dumps(cmd).encode("utf-8"),
767 decode=False)
768 self.assertEqual(self.utf8_valid, out)
737class TestEncode(helpers.TestCase):771class TestEncode(helpers.TestCase):
738 """Test the encoding functions"""772 """Test the encoding functions"""


People subscribed via source and target branches