Merge ~raharper/curtin:fix/udevadm-info-shlex-quote into curtin:master

Proposed by Ryan Harper
Status: Merged
Approved by: Ryan Harper
Approved revision: 5ceb185c76810d3052190b73451cd3e02479a966
Merge reported by: Server Team CI bot
Merged at revision: not available
Proposed branch: ~raharper/curtin:fix/udevadm-info-shlex-quote
Merge into: curtin:master
Diff against target: 198 lines (+122/-8)
3 files modified
curtin/udev.py (+25/-2)
tests/data/udevadm_info_sandisk_cruzer.txt (+54/-0)
tests/unittests/test_udev.py (+43/-6)
Reviewer Review Type Date Requested Status
Dan Watkins (community) Approve
Server Team CI bot continuous-integration Approve
Shawn Weeks (community) Needs Information
Review via email: mp+382993@code.launchpad.net

Commit message

udev: use shlex.quote when shlex.split errors on shell-escape chars

The udev database may include shell escape characters in the output.
We want to avoid using shlex_quote unless needed as curtin does not
use the quoted value in our code and applying it to all of the values
breaks parsing of the udevadm info content.

If we encounter a ValueError while invoking shlex.split() then we
first use shlex_quote() and try parsing again and if that fails we
will try replacing shell-quote values with '_'. We log warning
whenever we have to quote or replace values.

- Python2.7 does not have shlex.quote, use pipes.quote (which was
  renamed to shlex.quote in Py3:
  https://docs.python.org/2/library/pipes.html#pipes.quote)
- Add unittest with original info dump from bug submitters system

LP: #1875085

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Needs Fixing (continuous-integration)
Revision history for this message
Shawn Weeks (absolutesantaja) wrote :

I wrote a little program to test this and the output may not be what you want. I feel like this has added too many quotes and still has an uneven number of single quotes.

file = open("output.txt","r")
fileString = file.read()
output = "'" + fileString.replace("'", "\'\"\'\"\'") + "'"
print output

Here is a small sample of the output.

SCSI_TPGS='"'"'0'"'"'
SCSI_TYPE='"'"'disk'"'"'
SCSI_VENDOR='"'"'SanDisk'"'"''"'"'
SCSI_VENDOR_ENC='"'"'SanDisk'"'"''"'"'
SCSI_MODEL='"'"'Cruzer_Fit'"'"'
SCSI_MODEL_ENC='"'"'Cruzer\x20Fit\x20\x20\x20\x20\x20\x20'"'"'
SCSI_REVISION='"'"'1.00'"'"'
ID_TYPE='"'"'disk'"'"'

review: Needs Information
Revision history for this message
Ryan Harper (raharper) wrote :

Thanks. shlex.quote() is designed to make the string usable directly in shell, like so:

% lxc exec x1 bash
root@x1:~# lsb_release -rd
Description: Ubuntu 16.04.6 LTS
Release: 16.04
root@x1:~# python3
Python 3.5.2 (default, Apr 16 2020, 17:47:17)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import shlex
>>> def shlex_quote(v):
... return "'" + v.replace("'", "\'\"\'\"\'") + "'"
...
>>> value="SanDisk'"
>>> value
"SanDisk'"
>>> shlex.quote(value)
'\'SanDisk\'"\'"\'\''
>>> shlex_quote(value)
'\'SanDisk\'"\'"\'\''
>>> print(shlex.quote(value))
'SanDisk'"'"''
>>>
root@x1:~# echo 'SanDisk'"'"''
SanDisk'

Revision history for this message
Shawn Weeks (absolutesantaja) wrote :

Thanks for the better test. I didn't fully understand what shlex was aiming for. I see the correct output now.

#!/bin/bash

outputVar=$(
cat << EOF | python
fileString = "'SanDisk''"
output = "'" + fileString.replace("'", "\'\"\'\"\'") + "'"
print output
quit()
EOF
)

sh -c "echo $outputVar"

'SanDisk''

Revision history for this message
Ryan Harper (raharper) wrote :

This unittest is not exposing the path we're seeing. I suspect there's some python string stuff happening here, I'll likely need to dump udevadm info --query=property --export output into a file and provide the decoded file output into the return of the mocked subp to have it trigger what I see a runtime.

First we shouldn't shlex.quote() everything, that leaves shell-escape characters in the values we'd like to use as they are, DEVTYPE="disk" for example, instead of DEVTYPE="'disk'".

>>> shlex.split("'disk'")
['disk']
>>> shlex.split(shlex.quote("'disk'"))
["'disk'"]

Revision history for this message
Shawn Weeks (absolutesantaja) wrote :

> This unittest is not exposing the path we're seeing. I suspect there's some
> python string stuff happening here, I'll likely need to dump udevadm info
> --query=property --export output into a file and provide the decoded file
> output into the return of the mocked subp to have it trigger what I see a
> runtime.
>
> First we shouldn't shlex.quote() everything, that leaves shell-escape
> characters in the values we'd like to use as they are, DEVTYPE="disk" for
> example, instead of DEVTYPE="'disk'".
>
> >>> shlex.split("'disk'")
> ['disk']
> >>> shlex.split(shlex.quote("'disk'"))
> ["'disk'"]

You're more than welcome to use output.txt on the bug report It was made by just piping the output of udevadm to a text file.

Revision history for this message
Ryan Harper (raharper) wrote :

Heh, that's exactly what I did =)

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Dan Watkins (oddbloke) wrote :

Broadly speaking this looks good, thanks for your work! I do have a few comments inline (though none of them are enormous).

review: Needs Fixing
Revision history for this message
Ryan Harper (raharper) wrote :

Thanks

a9c7968... by Ryan Harper

Use pipes.quote for py27

cf07a93... by Ryan Harper

Rework ValueError, update comment, conslidate trimmed value use

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Dan Watkins (oddbloke) wrote :

On Tue, Apr 28, 2020 at 04:53:30PM -0000, Ryan Harper wrote:
> > +try:
> > + shlex_quote = shlex.quote
> > +except AttributeError:
> > + # python2.7 shlex does not have quote, give it a try
> > + def shlex_quote(value):
>
> Yes, that seems *much* better. Thanks for tracking it down.

No problem!

> > + # strip the leading/ending single tick from udev output
>
> I'll rework.

Thanks!

> > + self.assertEqual('SanDisk'"'"'', info['SCSI_VENDOR'])
>
> What's wrong with matching the exact output from shlex_quote() ?
>
> % python3
> Python 3.8.2 (default, Mar 13 2020, 10:14:16)
> [GCC 9.3.0] on linux
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import shlex
> >>> shlex.quote("SanDisk'")
> '\'SanDisk\'"\'"\'\''
> >>> print(shlex.quote("SanDisk'"))
> 'SanDisk'"'"''
> >>>

I don't think that's what you're doing:

    In [4]: 'SanDisk'"'"'' == shlex.quote("SanDisk'")
    Out[4]: False

'SanDisk'"'"'' is equivalent to `'SanDisk' + "'" + ''`, which:

    In [5]: 'SanDisk'"'"'' == "SanDisk'"
    Out[5]: True

effectively undoes the escaping. This is a little easier to see in
triple quotes (where all the double and single quotes do not need
escaping, and I added newlines and a .strip() for further clarity):

    In [13]: """
        ...: 'SanDisk'"'"''
        ...: """.strip() == shlex.quote("SanDisk'")
    Out[13]: True

5ceb185... by Ryan Harper

Rework unittest and document expected result, being careful with quoting

Revision history for this message
Server Team CI bot (server-team-bot) wrote :
review: Approve (continuous-integration)
Revision history for this message
Dan Watkins (oddbloke) wrote :

LGTM, thanks! I've updated the commit message for the pipes change, but will leave it to you to mark Approved.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/curtin/udev.py b/curtin/udev.py
index e2e3dd0..f83f216 100644
--- a/curtin/udev.py
+++ b/curtin/udev.py
@@ -4,7 +4,14 @@ import shlex
4import os4import os
55
6from curtin import util6from curtin import util
7from curtin.log import logged_call7from curtin.log import logged_call, LOG
8
9try:
10 shlex_quote = shlex.quote
11except AttributeError:
12 # python2.7 uses pipes.quote
13 import pipes
14 shlex_quote = pipes.quote
815
916
10def compose_udev_equality(key, value):17def compose_udev_equality(key, value):
@@ -90,7 +97,23 @@ def udevadm_info(path=None):
90 value = None97 value = None
91 if value:98 if value:
92 # preserve spaces in values to match udev database99 # preserve spaces in values to match udev database
93 parsed = shlex.split(value)100 try:
101 parsed = shlex.split(value)
102 except ValueError:
103 # strip the leading/ending single tick from udev output before
104 # escaping the value to prevent their inclusion in the result.
105 trimmed_value = value[1:-1]
106 try:
107 quoted = shlex_quote(trimmed_value)
108 LOG.debug('udevadm_info: quoting shell-escape chars '
109 'in %s=%s -> %s', key, value, quoted)
110 parsed = shlex.split(quoted)
111 except ValueError:
112 escaped_value = (
113 trimmed_value.replace("'", "_").replace('"', "_"))
114 LOG.debug('udevadm_info: replacing shell-escape chars '
115 'in %s=%s -> %s', key, value, escaped_value)
116 parsed = shlex.split(escaped_value)
94 if ' ' not in value:117 if ' ' not in value:
95 info[key] = parsed[0]118 info[key] = parsed[0]
96 else:119 else:
diff --git a/tests/data/udevadm_info_sandisk_cruzer.txt b/tests/data/udevadm_info_sandisk_cruzer.txt
97new file mode 100644120new file mode 100644
index 0000000..a605afe
--- /dev/null
+++ b/tests/data/udevadm_info_sandisk_cruzer.txt
@@ -0,0 +1,54 @@
1DEVPATH='/devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.5/2-1.5.1/2-1.5.1:1.0/host6/target6:0:0/6:0:0:0/block/sdc/sdc1'
2DEVNAME='/dev/sdc1'
3DEVTYPE='partition'
4PARTN='1'
5MAJOR='8'
6MINOR='33'
7SUBSYSTEM='block'
8USEC_INITIALIZED='5265867'
9SCSI_TPGS='0'
10SCSI_TYPE='disk'
11SCSI_VENDOR='SanDisk''
12SCSI_VENDOR_ENC='SanDisk''
13SCSI_MODEL='Cruzer_Fit'
14SCSI_MODEL_ENC='Cruzer\x20Fit\x20\x20\x20\x20\x20\x20'
15SCSI_REVISION='1.00'
16ID_SCSI='1'
17ID_VENDOR='SanDisk_'
18ID_VENDOR_ENC='SanDisk\x27'
19ID_MODEL='Cruzer_Fit'
20ID_MODEL_ENC='Cruzer\x20Fit\x20\x20\x20\x20\x20\x20'
21ID_REVISION='1.00'
22ID_TYPE='disk'
23ID_SCSI_INQUIRY='1'
24ID_VENDOR_ID='0781'
25ID_MODEL_ID='5571'
26ID_SERIAL='SanDisk__Cruzer_Fit_4C530000140118216265-0:0'
27ID_SERIAL_SHORT='4C530000140118216265'
28ID_INSTANCE='0:0'
29ID_BUS='usb'
30ID_USB_INTERFACES=':080650:'
31ID_USB_INTERFACE_NUM='00'
32ID_USB_DRIVER='usb-storage'
33ID_PATH='pci-0000:00:1d.0-usb-0:1.5.1:1.0-scsi-0:0:0:0'
34ID_PATH_TAG='pci-0000_00_1d_0-usb-0_1_5_1_1_0-scsi-0_0_0_0'
35ID_PART_TABLE_UUID='36b64baf'
36ID_PART_TABLE_TYPE='dos'
37ID_FS_UUID='2020-04-23-08-02-07-00'
38ID_FS_UUID_ENC='2020-04-23-08-02-07-00'
39ID_FS_BOOT_SYSTEM_ID='EL\x20TORITO\x20SPECIFICATION'
40ID_FS_VERSION='Joliet Extension'
41ID_FS_LABEL='Ubuntu-Server_20.04_LTS_amd64'
42ID_FS_LABEL_ENC='Ubuntu-Server\x2020.04\x20LTS\x20amd64'
43ID_FS_TYPE='iso9660'
44ID_FS_USAGE='filesystem'
45ID_PART_ENTRY_SCHEME='dos'
46ID_PART_ENTRY_UUID='36b64baf-01'
47ID_PART_ENTRY_TYPE='0x0'
48ID_PART_ENTRY_FLAGS='0x80'
49ID_PART_ENTRY_NUMBER='1'
50ID_PART_ENTRY_OFFSET='0'
51ID_PART_ENTRY_SIZE='1859584'
52ID_PART_ENTRY_DISK='8:32'
53DEVLINKS='/dev/disk/by-path/pci-0000:00:1d.0-usb-0:1.5.1:1.0-scsi-0:0:0:0-part1 /dev/disk/by-id/usb-SanDisk__Cruzer_Fit_4C530000140118216265-0:0-part1 /dev/disk/by-label/Ubuntu-Server\x2020.04\x20LTS\x20amd64 /dev/disk/by-partuuid/36b64baf-01 /dev/disk/by-uuid/2020-04-23-08-02-07-00'
54TAGS=':systemd:'
diff --git a/tests/unittests/test_udev.py b/tests/unittests/test_udev.py
index 33d5f44..919c7c0 100644
--- a/tests/unittests/test_udev.py
+++ b/tests/unittests/test_udev.py
@@ -1,14 +1,18 @@
1# This file is part of curtin. See LICENSE file for copyright and license info.1# This file is part of curtin. See LICENSE file for copyright and license info.
22
3import mock3import mock
4import shlex
45
5from curtin import udev6from curtin.udev import (
7 udevadm_info,
8 shlex_quote,
9 )
6from curtin import util10from curtin import util
7from .helpers import CiTestCase11from .helpers import CiTestCase
812
913
10UDEVADM_INFO_QUERY = """\14UDEVADM_INFO_QUERY = """\
11DEVLINKS='/dev/disk/by-id/nvme-eui.0025388b710116a1'15DEVLINKS='/dev/disk/by-id/nvme-eui.0025388b710116a1 /dev/disk/by-id/nvme-n1'
12DEVNAME='/dev/nvme0n1'16DEVNAME='/dev/nvme0n1'
13DEVPATH='/devices/pci0000:00/0000:00:1c.4/0000:05:00.0/nvme/nvme0/nvme0n1'17DEVPATH='/devices/pci0000:00/0000:00:1c.4/0000:05:00.0/nvme/nvme0/nvme0n1'
14DEVTYPE='disk'18DEVTYPE='disk'
@@ -24,7 +28,8 @@ USEC_INITIALIZED='2026691'
24"""28"""
2529
26INFO_DICT = {30INFO_DICT = {
27 'DEVLINKS': ['/dev/disk/by-id/nvme-eui.0025388b710116a1'],31 'DEVLINKS': ['/dev/disk/by-id/nvme-eui.0025388b710116a1',
32 '/dev/disk/by-id/nvme-n1'],
28 'DEVNAME': '/dev/nvme0n1',33 'DEVNAME': '/dev/nvme0n1',
29 'DEVPATH':34 'DEVPATH':
30 '/devices/pci0000:00/0000:00:1c.4/0000:05:00.0/nvme/nvme0/nvme0n1',35 '/devices/pci0000:00/0000:00:1c.4/0000:05:00.0/nvme/nvme0/nvme0n1',
@@ -48,17 +53,49 @@ class TestUdevInfo(CiTestCase):
48 """ udevadm_info returns dictionary for specified device """53 """ udevadm_info returns dictionary for specified device """
49 mypath = '/dev/nvme0n1'54 mypath = '/dev/nvme0n1'
50 m_subp.return_value = (UDEVADM_INFO_QUERY, "")55 m_subp.return_value = (UDEVADM_INFO_QUERY, "")
51 info = udev.udevadm_info(mypath)56 info = udevadm_info(mypath)
52 m_subp.assert_called_with(57 m_subp.assert_called_with(
53 ['udevadm', 'info', '--query=property', '--export', mypath],58 ['udevadm', 'info', '--query=property', '--export', mypath],
54 capture=True)59 capture=True)
55 self.assertEqual(sorted(INFO_DICT), sorted(info))60 self.assertEqual(sorted(INFO_DICT), sorted(info))
5661
62 @mock.patch('curtin.util.subp')
63 def test_udevadm_info_escape_quotes(self, m_subp):
64 """verify we escape quotes when we fail to split. """
65 mypath = '/dev/sdz'
66 datafile = 'tests/data/udevadm_info_sandisk_cruzer.txt'
67 m_subp.return_value = (util.load_file(datafile), "")
68 info = udevadm_info(mypath)
69 m_subp.assert_called_with(
70 ['udevadm', 'info', '--query=property', '--export', mypath],
71 capture=True)
72
73 """
74 Replicate what udevadm_info parsing does and use pdb to examine what's
75 happening.
76
77 (Pdb) original_value
78 "SanDisk'"
79 (Pdb) quoted_value
80 '\'SanDisk\'"\'"\'\''
81 (Pdb) split_value
82 ["SanDisk'"]
83 (Pdb) expected_value
84 "SanDisk'"
85 """
86 original_value = "SanDisk'"
87 quoted_value = shlex_quote(original_value)
88 split_value = shlex.split(quoted_value)
89 expected_value = split_value if ' ' in split_value else split_value[0]
90
91 self.assertEqual(expected_value, info['SCSI_VENDOR'])
92 self.assertEqual(expected_value, info['SCSI_VENDOR_ENC'])
93
57 def test_udevadm_info_no_path(self):94 def test_udevadm_info_no_path(self):
58 """ udevadm_info raises ValueError for invalid path value"""95 """ udevadm_info raises ValueError for invalid path value"""
59 mypath = None96 mypath = None
60 with self.assertRaises(ValueError):97 with self.assertRaises(ValueError):
61 udev.udevadm_info(mypath)98 udevadm_info(mypath)
6299
63 @mock.patch('curtin.util.subp')100 @mock.patch('curtin.util.subp')
64 def test_udevadm_info_path_not_exists(self, m_subp):101 def test_udevadm_info_path_not_exists(self, m_subp):
@@ -66,4 +103,4 @@ class TestUdevInfo(CiTestCase):
66 mypath = self.random_string()103 mypath = self.random_string()
67 m_subp.side_effect = util.ProcessExecutionError()104 m_subp.side_effect = util.ProcessExecutionError()
68 with self.assertRaises(util.ProcessExecutionError):105 with self.assertRaises(util.ProcessExecutionError):
69 udev.udevadm_info(mypath)106 udevadm_info(mypath)

Subscribers

People subscribed via source and target branches