Merge ~mgerdts/cloud-init:lp1746605 into cloud-init:master

Proposed by Mike Gerdts
Status: Merged
Approved by: Scott Moser
Approved revision: c24efa09f7a3bc00157eabd5761c307af7229d3a
Merge reported by: Scott Moser
Merged at revision: 4ed164592fe8cb15758cacf3cb3f8c7d5ab7c82e
Proposed branch: ~mgerdts/cloud-init:lp1746605
Merge into: cloud-init:master
Diff against target: 118 lines (+67/-2)
2 files modified
cloudinit/sources/DataSourceSmartOS.py (+2/-0)
tests/unittests/test_datasource/test_smartos.py (+65/-2)
Reviewer Review Type Date Requested Status
Scott Moser Approve
Server Team CI bot continuous-integration Approve
Review via email: mp+343576@code.launchpad.net

Commit message

DataSourceSmartOS: add locking of serial device.

cloud-init and mdata-get each have their own implementation of the
SmartOS metadata protocol. If cloud-init and other services that call
mdata-get are run concurrently, crosstalk on the serial port can cause
them both to become confused.

This change makes it so that cloud-init uses the same cooperative
locking scheme that's used by mdata-get, thus preventing cross-talk
between mdata-get and cloud-init.

For testing, a VM running on a SmartOS host and pyserial are required.
If the tests are run on a platform other than SmartOS, those that use a
real serial port are skipped. pyserial remains commented in
requirements.txt because most testers will not be running atop SmartOS.

LP: #1746605

To post a comment you must log in.
Revision history for this message
Mike Gerdts (mgerdts) wrote :
Download full text (4.8 KiB)

The fix for bug #1667735 is a better fix for the retries that was originally part of this via

https://code.launchpad.net/~mgerdts/cloud-init/+git/cloud-init/+merge/337271

This merge proposal obsoletes merge proposal 337271.

The change to DataSourceSmartOS is now quite small. The new tests will only run on VMs running on a SmartOS host and then only if they have write access to /dev/ttyS1. This test code includes Python 2.6 fixes offered by smoser as part of the previous merge proposal.

Proof that the new test runs under tox:

$ tox tests.unittests.test_datasource.test_smartos.TestSerialConcurrency
GLOB sdist-make: /home/mgerdts/cloud-init/setup.py
py27 inst-nodeps: /home/mgerdts/cloud-init/.tox/dist/cloud-init-18.2.zip
py27 installed: certifi==2018.1.18,chardet==3.0.4,cloud-init==18.2,configobj==5.0.6,contextlib2==0.5.5,coverage==4.5.1,funcsigs==1.0.2,functools32==3.2.3.post2,httpretty==0.8.14,idna==2.6,Jinja2==2.10,jsonpatch==1.23,jsonpointer==2.0,jsonschema==2.6.0,linecache2==1.0.0,MarkupSafe==1.0,mock==2.0.0,nose==1.3.7,oauthlib==2.0.7,pbr==4.0.2,pkg-resources==0.0.0,pyserial==3.4,PyYAML==3.12,requests==2.18.4,six==1.11.0,traceback2==1.4.0,unittest2==1.1.0,urllib3==1.22
py27 runtests: PYTHONHASHSEED='1425861219'
py27 runtests: commands[0] | python -m nose tests.unittests.test_datasource.test_smartos.TestSerialConcurrency

----------------------------------------------------------------------
Ran 0 tests in 0.000s

OK
py3 inst-nodeps: /home/mgerdts/cloud-init/.tox/dist/cloud-init-18.2.zip
py3 installed: certifi==2018.1.18,chardet==3.0.4,cloud-init==18.2,configobj==5.0.6,contextlib2==0.5.5,coverage==4.5.1,httpretty==0.8.14,idna==2.6,Jinja2==2.10,jsonpatch==1.23,jsonpointer==2.0,jsonschema==2.6.0,linecache2==1.0.0,MarkupSafe==1.0,mock==2.0.0,nose==1.3.7,nose-timer==0.7.1,oauthlib==2.0.7,pbr==4.0.2,pkg-resources==0.0.0,pyserial==3.4,PyYAML==3.12,requests==2.18.4,six==1.11.0,termcolor==1.1.0,traceback2==1.4.0,unittest2==1.1.0,urllib3==1.22
py3 runtests: PYTHONHASHSEED='1425861219'
py3 runtests: commands[0] | /home/mgerdts/cloud-init/.tox/py3/bin/python -m nose --with-timer --timer-top-n 10 tests.unittests.test_datasource.test_smartos.TestSerialConcurrency
.
[success] 100.00% tests.unittests.test_datasource.test_smartos.TestSerialConcurrency.test_all_keys: 22.6319s
----------------------------------------------------------------------
Ran 1 test in 22.633s

OK
flake8 inst-nodeps: /home/mgerdts/cloud-init/.tox/dist/cloud-init-18.2.zip
flake8 installed: certifi==2018.1.18,chardet==3.0.4,cloud-init==18.2,configobj==5.0.6,flake8==3.3.0,hacking==0.13.0,idna==2.6,Jinja2==2.10,jsonpatch==1.23,jsonpointer==2.0,jsonschema==2.6.0,MarkupSafe==1.0,mccabe==0.6.1,oauthlib==2.0.7,pbr==4.0.2,pep8==1.5.7,pkg-resources==0.0.0,pycodestyle==2.3.1,pyflakes==1.5.0,PyYAML==3.12,requests==2.18.4,six==1.11.0,urllib3==1.22
flake8 runtests: PYTHONHASHSEED='1425861219'
flake8 runtests: commands[0] | /home/mgerdts/cloud-init/.tox/flake8/bin/python -m flake8 tests.unittests.test_datasource.test_smartos.TestSerialConcurrency
xenial inst-nodeps: /home/mgerdts/cloud-init/.tox/dist/cloud-init-18.2.zip
xenial installed: cloud-init==18.2,configobj==5.0.6,contextlib2==0.5.1,...

Read more...

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1035/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1035/rebuild

review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) wrote :

fcntl.lockf(fp, fcntl.LOCK_EX)

Do we not need to unlock at some point ? (fcntl.lockf(fp, fcntl.LOCK_UN)
Or are you relying on the lock being let go of on process exit?
or am I just missing something entirely?

Revision history for this message
Mike Gerdts (mgerdts) wrote :

The lock is automatically released on close() or process exit.

On Fri, Apr 20, 2018 at 1:33 PM, Scott Moser <email address hidden>
wrote:

> fcntl.lockf(fp, fcntl.LOCK_EX)
>
> Do we not need to unlock at some point ? (fcntl.lockf(fp, fcntl.LOCK_UN)
> Or are you relying on the lock being let go of on process exit?
> or am I just missing something entirely?
>
>
> --
> https://code.launchpad.net/~mgerdts/cloud-init/+git/cloud-
> init/+merge/343576
> You are the owner of ~mgerdts/cloud-init:lp1746605.
>

Revision history for this message
Scott Moser (smoser) :
review: Approve
Revision history for this message
Scott Moser (smoser) wrote :

Mike,

I enabled unused variables warnings in pylint a few commits
back. if you rebase this you'll see the errors.

the warning looks like this:
  tests/unittests/test_datasource/test_smartos.py:1053: [W0612(unused-variable), TestSerialConcurrency.test_all_keys] Unused variable 'it'

The fix is just to name that variable '_it' rather than 'it'. Or
even just '_'.

then it wont' complain.

Easiest thing is for you to commit that and push.
as it is right nwo our tools rebase yo to trunk and run tox and then merge... and if i want to use the tools (I do) then you fixing this branch is the easiest thing.

~mgerdts/cloud-init:lp1746605 updated
2b4bdbb... by Mike Gerdts

feedback from smoser: unused variable

Revision history for this message
Scott Moser (smoser) wrote :

An upstream commit landed for this bug.

To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=4ed16459

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
diff --git a/cloudinit/sources/DataSourceSmartOS.py b/cloudinit/sources/DataSourceSmartOS.py
index 0ef1003..4ea00eb 100644
--- a/cloudinit/sources/DataSourceSmartOS.py
+++ b/cloudinit/sources/DataSourceSmartOS.py
@@ -23,6 +23,7 @@
23import base6423import base64
24import binascii24import binascii
25import errno25import errno
26import fcntl
26import json27import json
27import os28import os
28import random29import random
@@ -526,6 +527,7 @@ class JoyentMetadataSerialClient(JoyentMetadataClient):
526 if not ser.isOpen():527 if not ser.isOpen():
527 raise SystemError("Unable to open %s" % self.device)528 raise SystemError("Unable to open %s" % self.device)
528 self.fp = ser529 self.fp = ser
530 fcntl.lockf(ser, fcntl.LOCK_EX)
529 self._flush()531 self._flush()
530 self._negotiate()532 self._negotiate()
531533
diff --git a/tests/unittests/test_datasource/test_smartos.py b/tests/unittests/test_datasource/test_smartos.py
index b926263..706e8eb 100644
--- a/tests/unittests/test_datasource/test_smartos.py
+++ b/tests/unittests/test_datasource/test_smartos.py
@@ -16,23 +16,27 @@ from __future__ import print_function
1616
17from binascii import crc3217from binascii import crc32
18import json18import json
19import multiprocessing
19import os20import os
20import os.path21import os.path
21import re22import re
22import shutil23import shutil
24import signal
23import stat25import stat
24import tempfile26import tempfile
27import unittest2
25import uuid28import uuid
2629
27from cloudinit import serial30from cloudinit import serial
28from cloudinit.sources import DataSourceSmartOS31from cloudinit.sources import DataSourceSmartOS
29from cloudinit.sources.DataSourceSmartOS import (32from cloudinit.sources.DataSourceSmartOS import (
30 convert_smartos_network_data as convert_net)33 convert_smartos_network_data as convert_net,
34 SMARTOS_ENV_KVM, SERIAL_DEVICE, get_smartos_environ)
3135
32import six36import six
3337
34from cloudinit import helpers as c_helpers38from cloudinit import helpers as c_helpers
35from cloudinit.util import b64e39from cloudinit.util import (b64e, subp)
3640
37from cloudinit.tests.helpers import mock, FilesystemMockingTestCase, TestCase41from cloudinit.tests.helpers import mock, FilesystemMockingTestCase, TestCase
3842
@@ -1023,4 +1027,63 @@ class TestNetworkConversion(TestCase):
1023 found = convert_net(SDC_NICS_SINGLE_GATEWAY)1027 found = convert_net(SDC_NICS_SINGLE_GATEWAY)
1024 self.assertEqual(expected, found)1028 self.assertEqual(expected, found)
10251029
1030
1031@unittest2.skipUnless(get_smartos_environ() == SMARTOS_ENV_KVM,
1032 "Only supported on KVM and bhyve guests under SmartOS")
1033@unittest2.skipUnless(os.access(SERIAL_DEVICE, os.W_OK),
1034 "Requires write access to " + SERIAL_DEVICE)
1035class TestSerialConcurrency(TestCase):
1036 """
1037 This class tests locking on an actual serial port, and as such can only
1038 be run in a kvm or bhyve guest running on a SmartOS host. A test run on
1039 a metadata socket will not be valid because a metadata socket ensures
1040 there is only one session over a connection. In contrast, in the
1041 absence of proper locking multiple processes opening the same serial
1042 port can corrupt each others' exchanges with the metadata server.
1043 """
1044 def setUp(self):
1045 self.mdata_proc = multiprocessing.Process(target=self.start_mdata_loop)
1046 self.mdata_proc.start()
1047 super(TestSerialConcurrency, self).setUp()
1048
1049 def tearDown(self):
1050 # os.kill() rather than mdata_proc.terminate() to avoid console spam.
1051 os.kill(self.mdata_proc.pid, signal.SIGKILL)
1052 self.mdata_proc.join()
1053 super(TestSerialConcurrency, self).tearDown()
1054
1055 def start_mdata_loop(self):
1056 """
1057 The mdata-get command is repeatedly run in a separate process so
1058 that it may try to race with metadata operations performed in the
1059 main test process. Use of mdata-get is better than two processes
1060 using the protocol implementation in DataSourceSmartOS because we
1061 are testing to be sure that cloud-init and mdata-get respect each
1062 others locks.
1063 """
1064 rcs = list(range(0, 256))
1065 while True:
1066 subp(['mdata-get', 'sdc:routes'], rcs=rcs)
1067
1068 def test_all_keys(self):
1069 self.assertIsNotNone(self.mdata_proc.pid)
1070 ds = DataSourceSmartOS
1071 keys = [tup[0] for tup in ds.SMARTOS_ATTRIB_MAP.values()]
1072 keys.extend(ds.SMARTOS_ATTRIB_JSON.values())
1073
1074 client = ds.jmc_client_factory()
1075 self.assertIsNotNone(client)
1076
1077 # The behavior that we are testing for was observed mdata-get running
1078 # 10 times at roughly the same time as cloud-init fetched each key
1079 # once. cloud-init would regularly see failures before making it
1080 # through all keys once.
1081 for _ in range(0, 3):
1082 for key in keys:
1083 # We don't care about the return value, just that it doesn't
1084 # thrown any exceptions.
1085 client.get(key)
1086
1087 self.assertIsNone(self.mdata_proc.exitcode)
1088
1026# vi: ts=4 expandtab1089# vi: ts=4 expandtab

Subscribers

People subscribed via source and target branches