Merge lp:~javier.collado/utah/provisioned_machine into lp:utah

Proposed by Javier Collado
Status: Merged
Approved by: Javier Collado
Approved revision: 821
Merged at revision: 811
Proposed branch: lp:~javier.collado/utah/provisioned_machine
Merge into: lp:utah
Diff against target: 235 lines (+117/-15)
4 files modified
examples/run_utah_tests.py (+27/-1)
utah/provisioning/provisioning.py (+18/-4)
utah/provisioning/ssh.py (+67/-9)
utah/run.py (+5/-1)
To merge this branch: bzr merge lp:~javier.collado/utah/provisioned_machine
Reviewer Review Type Date Requested Status
Max Brustkern (community) Approve
Javier Collado (community) Needs Resubmitting
Review via email: mp+145920@code.launchpad.net

Description of the change

This branch creates a new class named ProvisionedMachine class that takes care
of running tests on a provisioned machine (physical or virtual).

The way to run test cases in such a machine with the implementation in this
branch would be as follows:
PYTHONPATH=. ./examples/run_utah_tests.py --skip-provisioning --name
<machine_name> ./utah/client/examples/pass.run

I don't consider this merge request yet ready to be committed, but I'd like to
get some feedback with regard the following:
- Flag
What do you think about using --skip-provisioning flag? I believe there are too
many flags already in run_utah_tests.py, so if you any suggestion would be
welcomed.

- Inventory
For now, no inventory is used to get the machine object. However, one must be
used to prevent jenkins jobs trying to make use of the same hardware
simultaneously. What kind of inventory do you suggest? I see there's a separate
inventory for cobbler machines and for VMs so probably it makes sense to have
one for provisioned machines, but I think that it can be confusing to have so
many inventories. I know there's a plan to use PostgreSQL, but I'm thinking
about having all inventories in a common database in MongoDB. This way we could
use a different schema for the machines collection depending on how it's
expected to be used.

- Class hierarchy
The workaround I had to implement to skip the initialization code in the
Machine class probably means that some refactoring is needed to have a simpler
generic machine class from which ProvisionedMachine can inherit and another one
used by all the machine classes that need an image to install the system before
running any test. Do you think this refactoring would make sense?

To post a comment you must log in.
Revision history for this message
Max Brustkern (nuclearbob) wrote :

I'm going to comment on your questions before I actually look at the merge.

I think that flag is fine. I also think that script is overloaded with options, but I think at this point we're going to need a more thorough design overhaul to fix that, so adding another flag for now is probably the best thing to do.

I think at this point, an inventory that could handle physical machines, virtual machines, already installed machines, reinstalling existing machines, ARM boards, etc. is becoming a priority. It's something we've wanted for a long time. The PS team, in particular, would like to be able to request machines based on hardware info/tags rather than just always pulling one by name. I think that will help hardware testing scale much better as well. I've used SQL because that's what I know, but we do need different information for machines depending on whether they're virtual, physical, already installed, etc. If you want to put together something in Mongo, I'd enjoy looking at it, and I'm sure it'd be a good learning experience for me.

I think up until we did automatic ISO downloading, the base Machine class could run fine without an image, so I would be in favor of refactoring things so it can work that way again. If you want to prepare something for that, I'd be happy to test it, but if you're working on other things, that's something I can try to tackle at some point as well.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

Now for my thoughts on the code. I think since this class requires SSH, it would make sense to put it in the same file as SSHMixin. Right now that's ssh.py, but I'm not sure if we should have that in a separate file or if it should be moved back into provisioning.py. It was moved out for modularity, but we're still not using any machine that isn't SSH-based, so maybe it should just live in the main provisioning file. Either way, I'd be in favor of ProvisionedMachine living in the same file as SSHMixin.

Have you done any testing without your __del__ method in place? If the cleanup functions are written correctly, then if we haven't defined any cleanup actions for an instance of the class, nothing will happen when we call the cleanup functions, and we shouldn't need to skip them. If they are causing problems when we don't have any items to be cleaned up, I think that's probably something that should be fixed there.

815. By Javier Collado

Moved ProvisionedMachine class to utah.provisioning.ssh

Aside from this, timeout values have been set to more sensible values when
network connectivity fails.

816. By Javier Collado

Updated {ssh,ping}check methods to sleep only after a failure.

In the old implementation the sleep took place before any connection attempt
which was good when failures happened, but wasn't so good on successes because
of all the time wasted waiting for a check that finally succeeded. To fix this,
both methods now run the check right away and sleep only if they fail to
guarantee that the retry attempt doesn't happen immediately.

Besides this, the documentation strings for both methods have been updated to
describe the new behavior.

817. By Javier Collado

Added clean attribute and removed __del__ method implementation

Revision history for this message
Javier Collado (javier.collado) wrote :

Max, I've applied the following changes based on your comments:

- ProvisionedMachine class implementation has been moved to ssh module as
  suggested. provisioning.py is already quite large, so I prefer to do that
  rather than moving ssh mixin back.

- self.clean instance attribute has been added to the ProvisionedMachine
  implementation to fix the problem that I had when calling the cleanup
  methods. Indeed, there was no need to overwrite __del__ implementation.

Besides this, I've changed a little bit the timeout behavior of {ping,ssh}check
methods (have a look at rev. 816 commit message or the updated documentation in
the code).

Revision history for this message
Max Brustkern (nuclearbob) wrote :

This looks reasonable to me. I'm going to check it out and run a test on physical hardware.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

So when I tested this on physical hardware I got this:

Traceback (most recent call last):
  File "./examples/run_utah_tests.py", line 115, in <module>
    run_utah_tests()
  File "./examples/run_utah_tests.py", line 101, in run_utah_tests
    function(args=args)
  File "./examples/run_utah_tests.py", line 90, in run_provisioned_tests
    run_tests(args, machine)
  File "/home/max/provisioned_machine/utah/run.py", line 203, in run_tests
    shutil.copyfile(machine.finalpreseed, preseedfile)
AttributeError: 'ProvisionedMachine' object has no attribute 'finalpreseed'

I made it work with this patch:

=== modified file 'utah/run.py'
--- utah/run.py 2013-01-23 12:26:38 +0000
+++ utah/run.py 2013-02-01 15:43:56 +0000
@@ -193,7 +193,8 @@
             except Exception as err:
                 logging.warning('Failed to download files: ' + str(err))

- if args.outputpreseed or config.outputpreseed:
+ if ((args.outputpreseed or config.outputpreseed) and
+ hasattr(machine, 'finalpreseed')):
         if args.outputpreseed:
             logging.debug('Capturing preseed due to command line option')
         elif config.outputpreseed:

But everywhere I read about hasattr seems to indicate we may as well just try to use it and catch an AttributeError, so maybe that's more pythonic.

Also, it runs, and copies the test log, but it doesn't output the copied test log the way a normal run does.

Finally, it installs the client every time. This is really a design artifact of the main function in run.py assuming a new run is always happening, and it won't really hurt anything (it does make sure we always have the latest version of the client) but I wonder if at some point maybe installing the client should be considered part of provisioning the system. Maybe not, since this method does ensure we always have the latest client even if a machine was installed a while ago.

review: Needs Fixing
Revision history for this message
Javier Collado (javier.collado) wrote :

Max, I haven't been able to reproduce the problem and I guess it's because
we're using different command lines. What I've been using is:

PYTHONPATH=. ./examples/run_utah_tests.py --skip-provisioning --name <name>
./utah/client/examples/pass.run -d

I've tried with a dell mini9 and with a nexus 7 and it worked fine on both.

Regarding the log behavior, could you elaborate on this? I believe what you're
seeing is the behavior we have after the latest logging changes. If we want to
get the old behavior, then we probably need to set the console log level to
logging.INFO in the configuration file.

Regarding the client installation, probably this could to be improved to update
the package only when needed, but I think keeping the client in sync is
important. I also needed to install `gdebi` in my tests, but I think we can
ignore this, since the idea is to reuse systems provisioned by UTAH not
manually as I did in my tests.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I've got some long logs, so I'm going to send them in an email.

818. By Javier Collado

Added check to writing the preseed to a file for provisioned systems

Revision history for this message
Javier Collado (javier.collado) wrote :

Max, thanks for the logs.

I finally realized that I wasn't seeing the preseed problem because of a
configuration option in the system you're using that isn't enabled by default.
Anyway, using --outputpreseed I got to the same result.

Regarding how to fix the problem, in my opinion we shouldn't catch exceptions
for something that can be checked upfront, but also using hasattr is not a very
good solution because it doesn't commnicate much about the design. The change
I've pushed checks if machine is a ProvisionedMachine object which I feel makes
clearer what kind of machine isn't expected to have a preseed file available.
If you disagree, then we can talk about alternative solutions.

I'm looking now at the problem regarding the log output.

819. By Javier Collado

Added statement to print log file names to stdout

The log file names are needed by external processes (for example a jenkins job)
to gather them in a single location.

Revision history for this message
Javier Collado (javier.collado) wrote :

I've added a try/except/finally block similar to the one in other run_* scripts
and a print statement to make sure log file names are printed as expected.

With this change, both issues should have been addressed properly. Please let
me know if something else is missing.

review: Needs Resubmitting
Revision history for this message
Max Brustkern (nuclearbob) wrote :

This just reminds me that we could use more unit testing for all the options we have at this point. This is what I get now:

jenkins@magners-orchestra:/home/max/provisioned_machine$ PYTHONPATH=. ./examples/run_utah_tests.py --skip-provisioning --name acer-veriton-03 /usr/share/utah/client/examples/pass.run
Running on machine: acer-veriton-03
Traceback (most recent call last):
  File "./examples/run_utah_tests.py", line 126, in <module>
    run_utah_tests()
  File "./examples/run_utah_tests.py", line 112, in run_utah_tests
    function(args=args)
  File "./examples/run_utah_tests.py", line 97, in run_provisioned_tests
    if len(locallogs) != 0:
UnboundLocalError: local variable 'locallogs' referenced before assignment

I think we probably just need to define locallogs (and maybe exitstatus) before we try to set them to the return values on run_test, in case run_test has an exception.

820. By Javier Collado

Added default values for locallogs and exitstatus

Revision history for this message
Javier Collado (javier.collado) wrote :

Thanks for detecting that problem. It should be fixed in the latest commit.
Yes, I'd like to have more unit tests everywhere.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I got this working using this change:

- args.outputpreseed or config.outputpreseed):
+ (args.outputpreseed or config.outputpreseed)):

Without that, if config.outputpreseed is True, that's good enough for the if statement to be True, and we get the same AttributeError as before.

review: Needs Fixing
821. By Javier Collado

Fixed logic as suggested by Max

Revision history for this message
Javier Collado (javier.collado) wrote :

Thanks again Max. This merge certainly required to be reviewed.

review: Needs Resubmitting
Revision history for this message
Max Brustkern (nuclearbob) wrote :

Could you tell me why line 232 is indented the way it is? It doesn't actually matter, I'm just curious so I can be sure I'm doing the right thing in the future.

Other than that, it looks good. The only reason I found most of these was that I was testing in the lab, so I guess we should continue to test in the lab, since that represents a common configuration that we need to support.

review: Approve
Revision history for this message
Javier Collado (javier.collado) wrote :

To explain the indentation of line 232, let me paste the output of pep8 when
the code is aligned as usual:

if (not isinstance(machine, ProvisionedMachine) and
    (args.outputpreseed or config.outputpreseed)):
    if args.outputpreseed:

----------------------------------
$ pep8 --show-pep8 utah/run.py
utah/run.py:200:9: E125 continuation line does not distinguish itself from next logical line
    Continuation lines should align wrapped elements either vertically using
    Python's implicit line joining inside parentheses, brackets and braces, or
    using a hanging indent.

    When using a hanging indent the following considerations should be applied:

    - there should be no arguments on the first line, and

    - further indentation should be used to clearly distinguish itself as a
      continuation line.
----------------------------------

In particular, the problem the extra indentation in the merge request fixes is
the one explained in the last bullet. Before using pep8, I would have used the
version above, but now I find it makes sense to make visually clear the
separation between the if condition and the body.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

That makes sense, I wasn't looking at the line below. I'll remember that in the future, thanks.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'examples/run_utah_tests.py'
--- examples/run_utah_tests.py 2013-01-23 12:26:38 +0000
+++ examples/run_utah_tests.py 2013-02-05 17:24:22 +0000
@@ -26,10 +26,13 @@
26 file_arguments,26 file_arguments,
27 name_argument,27 name_argument,
28 virtual_arguments,28 virtual_arguments,
29 configure_logging29 configure_logging,
30 run_tests
30)31)
31from utah.timeout import timeout, UTAHTimeout32from utah.timeout import timeout, UTAHTimeout
32from run_install_test import run_install_test33from run_install_test import run_install_test
34from utah.provisioning.ssh import ProvisionedMachine
35from utah.exceptions import UTAHException
3336
3437
35def get_parser():38def get_parser():
@@ -50,6 +53,9 @@
50 help='Type of machine to provision (%(choices)s)')53 help='Type of machine to provision (%(choices)s)')
51 parser.add_argument('-v', '--variant',54 parser.add_argument('-v', '--variant',
52 help='Variant of architecture, i.e., armel, armhf')55 help='Variant of architecture, i.e., armel, armhf')
56 parser.add_argument('--skip-provisioning', action='store_true',
57 help=('Reuse a system that is already provisioned '
58 '(name argument must be passed)'))
53 parser = common_arguments(parser)59 parser = common_arguments(parser)
54 parser = custom_arguments(parser)60 parser = custom_arguments(parser)
55 parser = file_arguments(parser)61 parser = file_arguments(parser)
@@ -76,6 +82,26 @@
76 # Default is now CustomVM82 # Default is now CustomVM
77 function = run_install_test83 function = run_install_test
7884
85 if args.skip_provisioning:
86 def run_provisioned_tests(args):
87 """Run test cases in a provisioned machine."""
88 locallogs = []
89 exitstatus = 0
90 try:
91 # TBD: Inventory should be used to verify machine
92 # is not running other tests
93 machine = ProvisionedMachine(name=args.name)
94 exitstatus, locallogs = run_tests(args, machine)
95 except UTAHException as error:
96 sys.stderr.write('Exception: ' + str(error))
97 exitstatus = 2
98 finally:
99 if len(locallogs) != 0:
100 print('Test logs copied to the following files:')
101 print("\t" + "\n\t".join(locallogs))
102 sys.exit(exitstatus)
103
104 function = run_provisioned_tests
79 if args.arch is not None and 'arm' in args.arch:105 if args.arch is not None and 'arm' in args.arch:
80 # If arch is arm, use BambooFeederMachine106 # If arch is arm, use BambooFeederMachine
81 from run_test_bamboo_feeder import run_test_bamboo_feeder107 from run_test_bamboo_feeder import run_test_bamboo_feeder
82108
=== modified file 'utah/provisioning/provisioning.py'
--- utah/provisioning/provisioning.py 2013-01-24 16:31:58 +0000
+++ utah/provisioning/provisioning.py 2013-02-05 17:24:22 +0000
@@ -302,15 +302,29 @@
302 self._start()302 self._start()
303303
304 def pingcheck(self, timeout=config.checktimeout):304 def pingcheck(self, timeout=config.checktimeout):
305 """Check network connectivity using ping."""305 """Check network connectivity using ping.
306 self.logger.info('Sleeping {timeout} seconds'306
307 .format(timeout=timeout))307 :param timeout: Amount of time in seconds to sleep after a failure
308 time.sleep(timeout)308 :type timeout: int
309 :raises: UTAHProvisioningException
310
311 If there's a network connectivity failure, then sleep ``timeout``
312 seconds and raise a retriable exception.
313
314 .. seealso:: :func:`utah.retry.retry`, :meth:`pingpoll`
315
316 """
309 self.logger.info('Checking network connectivity (ping)')317 self.logger.info('Checking network connectivity (ping)')
310 returncode = \318 returncode = \
311 self._runargs(['ping', '-c1', '-w5', self.name])['returncode']319 self._runargs(['ping', '-c1', '-w5', self.name])['returncode']
312 if returncode != 0:320 if returncode != 0:
313 err = 'Ping returned {0}'.format(returncode)321 err = 'Ping returned {0}'.format(returncode)
322
323 if timeout > 0:
324 self.logger.info('Sleeping {timeout} seconds'
325 .format(timeout=timeout))
326 time.sleep(timeout)
327
314 raise UTAHProvisioningException(err, retry=True)328 raise UTAHProvisioningException(err, retry=True)
315329
316 def pingpoll(self,330 def pingpoll(self,
317331
=== modified file 'utah/provisioning/ssh.py'
--- utah/provisioning/ssh.py 2013-01-23 09:10:16 +0000
+++ utah/provisioning/ssh.py 2013-02-05 17:24:22 +0000
@@ -14,7 +14,8 @@
14# with this program. If not, see <http://www.gnu.org/licenses/>.14# with this program. If not, see <http://www.gnu.org/licenses/>.
1515
16"""16"""
17Provide a mixin class for machines with SSH support.17SSH based machine class for a provisioned system
18and SSHMixin for every machine class that needs SSH support.
18"""19"""
1920
20import logging21import logging
@@ -29,6 +30,7 @@
2930
30from utah import config31from utah import config
31from utah.provisioning.exceptions import UTAHProvisioningException32from utah.provisioning.exceptions import UTAHProvisioningException
33from utah.provisioning.provisioning import Machine
32from utah.retry import retry34from utah.retry import retry
3335
3436
@@ -40,6 +42,15 @@
40 # Note: Since this is a mixin it doesn't expect any argument42 # Note: Since this is a mixin it doesn't expect any argument
41 # However, it calls super to initialize any other mixins in the mro43 # However, it calls super to initialize any other mixins in the mro
42 super(SSHMixin, self).__init__(*args, **kwargs)44 super(SSHMixin, self).__init__(*args, **kwargs)
45 self.initialize()
46
47 def initialize(self):
48 """SSH mixin initialization
49
50 Use this method when it isn't appropriate to follow the MRO as in
51 __init__
52
53 """
43 ssh_client = paramiko.SSHClient()54 ssh_client = paramiko.SSHClient()
44 ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())55 ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
45 self.ssh_client = ssh_client56 self.ssh_client = ssh_client
@@ -223,20 +234,29 @@
223 super(SSHMixin, self).destroy(*args, **kw)234 super(SSHMixin, self).destroy(*args, **kw)
224235
225 def sshcheck(self, timeout=config.checktimeout):236 def sshcheck(self, timeout=config.checktimeout):
226 """237 """Check if the machine is available via ssh.
227 Sleep for a while and check if the machine is available via ssh.238
228 Return a retryable exception if it is not.239 :param timeout: Amount of time in seconds to sleep after a failure
229 Intended for use with retry.240 :type timeout: int
230 """241 :raises: UTAHProvisioningException
231 self.ssh_logger.info('Sleeping {timeout} seconds'242
232 .format(timeout=timeout))243 If there's a network connectivity failure, then sleep ``timeout``
233 time.sleep(timeout)244 seconds and raise a retriable exception.
245
246 .. seealso:: :func:`utah.retry.retry`, :meth:`sshpoll`
247
248 """
234 self.ssh_logger.info('Checking for ssh availability')249 self.ssh_logger.info('Checking for ssh availability')
235 try:250 try:
236 self.ssh_client.connect(self.name,251 self.ssh_client.connect(self.name,
237 username=config.user,252 username=config.user,
238 key_filename=config.sshprivatekey)253 key_filename=config.sshprivatekey)
239 except socket.error as err:254 except socket.error as err:
255 if timeout > 0:
256 self.ssh_logger.info('Sleeping {timeout} seconds'
257 .format(timeout=timeout))
258 time.sleep(timeout)
259
240 raise UTAHProvisioningException(str(err), retry=True)260 raise UTAHProvisioningException(str(err), retry=True)
241261
242 def sshpoll(self, timeout=None,262 def sshpoll(self, timeout=None,
@@ -260,3 +280,41 @@
260 if not self.active:280 if not self.active:
261 self._start()281 self._start()
262 self.sshcheck()282 self.sshcheck()
283
284
285class ProvisionedMachine(SSHMixin, Machine):
286 """A machine that is provisioned and can be accessed through ssh."""
287 def __init__(self, name, installtype=None):
288 SSHMixin.initialize(self)
289 self.name = name
290 self._loggersetup()
291
292 # No cleanup needed for systems that are already provisioned
293 self.clean = False
294
295 # System is expected to be available already, so there's no need to
296 # wait before trying to connect through ssh
297 self.check_timeout = 3
298 self.connectivity_timeout = 60
299
300 # TBD: Figure out install type by getting information through ssh
301 if installtype is None:
302 self.installtype = config.installtype
303
304 def activecheck(self):
305 """Check if machine is active.
306
307 Given that the machine is already provisioned, it's considered to be
308 active as long as it's reachable through ssh
309
310 """
311 try:
312 self.pingpoll(timeout=self.connectivity_timeout,
313 checktimeout=self.check_timeout)
314 except utah.timeout.UTAHTimeout:
315 # Ignore timeout for ping, since depending on the network
316 # configuration ssh might still work despite of the ping failure.
317 self.logger.warning('Network connectivity (ping) failure')
318
319 self.sshpoll(timeout=self.connectivity_timeout,
320 checktimeout=self.check_timeout)
263321
=== modified file 'utah/run.py'
--- utah/run.py 2013-01-23 12:26:38 +0000
+++ utah/run.py 2013-02-05 17:24:22 +0000
@@ -26,6 +26,7 @@
26from utah import config26from utah import config
27from utah.exceptions import UTAHException27from utah.exceptions import UTAHException
28from utah.url import url_argument28from utah.url import url_argument
29from utah.provisioning.ssh import ProvisionedMachine
2930
3031
31def common_arguments(parser):32def common_arguments(parser):
@@ -193,7 +194,10 @@
193 except Exception as err:194 except Exception as err:
194 logging.warning('Failed to download files: ' + str(err))195 logging.warning('Failed to download files: ' + str(err))
195196
196 if args.outputpreseed or config.outputpreseed:197 # Provisioned systems have an image already installed
198 # and the preseed file is no longer available
199 if (not isinstance(machine, ProvisionedMachine) and
200 (args.outputpreseed or config.outputpreseed)):
197 if args.outputpreseed:201 if args.outputpreseed:
198 logging.debug('Capturing preseed due to command line option')202 logging.debug('Capturing preseed due to command line option')
199 elif config.outputpreseed:203 elif config.outputpreseed:

Subscribers

People subscribed via source and target branches