Merge lp:~javier.collado/utah/provisioned_machine into lp:utah

Proposed by Javier Collado
Status: Merged
Approved by: Javier Collado
Approved revision: 821
Merged at revision: 811
Proposed branch: lp:~javier.collado/utah/provisioned_machine
Merge into: lp:utah
Diff against target: 235 lines (+117/-15)
4 files modified
examples/run_utah_tests.py (+27/-1)
utah/provisioning/provisioning.py (+18/-4)
utah/provisioning/ssh.py (+67/-9)
utah/run.py (+5/-1)
To merge this branch: bzr merge lp:~javier.collado/utah/provisioned_machine
Reviewer Review Type Date Requested Status
Max Brustkern (community) Approve
Javier Collado (community) Needs Resubmitting
Review via email: mp+145920@code.launchpad.net

Description of the change

This branch creates a new class named ProvisionedMachine class that takes care
of running tests on a provisioned machine (physical or virtual).

The way to run test cases in such a machine with the implementation in this
branch would be as follows:
PYTHONPATH=. ./examples/run_utah_tests.py --skip-provisioning --name
<machine_name> ./utah/client/examples/pass.run

I don't consider this merge request yet ready to be committed, but I'd like to
get some feedback with regard the following:
- Flag
What do you think about using --skip-provisioning flag? I believe there are too
many flags already in run_utah_tests.py, so if you any suggestion would be
welcomed.

- Inventory
For now, no inventory is used to get the machine object. However, one must be
used to prevent jenkins jobs trying to make use of the same hardware
simultaneously. What kind of inventory do you suggest? I see there's a separate
inventory for cobbler machines and for VMs so probably it makes sense to have
one for provisioned machines, but I think that it can be confusing to have so
many inventories. I know there's a plan to use PostgreSQL, but I'm thinking
about having all inventories in a common database in MongoDB. This way we could
use a different schema for the machines collection depending on how it's
expected to be used.

- Class hierarchy
The workaround I had to implement to skip the initialization code in the
Machine class probably means that some refactoring is needed to have a simpler
generic machine class from which ProvisionedMachine can inherit and another one
used by all the machine classes that need an image to install the system before
running any test. Do you think this refactoring would make sense?

To post a comment you must log in.
Revision history for this message
Max Brustkern (nuclearbob) wrote :

I'm going to comment on your questions before I actually look at the merge.

I think that flag is fine. I also think that script is overloaded with options, but I think at this point we're going to need a more thorough design overhaul to fix that, so adding another flag for now is probably the best thing to do.

I think at this point, an inventory that could handle physical machines, virtual machines, already installed machines, reinstalling existing machines, ARM boards, etc. is becoming a priority. It's something we've wanted for a long time. The PS team, in particular, would like to be able to request machines based on hardware info/tags rather than just always pulling one by name. I think that will help hardware testing scale much better as well. I've used SQL because that's what I know, but we do need different information for machines depending on whether they're virtual, physical, already installed, etc. If you want to put together something in Mongo, I'd enjoy looking at it, and I'm sure it'd be a good learning experience for me.

I think up until we did automatic ISO downloading, the base Machine class could run fine without an image, so I would be in favor of refactoring things so it can work that way again. If you want to prepare something for that, I'd be happy to test it, but if you're working on other things, that's something I can try to tackle at some point as well.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

Now for my thoughts on the code. I think since this class requires SSH, it would make sense to put it in the same file as SSHMixin. Right now that's ssh.py, but I'm not sure if we should have that in a separate file or if it should be moved back into provisioning.py. It was moved out for modularity, but we're still not using any machine that isn't SSH-based, so maybe it should just live in the main provisioning file. Either way, I'd be in favor of ProvisionedMachine living in the same file as SSHMixin.

Have you done any testing without your __del__ method in place? If the cleanup functions are written correctly, then if we haven't defined any cleanup actions for an instance of the class, nothing will happen when we call the cleanup functions, and we shouldn't need to skip them. If they are causing problems when we don't have any items to be cleaned up, I think that's probably something that should be fixed there.

815. By Javier Collado

Moved ProvisionedMachine class to utah.provisioning.ssh

Aside from this, timeout values have been set to more sensible values when
network connectivity fails.

816. By Javier Collado

Updated {ssh,ping}check methods to sleep only after a failure.

In the old implementation the sleep took place before any connection attempt
which was good when failures happened, but wasn't so good on successes because
of all the time wasted waiting for a check that finally succeeded. To fix this,
both methods now run the check right away and sleep only if they fail to
guarantee that the retry attempt doesn't happen immediately.

Besides this, the documentation strings for both methods have been updated to
describe the new behavior.

817. By Javier Collado

Added clean attribute and removed __del__ method implementation

Revision history for this message
Javier Collado (javier.collado) wrote :

Max, I've applied the following changes based on your comments:

- ProvisionedMachine class implementation has been moved to ssh module as
  suggested. provisioning.py is already quite large, so I prefer to do that
  rather than moving ssh mixin back.

- self.clean instance attribute has been added to the ProvisionedMachine
  implementation to fix the problem that I had when calling the cleanup
  methods. Indeed, there was no need to overwrite __del__ implementation.

Besides this, I've changed a little bit the timeout behavior of {ping,ssh}check
methods (have a look at rev. 816 commit message or the updated documentation in
the code).

Revision history for this message
Max Brustkern (nuclearbob) wrote :

This looks reasonable to me. I'm going to check it out and run a test on physical hardware.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

So when I tested this on physical hardware I got this:

Traceback (most recent call last):
  File "./examples/run_utah_tests.py", line 115, in <module>
    run_utah_tests()
  File "./examples/run_utah_tests.py", line 101, in run_utah_tests
    function(args=args)
  File "./examples/run_utah_tests.py", line 90, in run_provisioned_tests
    run_tests(args, machine)
  File "/home/max/provisioned_machine/utah/run.py", line 203, in run_tests
    shutil.copyfile(machine.finalpreseed, preseedfile)
AttributeError: 'ProvisionedMachine' object has no attribute 'finalpreseed'

I made it work with this patch:

=== modified file 'utah/run.py'
--- utah/run.py 2013-01-23 12:26:38 +0000
+++ utah/run.py 2013-02-01 15:43:56 +0000
@@ -193,7 +193,8 @@
             except Exception as err:
                 logging.warning('Failed to download files: ' + str(err))

- if args.outputpreseed or config.outputpreseed:
+ if ((args.outputpreseed or config.outputpreseed) and
+ hasattr(machine, 'finalpreseed')):
         if args.outputpreseed:
             logging.debug('Capturing preseed due to command line option')
         elif config.outputpreseed:

But everywhere I read about hasattr seems to indicate we may as well just try to use it and catch an AttributeError, so maybe that's more pythonic.

Also, it runs, and copies the test log, but it doesn't output the copied test log the way a normal run does.

Finally, it installs the client every time. This is really a design artifact of the main function in run.py assuming a new run is always happening, and it won't really hurt anything (it does make sure we always have the latest version of the client) but I wonder if at some point maybe installing the client should be considered part of provisioning the system. Maybe not, since this method does ensure we always have the latest client even if a machine was installed a while ago.

review: Needs Fixing
Revision history for this message
Javier Collado (javier.collado) wrote :

Max, I haven't been able to reproduce the problem and I guess it's because
we're using different command lines. What I've been using is:

PYTHONPATH=. ./examples/run_utah_tests.py --skip-provisioning --name <name>
./utah/client/examples/pass.run -d

I've tried with a dell mini9 and with a nexus 7 and it worked fine on both.

Regarding the log behavior, could you elaborate on this? I believe what you're
seeing is the behavior we have after the latest logging changes. If we want to
get the old behavior, then we probably need to set the console log level to
logging.INFO in the configuration file.

Regarding the client installation, probably this could to be improved to update
the package only when needed, but I think keeping the client in sync is
important. I also needed to install `gdebi` in my tests, but I think we can
ignore this, since the idea is to reuse systems provisioned by UTAH not
manually as I did in my tests.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I've got some long logs, so I'm going to send them in an email.

818. By Javier Collado

Added check to writing the preseed to a file for provisioned systems

Revision history for this message
Javier Collado (javier.collado) wrote :

Max, thanks for the logs.

I finally realized that I wasn't seeing the preseed problem because of a
configuration option in the system you're using that isn't enabled by default.
Anyway, using --outputpreseed I got to the same result.

Regarding how to fix the problem, in my opinion we shouldn't catch exceptions
for something that can be checked upfront, but also using hasattr is not a very
good solution because it doesn't commnicate much about the design. The change
I've pushed checks if machine is a ProvisionedMachine object which I feel makes
clearer what kind of machine isn't expected to have a preseed file available.
If you disagree, then we can talk about alternative solutions.

I'm looking now at the problem regarding the log output.

819. By Javier Collado

Added statement to print log file names to stdout

The log file names are needed by external processes (for example a jenkins job)
to gather them in a single location.

Revision history for this message
Javier Collado (javier.collado) wrote :

I've added a try/except/finally block similar to the one in other run_* scripts
and a print statement to make sure log file names are printed as expected.

With this change, both issues should have been addressed properly. Please let
me know if something else is missing.

review: Needs Resubmitting
Revision history for this message
Max Brustkern (nuclearbob) wrote :

This just reminds me that we could use more unit testing for all the options we have at this point. This is what I get now:

jenkins@magners-orchestra:/home/max/provisioned_machine$ PYTHONPATH=. ./examples/run_utah_tests.py --skip-provisioning --name acer-veriton-03 /usr/share/utah/client/examples/pass.run
Running on machine: acer-veriton-03
Traceback (most recent call last):
  File "./examples/run_utah_tests.py", line 126, in <module>
    run_utah_tests()
  File "./examples/run_utah_tests.py", line 112, in run_utah_tests
    function(args=args)
  File "./examples/run_utah_tests.py", line 97, in run_provisioned_tests
    if len(locallogs) != 0:
UnboundLocalError: local variable 'locallogs' referenced before assignment

I think we probably just need to define locallogs (and maybe exitstatus) before we try to set them to the return values on run_test, in case run_test has an exception.

820. By Javier Collado

Added default values for locallogs and exitstatus

Revision history for this message
Javier Collado (javier.collado) wrote :

Thanks for detecting that problem. It should be fixed in the latest commit.
Yes, I'd like to have more unit tests everywhere.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

I got this working using this change:

- args.outputpreseed or config.outputpreseed):
+ (args.outputpreseed or config.outputpreseed)):

Without that, if config.outputpreseed is True, that's good enough for the if statement to be True, and we get the same AttributeError as before.

review: Needs Fixing
821. By Javier Collado

Fixed logic as suggested by Max

Revision history for this message
Javier Collado (javier.collado) wrote :

Thanks again Max. This merge certainly required to be reviewed.

review: Needs Resubmitting
Revision history for this message
Max Brustkern (nuclearbob) wrote :

Could you tell me why line 232 is indented the way it is? It doesn't actually matter, I'm just curious so I can be sure I'm doing the right thing in the future.

Other than that, it looks good. The only reason I found most of these was that I was testing in the lab, so I guess we should continue to test in the lab, since that represents a common configuration that we need to support.

review: Approve
Revision history for this message
Javier Collado (javier.collado) wrote :

To explain the indentation of line 232, let me paste the output of pep8 when
the code is aligned as usual:

if (not isinstance(machine, ProvisionedMachine) and
    (args.outputpreseed or config.outputpreseed)):
    if args.outputpreseed:

----------------------------------
$ pep8 --show-pep8 utah/run.py
utah/run.py:200:9: E125 continuation line does not distinguish itself from next logical line
    Continuation lines should align wrapped elements either vertically using
    Python's implicit line joining inside parentheses, brackets and braces, or
    using a hanging indent.

    When using a hanging indent the following considerations should be applied:

    - there should be no arguments on the first line, and

    - further indentation should be used to clearly distinguish itself as a
      continuation line.
----------------------------------

In particular, the problem the extra indentation in the merge request fixes is
the one explained in the last bullet. Before using pep8, I would have used the
version above, but now I find it makes sense to make visually clear the
separation between the if condition and the body.

Revision history for this message
Max Brustkern (nuclearbob) wrote :

That makes sense, I wasn't looking at the line below. I'll remember that in the future, thanks.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'examples/run_utah_tests.py'
2--- examples/run_utah_tests.py 2013-01-23 12:26:38 +0000
3+++ examples/run_utah_tests.py 2013-02-05 17:24:22 +0000
4@@ -26,10 +26,13 @@
5 file_arguments,
6 name_argument,
7 virtual_arguments,
8- configure_logging
9+ configure_logging,
10+ run_tests
11 )
12 from utah.timeout import timeout, UTAHTimeout
13 from run_install_test import run_install_test
14+from utah.provisioning.ssh import ProvisionedMachine
15+from utah.exceptions import UTAHException
16
17
18 def get_parser():
19@@ -50,6 +53,9 @@
20 help='Type of machine to provision (%(choices)s)')
21 parser.add_argument('-v', '--variant',
22 help='Variant of architecture, i.e., armel, armhf')
23+ parser.add_argument('--skip-provisioning', action='store_true',
24+ help=('Reuse a system that is already provisioned '
25+ '(name argument must be passed)'))
26 parser = common_arguments(parser)
27 parser = custom_arguments(parser)
28 parser = file_arguments(parser)
29@@ -76,6 +82,26 @@
30 # Default is now CustomVM
31 function = run_install_test
32
33+ if args.skip_provisioning:
34+ def run_provisioned_tests(args):
35+ """Run test cases in a provisioned machine."""
36+ locallogs = []
37+ exitstatus = 0
38+ try:
39+ # TBD: Inventory should be used to verify machine
40+ # is not running other tests
41+ machine = ProvisionedMachine(name=args.name)
42+ exitstatus, locallogs = run_tests(args, machine)
43+ except UTAHException as error:
44+ sys.stderr.write('Exception: ' + str(error))
45+ exitstatus = 2
46+ finally:
47+ if len(locallogs) != 0:
48+ print('Test logs copied to the following files:')
49+ print("\t" + "\n\t".join(locallogs))
50+ sys.exit(exitstatus)
51+
52+ function = run_provisioned_tests
53 if args.arch is not None and 'arm' in args.arch:
54 # If arch is arm, use BambooFeederMachine
55 from run_test_bamboo_feeder import run_test_bamboo_feeder
56
57=== modified file 'utah/provisioning/provisioning.py'
58--- utah/provisioning/provisioning.py 2013-01-24 16:31:58 +0000
59+++ utah/provisioning/provisioning.py 2013-02-05 17:24:22 +0000
60@@ -302,15 +302,29 @@
61 self._start()
62
63 def pingcheck(self, timeout=config.checktimeout):
64- """Check network connectivity using ping."""
65- self.logger.info('Sleeping {timeout} seconds'
66- .format(timeout=timeout))
67- time.sleep(timeout)
68+ """Check network connectivity using ping.
69+
70+ :param timeout: Amount of time in seconds to sleep after a failure
71+ :type timeout: int
72+ :raises: UTAHProvisioningException
73+
74+ If there's a network connectivity failure, then sleep ``timeout``
75+ seconds and raise a retriable exception.
76+
77+ .. seealso:: :func:`utah.retry.retry`, :meth:`pingpoll`
78+
79+ """
80 self.logger.info('Checking network connectivity (ping)')
81 returncode = \
82 self._runargs(['ping', '-c1', '-w5', self.name])['returncode']
83 if returncode != 0:
84 err = 'Ping returned {0}'.format(returncode)
85+
86+ if timeout > 0:
87+ self.logger.info('Sleeping {timeout} seconds'
88+ .format(timeout=timeout))
89+ time.sleep(timeout)
90+
91 raise UTAHProvisioningException(err, retry=True)
92
93 def pingpoll(self,
94
95=== modified file 'utah/provisioning/ssh.py'
96--- utah/provisioning/ssh.py 2013-01-23 09:10:16 +0000
97+++ utah/provisioning/ssh.py 2013-02-05 17:24:22 +0000
98@@ -14,7 +14,8 @@
99 # with this program. If not, see <http://www.gnu.org/licenses/>.
100
101 """
102-Provide a mixin class for machines with SSH support.
103+SSH based machine class for a provisioned system
104+and SSHMixin for every machine class that needs SSH support.
105 """
106
107 import logging
108@@ -29,6 +30,7 @@
109
110 from utah import config
111 from utah.provisioning.exceptions import UTAHProvisioningException
112+from utah.provisioning.provisioning import Machine
113 from utah.retry import retry
114
115
116@@ -40,6 +42,15 @@
117 # Note: Since this is a mixin it doesn't expect any argument
118 # However, it calls super to initialize any other mixins in the mro
119 super(SSHMixin, self).__init__(*args, **kwargs)
120+ self.initialize()
121+
122+ def initialize(self):
123+ """SSH mixin initialization
124+
125+ Use this method when it isn't appropriate to follow the MRO as in
126+ __init__
127+
128+ """
129 ssh_client = paramiko.SSHClient()
130 ssh_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
131 self.ssh_client = ssh_client
132@@ -223,20 +234,29 @@
133 super(SSHMixin, self).destroy(*args, **kw)
134
135 def sshcheck(self, timeout=config.checktimeout):
136- """
137- Sleep for a while and check if the machine is available via ssh.
138- Return a retryable exception if it is not.
139- Intended for use with retry.
140- """
141- self.ssh_logger.info('Sleeping {timeout} seconds'
142- .format(timeout=timeout))
143- time.sleep(timeout)
144+ """Check if the machine is available via ssh.
145+
146+ :param timeout: Amount of time in seconds to sleep after a failure
147+ :type timeout: int
148+ :raises: UTAHProvisioningException
149+
150+ If there's a network connectivity failure, then sleep ``timeout``
151+ seconds and raise a retriable exception.
152+
153+ .. seealso:: :func:`utah.retry.retry`, :meth:`sshpoll`
154+
155+ """
156 self.ssh_logger.info('Checking for ssh availability')
157 try:
158 self.ssh_client.connect(self.name,
159 username=config.user,
160 key_filename=config.sshprivatekey)
161 except socket.error as err:
162+ if timeout > 0:
163+ self.ssh_logger.info('Sleeping {timeout} seconds'
164+ .format(timeout=timeout))
165+ time.sleep(timeout)
166+
167 raise UTAHProvisioningException(str(err), retry=True)
168
169 def sshpoll(self, timeout=None,
170@@ -260,3 +280,41 @@
171 if not self.active:
172 self._start()
173 self.sshcheck()
174+
175+
176+class ProvisionedMachine(SSHMixin, Machine):
177+ """A machine that is provisioned and can be accessed through ssh."""
178+ def __init__(self, name, installtype=None):
179+ SSHMixin.initialize(self)
180+ self.name = name
181+ self._loggersetup()
182+
183+ # No cleanup needed for systems that are already provisioned
184+ self.clean = False
185+
186+ # System is expected to be available already, so there's no need to
187+ # wait before trying to connect through ssh
188+ self.check_timeout = 3
189+ self.connectivity_timeout = 60
190+
191+ # TBD: Figure out install type by getting information through ssh
192+ if installtype is None:
193+ self.installtype = config.installtype
194+
195+ def activecheck(self):
196+ """Check if machine is active.
197+
198+ Given that the machine is already provisioned, it's considered to be
199+ active as long as it's reachable through ssh
200+
201+ """
202+ try:
203+ self.pingpoll(timeout=self.connectivity_timeout,
204+ checktimeout=self.check_timeout)
205+ except utah.timeout.UTAHTimeout:
206+ # Ignore timeout for ping, since depending on the network
207+ # configuration ssh might still work despite of the ping failure.
208+ self.logger.warning('Network connectivity (ping) failure')
209+
210+ self.sshpoll(timeout=self.connectivity_timeout,
211+ checktimeout=self.check_timeout)
212
213=== modified file 'utah/run.py'
214--- utah/run.py 2013-01-23 12:26:38 +0000
215+++ utah/run.py 2013-02-05 17:24:22 +0000
216@@ -26,6 +26,7 @@
217 from utah import config
218 from utah.exceptions import UTAHException
219 from utah.url import url_argument
220+from utah.provisioning.ssh import ProvisionedMachine
221
222
223 def common_arguments(parser):
224@@ -193,7 +194,10 @@
225 except Exception as err:
226 logging.warning('Failed to download files: ' + str(err))
227
228- if args.outputpreseed or config.outputpreseed:
229+ # Provisioned systems have an image already installed
230+ # and the preseed file is no longer available
231+ if (not isinstance(machine, ProvisionedMachine) and
232+ (args.outputpreseed or config.outputpreseed)):
233 if args.outputpreseed:
234 logging.debug('Capturing preseed due to command line option')
235 elif config.outputpreseed:

Subscribers

People subscribed via source and target branches