Merge lp:~smoser/ubuntu-server-ec2-testing/fixes into lp:ubuntu-server-ec2-testing

Proposed by Scott Moser
Status: Merged
Merged at revision: 21
Proposed branch: lp:~smoser/ubuntu-server-ec2-testing/fixes
Merge into: lp:ubuntu-server-ec2-testing
Diff against target: 142 lines (+55/-10)
2 files modified
src/ubuntu/ec2/settings.py (+5/-0)
src/ubuntu/ec2/testing.py (+50/-10)
To merge this branch: bzr merge lp:~smoser/ubuntu-server-ec2-testing/fixes
Reviewer Review Type Date Requested Status
James Page Approve
Review via email: mp+79174@code.launchpad.net

Commit message

try harder to get terminated logs, actually terminate with boto 2.0

This adds the terminate_and_wait method to TestCaseExecutor. As seen at
[1], TidyUp would raise an exception and prevent the final retrieval of
logs. Without those final logs, there was no way to debug what went
wrong. We make tidyup use this and basically try harder to get logs
of the terminated instance.

In the "all is well" path, there should be no additional wait, as
terminate_and_wait will recognize that the instance is already terminated.
Also, collect-console will recognize that there already exists console
output, so we wont' overwrite it.

This also fixes a bug in the test suite when using boto 2.0. Natty's
version of boto a boto.ec2.instance.stop() would terminate an instance,
but that is obviously wrong. Both 2.0 and natty's version have a
'terminate' method, so we use that.

--
[1] https://jenkins.qa.ubuntu.com/job/oneiric-server-ec2/10/ARCH=i386,REGION=ap-southeast-1,STORAGE=instance-store,TEST=cloud-config,label=ubuntu-server-ec2-testing/console

To post a comment you must log in.
21. By Scott Moser

fix DEFAULT_SLEEP missing name

22. By Scott Moser

make uuid contain a notable prefix and timestamp

since uuid is used in security groups and keypairs and
other places, it would be nice to make it identifyable.

This way, if i'm looking at keypairs and security groups, I
can clean up ones from the testing suite without any harm.

23. By Scott Moser

fix incorrect variable name in terminate_and_wait

24. By Scott Moser

terminate_and_wait: fix signature to have 'self'

25. By Scott Moser

collect all consoles in tidyup

This also adds a check in collect_console to avoid re-collecting
a console that was already collected. That has been added
because tidyup and the normal path will get call collect_console
with 'terminated'.

26. By Scott Moser

put terminate retries settings into settings

27. By Scott Moser

improve log message for 'already collected'

28. By Scott Moser

terminate instance with boto.ec2.instance.terminate()

apparently, with the version of boto that was in natty (1.9b-1ubuntu5)
boto.ec2.instance.stop() would terminate an instance.
In oneiric (2.0-0ubuntu1) that does a 'stop'.

Both versions have a 'boto.ec2.instance.terminate()' method, so
this change just uses that.

Revision history for this message
James Page (james-page) wrote :

Scott

All looks good to me - I've pulled this and the documentation MP in and added one other minor update - when ebs instances where being stopped they where doing weird internal calls on the instance to make that happen - probably related to the fact that stop actually terminated the instance.

I updated that call to use stop() instead.

I'll merge this in and push a new version to the PPA for oneiric.

Cheers

James

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/ubuntu/ec2/settings.py'
--- src/ubuntu/ec2/settings.py 2011-04-13 07:18:38 +0000
+++ src/ubuntu/ec2/settings.py 2011-10-13 16:25:20 +0000
@@ -32,6 +32,11 @@
32# Time to sleep before monitoring instances32# Time to sleep before monitoring instances
33START_SLEEP = 3033START_SLEEP = 30
3434
35# Time to pause after call to TerminateInstances before checking
36# to see if instance has terminated.
37TERMINATE_SLEEP = 10
38TERMINATE_SLEEP_TRIES = 12
39
35# Valid combinations of regions and availability zones40# Valid combinations of regions and availability zones
36REGION_PLACEMENT_LOOKUP = {41REGION_PLACEMENT_LOOKUP = {
37 'eu-west-1' : [ 'eu-west-1a' , 'eu-west-1b' , 'eu-west-1c', None],42 'eu-west-1' : [ 'eu-west-1a' , 'eu-west-1b' , 'eu-west-1c', None],
3843
=== modified file 'src/ubuntu/ec2/testing.py'
--- src/ubuntu/ec2/testing.py 2011-04-26 08:57:28 +0000
+++ src/ubuntu/ec2/testing.py 2011-10-13 16:25:20 +0000
@@ -61,6 +61,7 @@
61UEC_IMAGES_ALL_URL = 'http://uec-images.ubuntu.com/query/%s/%s/daily.txt'61UEC_IMAGES_ALL_URL = 'http://uec-images.ubuntu.com/query/%s/%s/daily.txt'
62LATEST_RELEASE = 'current'62LATEST_RELEASE = 'current'
63DEFAULT_TEST_DIR = os.path.expanduser('~/tests')63DEFAULT_TEST_DIR = os.path.expanduser('~/tests')
64UUID_PREFIX = "uec2"
6465
65CI_SCRIPT = "text/x-shellscript"66CI_SCRIPT = "text/x-shellscript"
66CI_SCRIPT_START = "#!"67CI_SCRIPT_START = "#!"
@@ -230,7 +231,9 @@
230 self.test_case = test_case 231 self.test_case = test_case
231 self.access_key_id = aws_access_key_id232 self.access_key_id = aws_access_key_id
232 self.secret_access_key = aws_secret_access_key233 self.secret_access_key = aws_secret_access_key
233 self.uuid = str(uuid.uuid4())234 self.uuid = "%s-%s-%s" % ( UUID_PREFIX,
235 time.strftime("%Y%m%d-%H%M"),
236 str(uuid.uuid4()).replace("-","")[0:14])
234 self.timeout = settings.DEFAULT_TIMEOUT237 self.timeout = settings.DEFAULT_TIMEOUT
235 # To start with check placement against region238 # To start with check placement against region
236 if (self.test_case.placement not in 239 if (self.test_case.placement not in
@@ -310,7 +313,7 @@
310 self.tidyup()313 self.tidyup()
311 if (l_timed_out):314 if (l_timed_out):
312 # Collect consoles from all instances 315 # Collect consoles from all instances
313 self.collect_all_consoles()316 self.collect_all_consoles('timeout')
314 317
315 return (success and not self.any_errors())318 return (success and not self.any_errors())
316319
@@ -441,11 +444,11 @@
441 logging.debug("Testing completed %s",test_error)444 logging.debug("Testing completed %s",test_error)
442 return test_error 445 return test_error
443 446
444 def collect_all_consoles(self):447 def collect_all_consoles(self, suffix):
445 ''' Collects consoles from all Test Case Instances marked with448 ''' Collects consoles from all Test Case Instances marked with
446 timeout '''449 timeout '''
447 for l_instance in self.test_instances:450 for l_instance in self.test_instances:
448 l_instance.collect_console('timeout')451 l_instance.collect_console(suffix)
449 452
450 def collect_timeout_metadata(self):453 def collect_timeout_metadata(self):
451 ''' Collect instance metadata via SSH if not complete and still454 ''' Collect instance metadata via SSH if not complete and still
@@ -454,6 +457,35 @@
454 if ((not l_instance.complete) and l_instance.sshable):457 if ((not l_instance.complete) and l_instance.sshable):
455 l_instance.collect_metadata('timeout')458 l_instance.collect_metadata('timeout')
456459
460 def terminate_and_wait(self, tries=settings.TERMINATE_SLEEP_TRIES,
461 naplen=settings.TERMINATE_SLEEP):
462 if not self.reservation:
463 return
464 # turn a result list into an instance list
465 def _resl2instl(result_set):
466 ret = []
467 for res in result_set:
468 for inst in res.instances:
469 ret.append(inst)
470 return(ret)
471
472 iid_l = [l.instance.id for l in self.test_instances]
473 # check to see if there is work to do, if not, return
474 for tnum in range(1,tries+1):
475 res_l = self.conn.get_all_instances(instance_ids=iid_l)
476 instances = _resl2instl(res_l)
477 to_term = [ i.id for i in instances if i.state != "terminated" ]
478 if len(to_term) == 0:
479 return
480 self.conn.terminate_instances(to_term)
481 iid_l = to_term
482 logging.debug("Waiting on terminate-instances [%i/%i]: %s" %
483 (tnum, tries, to_term))
484 time.sleep(naplen)
485
486 logging.warning("These instances did not terminate: %s" % to_term)
487 return
488
457 def tidyup(self):489 def tidyup(self):
458 ''' Tidyup any residual stuff left over ''' 490 ''' Tidyup any residual stuff left over '''
459 # Remove the private key!!!!491 # Remove the private key!!!!
@@ -470,8 +502,10 @@
470 # terminated502 # terminated
471 if self.reservation:503 if self.reservation:
472 logging.debug('Terminating all instances associated with test case')504 logging.debug('Terminating all instances associated with test case')
473 self.reservation.stop_all()505 self.terminate_and_wait()
474 506
507 self.collect_all_consoles('terminated')
508
475 # Tidyup the keypair509 # Tidyup the keypair
476 if self.keypair:510 if self.keypair:
477 logging.debug('Removing keypair associated with test case')511 logging.debug('Removing keypair associated with test case')
@@ -709,13 +743,19 @@
709 ktype, self.instance.id, host)743 ktype, self.instance.id, host)
710 self.keyscan_ok = True 744 self.keyscan_ok = True
711 745
712 def collect_console(self,suffix):746 def collect_console(self,suffix,update=False):
713 ''' Grabs the console for the instance in its current state '''747 ''' Grabs the console for the instance in its current state '''
748 output_path = os.path.join(self.working_dir,
749 '%s-%s.console.txt'
750 % (self.tce.uuid, suffix))
751 if os.path.isfile(output_path):
752 logging.debug("console %s already collected for %s" %
753 (self.instance.id, suffix))
754 return
755
714 l_output = self.instance.get_console_output()756 l_output = self.instance.get_console_output()
715 logging.debug(l_output.output)757 logging.debug(l_output.output)
716 console_file = open(os.path.join(self.working_dir,758 console_file = open(output_path, 'w')
717 '%s-%s.console.txt')
718 % (self.tce.uuid, suffix), 'w')
719 console_file.write(l_output.output)759 console_file.write(l_output.output)
720 console_file.close() 760 console_file.close()
721761
@@ -813,7 +853,7 @@
813 if (self.is_running()):853 if (self.is_running()):
814 self.collect_metadata('restarted')854 self.collect_metadata('restarted')
815 logging.debug('Terminating Instance %s', self.instance.id)855 logging.debug('Terminating Instance %s', self.instance.id)
816 self.instance.stop()856 self.instance.terminate()
817 # Replace with JUnit Test857 # Replace with JUnit Test
818 self.validate_host_keys()858 self.validate_host_keys()
819 self.sshable = False859 self.sshable = False

Subscribers

People subscribed via source and target branches