Merge lp:~smoser/ubuntu-server-ec2-testing/fixes into lp:ubuntu-server-ec2-testing

Proposed by Scott Moser
Status: Merged
Merged at revision: 21
Proposed branch: lp:~smoser/ubuntu-server-ec2-testing/fixes
Merge into: lp:ubuntu-server-ec2-testing
Diff against target: 142 lines (+55/-10)
2 files modified
src/ubuntu/ec2/settings.py (+5/-0)
src/ubuntu/ec2/testing.py (+50/-10)
To merge this branch: bzr merge lp:~smoser/ubuntu-server-ec2-testing/fixes
Reviewer Review Type Date Requested Status
James Page Approve
Review via email: mp+79174@code.launchpad.net

Commit message

try harder to get terminated logs, actually terminate with boto 2.0

This adds the terminate_and_wait method to TestCaseExecutor. As seen at
[1], TidyUp would raise an exception and prevent the final retrieval of
logs. Without those final logs, there was no way to debug what went
wrong. We make tidyup use this and basically try harder to get logs
of the terminated instance.

In the "all is well" path, there should be no additional wait, as
terminate_and_wait will recognize that the instance is already terminated.
Also, collect-console will recognize that there already exists console
output, so we wont' overwrite it.

This also fixes a bug in the test suite when using boto 2.0. Natty's
version of boto a boto.ec2.instance.stop() would terminate an instance,
but that is obviously wrong. Both 2.0 and natty's version have a
'terminate' method, so we use that.

--
[1] https://jenkins.qa.ubuntu.com/job/oneiric-server-ec2/10/ARCH=i386,REGION=ap-southeast-1,STORAGE=instance-store,TEST=cloud-config,label=ubuntu-server-ec2-testing/console

To post a comment you must log in.
21. By Scott Moser

fix DEFAULT_SLEEP missing name

22. By Scott Moser

make uuid contain a notable prefix and timestamp

since uuid is used in security groups and keypairs and
other places, it would be nice to make it identifyable.

This way, if i'm looking at keypairs and security groups, I
can clean up ones from the testing suite without any harm.

23. By Scott Moser

fix incorrect variable name in terminate_and_wait

24. By Scott Moser

terminate_and_wait: fix signature to have 'self'

25. By Scott Moser

collect all consoles in tidyup

This also adds a check in collect_console to avoid re-collecting
a console that was already collected. That has been added
because tidyup and the normal path will get call collect_console
with 'terminated'.

26. By Scott Moser

put terminate retries settings into settings

27. By Scott Moser

improve log message for 'already collected'

28. By Scott Moser

terminate instance with boto.ec2.instance.terminate()

apparently, with the version of boto that was in natty (1.9b-1ubuntu5)
boto.ec2.instance.stop() would terminate an instance.
In oneiric (2.0-0ubuntu1) that does a 'stop'.

Both versions have a 'boto.ec2.instance.terminate()' method, so
this change just uses that.

Revision history for this message
James Page (james-page) wrote :

Scott

All looks good to me - I've pulled this and the documentation MP in and added one other minor update - when ebs instances where being stopped they where doing weird internal calls on the instance to make that happen - probably related to the fact that stop actually terminated the instance.

I updated that call to use stop() instead.

I'll merge this in and push a new version to the PPA for oneiric.

Cheers

James

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/ubuntu/ec2/settings.py'
2--- src/ubuntu/ec2/settings.py 2011-04-13 07:18:38 +0000
3+++ src/ubuntu/ec2/settings.py 2011-10-13 16:25:20 +0000
4@@ -32,6 +32,11 @@
5 # Time to sleep before monitoring instances
6 START_SLEEP = 30
7
8+# Time to pause after call to TerminateInstances before checking
9+# to see if instance has terminated.
10+TERMINATE_SLEEP = 10
11+TERMINATE_SLEEP_TRIES = 12
12+
13 # Valid combinations of regions and availability zones
14 REGION_PLACEMENT_LOOKUP = {
15 'eu-west-1' : [ 'eu-west-1a' , 'eu-west-1b' , 'eu-west-1c', None],
16
17=== modified file 'src/ubuntu/ec2/testing.py'
18--- src/ubuntu/ec2/testing.py 2011-04-26 08:57:28 +0000
19+++ src/ubuntu/ec2/testing.py 2011-10-13 16:25:20 +0000
20@@ -61,6 +61,7 @@
21 UEC_IMAGES_ALL_URL = 'http://uec-images.ubuntu.com/query/%s/%s/daily.txt'
22 LATEST_RELEASE = 'current'
23 DEFAULT_TEST_DIR = os.path.expanduser('~/tests')
24+UUID_PREFIX = "uec2"
25
26 CI_SCRIPT = "text/x-shellscript"
27 CI_SCRIPT_START = "#!"
28@@ -230,7 +231,9 @@
29 self.test_case = test_case
30 self.access_key_id = aws_access_key_id
31 self.secret_access_key = aws_secret_access_key
32- self.uuid = str(uuid.uuid4())
33+ self.uuid = "%s-%s-%s" % ( UUID_PREFIX,
34+ time.strftime("%Y%m%d-%H%M"),
35+ str(uuid.uuid4()).replace("-","")[0:14])
36 self.timeout = settings.DEFAULT_TIMEOUT
37 # To start with check placement against region
38 if (self.test_case.placement not in
39@@ -310,7 +313,7 @@
40 self.tidyup()
41 if (l_timed_out):
42 # Collect consoles from all instances
43- self.collect_all_consoles()
44+ self.collect_all_consoles('timeout')
45
46 return (success and not self.any_errors())
47
48@@ -441,11 +444,11 @@
49 logging.debug("Testing completed %s",test_error)
50 return test_error
51
52- def collect_all_consoles(self):
53+ def collect_all_consoles(self, suffix):
54 ''' Collects consoles from all Test Case Instances marked with
55 timeout '''
56 for l_instance in self.test_instances:
57- l_instance.collect_console('timeout')
58+ l_instance.collect_console(suffix)
59
60 def collect_timeout_metadata(self):
61 ''' Collect instance metadata via SSH if not complete and still
62@@ -454,6 +457,35 @@
63 if ((not l_instance.complete) and l_instance.sshable):
64 l_instance.collect_metadata('timeout')
65
66+ def terminate_and_wait(self, tries=settings.TERMINATE_SLEEP_TRIES,
67+ naplen=settings.TERMINATE_SLEEP):
68+ if not self.reservation:
69+ return
70+ # turn a result list into an instance list
71+ def _resl2instl(result_set):
72+ ret = []
73+ for res in result_set:
74+ for inst in res.instances:
75+ ret.append(inst)
76+ return(ret)
77+
78+ iid_l = [l.instance.id for l in self.test_instances]
79+ # check to see if there is work to do, if not, return
80+ for tnum in range(1,tries+1):
81+ res_l = self.conn.get_all_instances(instance_ids=iid_l)
82+ instances = _resl2instl(res_l)
83+ to_term = [ i.id for i in instances if i.state != "terminated" ]
84+ if len(to_term) == 0:
85+ return
86+ self.conn.terminate_instances(to_term)
87+ iid_l = to_term
88+ logging.debug("Waiting on terminate-instances [%i/%i]: %s" %
89+ (tnum, tries, to_term))
90+ time.sleep(naplen)
91+
92+ logging.warning("These instances did not terminate: %s" % to_term)
93+ return
94+
95 def tidyup(self):
96 ''' Tidyup any residual stuff left over '''
97 # Remove the private key!!!!
98@@ -470,8 +502,10 @@
99 # terminated
100 if self.reservation:
101 logging.debug('Terminating all instances associated with test case')
102- self.reservation.stop_all()
103+ self.terminate_and_wait()
104
105+ self.collect_all_consoles('terminated')
106+
107 # Tidyup the keypair
108 if self.keypair:
109 logging.debug('Removing keypair associated with test case')
110@@ -709,13 +743,19 @@
111 ktype, self.instance.id, host)
112 self.keyscan_ok = True
113
114- def collect_console(self,suffix):
115+ def collect_console(self,suffix,update=False):
116 ''' Grabs the console for the instance in its current state '''
117+ output_path = os.path.join(self.working_dir,
118+ '%s-%s.console.txt'
119+ % (self.tce.uuid, suffix))
120+ if os.path.isfile(output_path):
121+ logging.debug("console %s already collected for %s" %
122+ (self.instance.id, suffix))
123+ return
124+
125 l_output = self.instance.get_console_output()
126 logging.debug(l_output.output)
127- console_file = open(os.path.join(self.working_dir,
128- '%s-%s.console.txt')
129- % (self.tce.uuid, suffix), 'w')
130+ console_file = open(output_path, 'w')
131 console_file.write(l_output.output)
132 console_file.close()
133
134@@ -813,7 +853,7 @@
135 if (self.is_running()):
136 self.collect_metadata('restarted')
137 logging.debug('Terminating Instance %s', self.instance.id)
138- self.instance.stop()
139+ self.instance.terminate()
140 # Replace with JUnit Test
141 self.validate_host_keys()
142 self.sshable = False

Subscribers

People subscribed via source and target branches