Merge ~powersj/cloud-init:cii-restructure-ssh into cloud-init:master

Proposed by Joshua Powers
Status: Merged
Approved by: Chad Smith
Approved revision: bfbc0ea14deffa7239b22d9c43f02c47d878fdfc
Merge reported by: Chad Smith
Merged at revision: 589b542bfb3b6630b931a506ca017635059cef1d
Proposed branch: ~powersj/cloud-init:cii-restructure-ssh
Merge into: cloud-init:master
Diff against target: 95 lines (+30/-11)
2 files modified
tests/cloud_tests/collect.py (+1/-1)
tests/cloud_tests/platforms/instances.py (+29/-10)
Reviewer Review Type Date Requested Status
Chad Smith Approve
Server Team CI bot continuous-integration Approve
Review via email: mp+342010@code.launchpad.net

Commit message

tests: restructure SSH and initial connections

The SSH function was retrying and waiting for SSH for over an
hour when an SSH connection was failing to be established. This
reduces the amount of retries and time between each retry to
prevent tests from running for hours.

Also restructures how waiting for the system works: the system
will attempt to SSH up to the boot timeout time by catching
SSH connection failures and retrying until the timeout is
reached. If the limit is reached now an exception is thrown
to abort the test.

Drive by - this also fixes printing of the instance name when
collecting the console log, rather than showing a Python object
address.

Fixes LP: #1758409

To post a comment you must log in.
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:7a6f7eaadb29b81a92a005a2652f3b160f64c2b1
https://jenkins.ubuntu.com/server/job/cloud-init-ci/923/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/923/rebuild

review: Approve (continuous-integration)
Revision history for this message
Scott Moser (smoser) :
Revision history for this message
Ryan Harper (raharper) :
Revision history for this message
Joshua Powers (powersj) :
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:85743a40c7763aa47002e0e9d1c22eff60be8645
https://jenkins.ubuntu.com/server/job/cloud-init-ci/950/
Executed test runs:
    SUCCESS: Checkout
    FAILED: Unit & Style Tests

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/950/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:f67f04b49c0180d344f61a0191ad2e1aeff45738
https://jenkins.ubuntu.com/server/job/cloud-init-ci/952/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/952/rebuild

review: Approve (continuous-integration)
Revision history for this message
Chad Smith (chad.smith) wrote :

Couple nits on this approach Josh, then I think it looks good.

Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:1fdf14ea023a095f4de928e3662c655509fe3ced
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1126/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1126/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

FAILED: Continuous integration, rev:2825748072db0f1c03dba1db5a2995eeda8d2dc5
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1127/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    FAILED: Ubuntu LTS: Integration

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1127/rebuild

review: Needs Fixing (continuous-integration)
Revision history for this message
Server Team CI bot (server-team-bot) wrote :

PASSED: Continuous integration, rev:bfbc0ea14deffa7239b22d9c43f02c47d878fdfc
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1130/
Executed test runs:
    SUCCESS: Checkout
    SUCCESS: Unit & Style Tests
    SUCCESS: Ubuntu LTS: Build
    SUCCESS: Ubuntu LTS: Integration
    SUCCESS: MAAS Compatability Testing
    IN_PROGRESS: Declarative: Post Actions

Click here to trigger a rebuild:
https://jenkins.ubuntu.com/server/job/cloud-init-ci/1130/rebuild

review: Approve (continuous-integration)
Revision history for this message
Chad Smith (chad.smith) wrote :

Ran through a number of tests on our jenkins server, this looks good now. thanks for the work.

review: Approve
Revision history for this message
Chad Smith (chad.smith) wrote :

An upstream commit landed for this bug.

To view that commit see the following URL:
https://git.launchpad.net/cloud-init/commit/?id=589b542b

There was an error fetching revisions from git servers. Please try again in a few minutes. If the problem persists, contact Launchpad support.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/tests/cloud_tests/collect.py b/tests/cloud_tests/collect.py
2index 1ba7285..78263bf 100644
3--- a/tests/cloud_tests/collect.py
4+++ b/tests/cloud_tests/collect.py
5@@ -42,7 +42,7 @@ def collect_console(instance, base_dir):
6 @param base_dir: directory to write console log to
7 """
8 logfile = os.path.join(base_dir, 'console.log')
9- LOG.debug('getting console log for %s to %s', instance, logfile)
10+ LOG.debug('getting console log for %s to %s', instance.name, logfile)
11 try:
12 data = instance.console_log()
13 except NotImplementedError as e:
14diff --git a/tests/cloud_tests/platforms/instances.py b/tests/cloud_tests/platforms/instances.py
15index cc439d2..95bc3b1 100644
16--- a/tests/cloud_tests/platforms/instances.py
17+++ b/tests/cloud_tests/platforms/instances.py
18@@ -87,7 +87,12 @@ class Instance(TargetBase):
19 self._ssh_client = None
20
21 def _ssh_connect(self):
22- """Connect via SSH."""
23+ """Connect via SSH.
24+
25+ Attempt to SSH to the client on the specific IP and port. If it
26+ fails in some manner, then retry 2 more times for a total of 3
27+ attempts; sleeping a few seconds between attempts.
28+ """
29 if self._ssh_client:
30 return self._ssh_client
31
32@@ -98,21 +103,22 @@ class Instance(TargetBase):
33 client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
34 private_key = paramiko.RSAKey.from_private_key_file(self.ssh_key_file)
35
36- retries = 30
37+ retries = 3
38 while retries:
39 try:
40 client.connect(username=self.ssh_username,
41 hostname=self.ssh_ip, port=self.ssh_port,
42- pkey=private_key, banner_timeout=30)
43+ pkey=private_key)
44 self._ssh_client = client
45 return client
46 except (ConnectionRefusedError, AuthenticationException,
47 BadHostKeyException, ConnectionResetError, SSHException,
48 OSError):
49 retries -= 1
50- time.sleep(10)
51+ LOG.debug('Retrying ssh connection on connect failure')
52+ time.sleep(3)
53
54- ssh_cmd = 'Failed ssh connection to %s@%s:%s after 300 seconds' % (
55+ ssh_cmd = 'Failed ssh connection to %s@%s:%s after 3 retries' % (
56 self.ssh_username, self.ssh_ip, self.ssh_port
57 )
58 raise util.InTargetExecuteError(b'', b'', 1, ssh_cmd, 'ssh')
59@@ -128,18 +134,31 @@ class Instance(TargetBase):
60 return ' '.join(l for l in test.strip().splitlines()
61 if not l.lstrip().startswith('#'))
62
63- time = self.config['boot_timeout']
64+ boot_timeout = self.config['boot_timeout']
65 tests = [self.config['system_ready_script']]
66 if wait_for_cloud_init:
67 tests.append(self.config['cloud_init_ready_script'])
68
69 formatted_tests = ' && '.join(clean_test(t) for t in tests)
70 cmd = ('i=0; while [ $i -lt {time} ] && i=$(($i+1)); do {test} && '
71- 'exit 0; sleep 1; done; exit 1').format(time=time,
72+ 'exit 0; sleep 1; done; exit 1').format(time=boot_timeout,
73 test=formatted_tests)
74
75- if self.execute(cmd, rcs=(0, 1))[-1] != 0:
76- raise OSError('timeout: after {}s system not started'.format(time))
77-
78+ end_time = time.time() + boot_timeout
79+ while True:
80+ try:
81+ return_code = self.execute(
82+ cmd, rcs=(0, 1), description='wait for instance start'
83+ )[-1]
84+ if return_code == 0:
85+ break
86+ except util.InTargetExecuteError:
87+ LOG.warning("failed to connect via SSH")
88+
89+ if time.time() < end_time:
90+ time.sleep(3)
91+ else:
92+ raise util.PlatformError('ssh', 'after %ss instance is not '
93+ 'reachable' % boot_timeout)
94
95 # vi: ts=4 expandtab

Subscribers

People subscribed via source and target branches