SoftwareDeploymentGroupTest fails at times with TimeoutException

Bug #1625921 reported by Rabi Mishra
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Fix Released
Medium
Zane Bitter

Bug Description

Tests failing intermittently with the below error.

2016-09-15 21:23:57.046934 | 2016-09-15 21:23:57.046 | Captured traceback:
2016-09-15 21:23:57.049056 | 2016-09-15 21:23:57.048 | ~~~~~~~~~~~~~~~~~~~
2016-09-15 21:23:57.052538 | 2016-09-15 21:23:57.052 | Traceback (most recent call last):
2016-09-15 21:23:57.054612 | 2016-09-15 21:23:57.053 | File "/opt/stack/new/heat/heat_integrationtests/functional/test_software_deployment_group.py", line 120, in test_deployment_crud
2016-09-15 21:23:57.056314 | 2016-09-15 21:23:57.055 | self.deployment_crud(self.sd_template)
2016-09-15 21:23:57.057901 | 2016-09-15 21:23:57.057 | File "/opt/stack/new/heat/heat_integrationtests/functional/test_software_deployment_group.py", line 95, in deployment_crud
2016-09-15 21:23:57.062085 | 2016-09-15 21:23:57.061 | resources_to_signal=group_resources)
2016-09-15 21:23:57.063607 | 2016-09-15 21:23:57.063 | File "/opt/stack/new/heat/heat_integrationtests/common/test.py", line 352, in _wait_for_stack_status
2016-09-15 21:23:57.065265 | 2016-09-15 21:23:57.064 | raise exceptions.TimeoutException(message)
2016-09-15 21:23:57.067128 | 2016-09-15 21:23:57.066 | heat_integrationtests.common.exceptions.TimeoutException: Request timed out
2016-09-15 21:23:57.070020 | 2016-09-15 21:23:57.069 | Details: Stack SoftwareDeploymentGroupTest-1803847381/c86092b8-81d6-440d-827c-f1428bc7a68a failed to reach CREATE_COMPLETE status within the required time (1200 s).

Noticed here:

http://logs.openstack.org/97/371097/1/check/gate-heat-dsvm-functional-orig-mysql-lbaasv2/4c24057/console.html.gz#_2016-09-15_21_23_57_070020

From the logs it looks like it keeps on signaling one deployment resources but nothing happens. Not sure what is going on.

Tags: gate-failure
Rabi Mishra (rabi)
Changed in heat:
importance: Undecided → Medium
milestone: none → next
Revision history for this message
Rabi Mishra (rabi) wrote :

There is also another kind of failure for the same tests with the following. Though it could have been a separate bug I've added it here.

2016-09-21 09:01:25.107822 | 2016-09-21 09:01:25.107 | Traceback (most recent call last):
2016-09-21 09:01:25.110507 | 2016-09-21 09:01:25.110 | File "/opt/stack/new/heat/heat_integrationtests/functional/test_software_deployment_group.py", line 123, in test_deployment_crud_with_rolling_update
2016-09-21 09:01:25.118818 | 2016-09-21 09:01:25.118 | self.deployment_crud(self.sd_template_with_upd_policy)
2016-09-21 09:01:25.125801 | 2016-09-21 09:01:25.122 | File "/opt/stack/new/heat/heat_integrationtests/functional/test_software_deployment_group.py", line 97, in deployment_crud
2016-09-21 09:01:25.128150 | 2016-09-21 09:01:25.127 | self.check_input_values(group_resources, 'foo', 'foo_input')
2016-09-21 09:01:25.130969 | 2016-09-21 09:01:25.130 | File "/opt/stack/new/heat/heat_integrationtests/common/test.py", line 522, in check_input_values
2016-09-21 09:01:25.135034 | 2016-09-21 09:01:25.134 | r.physical_resource_id)
2016-09-21 09:01:25.153122 | 2016-09-21 09:01:25.152 | File "/usr/local/lib/python2.7/dist-packages/heatclient/v1/software_deployments.py", line 57, in get
2016-09-21 09:01:25.169959 | 2016-09-21 09:01:25.164 | resp = self.client.get('/software_deployments/%s' % deployment_id)
2016-09-21 09:01:25.187146 | 2016-09-21 09:01:25.185 | File "/usr/local/lib/python2.7/dist-packages/heatclient/common/http.py", line 287, in get
2016-09-21 09:01:25.194433 | 2016-09-21 09:01:25.192 | return self.client_request("GET", url, **kwargs)
2016-09-21 09:01:25.201238 | 2016-09-21 09:01:25.198 | File "/usr/local/lib/python2.7/dist-packages/heatclient/common/http.py", line 280, in client_request
2016-09-21 09:01:25.207388 | 2016-09-21 09:01:25.206 | resp, body = self.json_request(method, url, **kwargs)
2016-09-21 09:01:25.222648 | 2016-09-21 09:01:25.222 | File "/usr/local/lib/python2.7/dist-packages/heatclient/common/http.py", line 269, in json_request
2016-09-21 09:01:25.228537 | 2016-09-21 09:01:25.227 | resp = self._http_request(url, method, **kwargs)
2016-09-21 09:01:25.236216 | 2016-09-21 09:01:25.232 | File "/usr/local/lib/python2.7/dist-packages/heatclient/common/http.py", line 232, in _http_request
2016-09-21 09:01:25.239921 | 2016-09-21 09:01:25.237 | raise exc.from_response(resp)
2016-09-21 09:01:25.244276 | 2016-09-21 09:01:25.242 | heatclient.exc.HTTPNotFound: ERROR: None
2016-09-21 09:01:25.297855 | 2016-09-21 09:01:25.281 |

Revision history for this message
Steve Baker (steve-stevebaker) wrote :

I haven't been able to reproduce this yet with test_deployment_crud running in a loop

Rabi Mishra (rabi)
tags: added: gate-failure
Revision history for this message
Rabi Mishra (rabi) wrote :

For the failures mentioned in comment #1 I think there is an issue with the test. We get the list of deployment group resources before the group is CREATE_COMPLETE. So there is a possibility that the resource_id is not set for the resources. We use the resource_id to get the deployment[1] and I see api calls like[2] which return 404.

[1] https://github.com/openstack/heat/blob/master/heat_integrationtests/common/test.py#L535

[2] http://logs.openstack.org/67/435667/4/check/gate-heat-dsvm-functional-convg-mysql-lbaasv2-ubuntu-xenial/2de94c8/logs/screen-h-api.txt.gz#_Jun_07_04_43_27_101259

Jun 07 04:43:27.101259 ubuntu-xenial-infracloud-vanilla-9177735 <email address hidden>[21569]: [pid: 21572|app: 0|req: 1992/3976] 15.184.65.210 () {54 vars in 1214 bytes} [Wed Jun 7 04:43:26 2017] GET /heat-api/v1/9c9213874f934756a217bdc684d476b7/software_deployments/ => generated 112 bytes in 103 msecs (HTTP/1.1 404) 4 headers in 164 bytes (1 switches on core 0)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/471727

Changed in heat:
assignee: nobody → Rabi Mishra (rabi)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/471727
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=4dd67bb1aa2df4f5270f79600ac1f888b0bd9a5f
Submitter: Jenkins
Branch: master

commit 4dd67bb1aa2df4f5270f79600ac1f888b0bd9a5f
Author: rabi <email address hidden>
Date: Wed Jun 7 16:44:24 2017 +0530

    Get the deployment group resources again after CREATE_COMPLETE

    We seem to get the list of group resources for signaling (there
    is possibility that resource_id is not set for some resources)
    and then use the same list to get the deployments. It would be
    good to get the resources again after they are created.

    Change-Id: I908d1d13abe8e59a65308e883591abca2b1c7a9a
    Partial-Bug: #1625921

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/474013

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/474014

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on heat (stable/ocata)

Change abandoned by Rico Lin (<email address hidden>) on branch: stable/ocata
Review: https://review.openstack.org/474013

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on heat (stable/newton)

Change abandoned by Rico Lin (<email address hidden>) on branch: stable/newton
Review: https://review.openstack.org/474014

Changed in heat:
assignee: Rabi Mishra (rabi) → Zane Bitter (zaneb)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/474004
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=41b8e44d1e89440dca994bb927ecb35784d94e34
Submitter: Jenkins
Branch: master

commit 41b8e44d1e89440dca994bb927ecb35784d94e34
Author: Zane Bitter <email address hidden>
Date: Tue Jun 13 19:38:39 2017 -0400

    Fix races in SoftwareDeploymentGroupTest

    Don't assume that we can get the physical IDs of all of the
    SoftwareDeployment resources as soon as the stack becomes
    CREATE_IN_PROGRESS. 4dd67bb1aa2df4f5270f79600ac1f888b0bd9a5f reads them
    again once the stack is COMPLETE; this patch also uses the same physical
    resource IDs to verify the update.

    Also, make sure all of the resources are IN_PROGRESS before trying to
    signal them, because the signal_resources() utility method only signals
    resources that are IN_PROGRESS.

    Change-Id: I9787a5de5e4272a3ab370f653182aa9283ae01c0
    Closes-Bug: #1697794
    Closes-Bug: #1626073
    Closes-Bug: #1625921

Changed in heat:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/heat 9.0.0.0b3

This issue was fixed in the openstack/heat 9.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.