In my testing, if I provide a URL to web-download that yields an error from urlopen(), I never see the store listed in the os_glance_failed_import list, and the store remains in os_glance_importing_to_stores. The image status does not change, which means there's no way for the API client to know that the import failed.
I found this when debugging a gate issue where occasionally the tempest web-download test fails. It ends up waiting many minutes for the import to complete, even though it failed long before that. In that case, the cirros link we use for testing web-download raised a timeout.
From my log, here what we log as returning to the user just before we start the import:
Feb 05 20:18:02 guaranine <email address hidden>[1008592]: DEBUG oslo_policy.policy [-] enforce: rule="modify_image" creds={"domain_id": null, "is_admin_project": true, "project_domain_id": "default", "project_id": "59a5997403484e97803cac28b7aa7366", "roles": ["reader", "member"], "service_project_domain_id": null, "service_project_id": null, "service_roles": [], "service_user_domain_id": null, "service_user_id": null, "system_scope": null, "user_domain_id": "default", "user_id": "10e5d60c60e54ab3889bcd57e367fe01"} target={"checksum": null, "container_format": "bare", "created_at": "2021-02-05T20:18:03.000000", "disk_format": "raw", "extra_properties": {}, "image_id": "70917fce-bfc6-4d57-aa54-58235d09cf24", "locations": [], "min_disk": 0, "min_ram": 0, "name": "test", "os_glance_failed_import": "", "os_glance_import_task": "e2cb5441-8c92-45c6-9363-f4b7915401e1", "os_glance_importing_to_stores": "cheap", "os_hash_algo": null, "os_hash_value": null, "os_hidden": false, "owner": "59a5997403484e97803cac28b7aa7366", "protected": false, "size": null, "status": "importing", "tags": [], "updated_at": "2021-02-05T20:18:03.000000", "virtual_size": null, "visibility": "shared"} {{(pid=1008592) enforce /usr/local/lib/python3.8/dist-packages/oslo_policy/policy.py:994}}
Note that os_glance_importing_to_stores="cheap" and os_glance_failed_import="". Shortly after this, the web-download task fails:
Feb 05 20:18:03 guaranine <email address hidden>[1008592]: ERROR glance.async_.flows._internal_plugins.web_download [-] Task e2cb5441-8c92-45c6-9363-f4b7915401e1 failed with exception <urlopen error [Errno -2] Name or service not known>: urllib.error.URLError: <urlopen error [Errno -2] Name or service not known>
Here's where the task is fully reverted:
Feb 05 20:18:03 guaranine <email address hidden>[1008592]: WARNING glance.async_.taskflow_executor [-] Task 'api_image_import-WebDownlo
ad-e2cb5441-8c92-45c6-9363-f4b7915401e1' (bc722b5c-ddd4-404b-9c09-8625ed9c5941) transitioned into state 'REVERTED' from state 'REVERTIN
G' with result 'None'
And after that, here's what we're still returning to the user:
Feb 05 20:18:03 guaranine <email address hidden>[1008592]: DEBUG oslo_policy.policy [-] enforce: rule="get_image" creds={"domain_id": n
ull, "is_admin_project": true, "project_domain_id": "default", "project_id": "59a5997403484e97803cac28b7aa7366", "roles": ["reader", "m
ember"], "service_project_domain_id": null, "service_project_id": null, "service_roles": [], "service_user_domain_id": null, "service_u
ser_id": null, "system_scope": null, "user_domain_id": "default", "user_id": "10e5d60c60e54ab3889bcd57e367fe01"} target={"checksum": nu
ll, "container_format": "bare", "created_at": "2021-02-05T20:18:03.000000", "disk_format": "raw", "extra_properties": {}, "image_id": "
70917fce-bfc6-4d57-aa54-58235d09cf24", "locations": [], "min_disk": 0, "min_ram": 0, "name": "test", "os_glance_failed_import": "", "os
_glance_import_task": "e2cb5441-8c92-45c6-9363-f4b7915401e1", "os_glance_importing_to_stores": "cheap", "os_hash_algo": null, "os_hash_
value": null, "os_hidden": false, "owner": "59a5997403484e97803cac28b7aa7366", "protected": false, "size": null, "status": "queued", "t
ags": [], "updated_at": "2021-02-05T20:18:03.000000", "virtual_size": null, "visibility": "shared"} {{(pid=1008592) enforce /usr/local/
lib/python3.8/dist-packages/oslo_policy/policy.py:994}}
Note that os_glance_importing_to_stores="cheap" and os_glance_failed_import="". In this case, "cheap" should have moved from "importing" to "failed".
I wrote a tempest negative test for this situation using a totally bogus URL, which is here:
https://review.opendev.org/c/openstack/tempest/+/774303
This stack should fix the problem:
https:/ /review. opendev. org/c/openstack /glance/ +/774315
That link is to a DNM that will pull in the tempest test to validate the fix. The actual fix is below it, and there's another refactor patch required for it to work below that.
I imagine this has been broken forever, so not a regression and thus not critical for a backport.