Merge lp:~rvb/gwacl/delete-disk-fix into lp:gwacl

Proposed by Raphaël Badin
Status: Merged
Approved by: Raphaël Badin
Approved revision: 94
Merged at revision: 91
Proposed branch: lp:~rvb/gwacl/delete-disk-fix
Merge into: lp:gwacl
Diff against target: 318 lines (+195/-15)
8 files modified
.bzrignore (+1/-0)
deletedisk.go (+64/-0)
deletedisk_test.go (+103/-0)
example/live_example_managementapi.go (+0/-4)
managementapi.go (+8/-0)
managementapi_test.go (+5/-0)
poller.go (+7/-7)
poller_test.go (+7/-4)
To merge this branch: bzr merge lp:~rvb/gwacl/delete-disk-fix
Reviewer Review Type Date Requested Status
Gavin Panella Approve
Review via email: mp+159150@code.launchpad.net

Commit message

Use a poller in ManagementAPI.DeleteDisk() to work around Windows Azure's bug.

Description of the change

This branch adds a poller used to delete a disk in order to work around a Windows Azure bug: it takes an indeterminate time for a disk previously attached to a deleted VM to become "not in use" and thus be available for deletion.

I refactored the poller: isDone() now also takes the error returned by poll(). This was done because I needed it for this branch obviously but this simply makes the poller more generic: isDone() can also decide whether or not the polling should stop using the error returned by poll() (and not only using the 'response' returned by poll()).

I've made all I could to keep this workaround contained. The goal was that, once the bug will be fixed in Windows Azure, only minimal changes will be required to revert to the old implementation. That's why the test TestManagementAPIDeleteDiskPolls is in deletedisk_test.go file. All we will have to do when the Azure bug is fixed is:
- remove the existing DeleteDisk() method and rename _DeleteDisk() into DeleteDisk()
- delete the files deletedisk.go and deletedisk_test.go.
- remove the tweak to deleteDiskInterval in managementapi_test.go (the variable deleteDiskInterval is in deletedisk.go so once that file is removed, the need to remove its usage in managementapi_test.go will become clear (i.e. managementapi_test.go won't compile))

To post a comment you must log in.
Revision history for this message
Gavin Panella (allenap) wrote :

Looks good :)

[1]

+func isInUseError(errString string, diskName string) bool {
+    pattern := fmt.Sprintf("BadRequest - A disk with name %s is currently in use by virtual machine.*", regexp.QuoteMeta(diskName))
+    res, err := regexp.MatchString(pattern, errString)

Do you think it's necessary to check the disk name in the error message?

It might be clearer to check that the response status is 400, and that
the body contains the phrases "A disk with name" and "is currently in
use by virtual machine".

[2]

+    res, err := regexp.MatchString(pattern, errString)
+    if err != nil {
+        panic(fmt.Sprintf("can not compile regular expression: %v", pattern))
+    }

Fwiw, there's regexp.MustCompile for these kinds of situations; saves
having to do the 3-line-dance.

review: Approve
lp:~rvb/gwacl/delete-disk-fix updated
94. By Raphaël Badin

Use regexp.MustCompile.

Revision history for this message
Raphaël Badin (rvb) wrote :

Thanks for the review!

> [1]
>
> +func isInUseError(errString string, diskName string) bool {
> +    pattern := fmt.Sprintf("BadRequest - A disk with name %s is currently in
> use by virtual machine.*", regexp.QuoteMeta(diskName))
> +    res, err := regexp.MatchString(pattern, errString)
>
> Do you think it's necessary to check the disk name in the error message?

I think it's better to keep the disk name in the error message just to be 100% certain that the error is about the disk that we're trying to delete. This might look a bit like "belt and braces" but I'd like to be on the safe side with Azure.

>
> [2]
>
> +    res, err := regexp.MatchString(pattern, errString)
> +    if err != nil {
> +        panic(fmt.Sprintf("can not compile regular expression: %v", pattern))
> +    }
>
> Fwiw, there's regexp.MustCompile for these kinds of situations; saves
> having to do the 3-line-dance.

Good point, done.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file '.bzrignore'
--- .bzrignore 2013-04-10 10:40:01 +0000
+++ .bzrignore 2013-04-17 08:28:24 +0000
@@ -1,2 +1,3 @@
1./example/live_example_managementapi1./example/live_example_managementapi
2./example/storage_example2./example/storage_example
3./tags
34
=== added file 'deletedisk.go'
--- deletedisk.go 1970-01-01 00:00:00 +0000
+++ deletedisk.go 2013-04-17 08:28:24 +0000
@@ -0,0 +1,64 @@
1// Copyright 2013 Canonical Ltd. This software is licensed under the
2// GNU Lesser General Public License version 3 (see the file COPYING).
3
4// A poller object used to delete a disk.
5//
6// It takes an indeterminate time for a disk previously attached to a
7// deleted VM to become "not in use" and thus be available for deletion.
8// When we receive the "disk is still attached" error, we try again every
9// 10 seconds until it succeeds, with a timeout of 30 minutes).
10// This bug might be related to:
11// http://social.msdn.microsoft.com/Forums/en-US/WAVirtualMachinesforWindows/thread/4394c75d-59ff-4634-8212-2ad71bf6fbd5/
12//
13// Once this bug is fixed in Windows Azure, this file and the related tests
14// can safely be removed, and ManagementAPI._DeleteDisk() can replace the
15// current implementation of ManagementAPI.DeleteDisk() (which uses this
16// poller).
17
18package gwacl
19
20import (
21 "fmt"
22 "regexp"
23 "time"
24)
25
26var deleteDiskTimeout = 30 * time.Minute
27var deleteDiskInterval = 10 * time.Second
28
29type diskDeletePoller struct {
30 api *ManagementAPI
31 diskName string
32}
33
34var _ poller = &diskDeletePoller{}
35
36func (poller diskDeletePoller) poll() (*x509Response, error) {
37 return nil, poller.api._DeleteDisk(poller.diskName)
38}
39
40// isInUseError returns whether or not the given string is of the "disk in use"
41// type.
42// Here is a real-world example of the error in question:
43// "BadRequest - A disk with name gwacldiske5w7lkj is currently in use
44// by virtual machine gwaclrolemvo1yab running within hosted service
45// gwacl623yosxtppsa9577xy5, deployment gwaclmachinewes4n64f. (http
46// code 400: Bad Request)"
47func isInUseError(errString string, diskName string) bool {
48 pattern := fmt.Sprintf("BadRequest - A disk with name %s is currently in use by virtual machine.*", regexp.QuoteMeta(diskName))
49 reg := regexp.MustCompile(pattern)
50 return reg.MatchString(errString)
51}
52
53func (poller diskDeletePoller) isDone(response *x509Response, pollerErr error) (bool, error) {
54 if pollerErr == nil {
55 return true, nil
56 }
57 if isInUseError(pollerErr.Error(), poller.diskName) {
58 // The error is of the "disk in use" type: continue polling.
59 return false, nil
60 }
61 // The error is *not* of the "disk in use" type: stop polling and return
62 // the error.
63 return true, pollerErr
64}
065
=== added file 'deletedisk_test.go'
--- deletedisk_test.go 1970-01-01 00:00:00 +0000
+++ deletedisk_test.go 2013-04-17 08:28:24 +0000
@@ -0,0 +1,103 @@
1// Copyright 2013 Canonical Ltd. This software is licensed under the
2// GNU Lesser General Public License version 3 (see the file COPYING).
3
4package gwacl
5
6import (
7 "fmt"
8 . "launchpad.net/gocheck"
9 "net/http"
10 "time"
11)
12
13type deleteDiskSuite struct{}
14
15var _ = Suite(&deleteDiskSuite{})
16
17// Real-world error messages and names.
18const (
19 diskInUseErrorTemplate = "BadRequest - A disk with name %s is currently in use by virtual machine gwaclrolemvo1yab running within hosted service gwacl623yosxtppsa9577xy5, deployment gwaclmachinewes4n64f. (http code 400: Bad Request)"
20 diskName = "gwacldiske5w7lkj"
21 diskDoesNotExistError = "DELETE request failed: ResourceNotFound - The disk with the specified name does not exist. (http code 404: Not Found)"
22)
23
24func (suite *deleteDiskSuite) TestIsInUseError(c *C) {
25 var testValues = []struct {
26 errorString string
27 diskName string
28 expectedResult bool
29 }{
30 {fmt.Sprintf(diskInUseErrorTemplate, diskName), diskName, true},
31 {fmt.Sprintf(diskInUseErrorTemplate, diskName), "another-disk", false},
32 {"unknown error", diskName, false},
33 {diskDoesNotExistError, diskName, false},
34 }
35 for _, test := range testValues {
36 c.Check(isInUseError(test.errorString, test.diskName), Equals, test.expectedResult)
37 }
38}
39
40func (suite *deleteDiskSuite) TestIsDoneReturnsTrueIfNilError(c *C) {
41 poller := diskDeletePoller{nil, ""}
42 randomResponse := x509Response{StatusCode: http.StatusAccepted}
43 done, err := poller.isDone(&randomResponse, nil)
44 c.Check(done, Equals, true)
45 c.Check(err, IsNil)
46}
47
48func (suite *deleteDiskSuite) TestIsDoneReturnsFalseIfDiskInUseError(c *C) {
49 diskName := "gwacldiske5w7lkj"
50 diskInUseError := fmt.Errorf(diskInUseErrorTemplate, diskName)
51 poller := diskDeletePoller{nil, diskName}
52 done, err := poller.isDone(nil, diskInUseError)
53 c.Check(done, Equals, false)
54 c.Check(err, IsNil)
55}
56
57func (suite *deleteDiskSuite) TestIsDoneReturnsTrueIfAnotherError(c *C) {
58 anotherError := fmt.Errorf("Unknown error")
59 poller := diskDeletePoller{nil, "disk-name"}
60 done, err := poller.isDone(nil, anotherError)
61 c.Check(done, Equals, true)
62 c.Check(err, Equals, anotherError)
63}
64
65func (suite *deleteDiskSuite) TestPollCallsDeleteDisk(c *C) {
66 api := makeAPI(c)
67 recordedRequests := setUpDispatcher("operationID")
68 diskName := "gwacldiske5w7lkj"
69 poller := diskDeletePoller{api, diskName}
70
71 response, err := poller.poll()
72
73 c.Assert(response, IsNil)
74 c.Assert(err, IsNil)
75 expectedURL := api.session.composeURL("services/disks/" + diskName)
76 checkOneRequest(c, recordedRequests, expectedURL, []byte{}, "DELETE")
77}
78
79func (suite *deleteDiskSuite) TestManagementAPIDeleteDiskPolls(c *C) {
80 firstResponse := DispatcherResponse{
81 response: &x509Response{},
82 errorObject: fmt.Errorf(diskInUseErrorTemplate, diskName)}
83 secondResponse := DispatcherResponse{
84 response: &x509Response{StatusCode: http.StatusOK},
85 errorObject: nil}
86 responses := []DispatcherResponse{firstResponse, secondResponse}
87 rigPreparedResponseDispatcher(responses)
88 recordedRequests := make([]*x509Request, 0)
89 rigRecordingDispatcher(&recordedRequests)
90
91 api := makeAPI(c)
92 diskName := "gwacldiske5w7lkj"
93 poller := diskDeletePoller{api, diskName}
94
95 response, err := performPolling(poller, time.Nanosecond, time.Minute)
96
97 c.Assert(response, IsNil)
98 c.Assert(err, IsNil)
99 expectedURL := api.session.composeURL("services/disks/" + diskName)
100 c.Check(len(recordedRequests), Equals, 2)
101 checkRequest(c, recordedRequests[0], expectedURL, []byte{}, "DELETE")
102 checkRequest(c, recordedRequests[1], expectedURL, []byte{}, "DELETE")
103}
0104
=== modified file 'example/live_example_managementapi.go'
--- example/live_example_managementapi.go 2013-04-02 03:44:55 +0000
+++ example/live_example_managementapi.go 2013-04-17 08:28:24 +0000
@@ -147,10 +147,6 @@
147 checkError(err)147 checkError(err)
148 fmt.Println("Done deleting deployment\n")148 fmt.Println("Done deleting deployment\n")
149149
150 // TODO: it takes some time for a disk previously attached to a deleted VM
151 // to become "not in use" and thus be available for deletion.
152 time.Sleep(3 * time.Minute)
153
154 fmt.Println("Deleting VM disk...")150 fmt.Println("Deleting VM disk...")
155 err = api.DeleteDisk(diskName)151 err = api.DeleteDisk(diskName)
156 checkError(err)152 checkError(err)
157153
=== modified file 'managementapi.go'
--- managementapi.go 2013-04-10 15:14:32 +0000
+++ managementapi.go 2013-04-17 08:28:24 +0000
@@ -225,6 +225,14 @@
225}225}
226226
227func (api *ManagementAPI) DeleteDisk(diskName string) error {227func (api *ManagementAPI) DeleteDisk(diskName string) error {
228 // Use the disk deletion poller to work around a bug in Windows Azure.
229 // See the documentation in the file deletedisk.go for details.
230 poller := diskDeletePoller{api, diskName}
231 _, err := performPolling(poller, deleteDiskInterval, deleteDiskTimeout)
232 return err
233}
234
235func (api *ManagementAPI) _DeleteDisk(diskName string) error {
228 response, err := api.session.delete("services/disks/" + diskName)236 response, err := api.session.delete("services/disks/" + diskName)
229 if err != nil {237 if err != nil {
230 return err238 return err
231239
=== modified file 'managementapi_test.go'
--- managementapi_test.go 2013-04-03 11:20:20 +0000
+++ managementapi_test.go 2013-04-17 08:28:24 +0000
@@ -507,6 +507,11 @@
507}507}
508508
509func (suite *managementAPISuite) TestDeleteDisk(c *C) {509func (suite *managementAPISuite) TestDeleteDisk(c *C) {
510 // The current implementation of DeleteDisk() works around a bug in
511 // Windows Azure by polling the server. See the documentation in the file
512 // deletedisk.go for details.
513 // Change the polling interval to speed up the tests:
514 deleteDiskInterval = time.Nanosecond
510 api := makeAPI(c)515 api := makeAPI(c)
511 recordedRequests := setUpDispatcher("operationID")516 recordedRequests := setUpDispatcher("operationID")
512 diskName := "diskName"517 diskName := "diskName"
513518
=== modified file 'poller.go'
--- poller.go 2013-04-03 11:20:20 +0000
+++ poller.go 2013-04-17 08:28:24 +0000
@@ -14,7 +14,7 @@
14// the response given by the server means that the polling is finished. 14// the response given by the server means that the polling is finished.
15type poller interface {15type poller interface {
16 poll() (*x509Response, error)16 poll() (*x509Response, error)
17 isDone(*x509Response) (bool, error)17 isDone(*x509Response, error) (bool, error)
18}18}
1919
20// performPolling calls the poll() method of the given 'poller' object every20// performPolling calls the poll() method of the given 'poller' object every
@@ -29,11 +29,8 @@
29 case <-timeoutChannel:29 case <-timeoutChannel:
30 return nil, fmt.Errorf("polling timed out waiting for an asynchronous operation")30 return nil, fmt.Errorf("polling timed out waiting for an asynchronous operation")
31 default:31 default:
32 response, err := poller.poll()32 response, pollerErr := poller.poll()
33 if err != nil {33 done, err := poller.isDone(response, pollerErr)
34 return nil, err
35 }
36 done, err := poller.isDone(response)
37 if err != nil {34 if err != nil {
38 return nil, err35 return nil, err
39 }36 }
@@ -89,9 +86,12 @@
89// IsDone returns true if the given response has a status code indicating86// IsDone returns true if the given response has a status code indicating
90// success and if the returned XML response corresponds to a valid Operation87// success and if the returned XML response corresponds to a valid Operation
91// with a status indicating that the operation is completed.88// with a status indicating that the operation is completed.
92func (poller operationPoller) isDone(response *x509Response) (bool, error) {89func (poller operationPoller) isDone(response *x509Response, pollerError error) (bool, error) {
93 // TODO: Add a timeout so that polling won't continue forever if the90 // TODO: Add a timeout so that polling won't continue forever if the
94 // server cannot be reached.91 // server cannot be reached.
92 if pollerError != nil {
93 return true, pollerError
94 }
95 if response.StatusCode >= 200 && response.StatusCode < 300 {95 if response.StatusCode >= 200 && response.StatusCode < 300 {
96 operation := Operation{}96 operation := Operation{}
97 err := operation.Deserialize(response.Body)97 err := operation.Deserialize(response.Body)
9898
=== modified file 'poller_test.go'
--- poller_test.go 2013-04-03 11:20:20 +0000
+++ poller_test.go 2013-04-17 08:28:24 +0000
@@ -51,7 +51,10 @@
51 return &testPollerResponse, nil51 return &testPollerResponse, nil
52}52}
5353
54func (poller *testPoller) isDone(response *x509Response) (bool, error) {54func (poller *testPoller) isDone(response *x509Response, pollerError error) (bool, error) {
55 if pollerError != nil {
56 return true, pollerError
57 }
55 if poller.notDoneCalls == 0 {58 if poller.notDoneCalls == 0 {
56 return true, nil59 return true, nil
57 }60 }
@@ -128,7 +131,7 @@
128 StatusCode: http.StatusOK,131 StatusCode: http.StatusOK,
129 }132 }
130133
131 isDone, err := poller.isDone(&response)134 isDone, err := poller.isDone(&response, nil)
132 c.Assert(err, IsNil)135 c.Assert(err, IsNil)
133 c.Assert(isDone, Equals, true)136 c.Assert(isDone, Equals, true)
134 }137 }
@@ -148,7 +151,7 @@
148 {StatusCode: http.StatusInternalServerError},151 {StatusCode: http.StatusInternalServerError},
149 }152 }
150 for _, response := range notDoneResponses {153 for _, response := range notDoneResponses {
151 isDone, _ := poller.isDone(&response)154 isDone, _ := poller.isDone(&response, nil)
152 c.Assert(isDone, Equals, false)155 c.Assert(isDone, Equals, false)
153 }156 }
154}157}
@@ -160,7 +163,7 @@
160 Body: []byte("><invalid XML"),163 Body: []byte("><invalid XML"),
161 StatusCode: http.StatusOK,164 StatusCode: http.StatusOK,
162 }165 }
163 _, err := poller.isDone(&response)166 _, err := poller.isDone(&response, nil)
164 c.Assert(err, NotNil)167 c.Assert(err, NotNil)
165 c.Check(err, FitsTypeOf, new(xml.SyntaxError))168 c.Check(err, FitsTypeOf, new(xml.SyntaxError))
166}169}

Subscribers

People subscribed via source and target branches

to all changes: