Merge lp:~chad.smith/charms/precise/storage/storage-volume-label-availability-zone into lp:~dpb/charms/precise/storage/trunk

Proposed by Chad Smith
Status: Merged
Merged at revision: 35
Proposed branch: lp:~chad.smith/charms/precise/storage/storage-volume-label-availability-zone
Merge into: lp:~dpb/charms/precise/storage/trunk
Prerequisite: lp:~chad.smith/charms/precise/storage/storage-fix-nfs-relation-ordering
Diff against target: 229 lines (+97/-19)
3 files modified
hooks/common_util.py (+31/-9)
hooks/storage-provider.d/block-storage-broker/block-storage-relation-changed (+2/-1)
hooks/test_common_util.py (+64/-9)
To merge this branch: bzr merge lp:~chad.smith/charms/precise/storage/storage-volume-label-availability-zone
Reviewer Review Type Date Requested Status
David Britton Approve
Fernando Correa Neto (community) Approve
Review via email: mp+212062@code.launchpad.net

Description of the change

Volumes in EC2-land can only be attached to instances that live in the same availability zone. Block-storage-broker takes whatever volume_label is requested by the storage subordinate and attempts to attach it to the instance-id which is requesting that volume_label. If the instance and volume live in different AZs, the attach will fail and Block-storage-broker hooks will fail.

    To avoid trying to euca-attach-volume of a volume in another AZ, we need to request that our volume label (volume_label translates to a volume TAG in EC2) specify the availability zone in which that volume is created. For openstack and nova, the AZ is always nova, but for ec2, the AZ will be determined as the AZ in which the requesting instance-id lives (us-east-1b, us-west-1a etc.)

This way, the block-storage-broker will not attempt to mount any discovered volume from different availability zones.

There will need to be more significant work in the block-storage-broker charm to also attempt to assess volumes and instances in incompatible AZs, but this current branch prevents a storage-principal unit from requesting an invalid volume_label.

This branch adds a get_availability_zone function to allow extracting this value from EC2 or openstack metadata.

Also in this branch:
   dropping the lsof -b (non-blocking) option, as called currently it was not a valid option. Will look at non-blocking lsof in a separate branch.

To post a comment you must log in.
Revision history for this message
David Britton (dpb) wrote :

[0] When relating to bsb, I recieved:

2014-03-20 22:59:28 INFO worker.uniter.jujuc server.go:108 running hook tool "juju-log" ["Validated charm configuration credentials have access to block storage service"]
2014-03-20 22:59:28 DEBUG worker.uniter.jujuc server.go:109 hook context id "block-storage-broker-ec2/0:block-storage-relation-changed:3470414816341938807"; dir "/var/lib/juju/agents/unit-block-storage-broker-ec2-0/charm"
2014-03-20 22:59:28 INFO juju juju-log.go:66 block-storage-broker-ec2/0 block-storage:2: Validated charm configuration credentials have access to block storage service
2014-03-20 22:59:29 INFO worker.uniter.jujuc server.go:108 running hook tool "juju-log" ["Creating a 9Gig volume named (postgresql/0 us-west-1c volume) for instance i-53df8d0c"]
2014-03-20 22:59:29 DEBUG worker.uniter.jujuc server.go:109 hook context id "block-storage-broker-ec2/0:block-storage-relation-changed:3470414816341938807"; dir "/var/lib/juju/agents/unit-block-storage-broker-ec2-0/charm"
2014-03-20 22:59:29 INFO juju juju-log.go:66 block-storage-broker-ec2/0 block-storage:2: Creating a 9Gig volume named (postgresql/0 us-west-1c volume) for instance i-53df8d0c
2014-03-20 22:59:29 INFO juju.worker.uniter context.go:255 HOOK 'availability_zone'
2014-03-20 22:59:29 ERROR juju.worker.uniter uniter.go:350 hook failed: exit status 1

40. By Chad Smith

drop unused CHARM_DIR global

Revision history for this message
David Britton (dpb) wrote :

Note: Looks good during testing. Starting code review now.

41. By Chad Smith

drop misused -b option to lsof

42. By Chad Smith

lint

Revision history for this message
Fernando Correa Neto (fcorrea) wrote :

Looks great! +1

[1]
71 + except subprocess.CalledProcessError, e:
72 + log("Error: couldn't discover ec2 availability zone. %s" % str(e),
73 + hookenv.ERROR)

Even though this block is for the ec2 try, it only indicates that it failed grabbing both nova and ec2, in which case I think we could log something more generic. Maybe "Error: couldn't discover volume availability zone..."

review: Approve
Revision history for this message
David Britton (dpb) wrote :

+1, let's merge this in as well.

review: Approve

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'hooks/common_util.py'
2--- hooks/common_util.py 2014-03-21 22:53:33 +0000
3+++ hooks/common_util.py 2014-03-21 22:53:33 +0000
4@@ -7,9 +7,8 @@
5 from stat import S_ISBLK, ST_MODE
6 import json
7
8-CHARM_DIR = os.environ.get("CHARM_DIR")
9 FSTAB_FILE = "/etc/fstab"
10-EC2_METADATA_URL = "http://169.254.169.254/latest/meta-data/instance-id"
11+EC2_METADATA_URL = "http://169.254.169.254/latest/meta-data"
12 OS_METADATA_URL = "http://169.254.169.254/openstack/2012-08-10/meta_data.json"
13 PERSIST_DATA_FILE = "charm-data"
14
15@@ -108,12 +107,12 @@
16 else: # Then device is in use let's report the offending process
17 message = "WARNING: umount %s failed. Retrying." % mountpoint
18 try:
19- lsof = subprocess.check_output(["lsof", mountpoint, "-b"])
20+ lsof = subprocess.check_output(["lsof", mountpoint])
21 except subprocess.CalledProcessError, e:
22 log("WARNING: %s" % str(e), hookenv.WARNING)
23 else:
24 process_name = lsof.split("\n")[1].split()[0]
25- message += " Device in use by (%s)" % process_name
26+ message += " Device in use by (%s)" % process_name
27 log(message, hookenv.WARNING)
28 sleep(5)
29
30@@ -306,16 +305,39 @@
31 try:
32 metadata = json.loads(
33 subprocess.check_output("curl %s" % OS_METADATA_URL, shell=True))
34- except ValueError:
35- pass # We are probably on EC2, try below
36- else:
37 return metadata["uuid"]
38+ except ValueError:
39+ pass # We are probably on EC2, try below
40 try:
41 metadata = subprocess.check_output(
42- "curl %s" % EC2_METADATA_URL, shell=True)
43+ "curl %s/instance-id" % EC2_METADATA_URL, shell=True)
44+ return metadata.strip()
45 except subprocess.CalledProcessError, e:
46 log("Error: couldn't discover ec2 instance id. %s" % str(e),
47 hookenv.ERROR)
48 sys.exit(1)
49- else:
50+
51+
52+def get_availability_zone():
53+ """
54+ Curl metadata for openstack from C{OS_METADATA_URL} to obtain the
55+ C{availability_zone}.
56+ If that fails, we are likely on EC2, so curl the
57+ C{EC2_METADATA_URL/placement/availability-zone} to get the EC2 availability
58+ zone.
59+ """
60+ try:
61+ metadata = json.loads(
62+ subprocess.check_output("curl %s" % OS_METADATA_URL, shell=True))
63+ return metadata["availability_zone"]
64+ except ValueError:
65+ pass # We are probably on EC2, try below
66+ try:
67+ metadata = subprocess.check_output(
68+ "curl %s/placement/availability-zone" % EC2_METADATA_URL,
69+ shell=True)
70 return metadata.strip()
71+ except subprocess.CalledProcessError, e:
72+ log("Error: couldn't discover ec2 availability zone. %s" % str(e),
73+ hookenv.ERROR)
74+ sys.exit(1)
75
76=== modified file 'hooks/storage-provider.d/block-storage-broker/block-storage-relation-changed'
77--- hooks/storage-provider.d/block-storage-broker/block-storage-relation-changed 2014-02-13 00:41:13 +0000
78+++ hooks/storage-provider.d/block-storage-broker/block-storage-relation-changed 2014-03-21 22:53:33 +0000
79@@ -8,6 +8,7 @@
80
81
82 instance_id = common_util.get_instance_id()
83+availability_zone = common_util.get_availability_zone()
84 config_data = hookenv.config()
85 volume_size = config_data.get("volume_size", None)
86 volume_map = config_data.get("volume_map", None)
87@@ -16,7 +17,7 @@
88 relids = hookenv.relation_ids("data")
89 for relid in relids:
90 for unit in hookenv.related_units(relid):
91- label = "%s nova volume" % unit
92+ label = "%s %s volume" % (unit, availability_zone)
93 if volume_map is not None:
94 # Grab the volume_label for our instance if it exists so we can
95 # request that volume-id from the block-storage-broker
96
97=== modified file 'hooks/test_common_util.py'
98--- hooks/test_common_util.py 2014-03-21 22:53:33 +0000
99+++ hooks/test_common_util.py 2014-03-21 22:53:33 +0000
100@@ -164,7 +164,7 @@
101 curl = self.mocker.replace(subprocess.check_output)
102 curl("curl %s" % util.OS_METADATA_URL, shell=True)
103 self.mocker.throw(ValueError("invalid JSON"))
104- curl("curl %s" % util.EC2_METADATA_URL, shell=True)
105+ curl("curl %s/instance-id" % util.EC2_METADATA_URL, shell=True)
106 self.mocker.result(instance_id)
107 self.mocker.replay()
108 self.assertEqual(util.get_instance_id(), instance_id)
109@@ -174,15 +174,14 @@
110 When L{get_instance_id} fails to curl C{EC2_METADATA_URL} it will log
111 the error and exit in error.
112 """
113- ec2_command = "curl %s" % util.EC2_METADATA_URL
114+ ec2_command = "curl %s/instance-id" % util.EC2_METADATA_URL
115
116 curl = self.mocker.replace(subprocess.check_output)
117 curl("curl %s" % util.OS_METADATA_URL, shell=True)
118 self.mocker.throw(ValueError("invalid JSON"))
119 curl(ec2_command, shell=True)
120 self.mocker.throw(
121- subprocess.CalledProcessError(
122- 1, "curl %s" % util.EC2_METADATA_URL))
123+ subprocess.CalledProcessError(1, ec2_command))
124 self.mocker.replay()
125 result = self.assertRaises(SystemExit, util.get_instance_id)
126 self.assertEqual(result.code, 1)
127@@ -192,6 +191,60 @@
128 self.assertTrue(
129 util.hookenv.is_logged(message), "Not logged- %s" % message)
130
131+ def test_get_availability_zone_when_openstack(self):
132+ """
133+ L{get_availability_zone} will attempt a curl for openstack metadata to
134+ extract the C{uuid} as the instance_id from openstack JSON metadata.
135+ """
136+ os_metadata = {
137+ "availability_zone": "nova"
138+ }
139+ curl = self.mocker.replace(subprocess.check_output)
140+ curl("curl %s" % util.OS_METADATA_URL, shell=True)
141+ self.mocker.result(json.dumps(os_metadata))
142+ self.mocker.replay()
143+ self.assertEqual(util.get_availability_zone(), "nova")
144+
145+ def test_get_availability_zone_when_ec2(self):
146+ """
147+ L{get_availability_zone} will first attempt a curl for openstack
148+ metadata. When that fails to be valid JSON because we are on an EC2
149+ instance, it will curl the C{EC2_METADATA_URL} to grab the
150+ C{availability-zone}.
151+ """
152+ availability_zone = "us-east-1b"
153+ curl = self.mocker.replace(subprocess.check_output)
154+ curl("curl %s" % util.OS_METADATA_URL, shell=True)
155+ self.mocker.throw(ValueError("invalid JSON"))
156+ curl("curl %s/placement/availability-zone" %
157+ util.EC2_METADATA_URL, shell=True)
158+ self.mocker.result(availability_zone)
159+ self.mocker.replay()
160+ self.assertEqual(util.get_availability_zone(), availability_zone)
161+
162+ def test_get_availability_zone_fails_on_ec2(self):
163+ """
164+ When L{get_availability_zone} fails to curl C{EC2_METADATA_URL} it will
165+ log the error and exit in error.
166+ """
167+ ec2_command = ("curl %s/placement/availability-zone" %
168+ util.EC2_METADATA_URL)
169+
170+ curl = self.mocker.replace(subprocess.check_output)
171+ curl("curl %s" % util.OS_METADATA_URL, shell=True)
172+ self.mocker.throw(ValueError("invalid JSON"))
173+ curl(ec2_command, shell=True)
174+ self.mocker.throw(
175+ subprocess.CalledProcessError(1, ec2_command))
176+ self.mocker.replay()
177+ result = self.assertRaises(SystemExit, util.get_availability_zone)
178+ self.assertEqual(result.code, 1)
179+ message = (
180+ "Error: couldn't discover ec2 availability zone. Command '%s' "
181+ "returned non-zero exit status 1" % ec2_command)
182+ self.assertTrue(
183+ util.hookenv.is_logged(message), "Not logged- %s" % message)
184+
185 def test_is_mounted_when_mountpoint_is_mounted(self):
186 """
187 L{is_mounted} returns C{True} when the mountpath is mounted.
188@@ -262,7 +315,8 @@
189
190 lsof_output = (
191 "COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME\n"
192- "postgresql 30544 csmith 10r CHR 1,9 0t0 2056 /mnt/this")
193+ "postgresql 30544 csmith 10r CHR 1,9 0t0 2056 "
194+ "/mnt/this")
195
196 ismount = self.mocker.replace(os.path.ismount)
197 ismount("/mnt/this")
198@@ -272,7 +326,7 @@
199 self.mocker.result(2) # Failure from the shell cmd
200 self.mocker.count(20)
201 lsof = self.mocker.replace(subprocess.check_output)
202- lsof(["lsof", "/mnt/this", "-b"])
203+ lsof(["lsof", "/mnt/this"])
204 self.mocker.result(lsof_output)
205 self.mocker.count(20)
206 sleep_mock = self.mocker.replace(sleep)
207@@ -297,10 +351,10 @@
208 self.mocker.result(2) # Failure from the umount shell cmd
209 self.mocker.count(20)
210 lsof = self.mocker.replace(subprocess.check_output)
211- lsof(["lsof", "/mnt/this", "-b"])
212+ lsof(["lsof", "/mnt/this"])
213 self.mocker.count(20)
214 self.mocker.throw(
215- subprocess.CalledProcessError(1, "lsof /mnt/this")) # lsof fail
216+ subprocess.CalledProcessError(1, "lsof /mnt/this")) # lsof fail
217 sleep_mock = self.mocker.replace(sleep)
218 sleep_mock(5)
219 self.mocker.count(20)
220@@ -310,7 +364,8 @@
221 messages = [
222 "ERROR: Unmount failed, leaving relation in errored state",
223 "WARNING: umount /mnt/this failed. Retrying.",
224- "WARNING: Command 'lsof /mnt/this' returned non-zero exit status 1"]
225+ "WARNING: Command 'lsof /mnt/this' returned non-zero exit status "
226+ "1"]
227 for message in messages:
228 self.assertTrue(
229 util.hookenv.is_logged(message), "Not logged- %s" % message)

Subscribers

People subscribed via source and target branches

to all changes: