Merge lp:~jtv/maas/resolve-mac-from-leases into lp:~maas-committers/maas/trunk

Proposed by Jeroen T. Vermeulen
Status: Rejected
Rejected by: Andres Rodriguez
Proposed branch: lp:~jtv/maas/resolve-mac-from-leases
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 368 lines (+190/-38)
6 files modified
src/provisioningserver/dhcp/leases.py (+32/-3)
src/provisioningserver/dhcp/leases_parser.py (+14/-2)
src/provisioningserver/dhcp/tests/test_leases.py (+61/-1)
src/provisioningserver/dhcp/tests/test_leases_parser.py (+1/-1)
src/provisioningserver/tasks.py (+32/-14)
src/provisioningserver/tests/test_tasks.py (+50/-17)
To merge this branch: bzr merge lp:~jtv/maas/resolve-mac-from-leases
Reviewer Review Type Date Requested Status
MAAS Maintainers Pending
Review via email: mp+217914@code.launchpad.net

Commit message

Try harder to resolve a node's power IP address. Look not just at the ARP cache but the given IP address (if any) and parsed DHCP leases as well. This should enable cold boots of AMT nodes on MAAS-managed networks.

Description of the change

This should fix the problem where we can't power up nodes whose power addresses are not in the ARP cache — provided that the MAC got a DHCP lease from the cluster controller.

Jeroen

To post a comment you must log in.
Revision history for this message
Gavin Panella (allenap) wrote :

Fwiw, I think the code is good.

However, I worry that it's going to be looking at a snapshot of leases that might be several minutes old, so we might still want a more immediate solution, e.g. nmap.

Also, we only ever ought to have one lease for a MAC address. This code is complicated because it has to consider multiple MAC addresses, but I'm not sure that's necessary or desirable. IMO, it should try the ARP cache then fall back to checking leases.

(This is somewhat related to something that has come up in review for lp:~allenap/maas/dhcp-leases-parsing. The new parser discards all but the last lease for each IP address then discards that if it has expired. The existing parser does the same operations but in reverse order: it discards all expired leases before choosing the most recent of those remaining. I'm still not 100% sure what the right behaviour should be, but I'm leaning towards the former.)

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Thanks for having a spontaneous look. This is still experimental: I really just wanted to see whether it would solve the problem. I'm trying that today.

As for the multiple mappings, I'm trying to cover for scenarios like a NIC getting different DHCP leases for its firmware boot and its OS boot. We don't support multiple MACs for one IP address, but multiple IPs mapping to one MAC seems quite plausible just because it takes an effort to prevent.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/dhcp/leases.py'
2--- src/provisioningserver/dhcp/leases.py 2013-10-07 09:12:40 +0000
3+++ src/provisioningserver/dhcp/leases.py 2014-05-01 12:53:29 +0000
4@@ -1,4 +1,4 @@
5-# Copyright 2012 Canonical Ltd. This software is licensed under the
6+# Copyright 2012-2014 Canonical Ltd. This software is licensed under the
7 # GNU Affero General Public License version 3 (see the file LICENSE).
8
9 """Send lease updates to the server.
10@@ -29,6 +29,7 @@
11
12 __metaclass__ = type
13 __all__ = [
14+ 'resolve_MAC',
15 'upload_leases',
16 'update_leases',
17 ]
18@@ -54,7 +55,11 @@
19 get_recorded_nodegroup_uuid,
20 )
21 from provisioningserver.cluster_config import get_maas_url
22-from provisioningserver.dhcp.leases_parser import parse_leases
23+from provisioningserver.dhcp.leases_parser import (
24+ normalise_MAC,
25+ parse_leases,
26+ )
27+from provisioningserver.utils import find_ip_via_arp
28
29
30 logger = getLogger(__name__)
31@@ -117,7 +122,6 @@
32 # but read them both at once here to reduce the scope for trouble.
33 previous_leases = cache.cache.get(LEASES_CACHE_KEY)
34 previous_leases_time = cache.cache.get(LEASES_TIME_CACHE_KEY)
35-
36 if get_leases_timestamp() == previous_leases_time:
37 return None
38 parse_result = parse_leases_file()
39@@ -209,3 +213,28 @@
40 if updated_lease_info is not None:
41 timestamp, leases = updated_lease_info
42 process_leases(timestamp, leases)
43+
44+
45+def resolve_MAC(mac):
46+ """Look up IP addresses associated with the given MAC address.
47+
48+ This uses the cached leases from the last upload, but also looks at the
49+ system's ARP cache.
50+
51+ :param mac: A string representing a MAC address, in colon-separated format.
52+ :return: A set of matching IP addresses, as strings.
53+ """
54+ mac = normalise_MAC(mac)
55+ leases = cache.cache.get(LEASES_CACHE_KEY)
56+ if leases is None:
57+ ips = set()
58+ else:
59+ ips = {
60+ lease_ip
61+ for lease_ip, lease_mac in leases.items()
62+ if lease_mac == mac
63+ }
64+ arp_ip = find_ip_via_arp(mac)
65+ if arp_ip is not None:
66+ ips.add(arp_ip)
67+ return ips
68
69=== modified file 'src/provisioningserver/dhcp/leases_parser.py'
70--- src/provisioningserver/dhcp/leases_parser.py 2013-10-15 11:35:10 +0000
71+++ src/provisioningserver/dhcp/leases_parser.py 2014-05-01 12:53:29 +0000
72@@ -1,4 +1,4 @@
73-# Copyright 2012 Canonical Ltd. This software is licensed under the
74+# Copyright 2012-2014 Canonical Ltd. This software is licensed under the
75 # GNU Affero General Public License version 3 (see the file LICENSE).
76
77 """Parser for ISC dhcpd leases file.
78@@ -19,6 +19,7 @@
79
80 __metaclass__ = type
81 __all__ = [
82+ 'normalise_MAC',
83 'parse_leases',
84 ]
85
86@@ -199,6 +200,16 @@
87 return leases
88
89
90+def normalise_MAC(mac):
91+ """Represent a MAC address in a normalised way so it can be compared."""
92+ return mac.lower()
93+
94+
95+def normalise_leases(leases):
96+ """Normalise IP/host mapping."""
97+ return {ip: normalise_MAC(mac) for ip, mac in leases.items()}
98+
99+
100 def parse_leases(leases_contents):
101 """Parse contents of a leases file.
102
103@@ -207,4 +218,5 @@
104 address that it is associated with.
105 """
106 entries = lease_parser.searchString(leases_contents)
107- return combine_entries(entries)
108+ leases = combine_entries(entries)
109+ return normalise_leases(leases)
110
111=== modified file 'src/provisioningserver/dhcp/tests/test_leases.py'
112--- src/provisioningserver/dhcp/tests/test_leases.py 2013-10-07 09:12:40 +0000
113+++ src/provisioningserver/dhcp/tests/test_leases.py 2014-05-01 12:53:29 +0000
114@@ -1,4 +1,4 @@
115-# Copyright 2012 Canonical Ltd. This software is licensed under the
116+# Copyright 2012-2014 Canonical Ltd. This software is licensed under the
117 # GNU Affero General Public License version 3 (see the file LICENSE).
118
119 """Tests for the report_leases task."""
120@@ -40,12 +40,14 @@
121 parse_leases_file,
122 process_leases,
123 record_lease_state,
124+ resolve_MAC,
125 send_leases,
126 update_leases,
127 upload_leases,
128 )
129 from provisioningserver.omshell import Omshell
130 from provisioningserver.testing.testcase import PservTestCase
131+from provisioningserver.testing.worker_cache import WorkerCacheFixture
132
133
134 class TestHelpers(PservTestCase):
135@@ -346,3 +348,61 @@
136 leases = factory.make_random_leases()
137 send_leases(leases)
138 self.assertEqual([], MAASClient.post.calls)
139+
140+
141+class TestResolveMAC(PservTestCase):
142+
143+ def patch_arp_lookup(self, mac_to_ip):
144+ """Patch ARP lookup to map the given MAC addresses to the given IPs.
145+
146+ Lookup on any un-mapped addresses will return `None`.
147+ """
148+ self.patch(
149+ leases_module, 'find_ip_via_arp', lambda mac: mac_to_ip.get(mac))
150+
151+ def patch_cached_leases(self, leases):
152+ """Patch the worker cache to contain the given leases."""
153+ self.useFixture(WorkerCacheFixture())
154+ cache.cache.set(LEASES_CACHE_KEY, leases)
155+
156+ def test__returns_empty_if_nothing_cached(self):
157+ self.patch_arp_lookup({})
158+ self.patch_cached_leases(None)
159+ self.assertEqual(set(), resolve_MAC(factory.getRandomMACAddress()))
160+
161+ def test__returns_IP_from_lease(self):
162+ mac = factory.getRandomMACAddress()
163+ ip = factory.getRandomIPAddress()
164+ self.patch_arp_lookup({})
165+ self.patch_cached_leases({ip: mac})
166+ self.assertEqual({ip}, resolve_MAC(mac))
167+
168+ def test__ignores_nonmatching_leases(self):
169+ self.patch_arp_lookup({})
170+ self.patch_cached_leases({
171+ factory.getRandomIPAddress(): factory.getRandomMACAddress(),
172+ })
173+ self.assertEqual(set(), resolve_MAC(factory.getRandomMACAddress()))
174+
175+ def test__filters_for_all_matching_leases(self):
176+ self.patch_arp_lookup({})
177+ mac = factory.getRandomMACAddress()
178+ ips = {factory.getRandomIPAddress() for _ in range(3)}
179+ self.patch_cached_leases({ip: mac for ip in ips})
180+ self.assertEqual(ips, resolve_MAC(mac))
181+
182+ def test__adds_ARP_lookup_if_found(self):
183+ mac = factory.getRandomMACAddress()
184+ ip = factory.getRandomIPAddress()
185+ self.patch_arp_lookup({mac: ip})
186+ self.patch_cached_leases({})
187+ self.assertEqual({ip}, resolve_MAC(mac))
188+
189+ def test__ignores_case(self):
190+ mac = factory.getRandomMACAddress()
191+ lease_ip = factory.getRandomIPAddress()
192+ arp_ip = factory.getRandomIPAddress()
193+ self.patch_arp_lookup({mac: arp_ip})
194+ self.patch_cached_leases({lease_ip: mac})
195+ self.assertEqual({lease_ip, arp_ip}, resolve_MAC(mac.upper()))
196+ self.assertEqual({lease_ip, arp_ip}, resolve_MAC(mac.lower()))
197
198=== modified file 'src/provisioningserver/dhcp/tests/test_leases_parser.py'
199--- src/provisioningserver/dhcp/tests/test_leases_parser.py 2013-10-07 09:12:40 +0000
200+++ src/provisioningserver/dhcp/tests/test_leases_parser.py 2014-05-01 12:53:29 +0000
201@@ -1,4 +1,4 @@
202-# Copyright 2012, 2013 Canonical Ltd. This software is licensed under the
203+# Copyright 2012-2014 Canonical Ltd. This software is licensed under the
204 # GNU Affero General Public License version 3 (see the file LICENSE).
205
206 """Tests for the DHCP leases parser."""
207
208=== modified file 'src/provisioningserver/tasks.py'
209--- src/provisioningserver/tasks.py 2014-04-22 15:10:32 +0000
210+++ src/provisioningserver/tasks.py 2014-05-01 12:53:29 +0000
211@@ -49,7 +49,10 @@
212 config,
213 detect,
214 )
215-from provisioningserver.dhcp.leases import upload_leases
216+from provisioningserver.dhcp.leases import (
217+ resolve_MAC,
218+ upload_leases,
219+ )
220 from provisioningserver.dns.config import (
221 DNSConfig,
222 execute_rndc_command,
223@@ -158,20 +161,35 @@
224 assert power_change in ('on', 'off'), (
225 "Unknown power change keyword: %s" % power_change)
226 kwargs['power_change'] = power_change
227+
228+ # Figure out the IP address(es) where we should send our power command.
229 if 'mac_address' in kwargs:
230- kwargs['ip_address'] = find_ip_via_arp(kwargs['mac_address'])
231- kwargs.setdefault('ip_address', None)
232- try:
233- pa = PowerAction(power_type)
234- pa.execute(**kwargs)
235- except PowerActionFail:
236- # TODO: signal to webapp that it failed
237-
238- # Re-raise, so the job is marked as failed. Only currently
239- # useful for tests.
240- raise
241-
242- # TODO: signal to webapp that it worked.
243+ ips = resolve_MAC(kwargs['mac_address'])
244+ else:
245+ ips = set()
246+ if 'ip_address' in kwargs:
247+ ips.add(kwargs['ip_address'])
248+ if len(ips) == 0:
249+ # No address set. We should run the action at least once though, in
250+ # case it doesn't require an address.
251+ ips.add(None)
252+
253+ # Send our power command to each IP address we have for the node.
254+ failures = {}
255+ for ip in ips:
256+ kwargs['ip_address'] = ip
257+ try:
258+ pa = PowerAction(power_type)
259+ pa.execute(**kwargs)
260+ except PowerActionFail as e:
261+ failures[ip] = e
262+
263+ if len(failures) == len(ips):
264+ # All attempts failed. Propagate an exception so that the job is
265+ # marked as failed.
266+ raise PowerActionFail(pa, "Power action failed: %s" % failures)
267+
268+ # XXX 2014-05-01 jtv: Notify the webapp of success or failure.
269
270
271 @task
272
273=== modified file 'src/provisioningserver/tests/test_tasks.py'
274--- src/provisioningserver/tests/test_tasks.py 2014-04-22 15:10:32 +0000
275+++ src/provisioningserver/tests/test_tasks.py 2014-05-01 12:53:29 +0000
276@@ -41,6 +41,7 @@
277 MockCalledOnceWith,
278 )
279 from mock import (
280+ call,
281 Mock,
282 sentinel,
283 )
284@@ -173,35 +174,67 @@
285 # Patch out PowerAction; we're just trying to demonstrate that
286 # it's invoked with an IP address even if one is not supplied.
287 PowerAction = self.patch(tasks, "PowerAction")
288- # find_ip_via_arp() is tested elsewhere; we just want to know
289+ # resolve_MAC() is tested elsewhere; we just want to know
290 # that it has been used.
291- self.patch(tasks, "find_ip_via_arp").return_value = sentinel.ip_address
292+ self.patch(tasks, "resolve_MAC").return_value = {sentinel.ip_address}
293+ mac = factory.getRandomMACAddress()
294 # PowerAction.execute() is passed an ip_address argument in
295 # addition to the mac_address argument when the latter is
296 # supplied.
297- tasks.issue_power_action(
298- sentinel.power_type, "on", mac_address=sentinel.mac_address)
299+ tasks.issue_power_action(sentinel.power_type, "on", mac_address=mac)
300+ self.assertThat(tasks.resolve_MAC, MockCalledOnceWith(mac))
301 self.assertThat(PowerAction, MockCalledOnceWith(sentinel.power_type))
302- self.assertThat(PowerAction.return_value.execute, MockCalledOnceWith(
303- power_change="on", mac_address=sentinel.mac_address,
304- ip_address=sentinel.ip_address))
305+ self.assertEqual(
306+ [
307+ call(
308+ power_change='on', mac_address=mac,
309+ ip_address=sentinel.ip_address),
310+ ],
311+ PowerAction.return_value.execute.mock_calls)
312
313- def test_ip_address_is_looked_up_when_already_supplied(self):
314+ def test_ip_address_is_looked_up_even_when_already_supplied(self):
315 # Patch out PowerAction; we're just trying to demonstrate that
316 # it's invoked with an IP address even if one is not supplied.
317 PowerAction = self.patch(tasks, "PowerAction")
318- # find_ip_via_arp() is tested elsewhere; we just want to know
319+ # resolve_MAC() is tested elsewhere; we just want to know
320 # that it has been used.
321- self.patch(tasks, "find_ip_via_arp").return_value = sentinel.ip_address
322+ self.patch(tasks, "resolve_MAC").return_value = {
323+ sentinel.resolved_ip_address}
324+ mac = factory.getRandomMACAddress()
325 # The ip_address argument passed to PowerAction.execute() is
326- # looked-up via find_ip_via_arp() even if an ip_address is
327- # passed into issue_power_action().
328+ # looked-up via resolve_MAC() even if an ip_address is
329+ # passed into issue_power_action(). The power action is issued to both
330+ # IP addresses.
331 tasks.issue_power_action(
332- sentinel.power_type, "on", mac_address=sentinel.mac_address,
333- ip_address=sentinel.another_ip_address)
334- self.assertThat(PowerAction.return_value.execute, MockCalledOnceWith(
335- power_change="on", mac_address=sentinel.mac_address,
336- ip_address=sentinel.ip_address))
337+ sentinel.power_type, "on", mac_address=mac,
338+ ip_address=sentinel.given_ip_address)
339+ self.assertItemsEqual(
340+ [
341+ call(
342+ power_change='on', mac_address=mac,
343+ ip_address=sentinel.given_ip_address),
344+ call(
345+ power_change='on', mac_address=mac,
346+ ip_address=sentinel.resolved_ip_address),
347+ ],
348+ PowerAction.return_value.execute.mock_calls)
349+
350+ def test_ip_address_is_None_if_no_address_can_be_found(self):
351+ # Patch out PowerAction; we're just trying to demonstrate that
352+ # it's invoked with an IP address even if one is not supplied.
353+ PowerAction = self.patch(tasks, "PowerAction")
354+ # resolve_MAC() is tested elsewhere; we just want to know
355+ # that it has been used.
356+ self.patch(tasks, "resolve_MAC").return_value = set()
357+ mac = factory.getRandomMACAddress()
358+ # PowerAction.execute() is passed an ip_address argument in
359+ # addition to the mac_address argument when the latter is
360+ # supplied.
361+ tasks.issue_power_action(sentinel.power_type, "on", mac_address=mac)
362+ self.assertThat(PowerAction, MockCalledOnceWith(sentinel.power_type))
363+ self.assertEqual(
364+ [call(power_change='on', mac_address=mac, ip_address=None)],
365+ PowerAction.return_value.execute.mock_calls)
366
367
368 class TestDHCPTasks(PservTestCase):