Merge lp:~allenap/maas/tftp-offload into lp:~maas-committers/maas/trunk

Proposed by Gavin Panella
Status: Merged
Approved by: Gavin Panella
Approved revision: no longer in the source branch.
Merged at revision: 5599
Proposed branch: lp:~allenap/maas/tftp-offload
Merge into: lp:~maas-committers/maas/trunk
Diff against target: 284 lines (+244/-0)
3 files modified
src/provisioningserver/plugin.py (+18/-0)
src/provisioningserver/rackdservices/tftp_offload.py (+207/-0)
src/provisioningserver/tests/test_plugin.py (+19/-0)
To merge this branch: bzr merge lp:~allenap/maas/tftp-offload
Reviewer Review Type Date Requested Status
Mike Pontillo (community) Approve
Review via email: mp+312146@code.launchpad.net

Commit message

Implement *experimental* TFTP offload support.

Description of the change

In my own time I've been writing a TFTP server in Rust. This small change to MAAS allows it to take up TFTP duties for MAAS. I will publish the server's code imminently. Though it's far from finished, the performance, predictably, already far outstrips MAAS's in-process TFTP server. This will be important when putting MAAS on more constrained hardware, and may improve the experience for all: that's what this experiment is intended to discover.

The change in the proposal is as self-contained as possible. The offload support won't even be imported unless it's explicitly requested, and it will be easy to remove this support at a later date. It is not documented in any publicly visible place so we're not making any promises about supporting this.

Once this change has landed and in the PPA I will publish some instructions for getting it working with the external offload process.

To post a comment you must log in.
Revision history for this message
Mike Pontillo (mpontillo) wrote :

This looks nice, and very valuable since we know the Python TFTP code causes memory leaks in MAAS long-term (on the order of months?). I'm tempted to just say land it, since it's effectively guarded from accidental use.

However, I'll say 'Needs Information', because I think this should be documented (in docs/) before it lands. (Preferably with a link to your rust-based reference implementation showing how to use it.)

My other worry is, given that this is an experimental project, it will be prone to bit rot unless we have CI tests for it. Should we commit to ensuring it is properly tested before we land it in trunk? (Maybe just keep it as a private branch until then.)

Also, it might be good if tftp_offload.py had comments with a reference to where the code is enabled in `plugin.py`, plus instructions on how to enable it.

A couple more comments below.

review: Needs Information
Revision history for this message
Gavin Panella (allenap) wrote :

Thanks Mike!

My expectation with this is that we'll either remove it before we release 2.2, or we'll rework it, add tests, and make it official. I think this is a worthwhile experiment but I do suspect the former (removal) will happen because this is a spike for which I want to get feedback but which is absolutely not production ready and is not intended to be. This is merely the easiest way to add support to MAAS for the experiment: the alternatives I considered were building packages myself in a PPA (too much work, too likely to break testers) or providing a patch to apply manually. I don't want to invest more time into documenting this than necessary, beyond providing instructions for how to get it working. Once this is cleared to land I will have the impetus to do that.

This is the server so far:
  https://github.com/allenap/allenap-tftp-offload
which builds on:
  https://github.com/allenap/allenap-libtftp

I'm not set on overriding tftp_port, though I think it's fairly harmless. Perhaps something like "does /etc/maas/experiment.tftp-offload exist" would be a reasonable alternative?

Revision history for this message
Gavin Panella (allenap) :
Revision history for this message
Gavin Panella (allenap) wrote :

I've written the instructions at https://github.com/allenap/allenap-tftp-offload.

Revision history for this message
Mike Pontillo (mpontillo) wrote :

Thanks. Looks good. See one comment below before landing, please.

review: Approve
Revision history for this message
Gavin Panella (allenap) wrote :

Thank you!

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'src/provisioningserver/plugin.py'
2--- src/provisioningserver/plugin.py 2016-10-19 17:06:30 +0000
3+++ src/provisioningserver/plugin.py 2016-12-08 22:02:38 +0000
4@@ -98,6 +98,24 @@
5 resource_root=tftp_root, port=tftp_port,
6 client_service=rpc_service)
7 tftp_service.setName("tftp")
8+
9+ # *** EXPERIMENTAL ***
10+ # https://code.launchpad.net/~allenap/maas/tftp-offload/+merge/312146
11+ # If the TFTP port has been set to zero, use the experimental offload
12+ # service. Otherwise stick to the normal in-process TFTP service.
13+ if tftp_port == 0:
14+ from provisioningserver.path import get_path
15+ from provisioningserver.rackdservices import tftp_offload
16+ from twisted.internet.endpoints import UNIXServerEndpoint
17+ tftp_offload_socket = get_path("/var/lib/maas/tftp-offload.sock")
18+ tftp_offload_endpoint = UNIXServerEndpoint(
19+ reactor, tftp_offload_socket, wantPID=False)
20+ tftp_offload_service = tftp_offload.TFTPOffloadService(
21+ reactor, tftp_offload_endpoint, tftp_service.backend)
22+ tftp_offload_service.setName("tftp-offload")
23+ return tftp_offload_service
24+ # *** /EXPERIMENTAL ***
25+
26 return tftp_service
27
28 def _makeImageDownloadService(self, rpc_service, tftp_root):
29
30=== added file 'src/provisioningserver/rackdservices/tftp_offload.py'
31--- src/provisioningserver/rackdservices/tftp_offload.py 1970-01-01 00:00:00 +0000
32+++ src/provisioningserver/rackdservices/tftp_offload.py 2016-12-08 22:02:38 +0000
33@@ -0,0 +1,207 @@
34+# Copyright 2016 Canonical Ltd. This software is licensed under the
35+# GNU Affero General Public License version 3 (see the file LICENSE).
36+
37+"""TFTP offload server.
38+
39+ ********************
40+ ** **
41+ ** EXPERIMENTAL **
42+ ** **
43+ ********************
44+
45+"""
46+
47+__all__ = [
48+ "TFTPOffloadService",
49+]
50+
51+import os
52+import shutil
53+import tempfile
54+
55+from provisioningserver.logger import LegacyLogger
56+from provisioningserver.utils.twisted import (
57+ call,
58+ callOut,
59+)
60+import tftp.backend
61+from twisted.application.internet import StreamServerEndpointService
62+from twisted.internet import (
63+ interfaces,
64+ protocol,
65+)
66+from twisted.internet.defer import (
67+ inlineCallbacks,
68+ maybeDeferred,
69+)
70+from twisted.python import context
71+from twisted.python.filepath import FilePath
72+from zope.interface import implementer
73+
74+
75+log = LegacyLogger()
76+
77+
78+class TFTPOffloadService(StreamServerEndpointService):
79+ """Service for `TFTPOffloadProtocol` on a given endpoint."""
80+
81+ def __init__(self, reactor, endpoint, backend):
82+ """
83+ :param reactor: A Twisted reactor.
84+ :param endpoint: A Twisted endpoint.
85+ :param backend: An instance of `TFTPBackend`.
86+ """
87+ super(TFTPOffloadService, self).__init__(
88+ endpoint, TFTPOffloadProtocolFactory(backend))
89+
90+
91+@implementer(interfaces.IProtocolFactory)
92+class TFTPOffloadProtocolFactory:
93+ """Factory for `TFTPOffloadProtocol`.
94+
95+ This arranges a "hand-off" store (i.e. a temporary directory) that is used
96+ by the protocol to give ephemeral files to the offload process, which is
97+ then responsible for deleting them. This temporary directory will also be
98+ removed during shutdown of this factory so belt-n-braces.
99+ """
100+
101+ def __init__(self, backend):
102+ super(TFTPOffloadProtocolFactory, self).__init__()
103+ self.backend = backend
104+
105+ def buildProtocol(self, addr):
106+ return TFTPOffloadProtocol(self.backend, self.store)
107+
108+ def doStart(self):
109+ self.store = tempfile.mkdtemp(
110+ prefix="maas.tftp.", suffix=".store")
111+
112+ def doStop(self):
113+ store, self.store = self.store, None
114+ shutil.rmtree(store)
115+
116+
117+class TFTPOffloadProtocol(protocol.Protocol):
118+ """A partially dynamic read-only TFTP **offload** protocol.
119+
120+ The protocol is simple. Where applicable and unless noted otherwise,
121+ values are encoded as ASCII.
122+
123+ - On a new connection, write the following:
124+
125+ - The local IP address
126+ - NULL (a single byte of value zero)
127+ - The remote IP address
128+ - NULL
129+ - The requested file name
130+ - NULL
131+ - A literal dollar sign
132+
133+ - In response the following will be written:
134+
135+ - A hyphen to indicate success
136+ - NULL
137+ - Either a hyphen or "EPH" to indicate the type of response
138+ - NULL
139+ - The file name to serve (in file-system encoding)
140+ - NULL
141+ - A literal dollar sign
142+
143+ The offload process should then:
144+
145+ - Serve the file specified to its client
146+ - Where "EPH" was specified, delete the file
147+
148+ - Or, in the case of failure:
149+
150+ - A decimal in ASCII denoting a TFTP error code
151+ - NULL
152+ - An error message encoded as UTF-8
153+ - NULL
154+ - A literal dollar sign
155+
156+ The offload process should then:
157+
158+ - Send an error packet to its client
159+ - Terminate the transfer
160+
161+ """
162+
163+ def __init__(self, backend, store):
164+ """
165+ :param backend: An instance of `TFTPBackend`.
166+ """
167+ super(TFTPOffloadProtocol, self).__init__()
168+ self.backend = backend
169+ self.store = store
170+
171+ def connectionMade(self):
172+ self.buf = b""
173+
174+ def dataReceived(self, data):
175+ self.buf += data
176+ parts = self.buf.split(b"\x00")
177+ if len(parts) >= 4:
178+ self.prepareResponse(*parts)
179+
180+ def prepareResponse(self, local, remote, file_name, over, *rest):
181+ if over != b"$":
182+ log.error(
183+ "Message not properly terminated: local={local!r} "
184+ "remote={remote!r} file_name={file_name!r} over={over!r} "
185+ "rest={rest!r}", local=local, remote=remote,
186+ file_name=file_name, over=over, rest=rest)
187+ self.transport.loseConnection()
188+ elif len(rest) != 0:
189+ log.error(
190+ "Message had trailing garbage: local={local!r} "
191+ "remote={remote!r} file_name={file_name!r} over={over!r} "
192+ "rest={rest!r}", local=local, remote=remote,
193+ file_name=file_name, over=over, rest=rest)
194+ self.transport.loseConnection()
195+ else:
196+ d = context.call(
197+ {"local": (local.decode(), 0), "remote": (remote.decode(), 0)},
198+ self.backend.get_reader, file_name)
199+ d.addCallbacks(self.prepareWriteResponse, self.writeError)
200+ d.addBoth(call, self.transport.loseConnection)
201+ d.addErrback(log.err, "Failure in TFTP back-end.")
202+
203+ def prepareWriteResponse(self, reader):
204+ if isinstance(reader, tftp.backend.FilesystemReader):
205+ d = maybeDeferred(self.writeFileResponse, reader)
206+ else:
207+ d = maybeDeferred(self.writeStreamedResponse, reader)
208+ return d.addBoth(callOut, reader.finish)
209+
210+ def writeFileResponse(self, reader):
211+ return self.writeResponse(
212+ reader.file_path, ephemeral=False)
213+
214+ def writeStreamedResponse(self, reader):
215+ return self.copyReader(reader).addCallback(
216+ self.writeResponse, ephemeral=True)
217+
218+ @inlineCallbacks
219+ def copyReader(self, reader):
220+ tempfd, tempname = tempfile.mkstemp(dir=self.store)
221+ with os.fdopen(tempfd, "wb") as tempfd:
222+ chunksize = 2 ** 16 # 64kiB
223+ while True:
224+ chunk = yield reader.read(chunksize)
225+ tempfd.write(chunk)
226+ if len(chunk) < chunksize:
227+ return FilePath(tempname)
228+
229+ def writeResponse(self, file_path, *, ephemeral):
230+ self.transport.write(b"-\x00") # Hyphen = okay.
231+ self.transport.write(b"EPH\x00" if ephemeral else b"-\x00")
232+ self.transport.write(file_path.asBytesMode().path)
233+ self.transport.write(b"\x00$") # Terminate. We're done.
234+ self.transport.loseConnection()
235+
236+ def writeError(self, failure):
237+ # Use TFTP error codes where possible.
238+ self.transport.write(b"0\x00") # 0 = See error message.
239+ self.transport.write(failure.getErrorMessage().encode())
240+ self.transport.write(b"\x00$") # Terminate. We're done.
241
242=== modified file 'src/provisioningserver/tests/test_plugin.py'
243--- src/provisioningserver/tests/test_plugin.py 2016-10-19 21:08:54 +0000
244+++ src/provisioningserver/tests/test_plugin.py 2016-12-08 22:02:38 +0000
245@@ -44,13 +44,16 @@
246 TFTPBackend,
247 TFTPService,
248 )
249+from provisioningserver.rackdservices.tftp_offload import TFTPOffloadService
250 from provisioningserver.testing.config import ClusterConfigurationFixture
251 from testtools.matchers import (
252 AfterPreprocessing,
253+ Contains,
254 Equals,
255 IsInstance,
256 MatchesAll,
257 MatchesStructure,
258+ Not,
259 )
260 from twisted.application.service import MultiService
261 from twisted.python.filepath import FilePath
262@@ -112,6 +115,22 @@
263 logger.configure, MockCalledOnceWith(
264 options["verbosity"], logger.LoggingMode.TWISTD))
265
266+ def test_makeService_with_EXPERIMENTAL_tftp_offload_service(self):
267+ """
268+ Only the site service is created when no options are given.
269+ """
270+ # Activate the offload service by setting port to 0.
271+ self.useFixture(ClusterConfigurationFixture(tftp_port=0))
272+
273+ options = Options()
274+ service_maker = ProvisioningServiceMaker("Harry", "Hill")
275+ service = service_maker.makeService(options)
276+ self.assertIsInstance(service, MultiService)
277+ self.assertThat(service.namedServices, Not(Contains("tftp")))
278+ self.assertThat(service.namedServices, Contains("tftp-offload"))
279+ tftp_offload_service = service.getServiceNamed("tftp-offload")
280+ self.assertThat(tftp_offload_service, IsInstance(TFTPOffloadService))
281+
282 def test_makeService_patches_tftp_service(self):
283 mock_tftp_patch = (
284 self.patch(plugin_module, 'add_patches_to_txtftp'))