TFTP file transfer fails when no options are given

Bug #1823105 reported by Lee Trager
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Björn Tillenius

Bug Description

When a TFTP client gives no options when requesting a file from MAAS the client is unable to to send an ACK for the first block. This results in the client and MAAS ping ponging between the client requesting the file, getting the first block, and failing to send the ACK and MAAS resending the first block. Clients which send TFTP options(e.g tsize, blksize) do not run into this problem as the negotiation of options gives MAAS enough time to open the UDP port to receive the ACK.

97da24b, which starts tracking TFTP file transfer time surfaced this bug. The following patch fixes MAAS however the underlying issue is the UDP port is not opening quick enough. A heavily loaded system without TFTP file transfer tracking would most likely run into the same issue.

diff --git a/src/provisioningserver/rackdservices/tftp.py b/src/provisioningserver/rackdservices/tftp.py
index a7f735c27..38b8b01b2 100644
--- a/src/provisioningserver/rackdservices/tftp.py
+++ b/src/provisioningserver/rackdservices/tftp.py
@@ -527,8 +527,7 @@ class TFTPService(MultiService, object):
         for address in addrs_desired - addrs_established:
             if not IPAddress(address).is_link_local():
                 tftp_service = UDPServer(
- self.port, TransferTimeTrackingTFTP(self.backend),
- interface=address)
+ self.port, TFTP(self.backend), interface=address)
                 tftp_service.setName(address)
                 tftp_service.setServiceParent(self)

Reproduction:
curl --tftp-no-options --max-time 5 tftp://$RACK_IP/pxelinux.cfg/default

Related branches

Lee Trager (ltrager)
description: updated
Revision history for this message
Björn Tillenius (bjornt) wrote :

I took a look at this, since I found it odd that we wouldn't open the UDP port quickly enough. And sure enough, that's not the problem. The problem is how we replace the read session with on that keeps track of the time. We do it too late, so it has already started when we replace it. The curl command in the description to reproduce this issue shows it clearly, since you see that the contents are actually transferred, but than it waits 5 seconds and times out.

With this diff, it seems to work correctly:

   http://paste.ubuntu.com/p/m2cfTWmVpk/

Revision history for this message
Lee Trager (ltrager) wrote :

Confirmed Bjorn's patch fixes things on S390X and the curl reproduction.

Revision history for this message
Andres Rodriguez (andreserl) wrote :

@Bjorn,

Could you please finish this asap and get it reviewed/landed for Monday?

Changed in maas:
assignee: nobody → Björn Tillenius (bjornt)
Changed in maas:
status: Triaged → In Progress
Changed in maas:
status: In Progress → Fix Committed
Changed in maas:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.