Code review comment for ~dirk.zimoch/epics-base:iocLogClientFixes

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I notice that the page you link states:

> DISCLAIMER: This is my interpretation of the RFCs (I have read all the TCP-related ones I could find), but I have not attempted to examine implementation source code or trace actual connections in order to verify it. I am satisfied that the logic is correct, though.

which makes me smile.

The situation you describe, with a "firewall" in between is actually fairly complicated. And opaque in that the firewall might be acting in different ways depending on how stateful and helpful it is (eg. whether one or both sides get RST).

What I understand about BSD sockets and TCP is that:

1. If the peer maintains the connection, but doesn't call recv() (at all, or not fast enough) then send() will eventually fill the Tx socket buffer FIFO and block until it becomes not full (by peer ACK on recv() ).

2. calling send() on a socket which has been closed (peer calls close()) while the TX FIFO is empty results in an immediate EPIPE.

3. if a send() is in progress because the Tx FIFO is full (or probably when not empty), a peer close results in ECONNRESET.

4. if a the peer has shutdown(SHUT_RD), then send() will succeed until the Tx FIFO is full, then it will block as with #2. When the peer does close(), the send will return with EPIPE. (or probably ECONNRESET if some data was sent, but not ACK'd before the FIN).

5. if the peer simply stops responding, then send() will succeed until the Tx FIFO is full, then block waiting for an ACK which never comes.

So normal, or abnormal, termination of the TCP connection will break a send() immediately after RST or the FIN+ACK handshake completes. peer shutdown(SHUT_RD) or interruption of the network path will not break or error send().

I've just verified 1,2,3,4 on Linux with short python scripts. (I was surprised that #4 doesn't error) #5 I know from past experience.

So I don't think that calling recv() serves the purpose you intend. I think that SO_SNDTIMEO would be a better fit.

That said, the recv() might serve as one piece of a mechanism for a log server to detect condition #5 faster than SO_KEEPALIVE would (imo. SO_KEEPALIVE should also be set). Some extra protocol would be needed to allow a log server to detect this protocol feature.

« Back to merge proposal