Merge ~dirk.zimoch/epics-base:iocLogClientFixes into ~epics-core/epics-base/+git/epics-base:7.0

Proposed by Dirk Zimoch
Status: Rejected
Rejected by: mdavidsaver
Proposed branch: ~dirk.zimoch/epics-base:iocLogClientFixes
Merge into: ~epics-core/epics-base/+git/epics-base:7.0
Diff against target: 579 lines (+129/-161)
5 files modified
modules/database/src/ioc/misc/dbCore.dbd (+3/-0)
modules/libcom/src/log/iocLog.c (+25/-0)
modules/libcom/src/log/iocLogServer.c (+0/-12)
modules/libcom/src/log/logClient.c (+101/-148)
modules/libcom/src/log/logClient.h (+0/-1)
Reviewer Review Type Date Requested Status
mdavidsaver Disapprove
Review via email: mp+372043@code.launchpad.net

This proposal has been superseded by a proposal from 2019-09-18.

Commit message

fix several problems in logClient

Description of the change

Bugfix: don't send errlog to all logClients, only to iocLog
Improvements to minimize message loss when server disconnects:
* detect lost connection to server early
* do not discard buffer at disconnect
* flush buffer directly after reconnect
Other improvements:
* avoid delays on ioc startup and shutdown
* avoid repeated memmove

Relates to bug report https://bugs.launchpad.net/epics-base/+bug/1841608 and Tech-Talk thread https://epics.anl.gov/tech-talk/2019/msg01329.php.

To post a comment you must log in.
Revision history for this message
mdavidsaver (mdavidsaver) wrote :

What necessitates removing shutdown() and adding select()/recv()? To my knowledge send() should return with an error if the socket is, or becomes, disconnected.

review: Needs Information
Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

Unfortunately send() does not return an error immediately. It simply puts the message into the TCP send queue, even when the peer has already shut down the socket. Only one or more messages later send() returns an error. But by then the messages are lost. The logClient has already "successfully" sent them ti the TCP stack but TCP failed to deliver them and there is no way to recover the messages now.

From the send man page: "No indication of failure to deliver is implicit in a send()."
It only tells you that queuing the message was successful.

When the peer closes the socket, select() will show the socket readable and read() will return 0, but only after everything the peer may have actually send has been read (and discarded).
That is the only reliable way to detect a closed connection.
However this method only works if the receiving side had not been shut down.

I had to google a lot until I found this out. It seems many people get this wrong assuming send() should fail immediately. It doesn't.

I can't find the page that helped me in the end any more, but this one comes close:
https://www.softlab.ntua.gr/facilities/documentation/unix/unix-socket-faq/unix-socket-faq-2.html

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

When say "closed connection" do you mean normal closure (FIN/ACK), abnormal closure (RST), and/or (ideally) timeout?

Lack of an application level timeout is a known issue with the logging protocol. Are you trying to address this too?

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

I don't know the TCP details. I tested with a local server (ncat) which I terminated and also with our two production servers (syslog-ng adnd logstash) where the firewall closes idle connections after one hour. In both cases I can now detect when the connection breaks. Before I always lost messages.

Which timeout scenario do you have in mind? send() hanging? How could I test that?

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I notice that the page you link states:

> DISCLAIMER: This is my interpretation of the RFCs (I have read all the TCP-related ones I could find), but I have not attempted to examine implementation source code or trace actual connections in order to verify it. I am satisfied that the logic is correct, though.

which makes me smile.

The situation you describe, with a "firewall" in between is actually fairly complicated. And opaque in that the firewall might be acting in different ways depending on how stateful and helpful it is (eg. whether one or both sides get RST).

What I understand about BSD sockets and TCP is that:

1. If the peer maintains the connection, but doesn't call recv() (at all, or not fast enough) then send() will eventually fill the Tx socket buffer FIFO and block until it becomes not full (by peer ACK on recv() ).

2. calling send() on a socket which has been closed (peer calls close()) while the TX FIFO is empty results in an immediate EPIPE.

3. if a send() is in progress because the Tx FIFO is full (or probably when not empty), a peer close results in ECONNRESET.

4. if a the peer has shutdown(SHUT_RD), then send() will succeed until the Tx FIFO is full, then it will block as with #2. When the peer does close(), the send will return with EPIPE. (or probably ECONNRESET if some data was sent, but not ACK'd before the FIN).

5. if the peer simply stops responding, then send() will succeed until the Tx FIFO is full, then block waiting for an ACK which never comes.

So normal, or abnormal, termination of the TCP connection will break a send() immediately after RST or the FIN+ACK handshake completes. peer shutdown(SHUT_RD) or interruption of the network path will not break or error send().

I've just verified 1,2,3,4 on Linux with short python scripts. (I was surprised that #4 doesn't error) #5 I know from past experience.

So I don't think that calling recv() serves the purpose you intend. I think that SO_SNDTIMEO would be a better fit.

That said, the recv() might serve as one piece of a mechanism for a log server to detect condition #5 faster than SO_KEEPALIVE would (imo. SO_KEEPALIVE should also be set). Some extra protocol would be needed to allow a log server to detect this protocol feature.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

4. Was the problem. send does not fail until some time later. But than messages are already lost.
So my fix is not to rely on send failing but on recv returning EOF.
SO_KEEPALIVE has a default timeout of 2 hours and you cannot change it per socket but only system wide. It is totally useless for solving the problems with lost messages. It is there to allow to clean up stale connections and not to run out of file descriptors in server programs that are constantly running. But that's about all. So it may help the log server but not the client.

The problem I solve here is that send succeeds until the FIFO is full, just as you pointed out in 4. and 5.

I cannot confirm 2 though. My tests with ncat as a server have shown that send does not detect immediately when the server has exited (and thus the kernel has closed the socket).

But leaving discussions about how TCP is supposed to work aside, my main argument for this change is: Now it works, before it didn't.

Revision history for this message
Andrew Johnson (anj) wrote :

@Dirk: Does your new version still work with the original iocLogServer?

IIRC at connection time that server explicitly closes the socket in the log-server -> client IOC direction because the server never sends data back to the client. I'm not sure whether that might cause the client to get an EOF from a recv() call sooner than with the other log-servers that don't do that close.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

Indeed this looks like a close to the client and it closes the connection, but re-opens it on the next message. No messages are lost. The disconnect is not nice though. I'll have a look how to fix that.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

wrt. iocLogServer, I think it would be reasonable to accept removal of the shutdown(SHUT_WR) in iocLogServer.c. So long as older IOCs can still log against newer iocLogServer (which I think would be true).

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

It seems that the broken connection is quite reliably detected at the second send() after the connection had been closed. A dummy send(socket, NULL, 0, 0) seems sufficient to detect the broken connection to iocLogServer as well as to ncat and do Michael's server.py example.

Revision history for this message
Andrew Johnson (anj) wrote :

@Dirk: Which architectures have you tested that on?

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

Linux so far. I am about to test on vxWorks and Windows right now.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

Hmm... Not all Linux versions behave the same. On some send(socket, NULL, 0, 0) is a NOOP and does not detect a broken connection.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

The send() trick with 0 bytes does not work on all systems. It seems to depend on the Linux kernel version as well.

I am testing now a different solution. On Linux the call ioctl(sock, TIOCOUTQ, &unsent) gives the number of unsent bytes in the TCP output queue. This allows to keep the unsent bytes in the buffer in case it turns out later that the connection is broken.

On macOS there seems to be a similar functionality with getsockopt(sock, SOL_SOCKET, SO_NWRITE, &unsent), but I cannot test because I have no mac.

Other Unix flavors may or may not have such a feature.

For vxWorks I could not find a way to get the number of unsent bytes, neither for version 5.5 nor for 6.7.

I don't know about RTEMS.

On Windows 10 the number of unsent bytes should be available with WSAIoctl(sock, SIO_TCP_INFO, ..._). I spent most of last week to find out why it is not working before finding out that the Win 10 version I got was too old. I am upgrading now. Older Windows versions seem to have no way to find out the number of unsent bytes.

So I can mitigate the problem of message loss for some clients but not for all. Depending on the OS.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

> The send() trick with 0 bytes does not work on all systems.

Unless poll() is used, there is a race condition involved. A second send() will only fail if a RST has been received. This may or may not be platform dependent.

Have you discarded the idea of removing the 'shutdown(SHUT_WR)' from iocLogServer?

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

I first want to stay as compatible as possible to existing iocLogServer implementations.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

I made a new branch some commits below the head of this one. How can I change the merge request to the new branch?

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I usually do a force push to replace the existing branch content. I don't think branches can be changed, so the alternative is a new MR.

Revision history for this message
Martin Konrad (info-martin-konrad) wrote :

Beware that force pushing messes up the diff on Launchpad's MR. Dirk, have you considered doing this in smaller steps (separate MRs for each issue you are addressing)? This way some of your modifications might get merged faster.

Revision history for this message
Ralph Lange (ralph-lange) wrote :

iirc, when you properly supersede the MP through the web interface, you have an option to pull/link the existing conversation to the new proposal.

Revision history for this message
Dirk Zimoch (dirk.zimoch) wrote :

So, now I have both variants for discussion.
This one, which requires a change in iocLogServer as well (not to shutdown its write channel).
Or mp+372925 which does not quite work for vxWorks and older Windows and uses an unspecified feature of send() in Linux.

abbd54b... by Dirk Zimoch

fix wrong function name in comment

301a6e0... by Dirk Zimoch

moved logClientSendMessage and made it static

5f0df84... by Dirk Zimoch

increase error message buffer size for long (Windows) error messges

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

The other variant was merged, so this one is no longer relevant.

review: Disapprove

Unmerged commits

5f0df84... by Dirk Zimoch

increase error message buffer size for long (Windows) error messges

301a6e0... by Dirk Zimoch

moved logClientSendMessage and made it static

abbd54b... by Dirk Zimoch

fix wrong function name in comment

bad6295... by Dirk Zimoch

don't shutdown write channel of iocLogServer to allow logClient to detect disconnect

5801ff2... by Dirk Zimoch

use dynamic debug flag for logClient

9dfed16... by Dirk Zimoch

improve logClientShow to show unsent bytes on level 2 (and fix level 1)

b925d3f... by Dirk Zimoch

(Re-)connect only when there is something to send. This avoids constant useless disconnect/connect cycles if the server (or firewall) keeps closing the connection.

2cafe6d... by Dirk Zimoch

detect when server closes connection. can only be done reliably with recv, not with send

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1diff --git a/modules/database/src/ioc/misc/dbCore.dbd b/modules/database/src/ioc/misc/dbCore.dbd
2index 51f9c96..adde24e 100644
3--- a/modules/database/src/ioc/misc/dbCore.dbd
4+++ b/modules/database/src/ioc/misc/dbCore.dbd
5@@ -33,3 +33,6 @@ variable(callbackParallelThreadsDefault,int)
6
7 # Real-time operation
8 variable(dbThreadRealtimeLock,int)
9+
10+# show logClient network activity
11+variable(logClientDebug,int)
12diff --git a/modules/libcom/src/log/iocLog.c b/modules/libcom/src/log/iocLog.c
13index e62da20..ba78041 100644
14--- a/modules/libcom/src/log/iocLog.c
15+++ b/modules/libcom/src/log/iocLog.c
16@@ -18,8 +18,10 @@
17
18 #define epicsExportSharedSymbols
19 #include "envDefs.h"
20+#include "errlog.h"
21 #include "logClient.h"
22 #include "iocLog.h"
23+#include "epicsExit.h"
24
25 int iocLogDisable = 0;
26
27@@ -75,6 +77,24 @@ void epicsShareAPI epicsShareAPI iocLogFlush (void)
28 }
29
30 /*
31+ * logClientSendMessage ()
32+ */
33+static void logClientSendMessage ( logClientId id, const char * message )
34+{
35+ if ( !iocLogDisable ) {
36+ logClientSend (id, message);
37+ }
38+}
39+
40+/*
41+ * iocLogClientDestroy()
42+ */
43+static void iocLogClientDestroy (logClientId id)
44+{
45+ errlogRemoveListeners (logClientSendMessage, id);
46+}
47+
48+/*
49 * iocLogClientInit()
50 */
51 static logClientId iocLogClientInit (void)
52@@ -89,6 +109,10 @@ static logClientId iocLogClientInit (void)
53 return NULL;
54 }
55 id = logClientCreate (addr, port);
56+ if (id != NULL) {
57+ errlogAddListener (logClientSendMessage, id);
58+ epicsAtExit (iocLogClientDestroy, id);
59+ }
60 return id;
61 }
62
63@@ -135,3 +159,4 @@ logClientId epicsShareAPI logClientInit (void)
64 {
65 return iocLogClientInit ();
66 }
67+
68diff --git a/modules/libcom/src/log/iocLogServer.c b/modules/libcom/src/log/iocLogServer.c
69index e707185..4d2f14f 100644
70--- a/modules/libcom/src/log/iocLogServer.c
71+++ b/modules/libcom/src/log/iocLogServer.c
72@@ -486,18 +486,6 @@ static void acceptNewClient ( void *pParam )
73 }
74 }
75
76- status = shutdown(pclient->insock, SHUT_WR);
77- if(status<0){
78- char sockErrBuf[64];
79- epicsSocketConvertErrnoToString ( sockErrBuf, sizeof ( sockErrBuf ) );
80- fprintf (stderr, "%s:%d shutdown err %s\n", __FILE__, __LINE__,
81- sockErrBuf);
82- epicsSocketDestroy ( pclient->insock );
83- free(pclient);
84-
85- return;
86- }
87-
88 status = fdmgr_add_callback(
89 pserver->pfdctx,
90 pclient->insock,
91diff --git a/modules/libcom/src/log/logClient.c b/modules/libcom/src/log/logClient.c
92index 99ee671..986c7b8 100644
93--- a/modules/libcom/src/log/logClient.c
94+++ b/modules/libcom/src/log/logClient.c
95@@ -21,11 +21,9 @@
96 #include <string.h>
97 #include <stdio.h>
98
99-#define epicsExportSharedSymbols
100 #include "dbDefs.h"
101 #include "epicsEvent.h"
102 #include "iocLog.h"
103-#include "errlog.h"
104 #include "epicsMutex.h"
105 #include "epicsThread.h"
106 #include "epicsTime.h"
107@@ -33,9 +31,13 @@
108 #include "epicsAssert.h"
109 #include "epicsExit.h"
110 #include "epicsSignal.h"
111+#include "epicsExport.h"
112
113 #include "logClient.h"
114
115+int logClientDebug = 0;
116+epicsExportAddress (int, logClientDebug);
117+
118 typedef struct {
119 char msgBuf[0x4000];
120 struct sockaddr_in addr;
121@@ -44,6 +46,7 @@ typedef struct {
122 SOCKET sock;
123 epicsThreadId restartThreadId;
124 epicsEventId stateChangeNotify;
125+ epicsEventId shutdownNotify;
126 unsigned connectCount;
127 unsigned nextMsgIndex;
128 unsigned connected;
129@@ -53,7 +56,6 @@ typedef struct {
130 } logClient;
131
132 static const double LOG_RESTART_DELAY = 5.0; /* sec */
133-static const double LOG_SERVER_CREATE_CONNECT_SYNC_TIMEOUT = 5.0; /* sec */
134 static const double LOG_SERVER_SHUTDOWN_TIMEOUT = 30.0; /* sec */
135
136 /*
137@@ -66,10 +68,10 @@ static char* logClientPrefix = NULL;
138 */
139 static void logClientClose ( logClient *pClient )
140 {
141-# ifdef DEBUG
142+ if (logClientDebug) {
143 fprintf (stderr, "log client: lingering for connection close...");
144 fflush (stderr);
145-# endif
146+ }
147
148 /*
149 * mutex on
150@@ -84,8 +86,6 @@ static void logClientClose ( logClient *pClient )
151 pClient->sock = INVALID_SOCKET;
152 }
153
154- pClient->nextMsgIndex = 0u;
155- memset ( pClient->msgBuf, '\0', sizeof ( pClient->msgBuf ) );
156 pClient->connected = 0u;
157
158 /*
159@@ -93,9 +93,8 @@ static void logClientClose ( logClient *pClient )
160 */
161 epicsMutexUnlock (pClient->mutex);
162
163-# ifdef DEBUG
164+ if (logClientDebug)
165 fprintf (stderr, "done\n");
166-# endif
167 }
168
169 /*
170@@ -113,6 +112,7 @@ static void logClientDestroy (logClientId id)
171 epicsMutexMustLock ( pClient->mutex );
172 pClient->shutdown = 1u;
173 epicsMutexUnlock ( pClient->mutex );
174+ epicsEventSignal ( pClient->shutdownNotify );
175
176 /* unblock log client thread blocking in send() or connect() */
177 interruptInfo =
178@@ -154,18 +154,41 @@ static void logClientDestroy (logClientId id)
179 return;
180 }
181
182- errlogRemoveListeners ( logClientSendMessage, (void *) pClient );
183-
184 logClientClose ( pClient );
185
186 epicsMutexDestroy ( pClient->mutex );
187-
188 epicsEventDestroy ( pClient->stateChangeNotify );
189+ epicsEventDestroy ( pClient->shutdownNotify );
190
191 free ( pClient );
192 }
193
194 /*
195+ * logClientCheckConnection
196+ */
197+static void logClientCheckConnection( logClient * pClient )
198+{
199+ epicsMutexMustLock ( pClient->mutex );
200+ while ( pClient->connected ) {
201+ struct timeval timeout = { 0, 0 };
202+ fd_set set;
203+ char buffer[256];
204+
205+ FD_ZERO ( &set );
206+ FD_SET ( pClient->sock, &set );
207+ if ( select ( pClient->sock + 1, &set, NULL, NULL, &timeout ) == 0)
208+ break;
209+ if ( recv ( pClient->sock, buffer, sizeof ( buffer ), 0 ) == 0 ) {
210+ fprintf ( stderr, "log client: connection closed by server \"%s\"\n",
211+ pClient->name );
212+ logClientClose ( pClient );
213+ break;
214+ }
215+ }
216+ epicsMutexUnlock ( pClient->mutex );
217+}
218+
219+/*
220 * This method requires the pClient->mutex be owned already.
221 */
222 static void sendMessageChunk(logClient * pClient, const char * message) {
223@@ -176,57 +199,24 @@ static void sendMessageChunk(logClient * pClient, const char * message) {
224 unsigned msgBufBytesLeft =
225 sizeof ( pClient->msgBuf ) - pClient->nextMsgIndex;
226
227- if ( strSize > msgBufBytesLeft ) {
228- int status;
229-
230- if ( ! pClient->connected ) {
231- break;
232- }
233-
234- if ( msgBufBytesLeft > 0u ) {
235- memcpy ( & pClient->msgBuf[pClient->nextMsgIndex],
236- message, msgBufBytesLeft );
237- pClient->nextMsgIndex += msgBufBytesLeft;
238- strSize -= msgBufBytesLeft;
239- message += msgBufBytesLeft;
240- }
241-
242- status = send ( pClient->sock, pClient->msgBuf,
243- pClient->nextMsgIndex, 0 );
244- if ( status > 0 ) {
245- unsigned nSent = (unsigned) status;
246- if ( nSent < pClient->nextMsgIndex ) {
247- unsigned newNextMsgIndex = pClient->nextMsgIndex - nSent;
248- memmove ( pClient->msgBuf, & pClient->msgBuf[nSent],
249- newNextMsgIndex );
250- pClient->nextMsgIndex = newNextMsgIndex;
251- }
252- else {
253- pClient->nextMsgIndex = 0u;
254- }
255- }
256- else {
257- if ( ! pClient->shutdown ) {
258- char sockErrBuf[64];
259- if ( status ) {
260- epicsSocketConvertErrnoToString ( sockErrBuf, sizeof ( sockErrBuf ) );
261- }
262- else {
263- strcpy ( sockErrBuf, "server initiated disconnect" );
264- }
265- fprintf ( stderr, "log client: lost contact with log server at \"%s\" because \"%s\"\n",
266- pClient->name, sockErrBuf );
267- }
268- logClientClose ( pClient );
269- break;
270- }
271+ if ( msgBufBytesLeft < strSize && pClient->nextMsgIndex != 0u && pClient->connected)
272+ {
273+ /* buffer is full, thus flush it */
274+ logClientCheckConnection( pClient );
275+ logClientFlush ( pClient );
276+ msgBufBytesLeft = sizeof ( pClient->msgBuf ) - pClient->nextMsgIndex;
277 }
278- else {
279- memcpy ( & pClient->msgBuf[pClient->nextMsgIndex],
280- message, strSize );
281- pClient->nextMsgIndex += strSize;
282+ if ( msgBufBytesLeft == 0u ) {
283+ fprintf ( stderr, "log client: messages to \"%s\" are lost\n",
284+ pClient->name );
285 break;
286 }
287+ if ( msgBufBytesLeft > strSize) msgBufBytesLeft = strSize;
288+ memcpy ( & pClient->msgBuf[pClient->nextMsgIndex],
289+ message, msgBufBytesLeft );
290+ pClient->nextMsgIndex += msgBufBytesLeft;
291+ strSize -= msgBufBytesLeft;
292+ message += msgBufBytesLeft;
293 }
294 }
295
296@@ -255,32 +245,25 @@ void epicsShareAPI logClientSend ( logClientId id, const char * message )
297
298 void epicsShareAPI logClientFlush ( logClientId id )
299 {
300+ unsigned nSent = 0u;
301+
302 logClient * pClient = ( logClient * ) id;
303
304- if ( ! pClient ) {
305+ if ( ! pClient || ! pClient->connected ) {
306 return;
307 }
308
309 epicsMutexMustLock ( pClient->mutex );
310
311- while ( pClient->nextMsgIndex && pClient->connected ) {
312- int status = send ( pClient->sock, pClient->msgBuf,
313- pClient->nextMsgIndex, 0 );
314+ while ( nSent < pClient->nextMsgIndex && pClient->connected ) {
315+ int status = send ( pClient->sock, pClient->msgBuf + nSent,
316+ pClient->nextMsgIndex - nSent, 0 );
317 if ( status > 0 ) {
318- unsigned nSent = (unsigned) status;
319- if ( nSent < pClient->nextMsgIndex ) {
320- unsigned newNextMsgIndex = pClient->nextMsgIndex - nSent;
321- memmove ( pClient->msgBuf, & pClient->msgBuf[nSent],
322- newNextMsgIndex );
323- pClient->nextMsgIndex = newNextMsgIndex;
324- }
325- else {
326- pClient->nextMsgIndex = 0u;
327- }
328+ nSent += (unsigned) status;
329 }
330 else {
331 if ( ! pClient->shutdown ) {
332- char sockErrBuf[64];
333+ char sockErrBuf[128];
334 if ( status ) {
335 epicsSocketConvertErrnoToString ( sockErrBuf, sizeof ( sockErrBuf ) );
336 }
337@@ -294,6 +277,11 @@ void epicsShareAPI logClientFlush ( logClientId id )
338 break;
339 }
340 }
341+ pClient->nextMsgIndex -= nSent;
342+ if ( nSent > 0 && pClient->nextMsgIndex > 0 ) {
343+ memmove ( pClient->msgBuf, & pClient->msgBuf[nSent],
344+ pClient->nextMsgIndex );
345+ }
346 epicsMutexUnlock ( pClient->mutex );
347 }
348
349@@ -302,10 +290,10 @@ void epicsShareAPI logClientFlush ( logClientId id )
350 */
351 static void logClientMakeSock (logClient *pClient)
352 {
353-
354-# ifdef DEBUG
355+ if (logClientDebug) {
356 fprintf (stderr, "log client: creating socket...");
357-# endif
358+ fflush (stderr);
359+ }
360
361 epicsMutexMustLock (pClient->mutex);
362
363@@ -314,7 +302,7 @@ static void logClientMakeSock (logClient *pClient)
364 */
365 pClient->sock = epicsSocketCreate ( AF_INET, SOCK_STREAM, 0 );
366 if ( pClient->sock == INVALID_SOCKET ) {
367- char sockErrBuf[64];
368+ char sockErrBuf[128];
369 epicsSocketConvertErrnoToString (
370 sockErrBuf, sizeof ( sockErrBuf ) );
371 fprintf ( stderr, "log client: no socket error %s\n",
372@@ -323,10 +311,8 @@ static void logClientMakeSock (logClient *pClient)
373
374 epicsMutexUnlock (pClient->mutex);
375
376-# ifdef DEBUG
377+ if (logClientDebug)
378 fprintf (stderr, "done\n");
379-# endif
380-
381 }
382
383 /*
384@@ -366,7 +352,7 @@ static void logClientConnect (logClient *pClient)
385 }
386 else {
387 if ( pClient->connFailStatus != errnoCpy && ! pClient->shutdown ) {
388- char sockErrBuf[64];
389+ char sockErrBuf[128];
390 epicsSocketConvertErrnoToString (
391 sockErrBuf, sizeof ( sockErrBuf ) );
392 fprintf (stderr,
393@@ -392,27 +378,13 @@ static void logClientConnect (logClient *pClient)
394 optval = TRUE;
395 status = setsockopt (pClient->sock, SOL_SOCKET, SO_KEEPALIVE, (char *)&optval, sizeof(optval));
396 if (status<0) {
397- char sockErrBuf[64];
398+ char sockErrBuf[128];
399 epicsSocketConvertErrnoToString (
400 sockErrBuf, sizeof ( sockErrBuf ) );
401 fprintf (stderr, "log client: unable to enable keepalive option because \"%s\"\n", sockErrBuf);
402 }
403
404 /*
405- * we don't need full-duplex (clients only write), so we shutdown
406- * the read end of our socket
407- */
408- status = shutdown (pClient->sock, SHUT_RD);
409- if (status < 0) {
410- char sockErrBuf[64];
411- epicsSocketConvertErrnoToString (
412- sockErrBuf, sizeof ( sockErrBuf ) );
413- fprintf (stderr, "%s:%d shutdown(%d,SHUT_RD) error was \"%s\"\n",
414- __FILE__, __LINE__, pClient->sock, sockErrBuf);
415- /* not fatal (although it shouldn't happen) */
416- }
417-
418- /*
419 * set how long we will wait for the TCP state machine
420 * to clean up when we issue a close(). This
421 * guarantees that messages are serialized when we
422@@ -425,7 +397,7 @@ static void logClientConnect (logClient *pClient)
423 lingerval.l_linger = 60*5;
424 status = setsockopt (pClient->sock, SOL_SOCKET, SO_LINGER, (char *) &lingerval, sizeof(lingerval));
425 if (status<0) {
426- char sockErrBuf[64];
427+ char sockErrBuf[128];
428 epicsSocketConvertErrnoToString (
429 sockErrBuf, sizeof ( sockErrBuf ) );
430 fprintf (stderr, "log client: unable to set linger options because \"%s\"\n", sockErrBuf);
431@@ -451,20 +423,21 @@ static void logClientRestart ( logClientId id )
432 /* SMP safe state inspection */
433 epicsMutexMustLock ( pClient->mutex );
434 while ( ! pClient->shutdown ) {
435- unsigned isConn;
436+ unsigned isConn, dataToSend;
437+
438+ logClientCheckConnection( pClient );
439
440 isConn = pClient->connected;
441+ dataToSend = pClient->nextMsgIndex;
442
443 epicsMutexUnlock ( pClient->mutex );
444
445- if ( isConn ) {
446+ if ( dataToSend ) {
447+ if ( ! isConn ) logClientConnect ( pClient );
448 logClientFlush ( pClient );
449 }
450- else {
451- logClientConnect ( pClient );
452- }
453-
454- epicsThreadSleep ( LOG_RESTART_DELAY );
455+
456+ epicsEventWaitWithTimeout ( pClient->shutdownNotify, LOG_RESTART_DELAY);
457
458 epicsMutexMustLock ( pClient->mutex );
459 }
460@@ -480,9 +453,7 @@ static void logClientRestart ( logClientId id )
461 logClientId epicsShareAPI logClientCreate (
462 struct in_addr server_addr, unsigned short server_port)
463 {
464- epicsTimeStamp begin, current;
465 logClient *pClient;
466- double diff;
467
468 pClient = calloc (1, sizeof (*pClient));
469 if (pClient==NULL) {
470@@ -507,14 +478,22 @@ logClientId epicsShareAPI logClientCreate (
471 pClient->shutdownConfirm = 0;
472
473 epicsAtExit (logClientDestroy, (void*) pClient);
474-
475+
476 pClient->stateChangeNotify = epicsEventCreate (epicsEventEmpty);
477 if ( ! pClient->stateChangeNotify ) {
478 epicsMutexDestroy ( pClient->mutex );
479 free ( pClient );
480 return NULL;
481 }
482-
483+
484+ pClient->shutdownNotify = epicsEventCreate (epicsEventEmpty);
485+ if ( ! pClient->shutdownNotify ) {
486+ epicsMutexDestroy ( pClient->mutex );
487+ epicsEventDestroy ( pClient->stateChangeNotify );
488+ free ( pClient );
489+ return NULL;
490+ }
491+
492 pClient->restartThreadId = epicsThreadCreate (
493 "logRestart", epicsThreadPriorityLow,
494 epicsThreadGetStackSize(epicsThreadStackSmall),
495@@ -522,35 +501,12 @@ logClientId epicsShareAPI logClientCreate (
496 if ( pClient->restartThreadId == NULL ) {
497 epicsMutexDestroy ( pClient->mutex );
498 epicsEventDestroy ( pClient->stateChangeNotify );
499+ epicsEventDestroy ( pClient->shutdownNotify );
500 free (pClient);
501 fprintf(stderr, "log client: unable to start log client connection watch dog thread\n");
502 return NULL;
503 }
504
505- /*
506- * attempt to synchronize with circuit connect
507- */
508- epicsTimeGetCurrent ( & begin );
509- epicsMutexMustLock ( pClient->mutex );
510- do {
511- epicsMutexUnlock ( pClient->mutex );
512- epicsEventWaitWithTimeout (
513- pClient->stateChangeNotify,
514- LOG_SERVER_CREATE_CONNECT_SYNC_TIMEOUT / 10.0 );
515- epicsTimeGetCurrent ( & current );
516- diff = epicsTimeDiffInSeconds ( & current, & begin );
517- epicsMutexMustLock ( pClient->mutex );
518- }
519- while ( ! pClient->connected && diff < LOG_SERVER_CREATE_CONNECT_SYNC_TIMEOUT );
520- epicsMutexUnlock ( pClient->mutex );
521-
522- if ( ! pClient->connected ) {
523- fprintf (stderr, "log client create: timed out synchronizing with circuit connect to \"%s\" after %.1f seconds\n",
524- pClient->name, LOG_SERVER_CREATE_CONNECT_SYNC_TIMEOUT );
525- }
526-
527- errlogAddListener ( logClientSendMessage, (void *) pClient );
528-
529 return (void *) pClient;
530 }
531
532@@ -568,24 +524,21 @@ void epicsShareAPI logClientShow (logClientId id, unsigned level)
533 printf ("log client: disconnected from log server at \"%s\"\n", pClient->name);
534 }
535
536- if (level>1) {
537- printf ("log client: sock=%s, connect cycles = %u\n",
538- pClient->sock==INVALID_SOCKET?"INVALID":"OK",
539- pClient->connectCount);
540- }
541-
542 if (logClientPrefix) {
543 printf ("log client: prefix is \"%s\"\n", logClientPrefix);
544 }
545-}
546
547-/*
548- * logClientSendMessage (); deprecated
549- */
550-void logClientSendMessage ( logClientId id, const char * message )
551-{
552- if ( !iocLogDisable ) {
553- logClientSend (id, message);
554+ if (level>0) {
555+ printf ("log client: sock %s, connect cycles = %u\n",
556+ pClient->sock==INVALID_SOCKET?"INVALID":"OK",
557+ pClient->connectCount);
558+ }
559+ if (level>1) {
560+ printf ("log client: %u bytes in buffer\n", pClient->nextMsgIndex);
561+ if (pClient->nextMsgIndex)
562+ printf("-------------------------\n"
563+ "%.*s-------------------------\n",
564+ (int)(pClient->nextMsgIndex), pClient->msgBuf);
565 }
566 }
567
568diff --git a/modules/libcom/src/log/logClient.h b/modules/libcom/src/log/logClient.h
569index 1797bbb..3b3f63a 100644
570--- a/modules/libcom/src/log/logClient.h
571+++ b/modules/libcom/src/log/logClient.h
572@@ -38,7 +38,6 @@ epicsShareFunc void epicsShareAPI iocLogPrefix(const char* prefix);
573 /* deprecated interface; retained for backward compatibility */
574 /* note: implementations are in iocLog.c, not logClient.c */
575 epicsShareFunc logClientId epicsShareAPI logClientInit (void);
576-epicsShareFunc void logClientSendMessage (logClientId id, const char *message);
577
578 #ifdef __cplusplus
579 }

Subscribers

People subscribed via source and target branches