Code review comment for ~epics-core/epics-base/+git/Com:thread-join

Revision history for this message
Andrew Johnson (anj) wrote :

The jenkins "Build Timeout" plugin wouldn't help much, the documentation says:

  Once the timeout is reached, Jenkins behaves as if an invisible hand has
  clicked the "abort build" button.

So as I wrote above Jenkins won't send out any emails as a result, and now nobody knows that the job failed.

The problem with implementing a watchdog timeout in the wrapper script is that the timeout period really should vary with the specific test and the machine it's running on. As Lewis Muir pointed out, the code asking for the join knows best how long to wait before giving up, and also what to do if it fails.

The epicsThread class provides an exitWait(double seconds) method which is equivalent to join with a timeout. Even Python's Thread.join() method has a timeout parameter.

A return status value from epicsThreadJoin() would provide the same kinds of information to the caller that pthread_join() and taskWait() return: "Not a joinable thread", "Cannot wait on self", "Invalid thread ID", and even "Timeout". A fundamental library routine shouldn't dictate that the only way to handle errors is for it to ignore them or call cantProceed(); a unit test program at least should be able to call testAbort() instead. Just step back a moment and look at your 3 implementations of epicsThreadJoin() — a significant proportion of the code in them is just handling error conditions.

« Back to merge proposal