remove all uses of XXXSUPFUN typedefs

Bug #1666924 reported by Ben Franksen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Fix Released
Wishlist
Ben Franksen

Bug Description

The OO-like function tables defined in recSup.h, devSup.h, and drvSup.h disable type checking by using the typedefs RECSUPFUN, DEVSUPFUN, and DRVSUPFUN, respectively, which declare the methods as having "any number and type of arguments". For instance, this is in devSup.h:

typedef long (*DRVSUPFUN) (); /* ptr to driver support function*/

typedef struct drvet { /* driver entry table */
 long number; /*number of support routines*/
 DRVSUPFUN report; /*print report*/
 DRVSUPFUN init; /*init support*/
}drvet;

Supports have to uses casts to XXXSUPFUN when they fill a xxxet structure.

I think listing the well-known reasons why this is bad would be redundant. As an illustration, see the bug recently uncovered by Carsten Winkler (http://www.aps.anl.gov/epics/tech-talk/2017/msg00347.php).

I propose to change the definition of these structs so that all the methods get their proper type.

There remains the question how to handle the (implicit) subtype polymorphism for things like dbCommon. I propose to handle this by forcing support definitions to explicitly cast the argument (e.g. dbCommon *precord) to the appropriate subtype; this adds one line of code to each support function e.g.

long process(dbCommon *pdbCommon)
{
    aoRecord *prec = (aoRecord *)pdbCommon;
    ...use prec from here onward...
}

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I've certainly been bitten by this before. The problem is how to change this is a backwards compatible way. That is, how not to break existing device support code which uses these definitions. Accomplishing this for both C and C++ seems a less than straight forward task.

You might be interested in looking at, or working on, lp:~epics-core/epics-base/dtypedset which tries to solve this problem.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

On the C++ side I've sometimes made use of

template<typename R>
struct dset6 {
    long N;
    long (*report)(R*);
    long (*init)(int);
    long (*init_record)(dbCommon*);
    long (*get_io_intr_info)(int, dbCommon* prec, IOSCANPVT*);
    long (*readwrite)(R*);
    long (*linconv)(R*);
};

Changed in epics-base:
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
Ben Franksen (bfrk) wrote :

> The problem is how to change this is a backwards compatible way

IMO it should *not* be backwards compatible.

Support code written against the old API *should* refuse to compile. It's a simple enough task (just tedious) to fix the compiler errors you get, but to find all the places that need fixing is only easy if you let the type checker help you.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

> Support code written against the old API *should* refuse to compile.
> ... just tedious ...

IMO any solution here needs to handle backward/forward compatibility in some way other than as a flag day.

My thinking is to offer two struct definitions to allow incremental replacement. I suppose another option is to use an #ifdef to select one or the other. I'm sure there are other options as well.

Revision history for this message
Ben Franksen (bfrk) wrote :

I am not sure what "flag day" means exactly but I can make an educated guess ;-)

I do not think this is a problem at all. If we make these changes in 3.16, then people have a long time to adapt their code. Nobody is forced to switch to a 3.16 any time soon. We can make this change in a way that the required code changes are minimal and compatible with old versions of base.

Here is the idea, illustrated for recSup, other support types can be handled similarly.

We change the struct definition so that all methods have the intended type, but replace dbCommon with a macro REC_TYPE. Before the struct we add

  #ifndef REC_TYPE
  #define RECTYPE dbCommon
  #endif

With this, support implementations need the following minimal changes:

* add

    #define REC_TYPE aoRecord

  before they #include "recSup.h"

* remove (RECSUPFUN) type casts, if any.

Code that calls the methods need no changes at all due to the default #define.

All code that is actually type correct should compile cleanly (support implementations as well as method calls).

Support implementations that have been adapted in this way still work with the old struct. The #define REC_TYPE is harmless and the (RECSUPFUN) type casts are not necessary, there are none in base, for instance.

I have checked that this all works as intended, in principle, see the attached patch which is against the HEAD of 3.14. For recSup.h it contains all the needed changes. For illustration purposes I also adapted aoRecord.c with the one-liner #define (see above) which is all that is needed.

You'll get lots of warnings (type mismatches, mostly, e.g. in dbConvert.c) but none for aoRecord.c.

Revision history for this message
Ralph Lange (ralph-lange) wrote :

Given that 3.16 will be marketed as EPICS 7, which is obviously a MAJOR step,

+1 for

- doing this change in 3.16, and
- having a proper EPICS 7 migration guide explaining what to change.

If we do this properly, users might be surprised they only have to change so little.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

https://en.wikipedia.org/wiki/Flag_day_%28computing%29

An expensive change with no way back.

Ben, I hope I don't discourage you as I want to see dset/rset/... become more type safe. However, I don't consider this enough of a benefit to break API compatibility without a transition period where both APIs are available.

For 3.16 I won't accept a change to dset et al. by _default_. I'm open to, for example, allowing user code to define a macro to change these definitions.

When thinking about how such a change could be done, consider what a support module would need to do to remain compatible with with both APIs.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

  #ifndef REC_TYPE
  #define RECTYPE dbCommon
  #endif

This seems to carry the implicit assumption that each source file has device support for a single record type, which is not the case for any driver I've written.

Revision history for this message
Ben Franksen (bfrk) wrote :

> #ifndef REC_TYPE
> #define RECTYPE dbCommon
> #endif
>
> This seems to carry the implicit assumption that each source file has device support for a single
> record type, which is not the case for any driver I've written.

The situation for device support is more complicated than for record support. This is because each record type can (and has to) add methods to the common subset known to base. And these definitions have always been implicit, i.e. not published in a header file, which is the first thing that would have to change when we attack the device support problem.

This is why I tried to focus the discussion to record support, where the situation is more regular. I know of no record support module where a single C file implements more than one record type. Besides, what prompted this bug report was a crash of EPICS base due to type errors in dbConvert.c, when calling record support methods.

The most immediately pressing issue is to fix these bugs ASAP in all branches. This (and lazyness) is why the patch I made is against the 3.14 branch. I did not mean to ask you to apply that patch in any public bzr repo without modification. The patch I sent is there to help you identify these bugs and as proof that the idea is sound in principle.

> An expensive change with no way back.

Adding one line per C file is not expensive. And there is no reason why anyone would want to go back on this.

I already said (but perhaps not with enough emphasis) that the changes needed to adapt record supports are compatible. Why would one additional #define before including recSup.h break anything w.r.t. the old recSup.h? Also note that there are *no* uses of the typedef RECSUPFUN in base. This implies that if record supports that are not in base remove their uses of RECSUPFUN (so they are 3.16 compatible) then they are still compatible to base versions with the old recSup.h.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

> I already said

Apologies if I've mis-understood. I think I've made my criteria clear, so I won't say more until there is code.

Revision history for this message
Ben Franksen (bfrk) wrote :

I have pushed a branch (~bfrk/epics-base/typed-recsup) but it says the branch is empty and displays a permanent

Updating branch...

Launchpad is processing new changes to this branch which will be available in a few minutes. Reload to see the changes.

Strangely, I do see the patch I committed (and all the ones before it, I started from the mainline this time).

Any ideas?

Changed in epics-base:
assignee: nobody → Ben Franksen (bfrk)
status: Triaged → In Progress
Revision history for this message
mdavidsaver (mdavidsaver) wrote :

Looks fine now. Launchpad minutes can be quite long.

Do you want to start a merge proposal for your branch?

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

In a quick pass through https://github.com/search?p=11&q=org%3Aepicsdeb+get_precision&type=Code and https://github.com/search?p=5&q=RECSUPFUN&ref=searchresults&type=Code&utf8=%E2%9C%93 I find only two record types defined in C++, where RECSUPFUN must be used. motorRecord and something called axisRecord. These two would be broken by this change. This isn't as bad as I had feared.

How would you change motorRecord.cc so that it builds against both APIs?

https://github.com/epics-modules/motor/blob/14b1070b7d64fb95e7dcf4be06b9386a088cf258/motorApp/MotorSrc/motorRecord.cc

https://github.com/EPICS-motor-wg/axis/blob/36813c8981a60b4af225e4ad200c5fc94190efee/axisApp/AxisSrc/axisRecord.cc

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I pulled out the change to dbConvert.c onto the 3.14 branch as I think this is a clear bug fix. http://bazaar.launchpad.net/~epics-core/epics-base/3.14/revision/12700

I didn't apply the change to dbAccess.c as this doesn't seem correct to me. I think dbr_precision_size will always be larger than sizeof(epicsInt32). As the DBRprecision union has 'long dp', the present code seems correct. The comment near DBRprecision in dbAccessDefs.h explains this.

Revision history for this message
Ben Franksen (bfrk) wrote :

Thanks. BTW credit for the fix is due Freddie <email address hidden>.

You are right wrt dbr_precision_size, of course. Didn't look sharp enough at the header file.

Re C++: Good catch. It did not occur to me to consider supports written in C++. Will look into that.

Revision history for this message
Ben Franksen (bfrk) wrote :

> Do you want to start a merge proposal for your branch?

Not sure about the procedure to follow here. Is this done early or late in the process?

Revision history for this message
mdavidsaver (mdavidsaver) wrote : Re: [Bug 1666924] Re: remove all uses of XXXSUPFUN typedefs

On 02/26/2017 08:18 AM, Ben Franksen wrote:
>> Do you want to start a merge proposal for your branch?
>
> Not sure about the procedure to follow here. Is this done early or late
> in the process?

Early please

Revision history for this message
Andrew Johnson (anj) wrote :

One response to this comment:

> there is no reason why anyone would want to go back on this.

It is pretty much essential now for support module releases to be compatible with more than one branch of Base. The guys supporting synApps don't mind having to make minor changes to their code to get it to build against new versions of Base, but they do want a new release to compile against older Base versions too — they *really* don't want to have a separate release branch of a module for each Base branch. They will use conditional compilation #if .. #else .. #endif blocks if necessary to make that work, but doing so does make the code ugly and harder to read.

In this case having to #define REC_TYPE for the new release won't break builds against the old Base, so I that should be fine. Having to make some method arguments const might be OK, but this has more of a chance of causing problems inside the method depending on how the arguments get used and what other routines they might get passed to.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

This work was release in 3.16.1.

Changed in epics-base:
status: In Progress → Fix Released
Revision history for this message
mdavidsaver (mdavidsaver) wrote :

It looks like we missed the init() function, which seems like it should be "long (*init)(void);". Can we fix w/o a compatibility break?

> struct typed_rset {
> long number; /* number of support routines */
> long (*report)(void *precord);
> long (*init)();

Revision history for this message
Andrew Johnson (anj) wrote :

I don't know of any record types that provide an RSET::init() routine, even the motor record has that routine set to NULL (the axis record is a fork of the motor record). If there are no users, there should be nothing to stop us from fixing the typed_rset declaration. I concur with your argument list, iocInit.c::initRecSup() calls it like this:

        if (prset->init) {
            prset->init();
        }

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.