Code review comment for lp:~epics-core/epics-base/fix-calc-bit-manipulation

Revision history for this message
Andrew Johnson (anj) wrote :

> Is that a compiler or a size dependency? My Debian 64bit gcc5 machine was testing them fine.

Don't know, I was testing on RHEL 6.7 with both 64 and 32-bit builds. Let me boot my Ubuntu VM to see what that does...
No difference; I'm doing manual checks on an IOC with a single calc record loaded.

This is with an unfixed 3.14 IOC, 64-bit:
tux% caget -lx anj:calc.CALC anj:calc
anj:calc.CALC a:=2863311530.0; a >> 8
anj:calc 0xAAAAAA

This is a 3.15 IOC, 64-bit:
tux% caget -lx anj:calc.CALC anj:calc
anj:calc.CALC a:=2863311530.0; a >> 8
anj:calc 0xFFFFFFFFFF800000

Then with this branch, 64-bit:
tux% caget -lx anj:calc.CALC anj:calc
anj:calc.CALC a:=2863311530.0; a >> 8
anj:calc 0xFFFFFFFFFFAAAAAA

The difference between 3.14 and 3.15 is because the bit-wise expressions were changed from long to epicsInt32, which fixed the shift to make them sign-extend properly, but show the broken value conversions. The final result above is correct (the sign extension on the results above are being done by caget -lx on 64-bit). I can get the right result from 3.15 using the negative value instead:
tux% caget -lx anj:calc.CALC anj:calc
anj:calc.CALC a:=-1431655766.1; a >> 8
anj:calc 0xFFFFFFFFFFAAAAAA

I think I got a bit confused when I was working on this last week; the negative value is evidently not (and maybe shouldn't be) necessary with the proper fix, so maybe I didn't need to change those tests. However I have just rewritten the double tests to check that we do convert the arguments properly for every bit-wise operator instead of duplicating the earlier integer tests:

    // converting double values (add 0.1 to force as double)
    // 0xaaaaaaaa = -1431655766 or 2863311530u
    testUInt32Calc("-1431655766.1 OR 0", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 OR 0", 0xaaaaaaaau);
    testUInt32Calc("0 OR -1431655766.1", 0xaaaaaaaau);
    testUInt32Calc("0 OR 2863311530.1", 0xaaaaaaaau);
    testUInt32Calc("-1431655766.1 XOR 0", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 XOR 0", 0xaaaaaaaau);
    testUInt32Calc("0 XOR -1431655766.1", 0xaaaaaaaau);
    testUInt32Calc("0 XOR 2863311530.1", 0xaaaaaaaau);
    testUInt32Calc("-1431655766.1 AND 0xffffffff", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 AND 0xffffffff", 0xaaaaaaaau);
    testUInt32Calc("0xffffffff AND -1431655766.1", 0xaaaaaaaau);
    testUInt32Calc("0xffffffff AND 2863311530.1", 0xaaaaaaaau);
    testUInt32Calc("~ -1431655766.1", 0x55555555u);
    testUInt32Calc("~ 2863311530.1", 0x55555555u);
    testUInt32Calc("-1431655766.1 >> 0", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 >> 0", 0xaaaaaaaau);
    testUInt32Calc("-1431655766.1 >> 0.1", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 >> 0.1", 0xaaaaaaaau);
    testUInt32Calc("-1431655766.1 << 0", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 << 0", 0xaaaaaaaau);
    testUInt32Calc("-1431655766.1 << 0.1", 0xaaaaaaaau);
    testUInt32Calc("2863311530.1 << 0.1", 0xaaaaaaaau);

Hmm, I just copied the resulting epicsCalcTest.cpp file into an unmodified 3.14 branch, where it happily runs with no failures at all. In a 3.15 branch however the +ve values always convert to 0x80000000 so half the above tests fail. Based on that evidence it looks like this fix needs to be applied to 3.15 but not to 3.14, so why did Dirk see any errors at all? The last change to the libCom/calc directory on the 3.14 branch was committed before the 3.14.12 release that he said he was using, so I should be running the exact same code as him.

Looking at Dirk's bug report, he's passing in the 0xaaaaaaaa and 0xffff0000 values from constant links in the INPA and INPB fields, not as literals inside the CALC expression. Integer CALC literals get sign-extended in the calcPerform() routine so they should always be able to be converted back into a 32-bit value. Values from the input fields of a CALC record however are not sign-extended, and may fall be outside the legal range and thus cause an overflow when cast to a 32-bit value.

Most of Dirk's 3 different sets of results can be explained based on his OS and sizeof(long). On the 64-bit Linux, 0xaaaaaaaa is getting properly converted, hence the correct ~ and ~~ results. In most other cases the hex argument is probably overflowing, so the value becomes 0x80000000 except on VxWorks where it's 0x7fffffff.

Is this an acceptable state of affairs? Should we commit this branch as-is, or mask off the higher bits before casting to 32-bits? After looking at this for most of today my brain can't decide, so external input would be welcome.

review: Needs Information

« Back to merge proposal