MDEV-33734 Improve the sequence increment inequality testing
We add an extra condition that makes the inequality testing in
SEQUENCE::increment_value() mathematically watertight, and we cast to
and from unsigned in potential underflow and overflow addition and
subtractions to avoid undefined behaviour.
Let's start by distinguishing between c++ expressions and mathematical
expressions. by c++ expression I mean an expression with the outcome
determined by the compiler/runtime. by mathematical expression I mean
an expression whose value is mathematically determined. So a c++
expression -9223372036854775806 - 1000 at worst can evaluate to any
value due to underflow. A mathematical expression -9223372036854775806
- 1000 evaluates to -9223372036854776806.
The problem boils down to how to write a c++ expression equivalent to
an mathematical expression x + y < z where x and z can take any values
of long long int, and y < 0 is also a long long int. Ideally we want
to avoid underflow, but I'm not sure how this can be done.
The correct c++ form should be (x + y < z || x < z - y || x < z).
Let M=9223372036854775808 i.e. LONGLONG_MAX + 1. We have
-M < x < M - 1
-M < y < 0
-M < z < M - 1
Let's consider the case where x + y < z is true as a mathematical
expression.
If the first disjunct underflows, i.e. the mathematical expression x
+ y < -M. If the arbitrary value resulting from the underflow causes
the c++ expression to hold too, then we are done. Otherwise we move
onto the next expression x < z - y. If there's no overflow in z
- y then we are done. If there's overflow i.e. z - y > M - 1,
and the c++ expression evals to false, then we are onto x < z.
There's no over or underflow here, and it will eval to true. To see
this, note that
x + y < -M means x < -M - y < -M - (-M) = 0
z - y > M - 1 means z > y + M - 1 > - M + M - 1 = -1
so x < z.
Now let's consider the case where x + y < z is false as a mathematical
expression.
The first disjunct will not underflow in this case, so we move to (x <
z - y). This will not overflow. To see this, note that
x + y >= z means z - y <= x < M - 1
So it evals to false too. And the third disjunct x < z also evals to
false because x >= z - y > z.
I suspect that in either case the expression x < z does not determine
the final value of the disjunction in the vast majority cases, which
is why we leave it as the final one in case of the rare cases of both
an underflow and an overflow happening.
Here's an example of both underflow and overflow happening and the
added inequality x < z saves the day:
x = - M / 2
y = - M / 2 - 1
z = M / 2
x + y evals to M - 1 which is > z
z - y evals to - M + 1 which is < x
We can do the same to test x + y > z where the increment y is positive:
It was wrong to derive Item_func_uuid from Item_func_sys_guid,
because the former is a function returning the UUID data type,
while the latter is a string function returning VARCHAR.
As a result of the wrong hierarchy, Item_func_uuid erroneously derived
Item_str_func::fix_fields(), which contains this code:
/*
In Item_str_func::check_well_formed_result() we may set null_value
flag on the same condition as in test() below.
*/
if (thd->is_strict_mode())
set_maybe_null();
This code is not relevant to UUID() at all.
A simple fix would be to set_maybe_null(false) in
Item_func_uuid::fix_length_and_dec(). However,
it'd fix only exactly this single consequence of the wrong
class hierarchy, and similar bugs could appear again in
the future. Moreover, we're going to add functions UUIDv4()
and UUIDv7() soon (in 11.6). So it's better to fix the class hierarchy
in the right way before adding these new functions.
Fix:
- Adding a new abstract class Item_fbt_func in the template
in sql_type_fixedbin.h
- Deriving Item_typecast_fbt from Item_fbt_func
- Deriving Item_func_uuid from Item_fbt_func
- Adding a new helper class UUIDv1. It derives from UUID, and additionally
initializes the value to "UUID version 1" right in the constructor.
Note, the new coming soon SQL functions UUIDv4() and UUIDv7()
will also have corresponding classes UUIDv4 and UUIDv7.
So now UUID() is a pure "returning UUID" function,
like CAST(expr AS UUID) used to be, without any unintentional
artifacts of functions returning VARCHAR/TEXT.
Cleanup:
- Removing the member Item_func_sys_guid::with_dashes,
as it's not needed any more:
* Item_func_sys_guid now does not have any descendants any more
* Item_func_sys_guid::val_str() itself always displays without dashes