> for example left shift of a negative integer. It's clearly and obviously possi...

gingerBill · on Sept 11, 2022

Name a machine that people still program for that is NOT two's complement.

And the latest version of C now requires two's complement too!

astrange · on Sept 11, 2022

Left shift with a negative shift count still has UB issues unless you generate inefficient code. The way the shift count is read is different on ARM, x86 scalar, and x86 SIMD… so you can't autovectorize it without UB.

gingerBill · on Sept 11, 2022

Odin does not allow for negative shifts to begin with. To copy from the Overview[1]:

> The right operand in a shift expression must have an unsigned integer type or be an untyped constant representable by a typed unsigned integer.

and [2]:

> The shift operators shift the left operand by the shift count specified by the right operand, which must be non-negative. The shift operators implement arithmetic shifts if the left operand a signed integer and logical shifts if the it is an unsigned integer. There is not an upper limit on the shift count. Shifts behave as if the left operand is shifted `n` times by `1` for a shift count of `n`. Therefore, `x<<1` is the same as `x*2` and `x>>1` is the same as `x/2` but truncated towards negative infinity.

[1] https://odin-lang.org/docs/overview/#arithmetic-operators [2] https://odin-lang.org/docs/overview/#integer-operators

raphlinus · on Sept 11, 2022

Requiring the shift amount to be unsigned is very sensible.

There are two other issues. One is when the shift amount is greater than or equal to the machine word size, for example 1u32 << 32. Java and Rust (release mode; debug is allowed to panic similarly to other forms of arithmetic overflow) take the position that the shift amount is masked to be modulo the word size, which happens to match Intel and aarch64 assembly instructions but not 32-bit ARM, so (unless the optimizer can prove the range) requires an additional mask instruction. This is the motivation for not explicitly specifying the behavior in C, though I believe a very strong case can be made that it should be implementation defined rather than undefined. In any case, it sounds like you are making the opposite decision as Java, so there's extra logic needed to correctly implement it on Intel and aarch64 but it should work efficiently on 32-bit ARM.

But I was specifically referring to the fact that -1 << 1 is UB in C. It was fixed in C++20 (so the value is specified to be -2, as arithmetic is now defined to be twos complement), but that fix is not in the C23 draft. I consider this perhaps the canonical example of programmer-hostile UB.

gingerBill · on Sept 11, 2022

My second quote above actually cover your cases in the case of "overshift". In the case of overshift, the value is zeroed. Empirically I have found that the vast majority of shift factors can be known at compile time (or at least the valid range of integer), meaning it will compile to 1 instruction rather 2/3.

In naïve C, the shift would look like this:

    (y < 8*sizeof(x)) > (x << y) : 0

In practice, the "select" instruction is never needed. This approach to shifting also places never in that it feels more like 2's complement in that `x << y` is now equivalent to `x * (2*y)`.

raphlinus · on Sept 12, 2022

That's a reasonable choice and I do agree it has nice properties, for example (a << b) << c is equal to a << (b + c) (unless that latter addition overflows), but it also does put you pretty firmly in the category of "not squeezing out every last cycle like you can in C." Usually that's one of the the things people expect in a tradeoff involving safety.

Regarding "never," I am an adventurous programmer[1], so am fully willing to accept that I am the exception that proves the rule.

On the more general point of UB for arithmetic overflow, I think we're on the same side: this is a pretty broken story in the way C has ended up, and one of the clear motivations for a cleaned up better-than-C language. I'm more interested in your thoughts on data races, where I suspect we have a substantive difference in perspective.

[1]: https://github.com/linebender/kurbo/blob/de52170e084cf658a2a...

jessermeyer · on Sept 11, 2022

Odin forbids negative shifts.

astrange · on Sept 12, 2022

It's not a genuine negative shift, because the CPU doesn't implement negative shifts either.

It's a trick to simplify eg 'x << (32 - y)' to 'x << (-y)' and save an instruction, seen in ffmpeg, because the compilers don't always do it themselves. This works because of the shift masking behavior.

gingerBill · on Sept 13, 2022

Odin's shift behaviour is different to that of C's undefined behaviour.

To copy from the Overview[1]:

> The right operand in a shift expression must have an unsigned integer type or be an untyped constant representable by a typed unsigned integer.

and [2]:

> The shift operators shift the left operand by the shift count specified by the right operand, which must be non-negative. The shift operators implement arithmetic shifts if the left operand a signed integer and logical shifts if the it is an unsigned integer. There is not an upper limit on the shift count. Shifts behave as if the left operand is shifted `n` times by `1` for a shift count of `n`. Therefore, `x<<1` is the same as `x2` and `x>>1` is the same as `x/2` but truncated towards negative infinity.

In the case of overshift, the value is zeroed. Empirically I have found that the vast majority of shift factors can be known at compile time (or at least the valid range of integer), meaning it will compile to 1 instruction rather 2/3. In naïve C, the shift would look like this:

    (y < 8*sizeof(x)) > (x << y) : 0

In practice, the "select" instruction is never needed. This approach to shifting also places never in that it feels more like 2's complement in that `x << y` is now equivalent to `x (2*y)`.

This means that `(a << b) << c` is equivalent to `a << (b + c)`. In your example of `x << (32 - y)`, that logic works as what people intuitively intent it to mean, unlike in C where when `y` is 0, the shift behaviour is platform defined (nothing on x86 (5 bit shift) but zeros on powerpc (6 bit shift)).

[1] https://odin-lang.org/docs/overview/#arithmetic-operators

[2] https://odin-lang.org/docs/overview/#integer-operators

elteto · on Sept 11, 2022

Those machines can stay using whatever compiler/language already works for them today. Enough of holding back the rest of the world because of these fabled non-standard architectures.