To do UB "optimizations", the compiler first needs to figure out that there is an UB it can "optimize" anyway. At this point instead of "optimizing" it could, and in my humble opinion absolutely should, blow up the compilation by generating an UB error, so people can fix their stuff.
What about backwards compatibility in regards to a new compiler version deciding to issue errors on UB now? You don't have any guarantees about what happens with UB right now, so if you upgrade to a new version compiler that generates errors instead of "optimizations" everything would be still as before: no guarantees. And it's frankly a lot better to blow up the compilation with errors than to have the compiler accept the UB code and roll a dice on how the final binary will behave later. You can either fix the code to make it compile again, or use an older "known good" version of the compiler that you previously used as a stopgap measure.
I fail to see any reason whatsoever why compilers are still doing all kinds of stupid stuff with UB instead of doing the right thing and issuing errors when they encounter UB.
I also fail to see why the C language designers still insist on keeping so much of the legacy shit around.
> To do UB "optimizations", the compiler first needs to figure out that there is an UB it can "optimize" anyway.
The compiler assumes UB will never happen and it makes transformations that will be valid if there happens to be no UB. This doesn't require any explicit detection of UB, and in some cases UB or not is simply undecidable at compile time (as in no compiler could detect it without incorrect results).
Without these assumptions the resulting compiled code would be much slower, though some optimizations have different danger vs speed impact and there certainly can be a case that there are some optimizations that should be eschewed because they're a poor trade-off.
There are many cases where current compilers will warn you when you've done something that is UB. It's probably not the case that they warn for every such detectable case and if so it would be reasonable to ask them to warn about more of them.
I think your irritation is just based on a misunderstanding of the situation.
Compiler authors are C(++) programmers too, they also don't like footguns. They're not trying to screw anyone over. They don't waste their time adding optimizations that don't make real performance improvements just to trip up invalid code.
Yes, some UB are not decidable at compile time, but a lot could be easily speced to have a defined behavior at runtime, such as overflows.
The main reason to not spec these things is because people would be arguing "this makes compiled code on my esoteric 9-bit 1-complement chip slower" or "there was this chip in the 70s that did things differently" or "but a short int on Cray was 64-bit". Great, so now the spec has avoidable unnecessary undefined behavior all over the place, and the code other people wrote still does not run correctly on your 9-bit chip. Brought to you by the same people who decided "NULL is not necessarily (void*)0", and who define those integer types everybody uses (instead of stdint) with an "at least this big".
Yes, a lot of that is legacy stuff and was added to accommodate and model things that already existed (the wrong way to go about it, IMO, but hindsight is 20/20), but that's my argument: fix this stuff once and for all and for good in an upcoming spec iteration.
>Without these assumptions the resulting compiled code would be much slower
In some cases, this is true (for different levels of "much slower"), but the trade off here is still "running code that works, but a little slower" vs "running code that does not work and will launch a nuclear strike at Switzerland by accident, but really fast".
In a lot of cases, it will not be slower, or at least not much slower.
>I think your irritation is just based on a misunderstanding of the situation.
Frankly, not really. I started writing my first C (and C++) in the early 90s, and I think I do understand the situation pretty well by now. But I should have been more precise in my initial ranting comment, I give you that.
Note that (void *)0 is always NULL, as mandated by the standard.
But, to address the content of your comment: defined behavior at runtime is not necessarily good behavior at runtime. Defining signed integer overflow to wrap, for example, is probably a bad idea, because this is rarely the intent of the code. Having all such operations trap might be a good idea, but now you're going to get the same "stop breaking my working programs" people angry at you.
Yes, thankfully at least with NULL they didn't fall into the legacy trap and messed up the standard with non-zero NULL that some machines before have been kind of using.
>Defining signed integer overflow to wrap, for example, is probably a bad idea
I wouldn't call it great behavior, but it's at least what most people expect will happen, and most people will be able to understand what's going on, and it's fast on most systems that matter. However, it's still undefined behavior. Just codifying overflows to be wrapping, would therefore be an improvement in my opinion, at least over what we have today.
> These days if you want to catch UB, compile with -fsanitize=undefined-behaviour. The program wll then trap if UB is actually detected at runtime.
So, let me get this straight, someone wants to make sure pointer p is not null (in the wrong way), and codes something like the examples in posts above like if (!p) ... and if that doesn't trigger calls use(*p), but compiler decides p can never be null because that would be UB and hence removes the check.
The C coder dumps the code and gets upset because the check is removed and gets the hint to catch UB by adding -fsanitize .. that "catches UB" in the above scenario so that the program will "trap if UB is detected".
I think we just came full circle there.
Sure, the -f will catch ALL detected bugs and so on, but I still found it a bit funny.
Ubsan will abort an invalid program if it detect ub. It doesn't let you handle it. So you shouldn't remove the erroneous check, but fix it so it is no longer erroneous, and ubsan will help you identify these errors.
Also ubsan adds significant overhead so it is not really appropriate for production builds unfortunately (hence my wish for a less powerful ubsan-lite but with lower overhead).
I think you are misunderstanding the situation. Given code like:
if (!p) {
use(*p);
}
(given no previous knowledge about p) no compiler will remove the "if (!p)" part.
What people are complaining about is the opposite case:
use(*p);
/* The compiler reasons that if p == NULL, the program would have crashed by now,
so if we got here, p != NULL must hold. */
if (!p) { // the compiler can remove this branch
report_error();
}
What about backwards compatibility in regards to a new compiler version deciding to issue errors on UB now? You don't have any guarantees about what happens with UB right now, so if you upgrade to a new version compiler that generates errors instead of "optimizations" everything would be still as before: no guarantees. And it's frankly a lot better to blow up the compilation with errors than to have the compiler accept the UB code and roll a dice on how the final binary will behave later. You can either fix the code to make it compile again, or use an older "known good" version of the compiler that you previously used as a stopgap measure.
I fail to see any reason whatsoever why compilers are still doing all kinds of stupid stuff with UB instead of doing the right thing and issuing errors when they encounter UB.
I also fail to see why the C language designers still insist on keeping so much of the legacy shit around.