At Mon, 24 Feb 2020 22:15:22 -0500 (EST), Mouse <mouse%Rodents-Montreal.ORG@localhost> wrote: Subject: Re: NULL pointer arithmetic issues > > > Greg A. Woods wrote: > > > > NO MORE "undefined behaviour"!!! Pick something sane and stick to it! > > > > The problem with modern "Standard" C is that instead of refining > > the definition of the abstract machine to match the most common > > and/or logical behaviour of existing implementations, the standards > > committee chose to throw the baby out with the bath water and make > > whole swaths of conditions into so-called "undefined behaviour" > > conditions. > > Unfortunately for your argument, they did this because there are > "existing implementations" that disagree severely over the points in > question. I don't believe that's quite right. True "Undefined Behaviour" is not usually the explanation for differences between implementations. That's normally what the Language Lawyers call "Implementation Defined" behaviour. "Undefined behaviour" is used for things like dereferencing a nil pointer. There's little disagreement about that being "undefined by definition" -- even ignoring the Language Lawyers. We can hopefully agree upon that even using the original K&R edition's language: "C guarantees that no pointer that validly points at data will contain zero" The problem though is that C gives more rope than you might ever think possible in some situations, such as for example, the chances of dereferencing a nil pointer with poorly written code. The worse problem though is when compiler writers, what I'll call "Optimization Warrior Lawyers", start abusing any and every possible instance of "Undefined Behaviour" to their advantage. This is worse than ignoring Hoare's advice -- this is the very epitome of premature optimization -- this is pure evil. This is breaking otherwise readable and usable code. I give you again my example: > > An excellent example are the data-flow optimizations that are now > > commonly abused to elide security/safety-sensitive code: > > > int > > foo(struct bar *p) > > { > > char *lp = p->s; > > > > if (p == NULL || lp == NULL) { > > return -1; > > } > > This code is, and always has been, broken; it is accessing p->s before > it knows that p isn't nil. How do you know for sure? How does the compiler know? Serious questions. What if all calls to foo() are written as such: if (p) foo(p); I agree this might not be "fail-safe" code, or in any other way advisable, but it was perfectly fine in the world before UB Optimization Warriors, however today's "Standard C" gives compilers license to replace "foo()" with a trap or call to "abort()", etc. I.e. it takes a real "C Language Lawyer(tm)" to know that past certain optimization levels the sequence points prevent this from happening. In the past I could equally assume the optimizer would rewrite the first bit of foo() as: if (! p || ! p->s) return -1; In 35 years of C programming I've never before had to pay such close attention to such minute details. I need tools now to audit old code for such things, and my current experience to date suggests UBSan is not up to this task -- i.e. runtime reports are useless (perhaps even with high-code-coverage unit tests). This is the main point of my original rant. "Undefined Behaviour" as it has been interpreted by Optimization Warriors has given us an unusable language. -- Greg A. Woods <gwoods%acm.org@localhost> Kelowna, BC +1 250 762-7675 RoboHack <woods%robohack.ca@localhost> Planix, Inc. <woods%planix.com@localhost> Avoncote Farms <woods%avoncote.ca@localhost>
Attachment:
pgpZQV0wRHXLE.pgp
Description: OpenPGP Digital Signature