C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Defining undefined, etc., behavior
Goto page 1, 2, 3, 4, 5, 6  Next
 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ language, library and standards
View previous topic :: View next topic  
Author Message
Scott Meyers
Guest





PostPosted: Fri Apr 14, 2006 8:06 am    Post subject: Defining undefined, etc., behavior Reply with quote



What work has been done on eliminating undefined behavior in C or C++? Off the
top of my head, some undefined behavior is due to "stupid user behavior," e.g.,
indexing beyond the end of an array, dereferencing a null pointer, etc. Other
undefined behavior -- or maybe it's unspecified or implementation-defined
behavior, I can never keep those terms straight -- arises from flexibility for
implementers, e.g., order of evaluating arguments for function calls, order of
expression evaluation between sequence points, order of initialization of
non-local statics defined in separate translation units. I'd be interested in
any work that's been done on identifying these kinds of behavior (beyond
grepping the standards for "undefined", "unspecified", etc.) and suggesting ways
to eliminate or avoid them, ideally without any obvious runtime cost. (Checking
array bounds would fail the "no obvious runtime cost" test, while defining
function argument evaluation order would pass, because, well, because it's not
obvious to me that imposing a required order would necessarily incur a runtime
penalty :-})

My motivation for the question is that it's my understanding that
safety-critical applications typically identify language subsets to reduce risk,
and they then typically impose usage rules on the features remaining in the
subsets to further reduce risk. One form of risk is unpredictable program
behavior, and such behavior can arise from the use of language constructs with
undefined/unspecified/implementation-defined behavior, so I'm wondering what
work has been done on identifying such constructs and dealing with them.

Thanks,

Scott

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Tomás
Guest





PostPosted: Fri Apr 14, 2006 4:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote



Scott Meyers posted:

Quote:
What work has been done on eliminating undefined behavior in C or C++?
Off the top of my head, some undefined behavior is due to "stupid user
behavior," e.g., indexing beyond the end of an array, dereferencing a
null pointer, etc.


I agree. Stuff like the following is just "stupid":

int *p;

*p = 5; //Undefined Behaviour


Quote:
Other undefined behavior -- or maybe it's
unspecified or implementation-defined behavior, I can never keep those
terms straight -- arises from flexibility for implementers, e.g., order
of evaluating arguments for function calls, order of expression
evaluation between sequence points, order of initialization of
non-local statics defined in separate translation units.

You're either writing:

a) Portable 100% Standard-compliant Code

b) Platform-specific code / Implementation-specific code


If you're writing portable code, then the solution is simple: Don't allow
situations that could, depending on the implementation, result in Undefined
Behaviour. Here's an example:


char p = 120;

char += 100; //Could cause Undefined Behaviour


This has no place in portable code.

On the other hand, if you're programming for a platform where you know that
"char" is unsigned, then you are free to do the above.


Quote:
I'd be
interested in any work that's been done on identifying these kinds of
behavior (beyond grepping the standards for "undefined", "unspecified",
etc.) and suggesting ways to eliminate or avoid them, ideally without
any obvious runtime cost.

Basically you just have to know the C++ language; you've to know things
like:

a) It's undefined behaviour for a signed integral type to overflow.

b) It's undefined behaviour to check an intrinsic variable's value if it was
never initialised.


Quote:
(Checking array bounds would fail the "no
obvious runtime cost" test, while defining function argument evaluation
order would pass, because, well, because it's not obvious to me that
imposing a required order would necessarily incur a runtime penalty
:-})


We already have the C++ programming language. Unless we change it with
future Standards, we can simply work around any problems. Instead of
writing:

j = f() + g();

Simply write:

j = f(); j += g();


Quote:
My motivation for the question is that it's my understanding that
safety-critical applications typically identify language subsets to
reduce risk, and they then typically impose usage rules on the features
remaining in the subsets to further reduce risk. One form of risk is
unpredictable program behavior, and such behavior can arise from the
use of language constructs with
undefined/unspecified/implementation-defined behavior, so I'm wondering
what work has been done on identifying such constructs and dealing with
them.


None that I'm aware of -- because none is needed. Here's an example of how I
write portable code:

I invariably use Windows XP, so all my programs are Win32. On Win32, the
built-in types are as follows:

char = 8 bits
short = 16 bits
int = 32 bits
long = 32 bits

On my platform, I can safely store a 32-bit number in an "int". If I'm
writing portable code however, I won't. Why? Because the Standard says an
"int" can be 16-Bit. Thankfully, it also says that a "long" must be at least
32-Bit, so I use a "long" when I need to store a 32-Bit number.

Also, there may be some systems where, if you don't initialise a local
variable, that it's value is 0, as follows:

void Func()
{
int k;

k += 56;

//k's value is now 56 on some platforms
}


However, the Standard doesn't guarantee this, so I don't do it in portable
code (I wouldn't do it in platform-specific code either -- I'd opt to
explicitly set it to zero).

So at the end of the day, you're either writing portable code, or platform
specific code. For both, avoid undefined behaviour. For the former, avoid
implementation specific behaviour which may result in undefined behaviour
depending on flexibility given to the plaform/implementation.


-Tomás

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Tomás
Guest





PostPosted: Fri Apr 14, 2006 5:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote



Quote:
char p = 120;

char += 100; //Could cause Undefined Behaviour


Meant write:

char p = 120;

p += 100; //Could cause Undefined Behaviour


-Tomás

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
ThosRTanner
Guest





PostPosted: Fri Apr 14, 2006 5:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

Scott Meyers wrote:
Quote:
What work has been done on eliminating undefined behavior in C or C++? Off the
top of my head, some undefined behavior is due to "stupid user behavior," e.g.,
indexing beyond the end of an array, dereferencing a null pointer, etc. Other
undefined behavior -- or maybe it's unspecified or implementation-defined
behavior, I can never keep those terms straight -- arises from flexibility for
implementers, e.g., order of evaluating arguments for function calls, order of
expression evaluation between sequence points, order of initialization of
non-local statics defined in separate translation units. I'd be interested in
any work that's been done on identifying these kinds of behavior (beyond
grepping the standards for "undefined", "unspecified", etc.) and suggesting ways
to eliminate or avoid them, ideally without any obvious runtime cost. (Checking
array bounds would fail the "no obvious runtime cost" test, while defining
function argument evaluation order would pass, because, well, because it's not
obvious to me that imposing a required order would necessarily incur a runtime
penalty :-})

My motivation for the question is that it's my understanding that
safety-critical applications typically identify language subsets to reduce risk,
and they then typically impose usage rules on the features remaining in the
subsets to further reduce risk. One form of risk is unpredictable program
behavior, and such behavior can arise from the use of language constructs with
undefined/unspecified/implementation-defined behavior, so I'm wondering what
work has been done on identifying such constructs and dealing with them.

Some of that work must have been done for the embedded C++ language
subset, though admittedly that is likely to go more on performance than
safety critical.

One of the risks in a language of C++ which is more difficult to
identify are those where the language specifies that certain behaviour
is very well defined but will cause your program to terminate - for
instance throwing an exception in a destructor which is being invoked
by an exception. I suspect you will find that in any safety critical
system (or indeed any system that is intended to run non-stop),
exceptions are banned because of the ease with which you can
unintentionally terminate your program. Off hand, these are:

1) Throwing an exception in the constructor of an exception
2) Throwing an exception in a destructor whilst processing an exception
(which means IIRC that there are some not entirely contrived constructs
where an error is guaranteed to cause a non-recoverable crash)
3) Throwing an exception which isn't in the exception specification of
some potentially arbitrarily distant client.

Of course, if you can't use exceptions, you can't really use the
default new - which means you have to code everything to explicitly
cope with memory exhaustion.

As to amount of work done to eliminate undefined behaviour - given the
howls of protest whenever anyone suggests a language change to
eliminate one or other aspect of undefined behaviour, it seems unlikely
without a proposal for a new language that would drop C (or indeed C++)
backward compatibility where it was necessary to avoid undefined /
unspecified / implementation defined / defined-but-deadly behaviour.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Pete Becker
Guest





PostPosted: Fri Apr 14, 2006 6:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

Scott Meyers wrote:

Quote:
Other undefined behavior -- or maybe it's
unspecified or implementation-defined behavior, I can never keep those
terms straight

Undefined behavior means that the language definition says nothing about
what happens if you do it (de-referencing a null pointer, for example).
For the other two, the behavior is more clearly bounded.
Implementation-defined means what it sounds like: the implementation
must document what it does (for example, whether char is signed or
unsigned). In contrast to unspecified, which means there is a choice
within a clear set of alternatives and the implementation is not
required to document what it does (for example, order of evaluation of
function arguments, which can change with different optimization
settings for the same compiler).

--

Pete Becker
Roundhouse Consulting, Ltd.

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
SuperKoko
Guest





PostPosted: Fri Apr 14, 2006 8:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

Scott Meyers wrote:
Quote:
any work that's been done on identifying these kinds of behavior (beyond
grepping the standards for "undefined", "unspecified", etc.) and suggesting ways
to eliminate or avoid them, ideally without any obvious runtime cost. (Checking
array bounds would fail the "no obvious runtime cost" test, while defining
function argument evaluation order would pass, because, well, because it's not
obvious to me that imposing a required order would necessarily incur a runtime
penalty :-})


Assume that the order of evaluation were defined (for example left to
right), and a sequence point was included between two arguments
evaluation (which may already reduces the efficiency of strongly
optimizing compilers on superscalar micro-processors).

But, I'll deem a realistic example.
The 8088 CPU.
This CPU has no cache, so the optimization for speed is equivalent of
optimization for size.
Assume a compiler using the common __cdecl calling convention which
requires arguments to be pushed from right to left.

f(g(),h());
With a right-to-left evaluation order, a function call is done like the
followin:

call near ptr h
push ax
call near ptr g
push ax
call near ptr f
add sp,4

Total CPU cycles for the "function call" : "push ax" + "push ax" +
"call near ptr f"+ "add sp,4":
15+15+23+4 = 57

Push operations requires only 1 machine code byte, and a few CPU
cycles.

With a left-to-right evaluation order, the stack can only be sensibly
accessed with the bp register. sp can't be used, and bx, si and di use
the ds segment instead of the ss segment.
A segment prefix opcode could be used, but it is slow.
Moreover, each register is precious, so we mustn't lose bx, si or di.

At best, the compiler (if it is smart enough) can write to parameters
via the bp register like that:

sub sp,4
call near ptr g
mov [bp - ... ], ax
call near ptr h
mov [bp - ... ], ax
call near ptr f
add sp,4

CPU cycles for the "function call" :
4+(13+9)+(13+9)+23+4 = 75

The function call is significantly slower

But, if the compiler allows dynamic allocation on the stack (it is not
standard in C++, but it can be a sensible implementation of VLA in a C
compiler...) via an implementation-specific alloca function, then the
compiler can't know the quantity of space between bp and sp, and thus
must do that:

push bp
mov bp,sp
sub sp,4
call near ptr g
mov [bp-4],ax
call near ptr h
mov [bp-2],ax
call near ptr f
mov sp,bp
pop bp

CPU cycles for the "function call" :
15+2+4+(13+9)+(13+9)+23+2+12 = 102


Okay, for a two-arguments function, there is a better way:

call near ptr h
push ax
call near ptr g
pop cx
push ax
push cx
call near ptr f
add sp,4

CPU cycles for the "function call"
15+12+15+15+23+4 = 84

And, if a register variable is free (which is very seldom, since there
are only two register for register variables : si and di).
It can be reduced to:

call near ptr h
mov di,ax
call near ptr g
push ax
push di
call near ptr f
add sp,4

CPU cycles for the "function call"
2+15+15+23+4 = 59

But, the slow way is the only sensible implementation I see for a
function having more parameters (3 or more).

And, there are probably CPU for which that would be worst!
Calling sequences can already be very problematic for some CPU :
Look at http://cm.bell-labs.com/cm/cs/who/dmr/clcs.html

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
James Kanze
Guest





PostPosted: Fri Apr 14, 2006 10:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

SuperKoko wrote:
Quote:
Scott Meyers wrote:
any work that's been done on identifying these kinds of
behavior (beyond grepping the standards for "undefined",
"unspecified", etc.) and suggesting ways to eliminate or
avoid them, ideally without any obvious runtime cost.
(Checking array bounds would fail the "no obvious runtime
cost" test, while defining function argument evaluation order
would pass, because, well, because it's not obvious to me
that imposing a required order would necessarily incur a
runtime penalty :-})

Assume that the order of evaluation were defined (for example
left to right), and a sequence point was included between two
arguments evaluation (which may already reduces the efficiency
of strongly optimizing compilers on superscalar
micro-processors).

But, I'll deem a realistic example.
The 8088 CPU.
This CPU has no cache, so the optimization for speed is equivalent of
optimization for size.

Not really. Not all bytes in the instruction flow take the same
amount of time -- a multiply of AX with another register, and a
move between two registers, are both the same size, but the
multiply will (from memory) take over 50 times more time than
the move.

Quote:
Assume a compiler using the common __cdecl calling convention
which requires arguments to be pushed from right to left.

I'm not sure what you mean by the __cdecl convention, but the
usual convention doesn't require anything to be pushed -- I
don't actually see how it could, since the called function has
no way of determining how the arguments got where they were.

Quote:
f(g(),h());
With a right-to-left evaluation order, a function call is done like the
followin:

call near ptr h
push ax
call near ptr g
push ax
call near ptr f
add sp,4

Or (more likely):

sub sp,8 # (At the start of the function,
# adding enough for the parameters
# of all of the functions called.

call h
mov [bp+whatever],ax
call g
mov [bp+whatever],ax
call f

mov bp,sp # at the end of the function.

Don't forget that push is one of the more expensive
instructions.

Quote:
Total CPU cycles for the "function call" : "push ax" + "push ax" +
"call near ptr f"+ "add sp,4":
15+15+23+4 = 57

I presume that the 15 is for the push. I'm no longer very sure
(it's been 25 years since I taught ASM 86), but I think that's
more than the mov [bp+x],ax.

Quote:
Push operations requires only 1 machine code byte, and a few
CPU cycles.

If you consider 15 a few. I suspect that in most cases, my
strategy will be faster.

Quote:
With a left-to-right evaluation order, the stack can only be
sensibly accessed with the bp register. sp can't be used, and
bx, si and di use the ds segment instead of the ss segment. A
segment prefix opcode could be used, but it is slow.
Moreover, each register is precious, so we mustn't lose bx, si
or di.

At best, the compiler (if it is smart enough) can write to
parameters via the bp register like that:

sub sp,4
call near ptr g
mov [bp - ... ], ax
call near ptr h
mov [bp - ... ], ax
call near ptr f
add sp,4

CPU cycles for the "function call" :
4+(13+9)+(13+9)+23+4 = 75

The function call is significantly slower

With my strategy, there's no difference. But I wonder about
what you mean by significantly. The three calls take so much
time on the processor you're talking about that anything else is
just chicken feed.

Quote:
But, if the compiler allows dynamic allocation on the stack
(it is not standard in C++, but it can be a sensible
implementation of VLA in a C compiler...) via an
implementation-specific alloca function, then the compiler
can't know the quantity of space between bp and sp, and thus
must do that:

push bp
mov bp,sp
sub sp,4
call near ptr g
mov [bp-4],ax
call near ptr h
mov [bp-2],ax
call near ptr f
mov sp,bp
pop bp

CPU cycles for the "function call" :
15+2+4+(13+9)+(13+9)+23+2+12 = 102

Okay, for a two-arguments function, there is a better way:

call near ptr h
push ax
call near ptr g
pop cx
push ax
push cx
call near ptr f
add sp,4

CPU cycles for the "function call"
15+12+15+15+23+4 = 84

Except that normally, the push bp; mov bp,sp; sub sp,x and the
mov sp,bp; pop bp are there anyway.

Quote:
And, if a register variable is free (which is very seldom,
since there are only two register for register variables : si
and di). It can be reduced to:

call near ptr h
mov di,ax
call near ptr g
push ax
push di
call near ptr f
add sp,4

CPU cycles for the "function call"
2+15+15+23+4 = 59

But, the slow way is the only sensible implementation I see
for a function having more parameters (3 or more).

And, there are probably CPU for which that would be worst!

I doubt that. The 8086 architecture is about the worst ever
invented for optimization.

But you seem to be making a lot of assumptions about the way the
compiler works. If the functions f() and g() are very small,
they can be made inline, and the compiler can do whatever it
likes with them. And if they aren't extremely trivial, any
difference in time here (which will never be more than a couple
of cycles) will be negligible compared to the time spent in the
functions.

Quote:
Calling sequences can already be very problematic for some CPU :
Look at http://cm.bell-labs.com/cm/cs/who/dmr/clcs.html

Which was written when? Are you trying to say that there has
been no progress in the last 25 years?

--
James Kanze kanze.james (AT) neuf (DOT) fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
James Kanze
Guest





PostPosted: Fri Apr 14, 2006 10:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

NULL (AT) NULL (DOT) NULL wrote:
Quote:
Scott Meyers posted:

If you're writing portable code, then the solution is simple:
Don't allow situations that could, depending on the
implementation, result in Undefined Behaviour. Here's an
example:

char p = 120;

char += 100; //Could cause Undefined Behaviour

(I suppose you meant p += 100, rather than char += 100.)

Undefined, or implementation specified. I think in this case,
implementation specified: the semantics of the statement is:
read p, convert it to an int, add 100 to the int, then convert
the results back to char and store into p. The only ambiguous
part is converting the results (220) back into char -- a char
may not be (and typically isn't) capable of representing 220.
In that case, the results are implementation defined (and may
result in an implementation defined signal).

Quote:
This has no place in portable code.

Agreed. But I don't think that this is the sort of thing that
Scott was talking about.

Quote:
On the other hand, if you're programming for a platform where
you know that "char" is unsigned, then you are free to do the
above.

I'd be interested in any work that's been done on identifying
these kinds of behavior (beyond grepping the standards for
"undefined", "unspecified", etc.) and suggesting ways to
eliminate or avoid them, ideally without any obvious runtime
cost.

Basically you just have to know the C++ language;

The problem is precisely that there are many cases where the C++
language doesn't specify what may happen.

Quote:
you've to know things like:

a) It's undefined behaviour for a signed integral type to overflow.

It's undefined behavior for an arithmetic operation to overflow
if the type isn't unsigned. Given a value, however, it is only
implementation defined (and not undefined) what happens when
converting it to a narrower type.

Quote:
b) It's undefined behaviour to check an intrinsic variable's
value if it was never initialised.

(Checking array bounds would fail the "no obvious runtime
cost" test, while defining function argument evaluation order
would pass, because, well, because it's not obvious to me
that imposing a required order would necessarily incur a
runtime penalty
:-})

We already have the C++ programming language. Unless we change
it with future Standards, we can simply work around any
problems.

My impression was that Scott was asking about possible changes,
to eliminate unnecessary undefined behaviors.

Some are, IMHO, simply inexcusable: using a symbol with two
successive _ is undefined behavior, for example. Surely the
compiler can check for this, at no cost in performance. If a
dynamically resolved function call resolves to a pure virtual
function, it's also undefined behavior -- this could also surely
be specified to result in a call to terminate, or to some
special function which calls terminate by default.

Those are the easy ones. I have yet to see where rigorously
defining an order of evaluation would have a measurable impact
on real programs, and it is probably the worst problem we face
today in terms of undefined (or underspecified) behavior.

Quote:
Instead of writing:

j = f() + g();

Simply write:

j = f(); j += g();

Sure. Why not just do away with binary operators other than the
assignment operators completely?

Quote:
My motivation for the question is that it's my understanding
that safety-critical applications typically identify language
subsets to reduce risk, and they then typically impose usage
rules on the features remaining in the subsets to further
reduce risk. One form of risk is unpredictable program
behavior, and such behavior can arise from the use of
language constructs with
undefined/unspecified/implementation-defined behavior, so I'm
wondering what work has been done on identifying such
constructs and dealing with them.

None that I'm aware of -- because none is needed. Here's an
example of how I write portable code:

I invariably use Windows XP, so all my programs are Win32. On
Win32, the built-in types are as follows:

char = 8 bits
short = 16 bits
int = 32 bits
long = 32 bits

On my platform, I can safely store a 32-bit number in an
"int". If I'm writing portable code however, I won't. Why?
Because the Standard says an "int" can be 16-Bit. Thankfully,
it also says that a "long" must be at least 32-Bit, so I use a
"long" when I need to store a 32-Bit number.

But that's not the real problem. In most safety critical
systems, portability isn't an issue. The problem is undefined
behavior even in non-portable code, due to order of evaluation,
etc.

Quote:
Also, there may be some systems where, if you don't initialise
a local variable, that it's value is 0, as follows:

void Func()
{
int k;

k += 56;

//k's value is now 56 on some platforms
}

However, the Standard doesn't guarantee this,

And it's pretty rare. I don't know of any compiler which does
it in production builds.

Quote:
so I don't do it in portable code (I wouldn't do it in
platform-specific code either -- I'd opt to explicitly set it
to zero).

You miss what I think is Scott's point. Let's say that by
accident or error, you do do it -- nobody's perfect, after all.
You then test your program, and get a certain result, which is
deemed correct. Your tests are only valid, however, if that
result is reproduceable.

Quote:
So at the end of the day, you're either writing portable code,
or platform specific code. For both, avoid undefined
behaviour. For the former, avoid implementation specific
behaviour which may result in undefined behaviour depending on
flexibility given to the plaform/implementation.

We all know that we should avoid undefined behavior. The
problem with undefined behavior, however, is that it isn't
always detectable. You make a mistake, and it doesn't have any
apparent symptoms. For the moment.

--
James Kanze kanze.james (AT) neuf (DOT) fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
James Kanze
Guest





PostPosted: Fri Apr 14, 2006 10:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

Pete Becker wrote:
Quote:
Scott Meyers wrote:

Other undefined behavior -- or maybe it's unspecified or
implementation-defined behavior, I can never keep those terms
straight

Undefined behavior means that the language definition says
nothing about what happens if you do it (de-referencing a null
pointer, for example). For the other two, the behavior is
more clearly bounded. Implementation-defined means what it
sounds like: the implementation must document what it does
(for example, whether char is signed or unsigned). In contrast
to unspecified, which means there is a choice within a clear
set of alternatives and the implementation is not required to
document what it does (for example, order of evaluation of
function arguments, which can change with different
optimization settings for the same compiler).

That's what the standard says (and I suspect that Scott knows
this). The problem Scott has is more likely knowing when each
of the above applies. (There's also the problem that although
the standard requires an implementation to document
implementation defined behavior, to my knowledge none do.)

There are two problems, really. The first is being able to look
at a piece of code, and know whether it is fully defined or not.
For example: "i = f() + g()". This should normally be fully
defined, but if f() and g() both have side effects, or if one
has side effects, and the result of the other depends on these
side effects, then it is unspecified. (And lets not forget the
classic "f( std::auto_ptr<T>( new T ), std::auto_ptr<T>( new T
) )", where it is unspecified whether the code leaks memory or
not.) The other is being able to have any confidence in the
tests you run -- tests are only valid if the results are
reproduceable, and in the presence of undefined or unspecified
behavior, by definition, they aren't. Taken to the extreme,
this means that there is no point in running any tests on a C++
program, because they don't mean anything anyways. (Of course,
in practice, it's not that bad. Not quite, anyway.)

--
James Kanze kanze.james (AT) neuf (DOT) fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France +33 (0)1 30 23 00 34

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Michael Karcher
Guest





PostPosted: Fri Apr 14, 2006 11:06 pm    Post subject: Re: Defining undefined, etc., behavior Reply with quote

James Kanze <kanze.james (AT) neuf (DOT) fr> wrote:
Quote:
That's what the standard says (and I suspect that Scott knows
this). The problem Scott has is more likely knowing when each
of the above applies. (There's also the problem that although
the standard requires an implementation to document
implementation defined behavior, to my knowledge none do.)

Whats wrong with 'info gcc "C Implementation"'?

Michael Karcher

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Guest






PostPosted: Sat Apr 15, 2006 1:06 am    Post subject: Re: Defining undefined, etc., behavior Reply with quote

"Tomás" wrote:
Quote:

So at the end of the day, you're either writing portable code, or platform
specific code. For both, avoid undefined behaviour.

Not quite. In the context of the C++ standard, "undefined behaviour"
means only that the C++ stanadard chooses not to provide any definition
for the behavior. That doesn't prevent a particular implementation from
providing it's own, unportable definition of the behavior. It's
commonplace for extensions to be implemented by defining
standard-undefined behavior. For instance, it's commonplace for an
implementation to implement an extension by requiring #inclusion of a
non-standard header file that defines or declares identifiers whose
names are reserved to the implementation, which has undefined behavior
according to 17.4.3.1p3. There's nothing wrong with code that's
intended to be non-portable relying on a particular implementation's
definition of behavior that is otherwise undefined.


---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Francis Glassborow
Guest





PostPosted: Sat Apr 15, 2006 1:06 am    Post subject: Re: Defining undefined, etc., behavior Reply with quote

In article <e1p3fi$bj2$1 (AT) emma (DOT) aioe.org>, James Kanze
<kanze.james (AT) neuf (DOT) fr> writes
Quote:
That's what the standard says (and I suspect that Scott knows
this). The problem Scott has is more likely knowing when each
of the above applies. (There's also the problem that although
the standard requires an implementation to document
implementation defined behavior, to my knowledge none do.)

The problem is that it seems that the choices for implementation defined
behaviour must always be ones that allow a program to continue
execution.

The problem with this is that it makes overflow of signed and floating
point values undefined behaviour. If we allowed an implementation to
specify that, for example, overflow of an int value results in the
program terminating we could then provide well-defined behaviour for all
the cases where the implementation does not go to that extreme.

When we couple unspecified order of evaluation with the undefined nature
of overflow of an int value we finish up with forcing very tortuous code
on programmers who do not need portability but do need behaviour
guarantees.

I am well aware that the elite can program round this form of undefined
behaviour but requiring such avoidance seriously reduces programmer
productivity and adds too many opportunities for buggy code.

I believe that Scott shares my concern for the need to make undefined
behaviour easier to avoid and less demanding of expertise.


--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit
For project ideas and contributions: http://www.spellen.org/youcandoit/projects

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Andrei Alexandrescu (See
Guest





PostPosted: Sat Apr 15, 2006 3:06 am    Post subject: Re: Defining undefined, etc., behavior Reply with quote

SuperKoko wrote:
Quote:
Assume that the order of evaluation were defined (for example left to
right), and a sequence point was included between two arguments
evaluation (which may already reduces the efficiency of strongly
optimizing compilers on superscalar micro-processors).

But, I'll deem a realistic example.
The 8088 CPU.
This CPU has no cache, so the optimization for speed is equivalent of
optimization for size.

It is quite understood that right-to-left evaluation appears to be more
natural given a stack-based calling convention. However, I conjecture
that the performance differences are practically nonexistent on today's
architectures.

Quote:
At best, the compiler (if it is smart enough) can write to parameters
via the bp register like that:

sub sp,4
call near ptr g
mov [bp - ... ], ax
call near ptr h
mov [bp - ... ], ax
call near ptr f
add sp,4

CPU cycles for the "function call" :
4+(13+9)+(13+9)+23+4 = 75

The function call is significantly slower

It could be argued that the function call can be faster. Each PUSH will
decrement the stack register. PUSH looks small, but it's not; it entails
a fair amount of microcode, and since we restrict ourselves to 8088,
that microcode will execute slowly, too - only a tad faster than the
equivalent assembly code. Or is it just as slow? I don't remember.

In contrast, this other method computes the stack frame of f()
statically, and then writes to the stack without needing to manipulate
sp. As a consequence, there are less RAW dependencies and the potential
for parallelization is superior (a potential not illustrated by the
example at hand).

But, as I said: on today's architectures, the differences in speed are
immesureable.

Quote:
And, there are probably CPU for which that would be worst!
Calling sequences can already be very problematic for some CPU :
Look at http://cm.bell-labs.com/cm/cs/who/dmr/clcs.html

That document was written 25 years ago. I failed to find a newer one
that discusses the same issues.


Andrei

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Andrei Alexandrescu (See
Guest





PostPosted: Sat Apr 15, 2006 3:06 am    Post subject: Re: Defining undefined, etc., behavior Reply with quote

James Kanze wrote:
Quote:
We all know that we should avoid undefined behavior. The
problem with undefined behavior, however, is that it isn't
always detectable. You make a mistake, and it doesn't have any
apparent symptoms. For the moment.

It's not detectable _efficiently_ and _with today's technology_. There
are less efficient languages with no undefined behavior at all. And IMHO
there's quite some gratuitous un*d behavior in C++.

Andrei

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
John Nagle
Guest





PostPosted: Sat Apr 15, 2006 7:06 am    Post subject: Re: Defining undefined, etc., behavior Reply with quote

Scott Meyers wrote:
Quote:
What work has been done on eliminating undefined behavior in C or C++?
Off the top of my head, some undefined behavior is due to "stupid user
behavior," e.g., indexing beyond the end of an array, dereferencing a
null pointer, etc. Other undefined behavior -- or maybe it's
unspecified or implementation-defined behavior, I can never keep those
terms straight -- arises from flexibility for implementers, e.g., order
of evaluating arguments for function calls, order of expression
evaluation between sequence points, order of initialization of non-local
statics defined in separate translation units. I'd be interested in any
work that's been done on identifying these kinds of behavior (beyond
grepping the standards for "undefined", "unspecified", etc.) and
suggesting ways to eliminate or avoid them, ideally without any obvious
runtime cost. (Checking array bounds would fail the "no obvious runtime
cost" test, while defining function argument evaluation order would
pass, because, well, because it's not obvious to me that imposing a
required order would necessarily incur a runtime penalty :-})

There really is more than one type of undefined behavior, and
classifying the ones in the C++ specification might be useful.
A few obvious classes:

-- Violations of the storage model
Examples:
-- Indexing beyond the limits of an array
-- Use of an iterator after an invalidating operation on its collection

-- Violations of the construction model
Examples:
-- Access to an object before the constructor has completed or after
the destructor has run (includes references to uninitialized variables)
-- Casting into a type for which not all possible values are valid
(pointers, constructed objects)

-- Violations of the sequencing model
Examples:
-- An expression with calls to two functions with (interfering?) side effects
for which the evaluation rules do not explicitly determine the order
of evaluation

-- Violations of the concurrency model (insofar as C++ has one)
Examples:
-- Non-volatile data shared between two threads.

With a taxonomy like this, some issues become clearer. The last three
categories, for example, are potentially detectable at compile time.
The issues for those categories revolve around whether certain
formally undefined behavior should be permitted.

John Nagle
Animats

---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ]
Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ language, library and standards All times are GMT
Goto page 1, 2, 3, 4, 5, 6  Next
Page 1 of 6

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.