C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

How should compile time integer overflow, etc. behave?

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
yossi.kreinin@gmail.com
Guest





PostPosted: Mon Nov 28, 2005 3:57 pm    Post subject: How should compile time integer overflow, etc. behave? Reply with quote



Hi!

What is the exact specification for compile-time integer
arithmetics in standard C++? My specific question is:
can a cross compiler use 64 bit integral data types for
intermediate results of compile-time computations?

More generic questions which I can't answer about this:

- Should a compile time expression such as a+b*c
behave EXACTLY as it would if a,b,c were variables
with values known at runtime?
- In unsigned arithmetics, should it simulate overflow in
intermediate results? If yes, should a cross compiler
simulate the data types of the target machine?
- In signed arithmetics (where AFAIK overflow causes
implementation-dependent behaviour), should the
behaviour of the target machine be simulated?
- Should division by zero be deferred until runtime so
that the target can handle it in it's favorite way? gcc
on Linux stunned me by considering `N/0'
a non-compile-time expression, and thus refusing to
instantiate templates with it, but it accepted
`const int n = N/0;' with a warning; a runtime reference
to n caused a core dump.
So it looks like gcc is working hard to defer division
by zero until runtime and thus let the target handle it.
But shouldn't signed arithmetic overflow be handled
this way, either, by the same reasoning (it is not
numerically specified)? gcc considers expressions
with overflows constant.
- Do C & C++ agree on these issues?

Thanks in advance!
Yossi


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
kanze
Guest





PostPosted: Mon Nov 28, 2005 9:05 pm    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote



[email]yossi.kreinin (AT) gmail (DOT) com[/email] wrote:

Quote:
What is the exact specification for compile-time integer
arithmetics in standard C++?

That it obey the same rules as run-time integer arithmetic,
except that (§5/5):

If during the evaluation of an expression, the result i not
mathematically defined or not in the range of representable
values for its type, the behavior is undefined, unless such
an expression is a constant expression, in which case the
program is ill formed.

(I think this actually goes farther than intended; I don't think
one can reasonably expect a compile time error in a case like:

extern int a[] ;

int *pa = a + 20 ;

when a is defined int[10] in another module. This is, however,
a constant expression, and the results of the evaluation are not
in the range of representable values for its type. There's no
problem with integral constant expressions, however.)

Quote:
My specific question is: can a cross compiler use 64 bit
integral data types for intermediate results of compile-time
computations?

Sure, as long as 1) in cases of no overflow, it ensures that the
results are the same as what they would be on the target
machine, and 2) it detects cases which would cause overflow on
the target machine.

Quote:
More generic questions which I can't answer about this:

- Should a compile time expression such as a+b*c
behave EXACTLY as it would if a,b,c were variables
with values known at runtime?

If a, b and c have integral types, the *results* should be
exactly the same as if the expression were evaluated at
run-time, and if any result (including intermediate results)
would overflow at runtime, the compiler must issue a diagnostic
(although there might be a result which you could use at
runtime).

Quote:
- In unsigned arithmetics, should it simulate overflow in
intermediate results? If yes, should a cross compiler
simulate the data types of the target machine?

Obviously. That's the definition of unsigned arithmetic.
Unsigned arithmetic never results in a value not in the range of
representable values.

Typically, the cross compiler doesn't have to simulate anything,
at least if there is a large enough unsigned type on the host
machine. It just does the usual unsigned arithmetic, and when
it is finished, the modulo. About the only special treatment
necessary *for* *unsigned* *integral* arithmetic is to avoid
dividing by 0.

Quote:
- In signed arithmetics (where AFAIK overflow causes
implementation-dependent behaviour), should the
behaviour of the target machine be simulated?

Not necessarily. First, because the behavior here isn't
implementation defined, but undefined. And secondly, because
the standard says that a diagnostic is required. Note that this
is true even in the case of a*b/c, where the results are in
range, but the intermediate value a*b isn't.

Quote:
- Should division by zero be deferred until runtime so
that the target can handle it in it's favorite way?

No. It must trigger a compile time error.

At least, I think that that is the intent of §5/5, especially
given the note. It could be made clearer, however.

Quote:
gcc on Linux stunned me by considering `N/0' a
non-compile-time expression, and thus refusing to
instantiate templates with it, but it accepted `const int n
= N/0;' with a warning; a runtime reference to n caused a
core dump.

Well, I don't see how you could instantiate a template on it, in
any way. My interpretation of the standard is that it requires
a compiler diagnostic.

Quote:
So it looks like gcc is working hard to defer division
by zero until runtime and thus let the target handle it.
But shouldn't signed arithmetic overflow be handled
this way, either, by the same reasoning (it is not
numerically specified)? gcc considers expressions
with overflows constant.

Well, the standard is very explicit concerning overflow. It
only sort of hints that division by zero should be handled the
same.

If the compiler authors believe that §5/5 does not apply to
division by zero, then the compiler can only refuse to compile
the code if it can prove that the code must be executed.
Otherwise, it more or less has to take the approach that gcc is
taking -- it obviously cannot compute the value at compile time,
and if it cannot give an error either...

Note that initialization of static objects is guaranteed to
occur, and that the standard also guarantees that if the
initialization expression is an integral constant expression,
the initialization is static. So at the very least, if the
compiler authors take this point of view, they must handle the
case differently when the expression is used to initialize a
static object.

Quote:
- Do C & C++ agree on these issues?

Off hand, I don't see the special case requiring a compiler
error in C. Other than that, the same rules govern, i.e.
(§6.5/5 in C98): "If an exceptional condition occurs during the
evaluation of an expression (that is, if the result is not
mathematically defined or not in the range of representable
values for its type), the behavior is undefined."

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Greg Herlihy
Guest





PostPosted: Tue Nov 29, 2005 10:26 am    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote



[email]yossi.kreinin (AT) gmail (DOT) com[/email] wrote:

Quote:
What is the exact specification for compile-time integer
arithmetics in standard C++? My specific question is:
can a cross compiler use 64 bit integral data types for
intermediate results of compile-time computations?

More generic questions which I can't answer about this:

- Should a compile time expression such as a+b*c
behave EXACTLY as it would if a,b,c were variables
with values known at runtime?

For a+b*c to be a compile-time expression (that is, a constant
expression), it must first appear in one of these contexts:

- as an array bounds
- as a case expression in a switch statement
- as a bitfield length
- as an enumerator initializer
- as a static member initializer
- as an integral or enumeration non-type template argument

Furthermore, a, b, and c would all to have to be one of the following:
an enumerator, a non-volatile const variable, a static data member
initialized with a constant expression, or a non-type template
parameter [§5.19/1]. So while it is not impossible for all of those
conditions to be met by the expression a+b*c, it is highly unlikely
that such an expression would turn out to be a constant expression as
one would find in most C++ programs. And given that the question
provides no context for the expression, a+b*c, there is no reason to
assume that it is a constant expression as the question assumes.

Quote:
- In unsigned arithmetics, should it simulate overflow in
intermediate results? If yes, should a cross compiler
simulate the data types of the target machine?

I don't see any reason why it should. The expression is not evaluated
by the program being compiled but by the program doing the compiling.
As such, it does not matter that the program happens to be a compiler,
it could be any kind of program and it would still evaluate an
expression in just one way.

Quote:
- In signed arithmetics (where AFAIK overflow causes
implementation-dependent behaviour), should the
behaviour of the target machine be simulated?

The compiler's behavior was specified by its own source code and the
compiler that compiled it - and not by the source code and target
architecture of a program that it may be compiling.

Quote:
- Should division by zero be deferred until runtime so
that the target can handle it in it's favorite way? gcc
on Linux stunned me by considering `N/0'
a non-compile-time expression, and thus refusing to
instantiate templates with it, but it accepted
`const int n = N/0;' with a warning; a runtime reference
to n caused a core dump.

A non-type template parameter such as N/0 is a compile-time constant
expression that the compiler must evaluate in order to instantiate the
template in which it appears. It is not a runtime expression that could
be evaluated by the compiled program. Therefore gcc evaluated the
expression and reported the error, as should be expected.

A divide-by-zero operation in a constant expression initializer means
that the program is ill-formed (§5.0/5). Gcc probably should report an
error and not a warning in this case. But since initializing a variable
is a runtime operation, there is a rationale for allowing the program
to run.

Quote:
So it looks like gcc is working hard to defer division
by zero until runtime and thus let the target handle it.
But shouldn't signed arithmetic overflow be handled
this way, either, by the same reasoning (it is not
numerically specified)? gcc considers expressions
with overflows constant.

I don't really see that the compiler is "working hard" with any set
agenda. There is little room for discretion here since the difference
between compile-time expressions and run-time expressions is clearly
spelled out. Any expression that is not a constant expression (which
were enumerated above) is a runtime expression that will be evaluated
by the program being compiled. And there is no reason to expect an
evaluation by the compiled program would produce results identical to
the results the compiler would have produced, since the two programs
were not in fact compiled with the same compiler, nor were they
compiled for the same machine architecture.

Greg


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
kanze
Guest





PostPosted: Wed Nov 30, 2005 11:49 am    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote

Greg Herlihy wrote:
Quote:
yossi.kreinin (AT) gmail (DOT) com wrote:

What is the exact specification for compile-time integer
arithmetics in standard C++? My specific question is: can a
cross compiler use 64 bit integral data types for
intermediate results of compile-time computations?

More generic questions which I can't answer about this:

- Should a compile time expression such as a+b*c
behave EXACTLY as it would if a,b,c were variables
with values known at runtime?

For a+b*c to be a compile-time expression (that is, a constant
expression), it must first appear in one of these contexts:

- as an array bounds
- as a case expression in a switch statement
- as a bitfield length
- as an enumerator initializer
- as a static member initializer
- as an integral or enumeration non-type template argument

You've got it backwards. For a+b*c to legally appear in one of
those contexts, it must be an integral constant expression
(which is, I suppose, what you mean by a compile-time
expression). Whether it is an integral constant expression is
determined by the form of a, b and c. (Generally speaking, with
very few exceptions, C++ is context independant; the meaning of
an expression doesn't depend on the context in which it is
used.)

Note that C++ also have other constant expressions; these play a
role in determining whether initialization of a non-local object
is static or dynamic, for example.

Quote:
Furthermore, a, b, and c would all to have to be one of the
following: an enumerator, a non-volatile const variable, a
static data member initialized with a constant expression, or
a non-type template parameter [§5.19/1].

So while it is not impossible for all of those conditions to
be met by the expression a+b*c, it is highly unlikely that
such an expression would turn out to be a constant expression
as one would find in most C++ programs.

It happens. I think that his question was meant to be more
general, however, and integral constant expressions involving
operators are actually fairly frequent in C/C++ code.

Quote:
And given that the question provides no context for the
expression, a+b*c, there is no reason to assume that it is a
constant expression as the question assumes.

That's not what he asked. He asked about what happens when it
is a constant expression. The fact that there are cases where
it isn't is irrelevant.

Quote:
- In unsigned arithmetics, should it simulate overflow in
intermediate results? If yes, should a cross compiler
simulate the data types of the target machine?

I don't see any reason why it should.

Because the standard requires it?

In practice, there isn't much "simluation" required for integral
types, unless the host doesn't support binary arithmetic. A bit
of masking here and there, and verifying that the intermediate
values don't overflow.

The problem is more difficult in the case of floating point, but
there are no contexts where the standard requires the compiler
to evaluate a floating point expression. (If the compiler
doesn't evaluate floating point expressions, it must still
determine whether the expression is a constant expression, and
if it is, ensure that the "runtime" initialization for it occurs
before any dynamic initialization.)

Quote:
The expression is not evaluated by the program being compiled
but by the program doing the compiling. As such, it does not
matter that the program happens to be a compiler, it could be
any kind of program and it would still evaluate an expression
in just one way.

I think you missed the point of his question. In a cross
compiler, the context in the compiler is different from that
during execution. Things like floating point precision and
integer size are different. One of the most important uses of
cross compilers today is for small embedded processors, where
typically, the host will have a 32 bit int, and the target a 16
bit int. So if I write something like:
char array[ 4 * 10000 / 1000 ] ;
what is the status of this expression? (Obviously, in practice,
4, 10000, and 1000 would be symbolic constants -- the expression
might look something like "sizeof(long) * maxMessageLength /
blockingFactor")

Note that the expresion has type int, and that the result,
<int,40> is easily representable on any machine, and that the
intermediate value <int,40000> is representable on the compiling
machine. But *not* on the target machine. I think that §5/5
means that the program is ill-formed, and the compiler must give
a diagnostic.

Quote:
- In signed arithmetics (where AFAIK overflow causes
implementation-dependent behaviour), should the
behaviour of the target machine be simulated?

The compiler's behavior was specified by its own source code
and the compiler that compiled it - and not by the source code
and target architecture of a program that it may be compiling.

The compiler's behavior is specified by the C++ standard. It's
up to the compiler authors to ensure that the compiler source
code and the compiler that compiled it generate what is
required. To be frank, the very idea that the functional
requirements of a program are something like: whatever the
source code and the compiler that compiles it does, shocks me.

Quote:
- Should division by zero be deferred until runtime so that
the target can handle it in it's favorite way? gcc on
Linux stunned me by considering `N/0' a non-compile-time
expression, and thus refusing to instantiate templates
with it, but it accepted `const int n = N/0;' with a
warning; a runtime reference to n caused a core dump.

A non-type template parameter such as N/0 is a compile-time
constant expression that the compiler must evaluate in order
to instantiate the template in which it appears. It is not a
runtime expression that could be evaluated by the compiled
program. Therefore gcc evaluated the expression and reported
the error, as should be expected.

A divide-by-zero operation in a constant expression
initializer means that the program is ill-formed (§5.0/5).
Gcc probably should report an error and not a warning in this
case. But since initializing a variable is a runtime
operation, there is a rationale for allowing the program to
run.

The rationale is that it is only undefined behavior if the
expression is actually evaluated at run-time. If the variable
in question is a local variable, and the block is never entered,
then there is no undefined behavior.

I think that this interpretation is due to a misreading of the C
standard. The C standard speaks of "an exceptional condition
[which] occurs during the evaluation of an expression"; if the
expression is never evaluated, then there is no problem.
However, the C standard also makes it clear that constant
expressions may be (and in certain cases must be) evaluated
at compile time, so the "undefined behavior" can occur at
compile time, even if actual execution never passes through the
initialization. Since it is undefined behavior, however, the
C standard doesn't forbid what gcc does.

The C standard doesn't forbid it; the C++ standard is clearer.
If the expression is a constant expression, the program is
ill-formed (which in turn requires a compiler diagnostic).

There is an interesting point here. If I remember the
discussions during the standardization of C, back in the 1980's,
there was a definite intent that a cross compiler not be
required to emulate the floating point arithmetic of the target
machine; the current standard even has language which strongly
suggests that the compiler may evaluate constant expressions
without emulating the target machine; §6.6/5 (of C99) says "If a
floating expression is evaluated in the translation environment,
the arithmetic precision and range shall be at least as great as
if the expression were being evaluated in the execution
environment." (Note that this means that the exact results of
the expression may depend on whether it is evaluated by the
compiler or not!) The C++ standard has nothing similar, and in
fact, requires a compiler diagnostic if the expression would
cause undefined behavior on the target machine -- if, for
example, the host used IEEE double, and the target used IBM, an
expression like 1e50*1e50/1e60 would require a diagnositic, even
though the host machine can calculate it without problems, and
the final result does not cause a problem; the intermediate
value, 1e100, is not representable in IBM double format. So
while I don't think that there is a requirement that the host
fully emulate the arithmetic, some additional checking is
required.

Quote:
So it looks like gcc is working hard to defer division by
zero until runtime and thus let the target handle it.
But shouldn't signed arithmetic overflow be handled this
way, either, by the same reasoning (it is not numerically
specified)? gcc considers expressions with overflows
constant.

I don't really see that the compiler is "working hard" with
any set agenda.

In this case, it is pretty obvious that it is. The code to
evaluate integral constant expressions at compile time must be
present. It is doubtlessly used by default, as well; if you
write something like 5 + 3, you really don't expect to see a
machine instruction for add in the generated code.

Quote:
There is little room for discretion here since the difference
between compile-time expressions and run-time expressions is
clearly spelled out.

In C, not really. C++ does go a little further.

Taking the as-if rule into effect, the difference is even more
vague.

Quote:
Any expression that is not a constant expression (which were
enumerated above) is a runtime expression that will be
evaluated by the program being compiled. And there is no
reason to expect an evaluation by the compiled program would
produce results identical to the results the compiler would
have produced, since the two programs were not in fact
compiled with the same compiler, nor were they compiled for
the same machine architecture.

The standard does not make the language definition dependant on
the machine the compiler is running on.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Greg Herlihy
Guest





PostPosted: Thu Dec 01, 2005 3:44 pm    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote

kanze wrote:
Quote:
Greg Herlihy wrote:
[email]yossi.kreinin (AT) gmail (DOT) com[/email] wrote:

Furthermore, a, b, and c would all to have to be one of the
following: an enumerator, a non-volatile const variable, a
static data member initialized with a constant expression, or
a non-type template parameter [§5.19/1].

So while it is not impossible for all of those conditions to
be met by the expression a+b*c, it is highly unlikely that
such an expression would turn out to be a constant expression
as one would find in most C++ programs.

It happens. I think that his question was meant to be more
general, however, and integral constant expressions involving
operators are actually fairly frequent in C/C++ code.

And given that the question provides no context for the
expression, a+b*c, there is no reason to assume that it is a
constant expression as the question assumes.

That's not what he asked. He asked about what happens when it
is a constant expression. The fact that there are cases where
it isn't is irrelevant.

Given that the original question seemed to have switched the meaning of
"compile-time" and "run-time", it seemed that deciding on a common
vocabulary and describing the scope of the question actually being
asked would be as least as useful as providing an answer.

Quote:
- In unsigned arithmetics, should it simulate overflow in
intermediate results? If yes, should a cross compiler
simulate the data types of the target machine?

I don't see any reason why it should.

Because the standard requires it?

In practice, there isn't much "simluation" required for integral
types, unless the host doesn't support binary arithmetic. A bit
of masking here and there, and verifying that the intermediate
values don't overflow.

The problem is more difficult in the case of floating point, but
there are no contexts where the standard requires the compiler
to evaluate a floating point expression. (If the compiler
doesn't evaluate floating point expressions, it must still
determine whether the expression is a constant expression, and
if it is, ensure that the "runtime" initialization for it occurs
before any dynamic initialization.)

The expression is not evaluated by the program being compiled
but by the program doing the compiling. As such, it does not
matter that the program happens to be a compiler, it could be
any kind of program and it would still evaluate an expression
in just one way.

I think you missed the point of his question. In a cross
compiler, the context in the compiler is different from that
during execution. Things like floating point precision and
integer size are different. One of the most important uses of
cross compilers today is for small embedded processors, where
typically, the host will have a 32 bit int, and the target a 16
bit int. So if I write something like:
char array[ 4 * 10000 / 1000 ] ;
what is the status of this expression? (Obviously, in practice,
4, 10000, and 1000 would be symbolic constants -- the expression
might look something like "sizeof(long) * maxMessageLength /
blockingFactor")

Note that the expresion has type int, and that the result,
int,40> is easily representable on any machine, and that the
intermediate value <int,40000> is representable on the compiling
machine. But *not* on the target machine. I think that §5/5
means that the program is ill-formed, and the compiler must give
a diagnostic.

It's not clear at all that the translation phase evaluations have to
agree with execution phase evaluations, and the array would necessarily
be ill-formed in this case. Since the maximum array size is an
implementation-defined quantity in the first place, then the evaluation
of this expression would also seem to be implementation-defined. At the
very least it would depend whether overflow is reversible on the target
machine; the compiler is allowed to evaluate the 100000/1000 portion
first, since the subexpressions are associative.

And there is certainly cases in which compile-time evaluations differ
from runtime evaluations - the preprocessor being a clear case. For
example, it's quite possible that a compiler would evaluate this
preprocessor directive:

#if 'z' - 'a' == 25

differently than the program being compiled evaluates this expression:

if ('z' - 'a' == 25)

Quote:
- In signed arithmetics (where AFAIK overflow causes
implementation-dependent behaviour), should the
behaviour of the target machine be simulated?

The compiler's behavior was specified by its own source code
and the compiler that compiled it - and not by the source code
and target architecture of a program that it may be compiling.

The compiler's behavior is specified by the C++ standard. It's
up to the compiler authors to ensure that the compiler source
code and the compiler that compiled it generate what is
required. To be frank, the very idea that the functional
requirements of a program are something like: whatever the
source code and the compiler that compiles it does, shocks me.

One of the more popular line of cross compilers are the gcc cross
compilers. Compilers running on one machine, called the "host", compile
programs to run on a different kind of machine, called the "target". No
in theory, practically any combination of host and target should be
supported - provided that the compiler on the host faithfully respected
the implementation defined aspects of its target.

The reality is that every gcc cross compiler requires that the word
size on the host machine be equal to the word size on the target. In
other words, the cross compiler is in fact limited to compiling
programs for only those targets that sufficiently resemble whatever the
implementation defined for its compilation.

In order to create a cross compiler without these kinds of
restrictions, it would practically be necessary to compile the cross
compiler with a special compiler that builds cross compilers. This
special compiler would have to "see through" the source code that it
was compiling and know that it was building a compiler to compile
programs for an environment different than the one it would otherwise
be built for. And in fact gcc adopts a similar strategy when building
itself on a single platform: the old gcc compiler builds the new gcc
compiler and then newly-built gcc compiler then builds itself.

Quote:
- Should division by zero be deferred until runtime so that
the target can handle it in it's favorite way? gcc on
Linux stunned me by considering `N/0' a non-compile-time
expression, and thus refusing to instantiate templates
with it, but it accepted `const int n = N/0;' with a
warning; a runtime reference to n caused a core dump.

A non-type template parameter such as N/0 is a compile-time
constant expression that the compiler must evaluate in order
to instantiate the template in which it appears. It is not a
runtime expression that could be evaluated by the compiled
program. Therefore gcc evaluated the expression and reported
the error, as should be expected.

A divide-by-zero operation in a constant expression
initializer means that the program is ill-formed (§5.0/5).
Gcc probably should report an error and not a warning in this
case. But since initializing a variable is a runtime
operation, there is a rationale for allowing the program to
run.

The rationale is that it is only undefined behavior if the
expression is actually evaluated at run-time. If the variable
in question is a local variable, and the block is never entered,
then there is no undefined behavior.

A template is instantiated only at compile time, if it is instantiated
at all. Therefore a program like the following:

template <int I>
struct Number
{
};

template <int N>
struct Number2 : public Number<N/0>
{
}:

int main()
{
Number2<3> n;
}

would never cause a runtime error. Because 3/0 is undefined
mathematically, it cannot be used as the constant expression needed
instantiate the template, so the program cannot be compiled.

Quote:
I think that this interpretation is due to a misreading of the C
standard. The C standard speaks of "an exceptional condition
[which] occurs during the evaluation of an expression"; if the
expression is never evaluated, then there is no problem.
However, the C standard also makes it clear that constant
expressions may be (and in certain cases must be) evaluated
at compile time, so the "undefined behavior" can occur at
compile time, even if actual execution never passes through the
initialization. Since it is undefined behavior, however, the
C standard doesn't forbid what gcc does.

No, the only result of a compile-time divide-by-zero operation is that
the program that necessitated the operation is ill-formed. It is only
the runtime divide-by-zero error that leads to undefined behavior on
the part of the program that performed the division. The Standard could
not specify that the compiler's behavior would be undefined, because
it governs two principal aspects of C++: the translation of - and the
execution of - a C++ program; it does not oversee the execution of the
translator - which may not even be a C++ program at all.

Quote:
Any expression that is not a constant expression (which were
enumerated above) is a runtime expression that will be
evaluated by the program being compiled. And there is no
reason to expect an evaluation by the compiled program would
produce results identical to the results the compiler would
have produced, since the two programs were not in fact
compiled with the same compiler, nor were they compiled for
the same machine architecture.

The standard does not make the language definition dependant on
the machine the compiler is running on.

Implementation-defined behavior can depend on whatever the
implementation decides. And as a practical matter, the behavior defined
by a compiler is often almost inextricably tied to the type of machine
on which the compiler was itself compiled - simply because creating a
cross compiler for which that were not to some extent the case - is a
much more difficult task.

Greg


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
kanze
Guest





PostPosted: Fri Dec 02, 2005 11:53 am    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote

Greg Herlihy wrote:
Quote:
kanze wrote:
Greg Herlihy wrote:
[email]yossi.kreinin (AT) gmail (DOT) com[/email] wrote:

[...]
Quote:
And given that the question provides no context for the
expression, a+b*c, there is no reason to assume that it is
a constant expression as the question assumes.

That's not what he asked. He asked about what happens when
it is a constant expression. The fact that there are cases
where it isn't is irrelevant.

Given that the original question seemed to have switched the
meaning of "compile-time" and "run-time", it seemed that
deciding on a common vocabulary and describing the scope of
the question actually being asked would be as least as useful
as providing an answer.

The original question seemed in context quite clear to me, even
if it wasn't couched in the same terms used in the standard. He
was asking about "compile time expression", i.e. "constant
expressions", in standardese. He was asking with regards to a
cross compiler -- questions like "can a cross compiler use 64
bit integral data types for intermediate results of compile-time
computations?" make the context exremely clear.

[...]
Quote:
Note that the expresion has type int, and that the result,
int,40> is easily representable on any machine, and that
the intermediate value <int,40000> is representable on the
compiling machine. But *not* on the target machine. I
think that §5/5 means that the program is ill-formed, and
the compiler must give a diagnostic.

It's not clear at all that the translation phase evaluations
have to agree with execution phase evaluations,

The semantics of an expression are strictly defined, *unless*
overflow occurs. The only thing I see in the standard which
allows for a difference is that in a constant expression,
overflow makes the program ill-formed, where as in any other
expression, it is undefined behavior.

There is another, much larger exception, which I forgot to
mention: preprocessor arithmetic; in the preprocessor, it is
explicitly allowed to not emulate the target machine (even to
the point of using 32 bit arithmetic when the target is a 64 bit
machine), see §16.1/4.

Quote:
and the array would necessarily be ill-formed in this case.

I'll admit that the requirement surprised me as well. I
previously thought undefined behavior. However, I took the
precaution of verifying in the standard before posting. It's
undefined behavior in C, but ill-formed (and thus requiring a
diagnostic) in C++.

Quote:
Since the maximum array size is an implementation-defined
quantity in the first place, then the evaluation of this
expression would also seem to be implementation-defined. At
the very least it would depend whether overflow is reversible
on the target machine; the compiler is allowed to evaluate the
100000/1000 portion first, since the subexpressions are
associative.

And where do you find this in the standard? In K&R C, the
compiler was allowed to reorder subexpressions of associative
operators. The freedom was expressedly removed by the C
standardization committee; it only remains under the as if rule.
Thus, for example, if I write an expression such as
10000/1000*4, I am guaranteed that there will be no overflow,
even on a 16 bit machine.

Actually, I don't think that even K&R C would have allowed the
reordering here, since integer division is NOT associative with
multiplication: (3*1)/3 is not the same thing as 3*(1/3).

Quote:
And there is certainly cases in which compile-time evaluations
differ from runtime evaluations - the preprocessor being a
clear case.

Yes. Because there is an explicit exemption for the
preprocessor, which only evaluates expressions in the context of
conditional inclusion, and which has a different definition as
to what can be part of a constant expression as well.
Preprocessor arithmetic is a completely separate case, defined
in §16.1/4.

Part of the original design of C was that the preprocessor could
be a separate process, with know knowledge of the targer
machine. The standard is carefully designed to make this
possible.

[...]
Quote:
The compiler's behavior is specified by the C++ standard.
It's up to the compiler authors to ensure that the compiler
source code and the compiler that compiled it generate what
is required. To be frank, the very idea that the functional
requirements of a program are something like: whatever the
source code and the compiler that compiles it does, shocks
me.

One of the more popular line of cross compilers are the gcc
cross compilers. Compilers running on one machine, called the
"host", compile programs to run on a different kind of
machine, called the "target". No in theory, practically any
combination of host and target should be supported - provided
that the compiler on the host faithfully respected the
implementation defined aspects of its target.

The reality is that every gcc cross compiler requires that the
word size on the host machine be equal to the word size on the
target. In other words, the cross compiler is in fact limited
to compiling programs for only those targets that sufficiently
resemble whatever the implementation defined for its
compilation.

Fine. If that's a restriction on gcc, that's a restriction on
gcc. There are definitly cross compilers for C which don't have
this restriction, and I'm fairly certain that they exist as well
for C++ -- I suspect that Green Hills would have them, and
probably some other suppliers as well.

Quote:
In order to create a cross compiler without these kinds of
restrictions, it would practically be necessary to compile the
cross compiler with a special compiler that builds cross
compilers.

Why? All that's needed is that the compiler know the limits of
the target machine. And it needs to know them anyway, more or
less -- at the very least, it needs to know the size and the
alignement restriction of the different types in order to
correctly lay out class types. It doesn't seem too much of a
problem for it to know the max and min values for each type, and
do correct bounds checking on all constant arithmetic.

Of course, if the host has a smaller word size than the target,
the compiler will have to use some sort of BigInteger class
internally, instead of long.

[...]
Quote:
A divide-by-zero operation in a constant expression
initializer means that the program is ill-formed (§5.0/5).
Gcc probably should report an error and not a warning in
this case. But since initializing a variable is a runtime
operation, there is a rationale for allowing the program
to run.

The rationale is that it is only undefined behavior if the
expression is actually evaluated at run-time. If the
variable in question is a local variable, and the block is
never entered, then there is no undefined behavior.

A template is instantiated only at compile time, if it is
instantiated at all.

The issue doesn't really concern templates, any more than
anything else. In C++, a division by zero in a constant
expression makes the program ill formed, regardless of what the
expression is used for. In C, it makes the program undefined;
historically, there has been an argument that the undefined
behavior doesn't occur unless the expression is actually
evaluated, but I don't think that that's the case in C, either,
since const expressions are (or can be) evaluated by the
compiler.

In practice, of course, there's not much a compiler could do in
the case of instantiating a template to differ detection to
run-time, even if the standard allowed it. But the point here
is more what the standard requires, rather than implementation
problems or details.

Quote:
I think that this interpretation is due to a misreading of
the C standard. The C standard speaks of "an exceptional
condition [which] occurs during the evaluation of an
expression"; if the expression is never evaluated, then
there is no problem. However, the C standard also makes it
clear that constant expressions may be (and in certain cases
must be) evaluated at compile time, so the "undefined
behavior" can occur at compile time, even if actual
execution never passes through the initialization. Since it
is undefined behavior, however, the C standard doesn't
forbid what gcc does.

No, the only result of a compile-time divide-by-zero operation
is that the program that necessitated the operation is
ill-formed.

That's true in C++, but not in C.

Quote:
It is only the runtime divide-by-zero error that leads to
undefined behavior on the part of the program that performed
the division.

That's true in C++, but not in C.

In the comment you are trying to refute, I explicitly said that
I was explaining why some compilers behaved as they did. I also
stated that I didn't think it necessary, even in C, and that it
wasn't even legal in C++.

Quote:
The Standard could not specify that the compiler's behavior
would be undefined, because it governs two principal aspects
of C++: the translation of - and the execution of - a C++
program; it does not oversee the execution of the translator -
which may not even be a C++ program at all.

The standard does specify that certain specific constructions
result in undefined behavior. It also states that in the face
of undefined behavior, the compiler may refuse to compile the
program.

Many cases of undefined behavior depend on actual program
behavior: it is undefined behavior to fall of the end of a
non-void function, for example. In such cases, the compiler can
only refuse to compile the program if and only if it can prove
that for all possible executions, there will be a case where the
code actually does fall of the end.

Other cases of undefined behavior are purely compile-time: if
you use the operator<< on an ostream without including
code will actually work if you've included <iostream>, on some
compilers, it will work if you've only included <string>. And I
know of no compiler where it will even compile if you've not
included anything. But according to the standard, it is always
undefined behavior. Even if, in practice, it is purely a
compile time problem. (I can't imagine an implementation which
would compile, but then core dump when you actually executed the
statement.)

Quote:
Any expression that is not a constant expression (which
were enumerated above) is a runtime expression that will
be evaluated by the program being compiled. And there is
no reason to expect an evaluation by the compiled program
would produce results identical to the results the
compiler would have produced, since the two programs were
not in fact compiled with the same compiler, nor were they
compiled for the same machine architecture.

The standard does not make the language definition dependant
on the machine the compiler is running on.

Implementation-defined behavior can depend on whatever the
implementation decides.

True enough. But we're not talking about implementation defined
behavior here; the implementation defines the range of the
different data types, but they must have only one range, not one
range in constant expressions, and another when executed at run
time.

Quote:
And as a practical matter, the behavior defined by a compiler
is often almost inextricably tied to the type of machine on
which the compiler was itself compiled

Are you kidding? The behavior defined by the compiler defined
by the source code of the compiler. And it's quite possible to
write portable C or C++ code. As a practical matter, the
requirements of the standard here aren't even particularly
difficult to implement in a portable manner.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Ben Hutchings
Guest





PostPosted: Sat Dec 03, 2005 11:31 am    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote

James Kanze <kanze (AT) gabi-soft (DOT) fr> wrote:
Quote:
Greg Herlihy wrote:
snip
And there is certainly cases in which compile-time evaluations
differ from runtime evaluations - the preprocessor being a
clear case.

Yes. Because there is an explicit exemption for the
preprocessor, which only evaluates expressions in the context of
conditional inclusion, and which has a different definition as
to what can be part of a constant expression as well.
Preprocessor arithmetic is a completely separate case, defined
in §16.1/4.

Part of the original design of C was that the preprocessor could
be a separate process, with know knowledge of the targer
machine. The standard is carefully designed to make this
possible.
snip


This used to be true, but in C99 the preprocessor is required to
implement arithmetic using intmax_t (which must be at least 64-bit).
I'm not aware of any C or C++ implementation still in development that
still uses a separate preprocessor, though.

Ben.

--
Ben Hutchings
Having problems with C++ templates? Your questions may be answered by
<http://womble.decadentplace.org.uk/c++/template-faq.html>.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
neil@daikokuya.co.uk
Guest





PostPosted: Fri Dec 09, 2005 11:06 am    Post subject: Re: How should compile time integer overflow, etc. behave? Reply with quote

Greg Herlihy wrote:

Quote:
The reality is that every gcc cross compiler requires that the word
size on the host machine be equal to the word size on the target. In
other words, the cross compiler is in fact limited to compiling
programs for only those targets that sufficiently resemble whatever the
implementation defined for its compilation.

This is simply untrue; please check your facts.

Quote:
In order to create a cross compiler without these kinds of
restrictions, it would practically be necessary to compile the cross
compiler with a special compiler that builds cross compilers. This
special compiler would have to "see through" the source code that it
was compiling and know that it was building a compiler to compile
programs for an environment different than the one it would otherwise
be built for. And in fact gcc adopts a similar strategy when building
itself on a single platform: the old gcc compiler builds the new gcc
compiler and then newly-built gcc compiler then builds itself.

This is not true either.

Neil.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.