C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Out-of-bounds nonsense
Goto page 1, 2  Next
 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Frederick Gotham
Guest





PostPosted: Wed Nov 01, 2006 10:10 am    Post subject: Out-of-bounds nonsense Reply with quote



[ This post deals with both C and C++, but does not alienate either
language because the language feature being discussed is common to both
languages. ]

Over on comp.lang.c, we've been discussing the accessing of array elements
via subscript indices which may appear to be out of range. In particular,
accesses similar to the following:

int arr[2][2];

arr[0][3] = 7;

Both the C Standard and the C++ Standard necessitate that the four int's be
lain out in memory in ascending order with no padding in between, i.e.:

(best viewed with a monowidth font)

--------------------------------
| Memory Address | Object |
--------------------------------
| 0 | arr[0][0] |
| 1 | arr[0][1] |
| 2 | arr[1][0] |
| 3 | arr[1][1] |
--------------------------------

One can see plainly that there should be no problem with the little snippet
above because arr[0][3] should be the same as arr[1][1], but I've had
people over on comp.lang.c telling me that the behaviour of the snippet is
undefined because of an "out of bounds" array access. They've even backed
this up with a quote from the C Standard:

J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
[...]
- An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).

Are the same claims of undefined behaviour existing in C++ made by anyone?

If it is claimed that the snippet's behaviour is undefined because the
second subscript index is out of range of the dimension, then this
rationale can be brought into doubt by the following breakdown. First let's
look at the expression statement:

arr[0][3] = 9;

The compiler, both in C and in C++, must interpret this as:

*( *(arr+0) + 3 ) = 9;

In the inner-most set of parentheses, "arr" decays to a pointer to its
first element, i.e. an R-value of the type int(*)[2]. The value 0 is then
added to this address, which has no effect. The address is then
dereferenced, yielding an L-value of the type int[2]. This expression then
decays to a pointer to its first element, yielding an R-value of the type
int*. The value 3 is then added to this address. (In terms of bytes, it's p
+= 3 * sizeof(int)). This address is then dereferenced, yielding an L-value
of the type int. The L-value int is then assigned to.

The only thing that sounds a little dodgy in the above paragraph is that an
L-value of the type int[2] is used as a stepping stone to access an element
whose index is greater than 1 -- but this shouldn't be a problem, because
the L-value decays to a simple R-value int pointer prior to the accessing
of the int object, so any dimension info should be lost by then.

To the C++ programmers: Is the snippet viewed as invoking undefined
behaviour? If so, why?

To the C programmers: How can you rationalise the assertion that it
actually does invoke undefined behaviour?

I'd like to remind both camps that, in other places, we're free to use our
memory however we please (given that it's suitably aligned, of course). For
instance, look at the following. The code is an absolute dog's dinner, but
it should work perfectly on all implementations:

/* Assume the inclusion of all necessary headers */

void Output(int); /* Defined elsewhere */

int main(void)
{
assert( sizeof(double) > sizeof(int) );

{ /* Start */

double *p;
int *q;
char unsigned const *pover;
char unsigned const *ptr;

p = malloc(5 * sizeof*p);
q = (int*)p++;
pover = (char unsigned*)(p+4);
ptr = (char unsigned*)p;
p[3] = 2423.234;
*q++ = -9;


do Output(*ptr++);
while (pover != ptr);

return 0;

} /* End */
}

Another thing I would remind both camps of, is that we can access any
memory as if it were simply an array of unsigned char's. That means we can
access an "int[2][2]" as if it were simply an object of the type "char
unsigned[sizeof(int[2][2])]".

The reason I'm writing this is that, at the moment, it sounds like absolute
nonsense to me that the original snippet's behaviour is undefined, and so I
challenge those who support its alleged undefinedness.

I leave you with this:

int arr[2][2];

void *const pv = &arr;

int *const pi = (int*)pv; /* Cast used for C++ programmers! */

pi[3] = 8;

--

Frederick Gotham

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Frederick Gotham
Guest





PostPosted: Wed Nov 01, 2006 11:33 pm    Post subject: Re: Out-of-bounds nonsense Reply with quote



Carlos Moreno:

Quote:
In both cases --- a compiler is free to perform bounds-checking and
place code that throws an exception when an out-of-bounds occurs.


It _shouldn't_ be allowed to. Given the following object:

int arr[2][2];

, the following snippets should all behave identically:

(1)
arr[0][3] = 8;

(2)
*( *(arr+0) + 3 ) = 8;

(3)
*(*arr + 3) = 8;

(4)

*(int*)( (char*)*arr + 3*sizeof(int) ) = 8;

--

Frederick Gotham

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Carlos Moreno
Guest





PostPosted: Thu Nov 02, 2006 3:54 am    Post subject: Re: Out-of-bounds nonsense Reply with quote



Frederick Gotham wrote:
Quote:
Carlos Moreno:


In both cases --- a compiler is free to perform bounds-checking and
place code that throws an exception when an out-of-bounds occurs.



It _shouldn't_ be allowed to. Given the following object:

int arr[2][2];

, the following snippets should all behave identically:

(1)
arr[0][3] = 8;

I don't know if they *should* --- and perhaps *they do* in practice,
with any version of any existing compiler on any existing platform;
that does not contradict the fact that it is undefined behaviour.

If you need to do tricks that rely on this feature, you have two
options:

1) Make sure it does work on your platform, under any operating
conditions.

2) Come up with some other idea that it is *guaranteed* to work
on a conformant compiler.

That's perhaps the key difference --- your code *works* on any
compiler on any OS I can think of. But it is not *guaranteed*
to work; so, at best, it is a time-bomb; a month from now you
could port it to a newer version of the compiler, and it will
mysteriously stop working (coul be obvious, could be a few
days or few weeks worth of painful debugging).

If you need to access arr[0][3] or arr[1][1], why not create
an array of 4 ints, and you do the subscripts arithmetic??
You could even encapsulate that into a class that handles it
transparently (hopefully, you'll use something *guaranteed*
to work, and not simply int arr[2][2] and map the subscripts
directly Wink)

Carlos
--

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
James Kanze
Guest





PostPosted: Thu Nov 02, 2006 3:55 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

Frederick Gotham wrote:
Quote:
Over on comp.lang.c, we've been discussing the accessing of array elements
via subscript indices which may appear to be out of range. In particular,
accesses similar to the following:

int arr[2][2];

arr[0][3] = 7;

Undefined behavior. In both C and C++. Since at least C90
(which as far as I know, introduced the concept).

Quote:
Both the C Standard and the C++ Standard necessitate that the four int's be
lain out in memory in ascending order with no padding in between, i.e.:

(best viewed with a monowidth font)

--------------------------------
| Memory Address | Object |
--------------------------------
| 0 | arr[0][0] |
| 1 | arr[0][1] |
| 2 | arr[1][0] |
| 3 | arr[1][1] |
--------------------------------

And how is the memory layout relevant?

Quote:
One can see plainly that there should be no problem with the little snippet
above because arr[0][3] should be the same as arr[1][1],

No. Since arr[0][3] doesn't exist in a legal program, it
doesn't refer to anything. With at least one compiler I've
heard of, it would cause an assertion failure.

Quote:
but I've had
people over on comp.lang.c telling me that the behaviour of the snippet is
undefined because of an "out of bounds" array access. They've even backed
this up with a quote from the C Standard:

J.2 Undefined behavior:
The behavior is undefined in the following circumstances:
[...]
- An array subscript is out of range, even if an object is apparently
accessible with the given subscript (as in the lvalue expression
a[1][7] given the declaration int a[4][5]) (6.5.6).

Are the same claims of undefined behaviour existing in C++ made by anyone?

They're part of the standard. At no time was it even considered
that C style arrays would work differently in C++.

Quote:
If it is claimed that the snippet's behaviour is undefined because the
second subscript index is out of range of the dimension,

It's not "claimed". It's the standard. The reason there is an
explicit example in C99 is because people tried to reason like
you in C90. The C committee has explicitly ruled that this
reasoning is falicious. Given an explicit ruling by the
committee, I'd say that the question is closed.

Quote:
then this
rationale can be brought into doubt by the following breakdown. First let's
look at the expression statement:

arr[0][3] = 9;

The compiler, both in C and in C++, must interpret this as:

*( *(arr+0) + 3 ) = 9;

In the inner-most set of parentheses, "arr" decays to a pointer to its
first element, i.e. an R-value of the type int(*)[2]. The value 0 is then
added to this address, which has no effect. The address is then
dereferenced, yielding an L-value of the type int[2]. This expression then
decays to a pointer to its first element, yielding an R-value of the type
int*. The value 3 is then added to this address.

Which is undefined behavior. You cannot add three to a pointer
to the first element of an int[2].

Quote:
(In terms of bytes, it's p
+= 3 * sizeof(int)). This address is then dereferenced, yielding an L-value
of the type int. The L-value int is then assigned to.

The only thing that sounds a little dodgy in the above paragraph is that an
L-value of the type int[2] is used as a stepping stone to access an element
whose index is greater than 1 -- but this shouldn't be a problem, because
the L-value decays to a simple R-value int pointer prior to the accessing
of the int object, so any dimension info should be lost by then.

Who says dimension information will be lost? The wording of the
C standard was carefully crafted intentionally to allow
implementations in which pointers carried bounds information.
At least one compiler (Centerline) used this technique in a
debugging implementation.

Quote:
To the C++ programmers: Is the snippet viewed as invoking undefined
behaviour? If so, why?

It's clearly undefined behavior. If by why, you mean why the
standard says it's undefined behavior, the reason is clear:
compatibility with C. And the reason why C says it is undefined
behavior is precisely to allow checking implementations.

Quote:
To the C programmers: How can you rationalise the assertion that it
actually does invoke undefined behaviour?

The standards committee has said so, explicitly. If the
standard says that X is undefined behavior, it is undefined
behavior. If you don't think it should be, write up a proposal
to change it, but until the committee has accepted your
proposal, there's not much room for argument here.

Quote:
I'd like to remind both camps that, in other places, we're free to use our
memory however we please (given that it's suitably aligned, of course).

Such as? The C++ standard (and to a lesser degree the C
standard), is fairly strict in what it allows.

Quote:
For
instance, look at the following. The code is an absolute dog's dinner, but
it should work perfectly on all implementations:

/* Assume the inclusion of all necessary headers */

void Output(int); /* Defined elsewhere */

int main(void)
{
assert( sizeof(double) > sizeof(int) );

{ /* Start */

double *p;
int *q;
char unsigned const *pover;
char unsigned const *ptr;

p = malloc(5 * sizeof*p);
q = (int*)p++;

pover = (char unsigned*)(p+4);
ptr = (char unsigned*)p;
p[3] = 2423.234;
*q++ = -9;

Undefined behavior. About the only thing you can legally do
with a q, here, is cast it back to a double*, and even that
isn't guaranteed to work (since in theory at least, an int*
could be smaller than a double*, and information could have been
lost).

Quote:
do Output(*ptr++);
while (pover != ptr);

Note that this is only legal insofar as ptr is an unsigned char*
(or a char*, in C++). There is a special exception to the
typing rules that allows you to access any type as if it were an
array of unsigned char (or char in C++). Thus, for example, in
your initial example of int[2][2], you can iterate through the
array with:

for ( unsigned char* p = arr; p != arr + sizeof( arr ) ; ++ p ) {
// ...
}

But this special exception only concerns unsigned char (and char
in C++).

Quote:
return 0;

} /* End */
}

Another thing I would remind both camps of, is that we can access any
memory as if it were simply an array of unsigned char's. That means we can
access an "int[2][2]" as if it were simply an object of the type "char
unsigned[sizeof(int[2][2])]".

Yes. That is a special exception.

Quote:
The reason I'm writing this is that, at the moment, it sounds like absolute
nonsense to me that the original snippet's behaviour is undefined, and so I
challenge those who support its alleged undefinedness.

Whether you consider it nonsense or not, the C committee has
explicitly ruled on the issue. I don't know what further
support you need than an explicit statement in the standard that
is it undefined.

Quote:
I leave you with this:

int arr[2][2];

void *const pv = &arr;

int *const pi = (int*)pv; /* Cast used for C++ programmers! */

pi[3] = 8;

Undefined behavior. In C++, this is a static_cast; the only
thing you can legally do with a void* in C++ is cast it back to
the original type (except, of course, copying and assigning it).
In this code, the original type is int (*)[2], and you convert
it back to int*. This is no more legal than if you cast a
double* to void*, then to int*, and expected it to work.

--
James Kanze (Gabi Software) email: james.kanze (AT) gmail (DOT) com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
James Kanze
Guest





PostPosted: Thu Nov 02, 2006 3:56 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

Frederick Gotham wrote:
Quote:
Carlos Moreno:

In both cases --- a compiler is free to perform bounds-checking and
place code that throws an exception when an out-of-bounds occurs.

It _shouldn't_ be allowed to.

That's your opinion. The standards are very clear that it
currently is. If you don't think it should be allowed to,
you'll have to write up a proposition to change the standard.
(Given that the issue was extensively discussed in the C
standard comittee, and that the current status represents an
explicit decision on the part of that committee, I rather doubt
that such a proposition will have much chance of success.)

Compilers have done this sort of checking in the past.

Compilers can also use the fact in optimization. If the
compiler sees that you've initialized a pointer with &arr[0][0],
it knows that nothing in arr[1] will be accessed through this
pointer.

Quote:
Given the following object:

int arr[2][2];

, the following snippets should all behave identically:

(1)
arr[0][3] = 8;

(2)
*( *(arr+0) + 3 ) = 8;

(3)
*(*arr + 3) = 8;

(4)

*(int*)( (char*)*arr + 3*sizeof(int) ) = 8;

Well, they all have the same status according to the standard:
undefined behavior. But there's nothing in the standard to
guarantee that all instances of undefined behavior will behave
the same. In practice, I can easily imagine an optimizer
treating them differently.

--
James Kanze (Gabi Software) email: james.kanze (AT) gmail (DOT) com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Seungbeom Kim
Guest





PostPosted: Fri Nov 03, 2006 3:23 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

Carlos Moreno wrote:
Quote:

What I'm trying to say: with the code that you posted, yes, you're
attempting to access contiguous memory --- the problem is that the
whole setup involves *other things* in addition to just accessing
contiguous memory; it is those other things the ones that cause
the trouble.

Not only because it *can* cause the trouble; but also to *allow* causing
a trouble (i.e. to signal an error if out-of-bounds access is detected).

If I have a matrix A[3][3] and a bug in my algorithm that attempts
to access A[0][4] just works fine with accessing A[1][1] instead,
I won't be very happy about it; in other words, I will be very happy
to use an implementation that detects and signals the error.

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
James Kanze
Guest





PostPosted: Fri Nov 03, 2006 5:23 pm    Post subject: Re: Out-of-bounds nonsense Reply with quote

Frederick Gotham wrote:
Quote:
James Kanze:

If the compiler sees that you've initialized a pointer with &arr[0][0],
it knows that nothing in arr[1] will be accessed through this pointer

But there's no such thing as &arr[0][0] -- it's just a pretty way of
writing:

&*(*(arr+0) + 0)

And what does that change.

Quote:
The type information is lost when it finally boils down to an R-value int*.

Formal type information, perhaps. But the compiler still knows
where you got that int* from, and knows that in no case can you
add more than 2 to it.

Quote:
Ah... I just don't see why nobody else can see that it's
utterly ridiculous that we can't access contiguous memory.

Perhaps because that's what a typed language is all about.
Whether the memory is contiguous or not is completely
irrelevant.

--
James Kanze (GABI Software) email:james.kanze (AT) gmail (DOT) com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Francis Glassborow
Guest





PostPosted: Fri Nov 03, 2006 6:50 pm    Post subject: Re: Out-of-bounds nonsense Reply with quote

In article <1162549447.427729.77070 (AT) m73g2000cwd (DOT) googlegroups.com>,
AlbertSSj <alberto.ing (AT) gmail (DOT) com> writes
Quote:
The whole discussion raised a question to me.

I've seen a lot of this

int a[N][M];

memset(&a, 0, sizeof(a))
or
memset(&a, 0, sizeof(int) * N * M)

Then you have seen a lot of code written by people who do not understand
what the language provides:

int a[d1][d2] = {0};
not least because that solution generalises:

double b[d1][d2] = {0.0};

Quote:

so I assumed that also this was acceptable:

std::fill(&a[0][0],&a[N][M], 0)

How much is bad all this?
No comment Smile



--
Francis Glassborow ACCU
Author of 'You Can Do It!' and "You Can Program in C++"
see http://www.spellen.org/youcandoit
For project ideas and contributions:
http://www.spellen.org/youcandoit/projects


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Gerhard Menzl
Guest





PostPosted: Fri Nov 03, 2006 11:00 pm    Post subject: Re: Out-of-bounds nonsense Reply with quote

AlbertSSj wrote:

Quote:
The whole discussion raised a question to me.

I've seen a lot of this

int a[N][M];

memset(&a, 0, sizeof(a))
or
memset(&a, 0, sizeof(int) * N * M)

so I assumed that also this was acceptable:

std::fill(&a[0][0],&a[N][M], 0)

How much is bad all this?

Very bad, in that std::fill expects a valid end iterator, in this case a
pointer to one past the end, but you pass a pointer to (N + M - 1) past
the end.

--
Gerhard Menzl

Non-spammers may respond to my email address, which is composed of my
full name, separated by a dot, followed by at, followed by "fwz",
followed by a dot, followed by "aero".



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
AlbertSSj
Guest





PostPosted: Fri Nov 03, 2006 11:08 pm    Post subject: Re: Out-of-bounds nonsense Reply with quote

Francis Glassborow wrote:
Quote:
Then you have seen a lot of code written by people who do not understand
what the language provides:

They probably understand what the language not provides...

Quote:
int a[d1][d2] = {0};

This form is good as initialization, but nothing else.

Quote:
How much is bad all this?
No comment Smile

Ok, is pretty bad ^^'', that popped in my mind...

Now, the only way to set to 0 an array is using a 'for'?

I mean, this is safe
for (int i = 0; i < d1; ++i) for (int j = 0; j < d2; ++j) a[i][j] = 0

But it isn't the prettiest code in the world
(the fact that is still possible to use a memset for the inner loop
do not count ^^)


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Martin Bonner
Guest





PostPosted: Tue Nov 07, 2006 1:31 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

AlbertSSj wrote:
Quote:
Francis Glassborow wrote:
Then you have seen a lot of code written by people who do not understand
what the language provides:

They probably understand what the language not provides...

int a[d1][d2] = {0};

This form is good as initialization, but nothing else.

How much is bad all this?
No comment :-)

Ok, is pretty bad ^^'', that popped in my mind...

Now, the only way to set to 0 an array is using a 'for'?

I mean, this is safe
for (int i = 0; i < d1; ++i) for (int j = 0; j < d2; ++j) a[i][j] = 0

But it isn't the prettiest code in the world
(the fact that is still possible to use a memset for the inner loop
do not count ^^)

It /certainly/ is not portable to use memset for the inner loop if the
array is floating point or pointer. I would need to study the standard
closely to decide if it is portable for /any/ type other than unsigned
char (and possibly char/signed char).


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Thomas Richter
Guest





PostPosted: Tue Nov 07, 2006 1:32 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

James Kanze wrote:

Quote:
Over on comp.lang.c, we've been discussing the accessing of array elements
via subscript indices which may appear to be out of range. In particular,
accesses similar to the following:


int arr[2][2];


arr[0][3] = 7;


Undefined behavior. In both C and C++. Since at least C90
(which as far as I know, introduced the concept).

How does all that fit to one of the favourite C trivia, namely that

int a[7];

a[4]=1;

assert(4[a]==1); /* no typo */

"works". Works in which way? The usual explanation one finds is that
a[4] == *(a+4), but according to what I've read here, it would say that
the latter is undefined behaivour.

Or, to put it differently, if bounds checking is a valid thing to do,
how can operator[] on the built-in arrays be "commutative"?

So long,
Thomas

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Seungbeom Kim
Guest





PostPosted: Tue Nov 07, 2006 4:21 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

Thomas Richter wrote:
Quote:
James Kanze wrote:

int arr[2][2];
arr[0][3] = 7;

Undefined behavior. In both C and C++. Since at least C90
(which as far as I know, introduced the concept).

How does all that fit to one of the favourite C trivia, namely that

int a[7];

a[4]=1;

assert(4[a]==1); /* no typo */

"works". Works in which way? The usual explanation one finds is that
a[4] == *(a+4), but according to what I've read here, it would say that
the latter is undefined behaivour.

Or, to put it differently, if bounds checking is a valid thing to do,
how can operator[] on the built-in arrays be "commutative"?

I don't see why it cannot be, nor why the assert shouldn't work.
No matter how much information a pointer has or how much checking
is done, pointer+int and int+pointer can be defined in the same way
(and they should, I believe).

For example,

// pseudocode
template <typename T>
T* operator+(T* pointer, int offset) { /* whatever */ }

template <typename T> inline
T* operator+(int offset, T* pointer) { return pointer + offset; }

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
James Kanze
Guest





PostPosted: Tue Nov 07, 2006 10:10 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

Thomas Richter wrote:
Quote:
James Kanze wrote:

Over on comp.lang.c, we've been discussing the accessing of
array elements via subscript indices which may appear to be
out of range. In particular, accesses similar to the
following:

int arr[2][2];

arr[0][3] = 7;

Undefined behavior. In both C and C++. Since at least C90
(which as far as I know, introduced the concept).

How does all that fit to one of the favourite C trivia, namely that

int a[7];

a[4]=1;

assert(4[a]==1); /* no typo */

"works". Works in which way? The usual explanation one finds is that
a[4] == *(a+4), but according to what I've read here, it would say that
the latter is undefined behaivour.

What makes you say that? The variable a has the type int[7].
When it converts to a pointer, it converts to a pointer with an
upper bound of a+7 (and a lower bound of a). The expression
a[4] is the equivalent of *(a+4) and 4[a] the equivalent of
*(4+a). Both involve pointer arithmetic on the pointer, and
bounds checking works exactly the same with both of them.

If we imagine that the pointer type is the equivalent of:

struct Ptr { int* current ; int* lower ; int* upper ; } ;

we end up with:

Ptr { a+4 /* or 4+a */, a, a+7 }

After each pointer add or subtract, the implementation checks
that current >= lower && current <= upper, and before
dereferencing, it checks that current >= lower && current <
upper.

(This is sometimes referred to as "fat pointers". The C
committee discussed the issue in depth, and intentionally worded
the standard in such a way as to make this type of
implementation legal.)

Quote:
Or, to put it differently, if bounds checking is a valid thing
to do, how can operator[] on the built-in arrays be
"commutative"?

Why shouldn't it be?

Note that bounds checking isn't applied to arrays, in general,
but to the pointers which derive from the array.

--
James Kanze (GABI Software) email:james.kanze (AT) gmail (DOT) com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Frederick Gotham
Guest





PostPosted: Wed Nov 08, 2006 5:53 am    Post subject: Re: Out-of-bounds nonsense Reply with quote

werasm:

Quote:
struct xs
{
enum{ Row = 10, Col = 10 };
int array[Row][Col];
xs();
}

xs:Mads()
: array() //???
{

}

This does not seem to zero initialize array (or fill it with zeros, but
rather the minimum).


Then I suggest you dispose of your K++ compiler and get a C++ one. Here's a
simple test to make sure you have a C++ compiler:

#include <cstddef>

template <class T,std::size_t len>
bool IsAnyElementTrue(T const (&arr)[len])
{
T const *p = arr;
T const *const pover = arr + len;

do if (*p++) return true;
while (pover != p);

return false;
}

#include <iostream>
#include <ostream>
using std::cout;
using std::endl;

int main()
{
int arr1[32] = {};
double arr2[32] = {};
char *arr3[32] = {};

int (&arr4)[32] = *new int[1][32]();
double (&arr5)[32] = *new double[1][32]();
char *(&arr6)[32] = *new char*[1][32]();

if (IsAnyElementTrue(arr1)) cout << "Problem with arr1." << endl;
if (IsAnyElementTrue(arr2)) cout << "Problem with arr2." << endl;
if (IsAnyElementTrue(arr3)) cout << "Problem with arr3." << endl;
if (IsAnyElementTrue(arr4)) cout << "Problem with arr4." << endl;
if (IsAnyElementTrue(arr5)) cout << "Problem with arr5." << endl;
if (IsAnyElementTrue(arr6)) cout << "Problem with arr6." << endl;

delete [] &arr4;
delete [] &arr5;
delete [] &arr6;
}

To my disgust, this failed when I ran it with g++. Specifically, it failed
to correctly default-initialise the dynamically created arrays.

I'll upgrade to the latest g++ compiler -- with any luck, it won't be a K++
compiler.

--

Frederick Gotham

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Goto page 1, 2  Next
Page 1 of 2

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.