 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Thomas Mang Guest
|
Posted: Sun Feb 27, 2005 10:02 pm Post subject: pointer arithmetic and raw memory |
|
|
Greetings,
Consider the following program (stripped down to the relevant parts):
struct Test
{
Test();
~Test();
};
int main()
{
std::size_t arraySize = 10;
void * Mem = ::operator new(sizeof(Test) * arraySize);
Test * initPointer = static_cast<Test*>(Mem);
for (std::size_t i = 0; i < arraySize; ++i)
{
new (initPointer) Test;
++initPointer; // #1
}
// cleanup
}
Ignore destruction of the Test-objects, exception handling, possible usage
of a raw_storage_iterator etc.
What I am interested in: Is this code undefined behavior?
I think it is because of the ++initPointer expression.
++Pointer increments a pointer by one (Pointer += 1), which means Pointer =
Pointer + 1. For the arithmetic operator+, the Standard says in 5.7/4:
"... If the pointer operand points to an element of an array object".
Here, initPointer does not point to an array object, since there is no
array.
Is my conclusion correct and this program yields undefined behavior?
Thank you,
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
kuyper@wizard.net Guest
|
Posted: Mon Feb 28, 2005 4:16 am Post subject: Re: pointer arithmetic and raw memory |
|
|
"Thomas Mang" wrote:
| Quote: | Greetings,
Consider the following program (stripped down to the relevant parts):
struct Test
{
Test();
~Test();
};
int main()
{
std::size_t arraySize = 10;
void * Mem = ::operator new(sizeof(Test) * arraySize);
Test * initPointer = static_cast<Test*>(Mem);
for (std::size_t i = 0; i < arraySize; ++i)
{
new (initPointer) Test;
++initPointer; // #1
}
// cleanup
}
Ignore destruction of the Test-objects, exception handling, possible
usage
of a raw_storage_iterator etc.
What I am interested in: Is this code undefined behavior?
I think it is because of the ++initPointer expression.
++Pointer increments a pointer by one (Pointer += 1), which means
Pointer =
Pointer + 1. For the arithmetic operator+, the Standard says in
5.7/4:
"... If the pointer operand points to an element of an array object".
Here, initPointer does not point to an array object, since there is
no
array.
Is my conclusion correct and this program yields undefined behavior?
|
That is not correct. See 5.7p4: "For the purposes of these operators, a
pointer to a non-array object behaves the same as a pointer to the
first element of an array of length one with the type of the object as
it's element type."
It's perfectly legal to increment a pointer to the last element of an
array; the result is called a one-past-the-end pointer. Such a pointer
cannot be safely dereferenced or incremented, but it can be
decremented, compared with other pointers for equality, and it can be
compared with a pointer at the object or one-past-the-end for relative
order.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Mon Feb 28, 2005 12:53 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
<kuyper (AT) wizard (DOT) net> schrieb im Newsbeitrag
news:1109561763.636738.327210 (AT) o13g2000cwo (DOT) googlegroups.com...
| Quote: |
"Thomas Mang" wrote:
Greetings,
Consider the following program (stripped down to the relevant parts):
struct Test
{
Test();
~Test();
};
int main()
{
std::size_t arraySize = 10;
void * Mem = ::operator new(sizeof(Test) * arraySize);
Test * initPointer = static_cast<Test*>(Mem);
for (std::size_t i = 0; i < arraySize; ++i)
{
new (initPointer) Test;
++initPointer; // #1
}
// cleanup
}
Ignore destruction of the Test-objects, exception handling, possible
usage
of a raw_storage_iterator etc.
What I am interested in: Is this code undefined behavior?
I think it is because of the ++initPointer expression.
++Pointer increments a pointer by one (Pointer += 1), which means
Pointer =
Pointer + 1. For the arithmetic operator+, the Standard says in
5.7/4:
"... If the pointer operand points to an element of an array object".
Here, initPointer does not point to an array object, since there is
no
array.
Is my conclusion correct and this program yields undefined behavior?
That is not correct. See 5.7p4: "For the purposes of these operators, a
pointer to a non-array object behaves the same as a pointer to the
first element of an array of length one with the type of the object as
it's element type."
|
Thanks, missed that. However, incrementing the pointer twice (without
creating an object) is undefined behavior (even if it points to valid
memory), and destruction has to appear in the reverse order - otherwise the
first element will be destroyed first, then the pointer doesn't point to a
valid object any more. The list of actions what can be done with those
pointers in 3.8/5 does not include arithmetic operations (although one can
dereference the pointer), so it's undefined behavior.
Correct, or did I miss again something?
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Mon Feb 28, 2005 12:54 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
Just another issue (how the whole idea derived from):
Suppose you write a memory pool - one that preallocates some chunk of memory
and assigns it to objects if they are created (overloading new).
Then my impression is:
situation 1) I use ::operator new to obtain the memory.
Then indexing into the chunk is not possible, except indeces 0 and 1 (1 if
there is an element at index 0) because there are no arrays, and adding more
than 1 is undefined.
However, I can access the first free spot by always incrementing a pointer
by one - e.g. jumping from object to the next object(all objects represent a
"one-element-array", and therefore increment can be used). This requires all
objects to be of the same type.
Casting to char* etc. and using that for indexing is not possible, there are
formally no char-elements / no char-array.
situation 2) I allocate the memory using char[]
Is indexing into that array [sizeof(T) * index] possible? I don't think so,
because when the memory is overwritten, the lifetime of the char ends
(although the bits still are a valid char) - and therefore I don't have
elements of an array any more. At least contiguity is broken, IMO even the
whole array status.
Furthermore, an implementation is allowed to add - for a call to the
allocation function, an array-overhead to all array-new expressions -
probably for non-PODs / types with destructors with side effects. However,
there is nothing that forbids adding an overhead for a char[]. The problem
is if this overhead causes an overflow for std::size_t, bets are off (the
allocation function will probably return less memory than needed. Just tried
it out, the program crashed nicely, although the memory needed to hold n
objects of type T did not exceed std::numeric_limits<std::size_t>::max() ).
I fail to see how to determine the maximum array-overhead:
-) An implementation is not required to document it.
-) it may vary during execution time.
Something like max_align, 2^8 will likely work, but not guaranteed.
Is the travelling-forward-one-object-by-one the only portable solution?
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Marc Schoolderman Guest
|
Posted: Mon Feb 28, 2005 10:20 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
[Somehow posting to the newsgroup itself doesn't work for me]
Thomas Mang wrote:
| Quote: | Thanks, missed that. However, incrementing the pointer twice (without
creating an object) is undefined behavior (even if it points to valid
memory), and destruction has to appear in the reverse order - otherwise the
first element will be destroyed first, then the pointer doesn't point to a
valid object any more. The list of actions what can be done with those
pointers in 3.8/5 does not include arithmetic operations (although one can
dereference the pointer), so it's undefined behavior.
Correct, or did I miss again something?
|
I believe your assumption that initPointer does not point to a valid
array is incorrect. Consider:
| Quote: | void * Mem = ::operator new(sizeof(Test) * arraySize);
|
3.7.3.1/2 "The pointer returned shall be suitably aligned so that it can
be converted to a pointer of any complete object type and then used to
access the object or array in the storage allocated."
So Mem is properly aligned and when converted by the static_cast<>,
points to (storage for) arraySize objects.
You also seem worried that incrementing a pointer is only valid when the
storage it points to contains a valid object. Intuitively, that implies
pointers get dereferenced when you do math on them, which in turn would
mean using past-the-end values would be illegal (since you can't
dereference those).
3.8/5 seems to codify this - it limits using pointers to objects whose
lifetime has not begun yet. Pointer arithmetic doesn't seem to be
restricted.
~Marc
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Thu Mar 03, 2005 6:47 am Post subject: Re: pointer arithmetic and raw memory |
|
|
I am throwing in another case:
In the 2003 revision, 23.2.4/1 was modified to guarantee vector elements are
stored contiguously, to make vector compatible with older libraries dealing
with pointers into _arrays_.
As far as I read the Standard, whenever those libraries do pointer
arithmetic other than 0 / 1, the behavior is undefined, because again there
is simply no array. The current wording of the Standard however does pointer
arithmetic; butit talks about identity guarantees, not pointer arithmetic
guarantees.
Does this suffice to make the current wording defined behavior? Or makes it
use of an expression that is undefined behavior?
Of course, I am not questioning the intent of that para, I am questioning
the legality of the expression used therein and very likely within many
libraries. std::vector is simply not an array in the techical sense,
therefore pointer arithmetic (other than 0 / 1) means all bets off.
Unless I am missing something that makes the current situation defined as it
was intended, my impression is pointer arithmetic should not be restricted
to arrays, but to all contiguous serieses of objects of the same type.
Your thoughts?
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
msalters Guest
|
Posted: Thu Mar 03, 2005 8:56 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
"Thomas Mang" wrote:
| Quote: | Just another issue (how the whole idea derived from):
Suppose you write a memory pool - one that preallocates some chunk of
memory
and assigns it to objects if they are created (overloading new).
Then my impression is:
situation 1) I use ::operator new to obtain the memory.
Then indexing into the chunk is not possible, except indeces 0 and 1
(1 if
there is an element at index 0) because there are no arrays, and
adding more
than 1 is undefined.
However, I can access the first free spot by always incrementing a
pointer
by one - e.g. jumping from object to the next object(all objects
represent a
"one-element-array", and therefore increment can be used). This
requires all
objects to be of the same type.
|
No. You have a valid char* pointer, enough memory, so the only concern
is whether the char* is properly aligned for the type you're creating.
This will always be the case if you're creating only objects of a
single type, but that's because they will be objects with the same
sizeof() and thus the same alignment. Any other type with the same
sizeof() can be substituted.
| Quote: | Casting to char* etc. and using that for indexing is not possible,
there are
formally no char-elements / no char-array.
|
Huh? I've got no idea what you're talking about. You can use char* for
indexing; a char* can hold any address.
| Quote: | situation 2) I allocate the memory using char[]
Is indexing into that array [sizeof(T) * index] possible? I don't
think so,
because when the memory is overwritten, the lifetime of the char ends
(although the bits still are a valid char) - and therefore I don't
have
elements of an array any more.
|
Wrong, for a number of reasons. To name a few: lifetime doesn't matter
for
chars, any memory can be accessed by chars, you're not using the chars
but
instead the storage they used to occupy.
| Quote: | Furthermore, an implementation is allowed to add - for a call to the
allocation function, an array-overhead to all array-new expressions -
probably for non-PODs / types with destructors with side effects.
However,
there is nothing that forbids adding an overhead for a char[]. The
problem
is if this overhead causes an overflow for std::size_t, bets are off
(the
allocation function will probably return less memory than needed.
Just tried
it out, the program crashed nicely, although the memory needed to
hold n
objects of type T did not exceed
std::numeric_limits<std::size_t>::max() ). |
Is this a problem in the real world, where size_t is much larger then
available memory anyway? In theory, it's not an issue because an
implementation can just document the maximum object size such that
new[] won't overflow. (An array counts as one object.)
HTH,
Michiel Salters
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Tue Mar 08, 2005 8:34 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
"msalters" <Michiel.Salters (AT) logicacmg (DOT) com> schrieb im Newsbeitrag
news:1109859186.935261.201880 (AT) f14g2000cwb (DOT) googlegroups.com...
| Quote: | "Thomas Mang" wrote:
Just another issue (how the whole idea derived from):
Suppose you write a memory pool - one that preallocates some chunk of
memory
and assigns it to objects if they are created (overloading new).
Then my impression is:
situation 1) I use ::operator new to obtain the memory.
Then indexing into the chunk is not possible, except indeces 0 and 1
(1 if
there is an element at index 0) because there are no arrays, and
adding more
than 1 is undefined.
However, I can access the first free spot by always incrementing a
pointer
by one - e.g. jumping from object to the next object(all objects
represent a
"one-element-array", and therefore increment can be used). This
requires all
objects to be of the same type.
No. You have a valid char* pointer, enough memory, so the only concern
is whether the char* is properly aligned for the type you're creating.
This will always be the case if you're creating only objects of a
single type, but that's because they will be objects with the same
sizeof() and thus the same alignment. Any other type with the same
sizeof() can be substituted.
Casting to char* etc. and using that for indexing is not possible,
there are
formally no char-elements / no char-array.
Huh? I've got no idea what you're talking about. You can use char* for
indexing; a char* can hold any address.
|
Sorry for answering late.
I am talking about the fact there was never ever a char[] created. A char*
has the same representation as a void*, but I fail to see how that matters
here.
The para about pointer arithmetic (5.7/5) clearly requires pointers to
elements of an array. Where is the char-array? I know I can treat the bits
in the memory as chars, but memory is not a char-array.
| Quote: |
situation 2) I allocate the memory using char[]
Is indexing into that array [sizeof(T) * index] possible? I don't
think so,
because when the memory is overwritten, the lifetime of the char ends
(although the bits still are a valid char) - and therefore I don't
have
elements of an array any more.
Wrong, for a number of reasons. To name a few: lifetime doesn't matter
for
chars, any memory can be accessed by chars, you're not using the chars
but
instead the storage they used to occupy.
|
I have destroyed the char-array by reusing the storage of the chars. 3.8/4
says so that the lifetime ends.
Once the char[] is lost, I fail to see how pointer arithmetic is valid any
more. In my opinion, the fact that the bits still make up a valid char is
irrelevant.
Suppose an implementation keeps internally a table of created arrays, when
they are still valid and so on and terminates the program whenever pointer
arithmetic (other than 0/1) is applied to a pointer pointing not to an
address belonging to that table, even if it is a char*. Is there anything
that forbids such an implementation? Is there anything that says memory can
be treated as a char[]?
To add something to the vector-example:
The more I think about it, the more I am convinced a DR is needed. There is
no array of objects, so formaly pointer arithmetic is undefined behavior.
One can discuss if the expression used in 23.3.4/1 make the behavior valid
for these special cases, but I think it doesn't matter.
Suppose the expressions make it valid (personally, I don't think). Then the
only thing you can do is adding an offset < vec.size(). Since n is required
to be < vec.size(), it means you cannot compute the one-past-the-end
address. That means something like:
&vec[0] + vec.size()
is undefined behavior. I see nothing wrong in an implementation that chokes
on this and quits (remember, we could increment the pointer pointing to the
last vector element by one, but we cannot compute it by pointers to other
elements, since again there is no array). But probably exactly this
expression will be used in conjunction with older libraries. Finally, the
expression does not cover subtracting a value from a pointer or subtracting
a pointer from a pointer to get the number of elements.
The problem seems not to arise in 23.2.4/1 (it says "the elements of a
vector are stored contiguously"), it seems to come from the pointer
arithmetic para which explicitly requires arrays.
Maybe I am ways too pedantic, but maybe not.
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Marc Schoolderman Guest
|
Posted: Wed Mar 09, 2005 2:49 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
Thomas Mang wrote:
| Quote: | The para about pointer arithmetic (5.7/5) clearly requires pointers to
elements of an array. Where is the char-array? I know I can treat the bits
in the memory as chars, but memory is not a char-array.
|
Memory consists of one or more contiguous series of bytes (1.7).
At other places you can read that an uchar is a byte, and that objects
have object representations that are sequences of uchars, and so on.
| Quote: | Suppose an implementation keeps internally a table of created arrays, when
they are still valid and so on and terminates the program whenever pointer
arithmetic (other than 0/1) is applied to a pointer pointing not to an
address belonging to that table, even if it is a char*. Is there anything
that forbids such an implementation?
|
This gets very close to a protected memory model, which is in fact what
a lot of systems implement.
| Quote: | The problem seems not to arise in 23.2.4/1 (it says "the elements of a
vector are stored contiguously"), it seems to come from the pointer
arithmetic para which explicitly requires arrays.
|
You're right. Hence to satisfy the identity mapping, references returned
by vec[n] MUST refer to elements of an array. Simple as that.
Why would it be a 'special case'? There are lots of places where the
standard explains parts of the language/library in terms of other parts.
~Marc
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Fri Mar 11, 2005 5:35 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
"Marc Schoolderman" <squell (AT) alumina (DOT) nl> schrieb im Newsbeitrag
news:422E62CD.9060105 (AT) alumina (DOT) nl...
| Quote: | Thomas Mang wrote:
The para about pointer arithmetic (5.7/5) clearly requires pointers to
elements of an array. Where is the char-array? I know I can treat the
bits
in the memory as chars, but memory is not a char-array.
Memory consists of one or more contiguous series of bytes (1.7).
|
True, but how does it matter?
| Quote: |
At other places you can read that an uchar is a byte, and that objects
have object representations that are sequences of uchars, and so on.
|
Still they are not a C++ - array. Non-empty contiguous serieses of objects
are not automatically an array.
| Quote: |
Suppose an implementation keeps internally a table of created arrays,
when
they are still valid and so on and terminates the program whenever
pointer
arithmetic (other than 0/1) is applied to a pointer pointing not to an
address belonging to that table, even if it is a char*. Is there
anything
that forbids such an implementation?
This gets very close to a protected memory model, which is in fact what
a lot of systems implement.
|
Yes, but note the important extension to distinguish whether an address
belongs to an array or not.
Suppose this simple snippet:
// 1
char* c = new char[10 * sizeof(T)];
// 2
for (int i = 0; i < 10; ++i)
new (c + i*sizeof(T)) T;
// 3
T* p = (T*) c;
// 4
p + 8;
and the compiler translates into this code:
// 1
char*c = new char[10 * sizeof(T)];
arrayAddresses.addArrayRange(c, c + 10* sizeof(T));
// 2
for (int i = 0; i < 10; ++i)
{
new (c + i*sizeof(T)) T;
arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c + i*sizeof(T));
}
// 3
T* p = (T*) c;
// 4
if (! arrayAddresses.addressBelongsToArrayAndTargetBelongsToArray(p, p + )
abort();
p + 8;
Is this implementation forbidden?
Same of course for creating contiguous serieses of objects using placement
new etc. (what std::vector probably does internally) where an address-range
was never added to arrayAddresses, because obtaining memory by a call to
placement new is very different from creating an array.
| Quote: |
The problem seems not to arise in 23.2.4/1 (it says "the elements of a
vector are stored contiguously"), it seems to come from the pointer
arithmetic para which explicitly requires arrays.
You're right. Hence to satisfy the identity mapping, references returned
by vec[n] MUST refer to elements of an array. Simple as that.
|
I disagree. First, I think the para is invalid because the expression used
therein is invalid an there is no wording 5.7/5 is overruled. Second, even
if no explicit overruling is needed, then it does not follow you have an
array. Fulfilling only the limited operations guaranteed by that para seems
to be enough for me. An implementation that terminates on calculating the
past-the-end pointer value seems to be perfectly legal.
And of course, for user defined types only _simulating_ arrays (meaning
identical memory layout) and allocators and so on the problem remains.
Again, let me emphasize I am not questioning the intent of either
std::vector in combination with plain pointers, or doing some pointer
arithmetic within an block of allocated memory. I want to express that IMHO
all the current usage of this is undefined behavior because 5.7/5 is
violated, because that explicitly requires arrays. I could not find any para
in the STandard permitting this, not even for char*.
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Marc Schoolderman Guest
|
Posted: Sat Mar 12, 2005 4:18 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
Thomas Mang wrote:
| Quote: | Still they are not a C++ - array. Non-empty contiguous serieses of objects
are not automatically an array.
|
I think I've found something. 1.8/1 says "An object is a region of
storage". So, an array object is (using this definition) simply a region
of storage containing contiguously allocated sub-objects.
But anyway, I think this is a moot issue. The only standard way to get a
contiguous series of objects in C++ is by creating an array! Anything
else will at least involve a type cast, which is a story in itself.
| Quote: | Suppose this simple snippet:
..
new (c + i*sizeof(T)) T;
arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c + i*sizeof(T));
|
I assume you agree this violates all common sense but are looking for
the standard to back you up.
Given 3.8/2; "the lifetime of an array object or of an object of POD
type starts as soon as storage [...] is obtained and ends when the
storage which [it] occupies is reused or released."
The most convincing answer is that your statement is re-using storage of
sub-objects of the array, not the storage of the array object proper, so
the lifetime of the array itself is not affected.
Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
'non-POD class type'. So the list of undefined operations doesn't apply
to arrays and POD objects regardless.
However, you've convinced me that 3.8 is confusing at best. I'm unsure
what "reusing storage" actually means.
| Quote: | The problem seems not to arise in 23.2.4/1 (it says "the elements of a
vector are stored contiguously"), it seems to come from the pointer
arithmetic para which explicitly requires arrays.
You're right. Hence to satisfy the identity mapping, references returned
by vec[n] MUST refer to elements of an array. Simple as that.
I disagree. First, I think the para is invalid because the expression used
therein is invalid an there is no wording 5.7/5 is overruled.
|
I don't see how a library specification would overrule the language.
vector<T>::operator[] is a function returning an lvalue of T. The &
operator will make it a pointer to T. So the identity mapping is merely
making a statement about how two pointers obtained through this function
are related. And the only way they can be related that way is if those
pointers point to elements of an array.
vector<> is already constrained by iterator invalidation rules and
complexity guarantees so that it can only (realistically) be implemented
by having it maintain an array privately, with its [] operator returning
references to objects inside that private array.
| Quote: | Fulfilling only the limited operations guaranteed by that para seems
to be enough for me. An implementation that terminates on calculating the
past-the-end pointer value seems to be perfectly legal.
|
It can't terminate and obey the identity. Assume vec is a vector<int>
with size() > 0.
int* end = &vec[vec.size()-1] + 1;
is valid because we're incrementing a pointer past an object. By
applying the identity of 23.2.4 to &vec[], and simplifying:
int* end = &vec[0] + vec.size();
Which must therefore be valid as well.
Of course, &vec[vec.size()] is still undefined because the identity
doesn't hold for n >= size().
~Marc
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Sat Mar 12, 2005 8:19 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
"Marc Schoolderman" <squell (AT) alumina (DOT) nl> schrieb im Newsbeitrag
news:423305A9.5010401 (AT) alumina (DOT) nl...
| Quote: | Thomas Mang wrote:
Still they are not a C++ - array. Non-empty contiguous serieses of
objects
are not automatically an array.
I think I've found something. 1.8/1 says "An object is a region of
storage". So, an array object is (using this definition) simply a region
of storage containing contiguously allocated sub-objects.
But anyway, I think this is a moot issue. The only standard way to get a
contiguous series of objects in C++ is by creating an array! Anything
else will at least involve a type cast, which is a story in itself.
|
I am not following you here. std::vector does, internally, most likely never
create an array, still the objects are laid out in a contiguous series.
| Quote: |
Suppose this simple snippet:
..
new (c + i*sizeof(T)) T;
arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c +
i*sizeof(T));
I assume you agree this violates all common sense but are looking for
the standard to back you up.
|
Honestly, I think this very^very unlikely to hit in practise, but is it
forbidden?
Please note I have never said it is a problem in practical code, I am saying
it is a problem in the current wording of the standard.
And BTW, an implementation that - in the _most strict_ mode - tells me about
everything that yields undefined behavior, well I think I wouldn't mind at
all against such an implementation.
| Quote: |
Given 3.8/2; "the lifetime of an array object or of an object of POD
type starts as soon as storage [...] is obtained and ends when the
storage which [it] occupies is reused or released."
The most convincing answer is that your statement is re-using storage of
sub-objects of the array, not the storage of the array object proper, so
the lifetime of the array itself is not affected.
|
No, the array itself counts as one object. If I reuse only one byte of that
object, or all bytes, does not seem to matter. I am reusing the storage to
assign it to a newly created object of type T. That ends the lifetime of the
array, thus all pointer arithmetic into that dead array seems like undefined
behavior to me.
I also think delete [] the "array" is undefined behavior, since the array
does not exist any more. But I am not sure about that. What do others think
about it?
| Quote: |
Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
'non-POD class type'. So the list of undefined operations doesn't apply
to arrays and POD objects regardless.
However, you've convinced me that 3.8 is confusing at best. I'm unsure
what "reusing storage" actually means.
|
Probably reusing the storage in a write-matter other than the object type it
was created with. But good point, that could need clarification too.
Note however my point is not affecting what can be done with the
char-subobjects (that's pretty clear), I am saying because the array is
destroyed, the pointer arithmetic yields undefined behavior.
| Quote: |
The problem seems not to arise in 23.2.4/1 (it says "the elements of a
vector are stored contiguously"), it seems to come from the pointer
arithmetic para which explicitly requires arrays.
You're right. Hence to satisfy the identity mapping, references returned
by vec[n] MUST refer to elements of an array. Simple as that.
I disagree. First, I think the para is invalid because the expression
used
therein is invalid an there is no wording 5.7/5 is overruled.
I don't see how a library specification would overrule the language.
|
Well, by clearly contradicting another para? Wouldn't be the first diverging
wording.
General question to the language lawyers: Is the wording of 23.2.4/1
extending the para about pointer arithmetic (by meaning "everything in the
Standard is correct, so it can't contradict, it must extend"), or is it
simply using something that yields undefined behavior, thus the para is
meaningless?
| Quote: | vector<T>::operator[] is a function returning an lvalue of T. The &
operator will make it a pointer to T. So the identity mapping is merely
making a statement about how two pointers obtained through this function
are related. And the only way they can be related that way is if those
pointers point to elements of an array.
|
Not necessarily. Remember the array-check implementation I presented? One
that relaxes the rules to the operations of std::vector in that para still
seems perfectly legal. Including aborting the program on calculating the
past-the-end pointer, or using operator- .....
| Quote: |
vector<> is already constrained by iterator invalidation rules and
complexity guarantees so that it can only (realistically) be implemented
by having it maintain an array privately, with its [] operator returning
references to objects inside that private array.
|
The fun is I would bet quite a lot most vector implementations do not hold
internally an array, they create object by object using the allocators
construct-function. No array.
| Quote: |
Fulfilling only the limited operations guaranteed by that para seems
to be enough for me. An implementation that terminates on calculating
the
past-the-end pointer value seems to be perfectly legal.
It can't terminate and obey the identity. Assume vec is a vector<int
with size() > 0.
int* end = &vec[vec.size()-1] + 1;
is valid because we're incrementing a pointer past an object.
|
Correct.
By
| Quote: | applying the identity of 23.2.4 to &vec[], and simplifying:
int* end = &vec[0] + vec.size();
Which must therefore be valid as well.
|
I don't think this is automatically guaranteed - even if the para is 100%
legal and not already undefined behavior. Simply because vec.size() need not
be one, and the identity guarantees only apply to real objects. And they
apply only to operator+; nothing says other pointer arithmetic expression
can be used safely. I am pretty convinced explicit relaxing of 5.7/5 would
be needed, which is not there.
| Quote: |
Of course, &vec[vec.size()] is still undefined because the identity
doesn't hold for n >= size().
|
Yes, and by reading the thread about null-references my opinion based on my
current knowledge (that is, I am not 100% familiar with the details of the
proposals) I hope this will always remain undefined behavior :-)
Anyways, please note that std::vector is only a special case. It is very
special because it is part of the Standard, but non-Standard libraries such
as allocators etc. yield IMHO clearly undefined behavior. That is, all the
manual memory-managements I have ever seen doing some pointer arithmetic are
undefined behavior, unless someone can point me to a para that says
otherwise.
I would even hope someone points me to such para, otherwise some time of
Easter will be spent writing a DR. Searching easter bunnies and eggs would
be more fun :-)
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Marc Schoolderman Guest
|
Posted: Wed Mar 16, 2005 9:24 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
Thomas Mang wrote:
[Is a contiguous series of objects an array?]
| Quote: | I think I've found something. 1.8/1 says "An object is a region of
storage". So, an array object is (using this definition) simply a region
of storage containing contiguously allocated sub-objects.
But anyway, I think this is a moot issue. The only standard way to get a
contiguous series of objects in C++ is by creating an array! Anything
else will at least involve a type cast, which is a story in itself.
I am not following you here. std::vector does, internally, most likely never
create an array, still the objects are laid out in a contiguous series.
|
But std::vector is just a class. We are exposed only to its interface,
and the contract on that interface. A std::vector "isn't" a contiguous
series of objects, it provides an abstraction for them.
| Quote: | new (c + i*sizeof(T)) T;
arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c +
i*sizeof(T));
I assume you agree this violates all common sense but are looking for
the standard to back you up.
Honestly, I think this very^very unlikely to hit in practise, but is it
forbidden?
Please note I have never said it is a problem in practical code, I am saying
it is a problem in the current wording of the standard.
|
Yes, and at the moment, I agree somewhat.
| Quote: | The most convincing answer is that your statement is re-using storage of
sub-objects of the array, not the storage of the array object proper, so
the lifetime of the array itself is not affected.
No, the array itself counts as one object. If I reuse only one byte of that
object, or all bytes, does not seem to matter. I am reusing the storage to
assign it to a newly created object of type T. That ends the lifetime of the
array, thus all pointer arithmetic into that dead array seems like undefined
behavior to me.
|
While the array itself counts as one object, it also contains other
sub-objects (which it is the 'complete object' for).
The exact same situation happens with structs and classes. So if we
re-use the storage of a class member, does that end the lifetime of the
encompassing class as well?
| Quote: | Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
'non-POD class type'. So the list of undefined operations doesn't apply
to arrays and POD objects regardless.
However, you've convinced me that 3.8 is confusing at best. I'm unsure
what "reusing storage" actually means.
Probably reusing the storage in a write-matter other than the object type it
was created with.
|
I was forgetting the strict aliassing rules of 3.10/15, which forbids
most of the things you could use the 'old' lvalue for anyway.
| Quote: | I don't see how a library specification would overrule the language.
Well, by clearly contradicting another para? Wouldn't be the first diverging
wording.
|
But there's a clear seperation in the standard (1.5). I'd really like to
see examples of the library introducing new meaning to the language.
| Quote: | General question to the language lawyers: Is the wording of 23.2.4/1
extending the para about pointer arithmetic (by meaning "everything in the
Standard is correct, so it can't contradict, it must extend"), or is it
simply using something that yields undefined behavior, thus the para is
meaningless?
|
This is a false dichotomy. You're dismissing the possibility of the
identity mapping as saying something about std::vector, not pointer
arithmetic.
| Quote: | vector<> is already constrained by iterator invalidation rules and
complexity guarantees so that it can only (realistically) be implemented
by having it maintain an array privately, with its [] operator returning
references to objects inside that private array.
The fun is I would bet quite a lot most vector implementations do not hold
internally an array, they create object by object using the allocators
construct-function. No array.
|
But how can you construct them object-by-object in a contiguous fashion
unless you already have an array you are constructing them into?
And since 3.8/1 says that the lifetime of [an array] begins as soon as
storage with the proper size and alignment for it is obtained - there's
your array.
| Quote: | It can't terminate and obey the identity. Assume vec is a vector<int
with size() > 0.
int* end = &vec[vec.size()-1] + 1;
is valid because we're incrementing a pointer past an object.
Correct.
applying the identity of 23.2.4 to &vec[], and simplifying:
int* end = &vec[0] + vec.size();
Which must therefore be valid as well.
I don't think this is automatically guaranteed - even if the para is 100%
legal and not already undefined behavior.
|
Well, if you assume that the paragraph is undefined behaviour, then the
first statement is undefined behaviour as well. If we assume the
paragraph is valid, then the second one must be valid as well.
Remember that this example wasn't to show that the identity mapping
itself is valid, it was to show that IF it is, you also can calculate
the past-the-end pointer in the ordinary fashion. You called that into
doubt as well.
| Quote: | and the identity guarantees only apply to real objects. And they
apply only to operator+; nothing says other pointer arithmetic expression
can be used safely. I am pretty convinced explicit relaxing of 5.7/5 would
be needed, which is not there.
|
So, in your view, the identity map doesn't say, for exampe,
&vec[0] == &vec[n] - n; with 0 <= n < vec.size()
| Quote: | Anyways, please note that std::vector is only a special case. It is very
special because it is part of the Standard, but non-Standard libraries such
as allocators etc. yield IMHO clearly undefined behavior. That is, all the
manual memory-managements I have ever seen doing some pointer arithmetic are
undefined behavior, unless someone can point me to a para that says
otherwise.
|
But this is C++, I do think it's basically heretical to suggest that the
standard library can do things that other libraries can not.
So I'd spend my easter searching for bunnies, not defects ;)
~Marc
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Sat Mar 19, 2005 7:29 am Post subject: Re: pointer arithmetic and raw memory |
|
|
"Marc Schoolderman" <squell (AT) alumina (DOT) nl> schrieb im Newsbeitrag
news:42385F79.6090504 (AT) alumina (DOT) nl...
| Quote: | Thomas Mang wrote:
[Is a contiguous series of objects an array?]
I think I've found something. 1.8/1 says "An object is a region of
storage". So, an array object is (using this definition) simply a region
of storage containing contiguously allocated sub-objects.
But anyway, I think this is a moot issue. The only standard way to get a
contiguous series of objects in C++ is by creating an array! Anything
else will at least involve a type cast, which is a story in itself.
I am not following you here. std::vector does, internally, most likely
never
create an array, still the objects are laid out in a contiguous series.
But std::vector is just a class. We are exposed only to its interface,
and the contract on that interface. A std::vector "isn't" a contiguous
series of objects, it provides an abstraction for them.
|
Yes, that's what I said. However, I am pretty sure the intent of the
internal layout of std::vector was to make it compatible with old libraries
dealing with pointers into arrays and performing pointer arithmetic.
But the current Standard says it is undefined behavior doing so, because no
array exists.
Note that basic_string<>::c_str() does return a pointer into an array,
because the description explicitly guarantees it.
| Quote: |
new (c + i*sizeof(T)) T;
arrayAddresses.deleteArrayRangeBecauseMemoryWasOverwritten(c +
i*sizeof(T));
I assume you agree this violates all common sense but are looking for
the standard to back you up.
Honestly, I think this very^very unlikely to hit in practise, but is it
forbidden?
Please note I have never said it is a problem in practical code, I am
saying
it is a problem in the current wording of the standard.
Yes, and at the moment, I agree somewhat.
The most convincing answer is that your statement is re-using storage of
sub-objects of the array, not the storage of the array object proper, so
the lifetime of the array itself is not affected.
No, the array itself counts as one object. If I reuse only one byte of
that
object, or all bytes, does not seem to matter. I am reusing the storage
to
assign it to a newly created object of type T. That ends the lifetime of
the
array, thus all pointer arithmetic into that dead array seems like
undefined
behavior to me.
While the array itself counts as one object, it also contains other
sub-objects (which it is the 'complete object' for).
The exact same situation happens with structs and classes. So if we
re-use the storage of a class member, does that end the lifetime of the
encompassing class as well?
|
I think so, because the still storage belongs to the outermost object.
I fairly certain one cannot override the storage occupied by the first
member of a std::pair<std::vector and still claim it to
be a std::pair<>.
| Quote: |
Failing that; 3.8/5 and 3.8/6 only limit operations on objects of
'non-POD class type'. So the list of undefined operations doesn't apply
to arrays and POD objects regardless.
However, you've convinced me that 3.8 is confusing at best. I'm unsure
what "reusing storage" actually means.
Probably reusing the storage in a write-matter other than the object
type it
was created with.
I was forgetting the strict aliassing rules of 3.10/15, which forbids
most of the things you could use the 'old' lvalue for anyway.
I don't see how a library specification would overrule the language.
Well, by clearly contradicting another para? Wouldn't be the first
diverging
wording.
But there's a clear seperation in the standard (1.5). I'd really like to
see examples of the library introducing new meaning to the language.
|
Here I agree with, at least with vector, and that's what I base my opinion
about a necessary defect report upon:
5.7/5 says pointer arithmetic (other than 0/1) requires an array. Nothing in
std::vector specifies it is an array (although it has internally the same
layout), so the para about the identity guarantees is useless to me, because
the expression used therein invokes undefined behavior.
| Quote: |
General question to the language lawyers: Is the wording of 23.2.4/1
extending the para about pointer arithmetic (by meaning "everything in
the
Standard is correct, so it can't contradict, it must extend"), or is it
simply using something that yields undefined behavior, thus the para is
meaningless?
This is a false dichotomy. You're dismissing the possibility of the
identity mapping as saying something about std::vector, not pointer
arithmetic.
|
Hmm. The para does pointer arithmetic, doesn't it? The para is supposed to
give the guarantee of pointer arithmetic other pre-standard era libraries
can count on, isn't it? Unfortunately, there is 5.7/5.
But I admit I have a problem here to understand priorities. The issue is:
One para says undefined behavior, the other says certain operations are
guaranteed. Which one has priority? The safe way (undefined behavior), or
the brave (guaranteed behavior) ?
| Quote: |
vector<> is already constrained by iterator invalidation rules and
complexity guarantees so that it can only (realistically) be implemented
by having it maintain an array privately, with its [] operator returning
references to objects inside that private array.
The fun is I would bet quite a lot most vector implementations do not
hold
internally an array, they create object by object using the allocators
construct-function. No array.
But how can you construct them object-by-object in a contiguous fashion
unless you already have an array you are constructing them into?
|
By allocation raw memory using the allocator, and then creating every object
directly past (that is, sizeof(T) bytes) the previous.
The memory layout is the same as an array, but officially it is not an
array.
| Quote: | And since 3.8/1 says that the lifetime of [an array] begins as soon as
storage with the proper size and alignment for it is obtained - there's
your array.
|
Here you have found an interesting para. Indeed, I agree, allocating memory
seems to create a char[] (or an int[] or POD[]).
But the problem remains in my opinion; when the memory is assigned to
another object T, the lifetime ends and the array is gone. It's in the same
para, 4 lines later.
| Quote: |
It can't terminate and obey the identity. Assume vec is a vector<int
with size() > 0.
int* end = &vec[vec.size()-1] + 1;
is valid because we're incrementing a pointer past an object.
Correct.
applying the identity of 23.2.4 to &vec[], and simplifying:
int* end = &vec[0] + vec.size();
Which must therefore be valid as well.
I don't think this is automatically guaranteed - even if the para is
100%
legal and not already undefined behavior.
Well, if you assume that the paragraph is undefined behaviour, then the
first statement is undefined behaviour as well. If we assume the
paragraph is valid, then the second one must be valid as well.
|
Here I disagree with. There is clearly 5.7/5. If 23.2.4/1 overrules 5.7/5
(repeating myself: IMHO it doesn't because the expression is already
undefined behavior), then in my opinion _only_ for operator+, and only for
n < vec.size(). Forget for a moment implementation stupidity and imagine one
that has fat pointers storing if they point into a vector-"array" and abort
on another operation than operator+. Is that illegal? Is it illegal if it
terminates on calculation the one-past-end pointer?
Personally, I think you read too much into 23.2.4/1, but of course it's also
possible I am not reading enough into it.
| Quote: |
Remember that this example wasn't to show that the identity mapping
itself is valid, it was to show that IF it is, you also can calculate
the past-the-end pointer in the ordinary fashion. You called that into
doubt as well.
|
Yes, indeed I do. I think I have learnt not to read the C++ Standard in a
way "B is guaranteed, so from that it follows automatically C and D are
guaranteed too". This might, of course, sometimes be the case, but in this
particular case I don't think so.
| Quote: |
and the identity guarantees only apply to real objects. And they
apply only to operator+; nothing says other pointer arithmetic
expression
can be used safely. I am pretty convinced explicit relaxing of 5.7/5
would
be needed, which is not there.
So, in your view, the identity map doesn't say, for exampe,
&vec[0] == &vec[n] - n; with 0 <= n < vec.size()
|
Yes, because that violates 5.7/5. And
T* pastEnd = &vec[0] + vec.size();
is IMHO also undefined behavior.
IOW: Can you come up with a para that clearly guarantes these expressions to
be valid, or forbids an implementation choking on that when used?
| Quote: |
Anyways, please note that std::vector is only a special case. It is very
special because it is part of the Standard, but non-Standard libraries
such
as allocators etc. yield IMHO clearly undefined behavior. That is, all
the
manual memory-managements I have ever seen doing some pointer arithmetic
are
undefined behavior, unless someone can point me to a para that says
otherwise.
But this is C++, I do think it's basically heretical to suggest that the
standard library can do things that other libraries can not.
|
Well, of course it can! Standard Library Simon says, so that's the way it
is:-)
Take a look at any allocator you like optimized on performing better than
::operator new. Those will very probably keep internally a block of raw
memory, and assign the memory to objects on request. They do it by pointer
arithmetic, usually by adding n*sizeof(T) to the starting address of that
block of memory. Here, only 5.7/5 applies, so it is undefined behavior
(because no array exists).
I am fairly certain this was not intended (which would make it - as far as I
know - a defect, not a proposal). Fixing std::vector is more or less only a
side-effect.
Thomas
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
Ray Lischner Guest
|
Posted: Tue Mar 22, 2005 8:41 pm Post subject: Re: pointer arithmetic and raw memory |
|
|
On Saturday 19 March 2005 02:29 am, Thomas Mang wrote:
| Quote: | I am pretty sure the intent of the
internal layout of std::vector was to make it compatible with old
libraries dealing with pointers into arrays and performing pointer
arithmetic. But the current Standard says it is undefined behavior
doing so, because no array exists.
|
You are correct about the intent, but not about the standard. This
oversight was corrected. The current Standard, that is, ISO/IEC
14882:2003(E), says, "The elements of a vector are stored contiguously,
meaning that if v is a vector<T, Allocator> where T is some type other
than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <=n
< v.size()."
--
Ray Lischner, author of C++ in a Nutshell
http://www.tempest-sw.com/cpp
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.jamesd.demon.co.uk/csc/faq.html ]
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|