C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Multidimensional arrays

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Seungbeom Kim
Guest





PostPosted: Thu Apr 14, 2005 7:08 am    Post subject: Multidimensional arrays Reply with quote



It has been mentioned quite often that we should avoid using plain arrays
and use std::vector<> instead wherever possible.

This is fine with one-dimensional arrays, but what about multidimensional
arrays? I know that "vector of vectors" has been a quite popular answer:

- std::vector< std::vector v(nrows, std::vector<T>(ncols));

but it is far from optimal, because:

- The creation involves multiple times of dynamic allocation
- Indexing involves multiple times of dereferencing into a vector
- The shape (all rows having the same # of columns) not ensured
by the language (though it's not required at all times)

Boost.MultiArray seems somewhat better, but I feel that the syntax is not
straightforward enough for novices to use comfortably. And for some
reason that I don't know, it seems not to have been mentioned as often as
the "vector of vectors" solution. (This is not based on any statistics,
but just on how I feel, so it may be wrong.)

What would be *the* answer to the multidimensional array problem?
This should certainly be on the FAQ, I think.

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Rob
Guest





PostPosted: Thu Apr 14, 2005 1:00 pm    Post subject: Re: Multidimensional arrays Reply with quote



Seungbeom Kim wrote:

Quote:
It has been mentioned quite often that we should avoid using plain
arrays and use std::vector<> instead wherever possible.

This is fine with one-dimensional arrays, but what about
multidimensional arrays? I know that "vector of vectors" has been a
quite popular answer:

- std::vector< std::vector v(nrows, std::vector<T>(ncols));

but it is far from optimal, because:

Depends on your criteria i.e. that is your basis for saying A is
better than B?

Quote:

- The creation involves multiple times of dynamic allocation

Sure. That is the trade-off. You either do dynamic allocation
multiple times, or you spend more effort to manage the indexing
into the arrays.

Quote:
- Indexing involves multiple times of dereferencing into a vector

That is true for any multi-dimensional array, using the standard array
syntax.

Again, if you work with a single dimensional array, but map indices
into a location in that array, you have to cost of doing that
mapping.

For example;

int array[2*3];

void change_array_element(int i, int j, int new_value)
{
// assume indices valid
array(i*3 + j) = new_value;
}

versus

int array[2][3];

void change_array_element(int i, int j, int new_value)
{

// assume indices valid
array[i][j] = new_value;
}

Which is "better" depends on your point of view.

Quote:
- The shape (all rows having the same # of columns) not ensured
by the language (though it's not required at all times)


Sure. That comes in handy, however, in cases where you don't want all
rows to have the same length.

Again, it depends on what you are trying to achieve.
Quote:

Boost.MultiArray seems somewhat better, but I feel that the syntax is
not straightforward enough for novices to use comfortably. And for
some reason that I don't know, it seems not to have been mentioned as
often as the "vector of vectors" solution. (This is not based on any
statistics, but just on how I feel, so it may be wrong.)

It is not mentioned as often, because it is non-standard (not in the C++
standard).

Like any option, it has advantages and disadvantages.

Quote:

What would be the answer to the multidimensional array problem?

Depends on how you define "the problem". There are a number of ways
of managing mult-dimensional arrays. All of them have trade-offs.
For some applications, one approach is "optimal". For other
applications, a different approach is "optimal".

Quote:
This should certainly be on the FAQ, I think.

Not without a lot more clarity about exactly what you're asking, I
suggest.....

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Oliver Kreylos
Guest





PostPosted: Fri Apr 15, 2005 7:14 am    Post subject: Re: Multidimensional arrays Reply with quote



Seungbeom Kim wrote:
[snip]
Quote:
What would be *the* answer to the multidimensional array problem?
This should certainly be on the FAQ, I think.

As Rob correctly mentions above, there is no *the* answer unless the
problem is well-specified.

That aside, from a practical standpoint, the main drawbacks of standard
C arrays are manual memory management, and manual index manipulation
when treating a 1D array as an nD array. In practice, I have been
getting good mileage out of a templatized Array class like so:

template <class ContentParam,int dimensionParam>
class Array
{
...
};

This goes together with an ArrayIndex class templatized by dimension
which you can use to overload the () or the [] operator to access array
elements. I then continue to provide specializations for common cases
(1, 2, 3, 4) to have - hopefully better - special versions of the access
methods and constructors that just take n integer arguments to create an
nD array.

This setup guarantees that all rows, planes, hyperplanes etc. have the
same dimension, that the array contents are tightly packed in memory,
and that iterating through the array in C-order is efficient. Also, I
found out that on all architectures I tried, accessing elements a la
basePtr[i * planeSize + j * rowSize + k] is faster than basePtr[i][j][k]
when the array is done dynamically and the index lookup requires going
to memory.

As always, your mileage may vary.

Oliver

--
Email: [email]kreylos (AT) cs (DOT) ucdavis.edu[/email]
WWW: http://graphics.cs.ucdavis.edu/~okreylos/ResDev


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Valentin Samko
Guest





PostPosted: Fri Apr 15, 2005 7:16 am    Post subject: Re: Multidimensional arrays Reply with quote

Seungbeom Kim wrote:
Quote:
It has been mentioned quite often that we should avoid using plain arrays
and use std::vector<> instead wherever possible.

This is fine with one-dimensional arrays, but what about multidimensional
arrays? I know that "vector of vectors" has been a quite popular answer:

- std::vector< std::vector v(nrows, std::vector<T>(ncols));

but it is far from optimal, because:

- The creation involves multiple times of dynamic allocation
- Indexing involves multiple times of dereferencing into a vector
- The shape (all rows having the same # of columns) not ensured
by the language (though it's not required at all times)

There is another major problem with this approach if you need to be able
to resize the outer std::vector (or insert/delete elements). In this
case, all the inner std::vector objects will be copied, and this can be
quite expensive.

Some time ago I solved this problem by implementing my own version of
std::vector, where the only difference in behaviour was that my version
used std::swap instead of copy constructors when moving elements in the
array (when deleting/inserting/... elements). So, my version performs
much better when constructor+swap is much cheaper than the copy
constructor, which is the case with all the STL containers.
Unfortunately I have not found any way to configure std::vector
(providing custom allocator) for such behaviour.

Valentin

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Maxim Yegorushkin
Guest





PostPosted: Fri Apr 15, 2005 9:27 am    Post subject: Re: Multidimensional arrays Reply with quote

On 15 Apr 2005 03:16:18 -0400, Valentin Samko <c++.moderated (AT) digiways (DOT) com>
wrote:

[]

Quote:
Some time ago I solved this problem by implementing my own version of
std::vector, where the only difference in behaviour was that my version
used std::swap instead of copy constructors when moving elements in the
array (when deleting/inserting/... elements). So, my version performs
much better when constructor+swap is much cheaper than the copy
constructor, which is the case with all the STL containers.
Unfortunately I have not found any way to configure std::vector
(providing custom allocator) for such behaviour.

Well, existing implementations allow you to override that, unfortunately
in a non portable way.

A hack for avoiding copying strings during vector reallocation for Dinkum
bundled with VC7.1:

namespace std
{

template<class Alloc>
inline vector<string>::pointer _Uninit_copy(
vector<string>::pointer First
, vector<string>::pointer Last
, vector<string>::pointer Dest
, Alloc& Al
, _Nonscalar_ptr_iterator_tag
)
{
for (; First != Last; ++Dest, ++First)
{
Al.construct(Dest, string());
Dest->swap(*First);
Al.destroy(First);
}
return Dest;
}

}

I'm pretty sure you can hack in the same manner other STL implementations,
but I'd like there was a standard way to customize reallocate.

--
Maxim Yegorushkin

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Seungbeom Kim
Guest





PostPosted: Sat Apr 16, 2005 12:03 am    Post subject: Re: Multidimensional arrays Reply with quote

Rob wrote:
Quote:
Seungbeom Kim wrote:

It has been mentioned quite often that we should avoid using plain
arrays and use std::vector<> instead wherever possible.

This is fine with one-dimensional arrays, but what about
multidimensional arrays? I know that "vector of vectors" has been a
quite popular answer:

- std::vector< std::vector v(nrows, std::vector<T>(ncols));

but it is far from optimal, because:

Depends on your criteria i.e. that is your basis for saying A is
better than B?

As listed below. :)

Quote:
- The creation involves multiple times of dynamic allocation

Sure. That is the trade-off. You either do dynamic allocation
multiple times, or you spend more effort to manage the indexing
into the arrays.

Not in the case of built-in arrays:

int a[2][3][4];
int (*pa)[3][4] = new int[2][3][4];

No multiple dynamic allocations, no manual index management.

Quote:
- Indexing involves multiple times of dereferencing into a vector

That is true for any multi-dimensional array, using the standard array
syntax.

Again, if you work with a single dimensional array, but map indices
into a location in that array, you have to cost of doing that
mapping.

Again, not in the case of built-in arrays. Given the declaration above,
a[i][j][k] is accessed just like *(&a[0][0][0] + i*3*4 + j*4 + k) without
multiple times of dereferencing; it's not anything like "get base, add i,
dereference, add j, dereference, add k, dereference."

Quote:
For example;

int array[2*3];

void change_array_element(int i, int j, int new_value)
{
// assume indices valid
array(i*3 + j) = new_value;
}

versus

int array[2][3];

void change_array_element(int i, int j, int new_value)
{
// assume indices valid
array[i][j] = new_value;
}

Which is "better" depends on your point of view.

I don't see in which case the former could be better than the latter. I
expect both to be compiled into the same machine codes, and the latter
looks more natural and cleaner.

Quote:
What would be the answer to the multidimensional array problem?

Depends on how you define "the problem". There are a number of ways
of managing mult-dimensional arrays. All of them have trade-offs.
For some applications, one approach is "optimal". For other
applications, a different approach is "optimal".

What I have in mind is a very typical situation like this:

int scores[MAX_STUDENTS][MAX_SUBJECTS][MAX_SESSIONS];
// wasteful; size should be determined at run-time

This is just an example, but does this clarify the problem?
Can you enumerate what trade-offs there could be?

Quote:
This should certainly be on the FAQ, I think.

Not without a lot more clarity about exactly what you're asking, I
suggest.....

I'm seeking for a general advice on how to use a multidimensional array
that is elegant in syntax, flexible in size, and efficient in run time.
If there can be many different choices depending on the circumstances, a
good FAQ should list all of them, of course.

--
Seungbeom Kim

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
kanze@gabi-soft.fr
Guest





PostPosted: Sat Apr 16, 2005 12:04 am    Post subject: Re: Multidimensional arrays Reply with quote

Rob wrote:
Quote:
Seungbeom Kim wrote:

It has been mentioned quite often that we should avoid using
plain arrays and use std::vector<> instead wherever
possible.

This is fine with one-dimensional arrays, but what about
multidimensional arrays? I know that "vector of vectors" has
been a quite popular answer:

- std::vector< std::vector v(nrows, std::vector<T>(ncols));

but it is far from optimal, because:

Depends on your criteria i.e. that is your basis for saying A
is better than B?

He explained his criteria in what followed. On the whole,
however, it's hard to see how anyone could claim std::vector was
optimal for user level two dimentional arrays. The syntax alone
makes it far from optimal.

Quote:
- The creation involves multiple times of dynamic allocation

Sure. That is the trade-off. You either do dynamic
allocation multiple times, or you spend more effort to manage
the indexing into the arrays.

With the noticeable difference: it's the implementation which
takes care of the allocations automatically, whereas if you use
one big array, the user has to take care of the index
calculations manually.

Quote:
- Indexing involves multiple times of dereferencing into a
vector

That is true for any multi-dimensional array, using the
standard array syntax.

Not really. Conceptually, the first index accesses to return an
array, but practically, the compiler calculates the final index
as if it were a single dimentional array.

Quote:
Again, if you work with a single dimensional array, but map
indices into a location in that array, you have to cost of
doing that mapping.

For example;

int array[2*3];

void change_array_element(int i, int j, int new_value)
{
// assume indices valid
array(i*3 + j) = new_value;
}

versus

int array[2][3];

void change_array_element(int i, int j, int new_value)
{

// assume indices valid
array[i][j] = new_value;
}

I've yet too see a compiler which would generate significantly
different code for these two. If there is a difference, it
would probably favor the second; the compiler has more
information (e.g. why you are multiplying, etc.), and so can
presumably do a better job. But in practice, the compile will
normally generate exactly the same code.

Quote:
Which is "better" depends on your point of view.

Quite. If you need a one dimensional array, the first is
better; if you need a two dimensional array, the second is. But
the initial posting was explicitly about multidimensional
arrays.

Quote:
- The shape (all rows having the same # of columns) not
ensured by the language (though it's not required at all
times)

Sure. That comes in handy, however, in cases where you don't
want all rows to have the same length.

Again, it depends on what you are trying to achieve.

Yes, but he spoke of multi-dimensional arrays. Which is a
particular data structure. There are certainly cases where
other data structures are more appropriate, but that's not the
question. Unless your argument is that multi-dimensional arrays
are so rare that they don't deserve standard support. (This is,
of course, a potential argument. I can't remember the last time
I used a multi-dimensional array. On the other hand, I know
I've never used a complex, but I understand the reasons why it
belongs in the standard.)

Quote:
Boost.MultiArray seems somewhat better, but I feel that the
syntax is not straightforward enough for novices to use
comfortably. And for some reason that I don't know, it seems
not to have been mentioned as often as the "vector of
vectors" solution. (This is not based on any statistics, but
just on how I feel, so it may be wrong.)

It is not mentioned as often, because it is non-standard (not
in the C++ standard).

A lot of people here do consider Boost more or less "standard".
While I wouldn't go that far, if the standard doesn't have a
solution, and Boost does, I'd certainly mention it.

Quote:
Like any option, it has advantages and disadvantages.

What would be the answer to the multidimensional array
problem?

Depends on how you define "the problem". There are a number
of ways of managing mult-dimensional arrays.

There are a number of ways of implementing multi-dimensional
arrays. From the users point of view, they should all have an
interface pretty much like a[i][j], since this is what works for
the built-in types.

Quote:
All of them have trade-offs. For some applications, one
approach is "optimal". For other applications, a different
approach is "optimal".

If we are considering multi-dimensional arrays, it's hard to
imagine a case where nested std::vectors would be "optimal".
The declaration syntax ensures that much. From what little I
understand, too, applications which do use multi-dimensional
arrays are often CPU bound -- placing the entire array in
contiguous memory (which std::vector will not do) will probably
help locality, and thus could improve performance considerably.

Quote:
This should certainly be on the FAQ, I think.

Not without a lot more clarity about exactly what you're
asking, I suggest.....

What he's asking is the C++ solution for multi-dimensional
arrays. I don't see any problem there. I rather suspect that
there is a good answer involving valarray; I think that they
were designed to support numeric applications, and I think that
that is where most of the uses of multi-dimensional arrays
occurs. But I'm not familiar enough with either valarray or
numeric applications to say for sure. What does seem obvious is
that asking people to write:

std::vector< std::vector< double > >
a( 5, std::vector< double >( 10 ) ) ;

instead of:

double a[ 5 ][ 10 ] ;

isn't an acceptable answer.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Carlos Moreno
Guest





PostPosted: Sat Apr 16, 2005 12:10 am    Post subject: Re: Multidimensional arrays Reply with quote

Valentin Samko wrote:

Quote:
There is another major problem with this approach if you need to be able
to resize the outer std::vector (or insert/delete elements). In this
case, all the inner std::vector objects will be copied, and this can be
quite expensive.

Not if you do it right:

template <typename T>
resize_outer (vector< vector & v, int new_size)
{
vector< vector tmp (new_size);

for (int i = 0; i < min(new_size, v.size()); i++)
{
v[i].swap (tmp[i]);
}

v.swap (tmp);
}


Carlos
--

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Valentin Samko
Guest





PostPosted: Sat Apr 16, 2005 12:18 am    Post subject: Re: Multidimensional arrays Reply with quote

Quote:
Some time ago I solved this problem by implementing my own version of
std::vector, where the only difference in behaviour was that my version
used std::swap instead of copy constructors when moving elements in the
array (when deleting/inserting/... elements). So, my version performs
much better when constructor+swap is much cheaper than the copy
constructor, which is the case with all the STL containers.
Unfortunately I have not found any way to configure std::vector
(providing custom allocator) for such behaviour.


Well, existing implementations allow you to override that, unfortunately
in a non portable way.

A hack for avoiding copying strings during vector reallocation for Dinkum
bundled with VC7.1:

namespace std
{

template<class Alloc
inline vector vector<string>::pointer First
, vector<string>::pointer Last
, vector<string>::pointer Dest
, Alloc& Al
, _Nonscalar_ptr_iterator_tag
)
{
for (; First != Last; ++Dest, ++First)
{
Al.construct(Dest, string());
Dest->swap(*First);
Al.destroy(First);
}
return Dest;
}

}

this is quite an interesting approach, but portability is important in
my case, and there is a chance that with some STL implementation this is
not possible. Moreover, there is no guarantee that the next version of
VC++ will allow the same hack. I really wish allocators had "move"
functions.

Btw, 23.2.4.1 mentiones constructor of value_type, and not allocator.
For instance,

"The constructor template <class InputIterator> vector(InputIterator
first, InputIterator last) makes only N calls to the copy constructor of
T (where N is the
distance between first and last) and no reallocations if iterators first
and last are of forward, bidirectional,
or random access categories."

I wonder why std::vector is not required to use allocator::construct.
Did I miss something?

Valentin

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Gerhard Wesp
Guest





PostPosted: Wed Apr 20, 2005 9:17 am    Post subject: Re: Multidimensional arrays Reply with quote

Seungbeom Kim <musiphil (AT) bawi (DOT) org> wrote:
Quote:
What would be *the* answer to the multidimensional array problem?
This should certainly be on the FAQ, I think.

I attach the description of my multidimensional array which is part of
cpp-lib, cf. http://www.cosy.sbg.ac.at/~gwesp/ . Contrary to other
designs, it supports dynamic dimensioning (before you ask: yes, cases
arise in practice where this is necessary! :-)

It's interface tries to follow the std container interface as closely as
possible. E.g., it is possible to push_back() a (d-1)-dimensional array
on top of a d-dimensional one.

Currently, it's indexed by a std::vector< std::size_t > of size d
(straightforward extension of indexing by a single std::size_t). One
could probably squeeze out a few additional performance percents in
special cases by templatizing on the index type. In the same manner, one
can add additional indexing syntax.

I'd very much appreciate any questions and comments!

Cheers
-Gerhard

===============================< snip >====================================


template< typename T > struct table ;
=====================================

A table< T > is a data type which maps a d-dimensional interval of integers
to T. It is indexed by std::vector< size_t >, typedef'ed as index_type.
Index vectors must have size() d.

Table< T > supports operations similar to std::vector<>. A table
can be empty(), which is equivalent to dimension() = 0.

A non-empty table t has a dimension() >= 1 (denoted by d ) and a size(),
which is an index_type indicating the one-past-the end index in each
dimension. The elements of t can be accessed by t[ i ] where i is an
index_type with d elements and i[ j ] < size()[ j ] for 0 <= j < d.

A table can be constructed with given maximum index values in each dimension.

Empty tables can be constructed and then populated by push_back().

If empty(), push_back( t ) for t.dimension() = d - 1 ``impresses''
dimension d on *this. size()[ d - 1 ] will be 1 and size()[ i ] will
agree with t.size()[ i ] for 0 <= i < d - 1. *this with the d-th index
equal to zero will have t as a subtable.

If empty(), push_back( t ) for a t of type T results in *this having
dimension 1, containing exactly one element equal to t.

If dimension() = d >= 2, push_back( t ) will succeed iff

t.dimension() = d - 1
t.size()[ i ] == size()[ i ] for 0 <= i < d - 1.

size()[ d - 1 ] will increase by one and *this with the d-th index equal to
size()[ d - 1 ] - 1 will have t as a subtable.

A table< T > stores its values in a std::vector< T >. It hands out
the begin() and end() iterators of this value vector. It is recommended
to use the iterators to speed up element access.

The d-dimensional structure is mapped to the value vector as follows.
The strides() function returns an index_type::const_iterator to the
beginning of a vector of length d + 1.

strides() is the ``cumulated product'' of size(), i.e.

strides()[ 0 ] = 1
strides()[ 1 ] = size()[ 0 ]
strides()[ 2 ] = size()[ 0 ] * size()[ 1 ]
....
strides()[ d ] = size()[ 0 ] * ... * size()[ d - 1 ]

strides()[ dimension() ] is the total number of elements in the table.

Let 0 <= i < dimension(). Then, for an index j in dimension i,

begin()[ j * strides()[ d ] ]

traverses dimension i with all other indices fixed.

--
Gerhard Wesp o o Tel.: +41 (0) 43 5347636
Bachtobelstrasse 56 | http://www.cosy.sbg.ac.at/~gwesp/
CH-8045 Zuerich _/ See homepage for email address!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.