C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

trusting the compiler to optimize the passing of large objec

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Walt Karas
Guest





PostPosted: Sun Dec 28, 2003 10:01 am    Post subject: trusting the compiler to optimize the passing of large objec Reply with quote



Hopefully some of you who are familiar with the internals of state-of-
the-art C++ compilers can give me some info about this issue. I'm
wondering to what extent we can rely on the compiler to optimize the
passing of large objects as function return values or as value
parameters. For example, can I depend on the compilier to generate
object code for this:

really_big_class foo(void);

as though it were this:

void foo(really_big_class &tmp);

This can be a significant issue when using templates. One example
can be seen in the STL templates (map, set) whose implementations
are based on binary search trees. It can be argued that the
"natural" implementation of an iterator through a binary tree uses
a vector of pointers to the nodes on the path from the root to the
"current" node. In the SGI implementation, however, each tree node
is augmented with a parent pointer, which means that tree iterators
require only a single pointer to the current node. Given that
iterators are passed by value throughout the STL, I would guess that
the person who did the implementation did not trust the compiler to
optimize the passing of large (tree iterator) objects by value.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
John Potter
Guest





PostPosted: Sun Dec 28, 2003 5:16 pm    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote



On 28 Dec 2003 05:01:59 -0500, [email]wkaras (AT) yahoo (DOT) com[/email] (Walt Karas) wrote:

Quote:
Hopefully some of you who are familiar with the internals of state-of-
the-art C++ compilers can give me some info about this issue. I'm
wondering to what extent we can rely on the compiler to optimize the
passing of large objects as function return values or as value
parameters. For example, can I depend on the compilier to generate
object code for this:

really_big_class foo(void);

as though it were this:

void foo(really_big_class &tmp);

No. The return value is a *value* not a reference. It must be created
within the function.

For most compilers, you can depend upon it being implemented as

void foo (void* raw_memory_aligned_for_really_big_class);

The function will do a placement new in the space supplied. That will
save a copy and may save two if used in initialization.

really_big_class rbc(foo());

The caller will pass the address of rbc to foo and foo will create the
object in that space. OTOH,

rbc = foo();

The caller will pass the address of some temporary space in which the
initialization will take place. That space may be a value parameter
to operator= or bound to a const& parameter to operator=. There will
be a temporary created and destroyed somewhere.

On the other side with value parameters, they are initialized by the
caller and are local variables for the callee. There is no way to
pass a reference. The copy is modifiable and that must not modify
the original. If you want to avoid the copy, passing by const& is
appropriate.

Inlining the function might save some copies.

Quote:
This can be a significant issue when using templates. One example
can be seen in the STL templates (map, set) whose implementations
are based on binary search trees. It can be argued that the
"natural" implementation of an iterator through a binary tree uses
a vector of pointers to the nodes on the path from the root to the
"current" node. In the SGI implementation, however, each tree node
is augmented with a parent pointer, which means that tree iterators
require only a single pointer to the current node. Given that
iterators are passed by value throughout the STL, I would guess that
the person who did the implementation did not trust the compiler to
optimize the passing of large (tree iterator) objects by value.

Iterators are assumed to be simple things like pointers which should
be passes by value. If your iterators violate the assumption, there
will be a cost. Consider istream_iterator<string> which will likely
contain a string member. For algorithms, the one pass by value to
initialize an internal copy is not that expensive compared to the
work done by the algorithm. For container members which receive an
iterator and do little work, big iterators would be expensive.

John

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Pete Becker
Guest





PostPosted: Sun Dec 28, 2003 10:49 pm    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote



Walt Karas wrote:
Quote:

It can be argued that the
"natural" implementation of an iterator through a binary tree uses
a vector of pointers to the nodes on the path from the root to the
"current" node. In the SGI implementation, however, each tree node
is augmented with a parent pointer, which means that tree iterators
require only a single pointer to the current node. Given that
iterators are passed by value throughout the STL, I would guess that
the person who did the implementation did not trust the compiler to
optimize the passing of large (tree iterator) objects by value.


The standard requires that insert not invalidate any iterators and that
erase not invalidate any iterators pointing to elements other than the
one being erased. This "natural" implementation is a horror to
implement, since the path from the root to an element can change when
other elements are removed or added, requiring changest to all iteartors
that point to affected elements.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Walt Karas
Guest





PostPosted: Mon Dec 29, 2003 11:04 am    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

Pete Becker <petebecker (AT) acm (DOT) org> wrote

Quote:
The standard requires that insert not invalidate any iterators and that
erase not invalidate any iterators pointing to elements other than the
one being erased.
...

Your statement contradicts C++PL 3rd Ed. section 16.3.8. I can't
see how any reasonable implementation of the vector template
could avoid invalidating iterators upon insert/erase.

Preservation of iterators over erase is also not good for allocation
schemes that involve allocating blocks of space that can hold
multiple elements. Preserving iterators disallows consolidation
to free up sparsely populated blocks.

Not to discourage discussion of STL design/implementation issues,
but that wasn't my main question. I wrote an AVL tree template:

http://www.geocities.com/wkaras/gen_cpp/avl_tree.html

I decided to make the iterators incompatible with the STL precisely
because I thought they were too big to be passed by value. I want
to know if my lack of faith in the compiler to optimize for this
was justified or not.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
John Potter
Guest





PostPosted: Tue Dec 30, 2003 5:13 am    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

On 29 Dec 2003 06:04:37 -0500, [email]wkaras (AT) yahoo (DOT) com[/email] (Walt Karas) wrote:

Quote:
Pete Becker <petebecker (AT) acm (DOT) org> wrote

The standard requires that insert not invalidate any iterators and that
erase not invalidate any iterators pointing to elements other than the
one being erased.
...

Your statement contradicts C++PL 3rd Ed. section 16.3.8. I can't
see how any reasonable implementation of the vector template
could avoid invalidating iterators upon insert/erase.

You talked about associative containers and he responded to that. The
standrad requires what he said for associatives and list.

Quote:
I wrote an AVL tree template:

http://www.geocities.com/wkaras/gen_cpp/avl_tree.html

On a quick look, I found nothing useful. If the goal was to explain an
AVL tree and have fun with low level implementation details, it seems
to have worked. If the goal was to provide something useful for
applications, the documentation is misdirected.

A map or set is useful. Consider a bidirectional map which is two
maps from key to iterator into other. It is important to not have
iterators invalidated by the next insertion.

Quote:
I decided to make the iterators incompatible with the STL precisely
because I thought they were too big to be passed by value. I want
to know if my lack of faith in the compiler to optimize for this
was justified or not.

As stated in my other post, they can't.

I would suggest that you rewrite it taking advantage of parent pointers
and compare. I have always found that insert/erase are faster with
parent pointers than otherwise. Given parent pointers, there is no
reason for large iterators. Your pointers are indexes in an
implementation detail. I don't think insertion needs to invalidate
iterators. KISS.

John

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Stephen Howe
Guest





PostPosted: Tue Dec 30, 2003 12:38 pm    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

Quote:
Your statement contradicts C++PL 3rd Ed. section 16.3.8. I can't
see how any reasonable implementation of the vector template
could avoid invalidating iterators upon insert/erase.

His statement does not contradict the said reference.
Your original post referred to map and set not vector and Pete's response
was for iterators to those containers.
Iterators for node-based containers (map,set,multimap,multiset,list) are
valid for as long as the node exists.

Iterator invalidation differs from container to container. You need to read
the standard on that.

Quote:
Not to discourage discussion of STL design/implementation issues,
but that wasn't my main question. I wrote an AVL tree template:

http://www.geocities.com/wkaras/gen_cpp/avl_tree.html

I decided to make the iterators incompatible with the STL precisely
because I thought they were too big to be passed by value.

Then that means you give up being able to use STL algorithms.

Quote:
I want
to know if my lack of faith in the compiler to optimize for this
was justified or not.

I would rethink your decision as to why your iterators cannot be pointers
(or a class-based pointer).

AFAIK, all implementations of <set>,<map> etc use red-black trees because
only 2 bits are required to store the colour, whereas AVL trees (or
height-balanced trees of height 1) require 3 bits (for -1, 0 or 1). Further
all implementations of <set>,<map> etc have 3 pointers-per-node (left,
right, parent) and it is not because the "person who did the implementation
did not trust the compiler to optimize the passing of large (tree iterator)
objects by value" but because there are certain complexity requirements
that must be met for insert(), erase() etc. Having a parent pointer makes it
easier.

There is no real reason why you cannot use AVL trees or any other tree type
for implementing set,map etc just as long as the complexity requirements are
met.

Stephen Howe




[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Pete Becker
Guest





PostPosted: Tue Dec 30, 2003 12:39 pm    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

Walt Karas wrote:
Quote:

Pete Becker <petebecker (AT) acm (DOT) org> wrote

The standard requires that insert not invalidate any iterators and that
erase not invalidate any iterators pointing to elements other than the
one being erased.
...

Your statement contradicts C++PL 3rd Ed. section 16.3.8. I can't
see how any reasonable implementation of the vector template
could avoid invalidating iterators upon insert/erase.

Sigh. Your message was about an iterator "through a binary tree." A
binary tree is not a vector.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Walt Karas
Guest





PostPosted: Wed Dec 31, 2003 12:59 am    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

Quote:
I wrote an AVL tree template:

http://www.geocities.com/wkaras/gen_cpp/avl_tree.html

On a quick look, I found nothing useful. If the goal was to explain an
AVL tree and have fun with low level implementation details, it seems
to have worked. If the goal was to provide something useful for
applications, the documentation is misdirected.

I discuss this in the documentation. This template would be useful if
memory was limited and the size of the associative container needed to
be compressed. Or in a small embedded system with no heap. Or if you
wanted to avoid using the heap for performance reasons. Or if you
wanted to keep the associative container in a single array for easy
storage to/retreival from non-volatile secondary storage.

This AVL tree template, with necessary modifications, could be used
to implement the STL associative containers. What (if any) modifications
would be needed depends on the "passing large values" issue (yes,
faithful readers, I still would like some input on this) and the
"preserving iterators over insert/erase" issue.

Although I chose to do it in C rather than C++, This:

http://www.geocities.com/wkaras/heapmm/heapmm.html

is an example showing the use of an associative container that couldn't
be done with the STL. (A heap that needs a heap is not very useful.)

Quote:

A map or set is useful. Consider a bidirectional map which is two
maps from key to iterator into other. It is important to not have
iterators invalidated by the next insertion.


I'm not sure I understand, but these kinds of situations are often
handled by having the container hold pointers rather than the actual
objects.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Walt Karas
Guest





PostPosted: Wed Dec 31, 2003 1:03 am    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

Pete Becker <petebecker (AT) acm (DOT) org> wrote

Quote:
Walt Karas wrote:

Pete Becker <petebecker (AT) acm (DOT) org> wrote

The standard requires that insert not invalidate any iterators and that
erase not invalidate any iterators pointing to elements other than the
one being erased.
...

Your statement contradicts C++PL 3rd Ed. section 16.3.8. I can't
see how any reasonable implementation of the vector template
could avoid invalidating iterators upon insert/erase.

Sigh. Your message was about an iterator "through a binary tree." A
binary tree is not a vector.

Sorry I misunderstood you.

I looked at the draft standard, and I can't find the section that says if
associative containers do or don't preserve iterator values over insert/
erase. Does anyone have the section number on the tip of their typing
fingers?

Regardless, you've convinced me that there are other advantages to
using a per-node parent pointer in a binary search tree implementation,
besides keeping iterators small. I assume you're not arguing that there
are never any circumstances where it's preferable to use a BST
implementation without parent pointers.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Jonathan Turkanis
Guest





PostPosted: Wed Dec 31, 2003 1:04 pm    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote


"Walt Karas" <wkaras (AT) yahoo (DOT) com> wrote in
Quote:

I looked at the draft standard, and I can't find the section that says if
associative containers do or don't preserve iterator values over insert/
erase. Does anyone have the section number on the tip of their typing
fingers?


Section 23.1.2[lib.associative.reqmts], par 8.

BTW, why are you using the 'draft' standard?

Jonathan





[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
John Potter
Guest





PostPosted: Wed Dec 31, 2003 1:13 pm    Post subject: Re: trusting the compiler to optimize the passing of large o Reply with quote

On 30 Dec 2003 19:59:10 -0500, [email]wkaras (AT) yahoo (DOT) com[/email] (Walt Karas) wrote:

Quote:
I wrote an AVL tree template:

http://www.geocities.com/wkaras/gen_cpp/avl_tree.html

On a quick look, I found nothing useful. If the goal was to explain an
AVL tree and have fun with low level implementation details, it seems
to have worked. If the goal was to provide something useful for
applications, the documentation is misdirected.

I discuss this in the documentation. This template would be useful if
memory was limited and the size of the associative container needed to
be compressed. Or in a small embedded system with no heap. Or if you
wanted to avoid using the heap for performance reasons. Or if you
wanted to keep the associative container in a single array for easy
storage to/retreival from non-volatile secondary storage.

Sorry. What I missed was an interface. I must have been tired or got
sidetracked. A second look shows exactly what I thought was missing.

Quote:
This AVL tree template, with necessary modifications, could be used
to implement the STL associative containers.

I have done that with an avl-tree. Slightly faster than the usual
rb-tree for my tests.

Quote:
What (if any) modifications
would be needed depends on the "passing large values" issue (yes,
faithful readers, I still would like some input on this) and the
"preserving iterators over insert/erase" issue.

I think you now know that you must have parent pointers to conform.
The requirement follows the table for associatives 23.1.2/8. The
requirement for list is in the insert/erase paragraphs.

Quote:
A map or set is useful. Consider a bidirectional map which is two
maps from key to iterator into other. It is important to not have
iterators invalidated by the next insertion.

I'm not sure I understand, but these kinds of situations are often
handled by having the container hold pointers rather than the actual
objects.

But we are thinking at a higher level. There are no pointers into
STL containers, there are iterators.

template <class K1, class K2, class Comp1 = less less<K2> >
class Bimap {
typedef map<K1, K2, Comp1> Map;
typedef typename Map::iterator iterator;
struct Comp {
bool operator() (iterator lhs, iterator rhs) {
return Comp2()(lhs->second, rhs->second);
}
};
Map data;
set<iterator, Comp> index;
// ...
};

We can't have an insertion into data destroy the index. We also
do not want a set of huge iterators.

John

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.