 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Andrei Alexandrescu Guest
|
|
| Back to top |
|
 |
kanze@gabi-soft.fr Guest
|
Posted: Fri Jun 27, 2003 2:10 pm Post subject: Re: vector.push_front |
|
|
John Potter <jpotter (AT) falcon (DOT) lhup.edu> wrote
| Quote: | On 21 Jun 2003 15:53:00 -0400, Corey Murtagh
[email]emonk (AT) slingshot (DOT) co.nz.no.uce[/email]> wrote:
The order of an operation will tell you, in general, how it relates
to the number of items, etc. It will however tell you very little
about the actual speed of the operation. Two algorithms can be
O(n) for the same task and still have huge difference in speed of
operation.
So what? We are talking about an O(n^2) push_front to O(n) push_back.
The difference in speed between copy-by-iteration and memcpy is
significant, so long as the STL implementation uses memcpy to copy
the whole block rather than each item (which would be slower than
copy-by-iteration for std::vector<char>).
You are comparing memmove(dat + 1, dat, size) to copy_backward(dat,
dat + size, dat + size + 1). Are you sure one is better than the
other? Post your benchmark please.
|
OK. I actually compared memcpy with std::copy. Obviously, the results
are very implementation dependant, but on my machine (Sun Sparc,
Solaris, Sun CC 5.1), I get the following results for copying an array
of 100000 elements (the first values are for int's, the second for long
long's) :
memcpy (int's) 2.3 milliseconds
std::copy (int's) 1.5 milliseconds
memcpy (ll's) 4.5 milliseconds
std::copy (ll's) 2.5 milliseconds
This is about what I would expect. The instantiation of std::copy is
less general, and so should normally be faster.
I do get a curious result, however. If I double the size of the array
(200000 elements), you'd expect the times to double. This is more or
less true for the first three:
memcpy (int's) 4.8 milliseconds
std::copy (int's) 3.1 milliseconds
memcpy (ll's) 9.7 milliseconds
std::copy (ll's) 11.5 milliseconds
For some reason, std::copy becomes significantly slower when the size
(in bytes) of the array is larger than 1 MByte.
Which just goes to show that you really do have to measure, and measure
exactly what it is that interests you. Intuitively, I find it hard to
believe that std::copy could ever be slower than memcpy, but the figures
are there.
--
James Kanze GABI Software mailto:kanze (AT) gabi-soft (DOT) fr
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Tél. : +33 (0)1 30 23 45 16
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dhruv Guest
|
Posted: Sat Jun 28, 2003 12:30 pm Post subject: Re: vector.push_front |
|
|
On Thu, 26 Jun 2003 18:25:02 -0400, Andrei Alexandrescu wrote:
This article does make some interesting reading...... I wanted to know
something... Were the compilers operated at full optimization? (-O3 for
g++)? Also, typically, I've seen that the memcpy function in the c library
is implemented as a duff's device algorothm, so it's surprising that they
give some difference in speed.
Regards,
-Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Andrei Alexandrescu Guest
|
Posted: Sat Jun 28, 2003 11:31 pm Post subject: Re: vector.push_front |
|
|
"Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote in message
I used -O2.
| Quote: | Also, typically, I've seen that the memcpy function in the c library
is implemented as a duff's device algorothm, so it's surprising that they
give some difference in speed.
|
Memcpy is sometimes (just a generalization of what I've seen) implemented
with the assembler instruction that copies blocks of memory. Some compilers
detect uses of memcpy and generate inline code with various implementations,
depending on the arguments.
Andrei
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
kanze@gabi-soft.fr Guest
|
Posted: Mon Jun 30, 2003 4:04 pm Post subject: Re: vector.push_front |
|
|
"Andrei Alexandrescu" <SeeWebsiteForEmail (AT) moderncppdesign (DOT) com> wrote in
message news:<bdkhf7$t8p86$1 (AT) ID-14036 (DOT) news.dfncis.de>...
| Quote: | "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote in message
http://moderncppdesign.com/publications/cuj-10-2001.html
This article does make some interesting reading...... I wanted to
know something... Were the compilers operated at full optimization?
(-O3 for g++)?
I used -O2.
Also, typically, I've seen that the memcpy function in the c library
is implemented as a duff's device algorothm, so it's surprising that
they give some difference in speed.
Memcpy is sometimes (just a generalization of what I've seen)
implemented with the assembler instruction that copies blocks of
memory. Some compilers detect uses of memcpy and generate inline code
with various implementations, depending on the arguments.
|
Correct. When this isn't the case, memcpy has to deal with all possible
alignments. I've seen some pretty fancy code to handle this, but it
still means extra tests, etc. Whereas when the code is generated
inline, the compiler typically knows the alignment of the pointers, that
the count is a multiple of some power of 2, etc., and can do a better
job.
--
James Kanze GABI Software
mailto:kanze (AT) gabi-soft (DOT) fr
Conseils en informatique orientée objet/
http://www.gabi-soft.fr
Beratung in objektorientierter
Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Tél. : +33 (0)1 30 23 45
16
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dhruv Guest
|
Posted: Wed Jul 02, 2003 5:01 pm Post subject: Re: vector.push_front |
|
|
| Quote: | Here's the std::copy() from BCB4 (RogueWave STL, which is crap):
As usual. |
| Quote: | template
OutputIterator copy(InputIterator first, InputIterator last,
OutputIterator result)
{
while (first != last)
*result++ = *first++;
return result;
}
They should not use post-increment (I have a thread on this though). |
while (first != last)
{*result = *first;
++result; ++first;}
return result;
Would be GUARANTEED to be much faster for non-PODs.
[program snipped]... moderators have a problem with the size of the
message.
Please refer to the original post for the program.
| Quote: |
Results:
9.07937e-05
3.35238e-06 //Is this correct.
Changing the order (doing the memcpy first) does not affect the
results.
So from this I infer that the std::copy()'s iteration method is
vastly
slower than memcpy... on the order of 30 times slower in fact.
If the exponent that you have shown is correct, then it's way beyond 30 |
times faster.
| Quote: | If I raise the buffer sizes to 64k, memcpy() slows down (due to cache
limits), giving an increase in speed of only 10 times or so over
std::copy().
If you really think that will make a difference in the problem being
discussed, benchmark that and post also. Push_front is so much
slower
that I did not have the patience to wait for it on a measurable
push_back size.
The point is, vector::push_front() *could* be specialised to use
memmove() and memcpy() rather than iteration copying for batch
insertions, although only for cases where it can be safely used.
Complex classes often need to be copied by their own operators to
prevent pointer duplication and so on, so obviously it isn't safe to
copy those around using memcpy(), and it often isn't safe to move them
with memmove either.
BTW, I did similar tests on my compiler+platform of memmove vs
std::copy_backward() (with a 64k buffer), and they show that memmove
is
around 25 times faster than std::copy_backward().
If you can't repeat the above, either you have a much more efficient
(ie: uses memcpy()/memmove()) implementation of std::copy() and
std::copy_backward(), or you have a very slow memcpy()/memmove().
|
The results are in front, for everyone to see.
Thanks Corey.
Regards,
-Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dhruv Guest
|
Posted: Thu Jul 03, 2003 6:07 pm Post subject: Re: vector.push_front |
|
|
On Wed, 02 Jul 2003 22:04:37 -0400, Ron Natalie wrote:
[snip]............
| Quote: |
The iterators for a vector are usually some POD t ype (a pointer)
It's the iterator you're incrementing here not the object.
|
Yes, but the iterators for a list, say are not that trivial.
Run the following program:
#include <iostream>
#include <list>
#include <vector>
#include <nstl/timer.hpp>
using std::list;
using std::vector;
using std::cout;
using std::endl;
using nstd::timer;
template <class I, class Dest>
void bad_copy (I first, I last, Dest d)
{
while (first != last)
*d++ = *first++;
}
template <class I, class Dest>
void good_copy (I first, I last, Dest d)
{
while (first != last) {
*d = *first;
++d; ++first;
}
//*d++ = *first++;
}
int main ()
{
timer t;
list<int> l1;// (500000, 23);
vector<int> v1;
v1.resize (3000000);
for (int i = 0; i < 3000000; ++i)
l1.push_back (i);
t.start ();
bad_copy (l1.begin(), l1.end(), v1.begin());
t.stop ();
cout<<"Time for vector copy with post-increment:
"<
t.start ();
good_copy (l1.begin(), l1.end(), v1.begin());
t.stop ();
cout<<"Time for vector copy with pre-increment:
"<
}
I got this output:
Time for vector copy with post-increment: 0.37
Time for vector copy with pre-increment: 0.34
After replacing vector with list, and list with vector, I got this:
Time for list copy with post-increment: 0.44
Time for list copy with pre-increment: 0.29
This is of course without optimizations. With optimization enabled, both
give the exact same timing.
-Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
kanze@gabi-soft.fr Guest
|
Posted: Fri Jul 04, 2003 2:23 am Post subject: Re: vector.push_front |
|
|
"Ron Natalie" <ron (AT) sensor (DOT) com> wrote
| Quote: | "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote in message
news:pan.2003.06.27.14.22.19.784515 (AT) gmx (DOT) net...
while (first != last)
*result++ = *first++;
return result;
}
They should not use post-increment (I have a thread on this though).
while (first != last)
{*result = *first;
++result; ++first;}
return result;
Would be GUARANTEED to be much faster for non-PODs.
The iterators for a vector are usually some POD type (a pointer). It's
the iterator you're incrementing here not the object.
|
Actually, all iterators in a quality implementation will be class types.
(I know that this is the case for g++ -- I think it is also the case for
most other recent releases.) On the other hand, they are extremely
simple classes, with all functions completely inline. In the actual
tests I did, I was unable to measure any differences with the g++
(3.2.2) implementations, when optimization was at -O3.
--
James Kanze GABI Software mailto:kanze (AT) gabi-soft (DOT) fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Tél. : +33 (0)1 30 23 45 16
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
kanze@gabi-soft.fr Guest
|
Posted: Fri Jul 04, 2003 3:02 am Post subject: Re: vector.push_front |
|
|
"Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote
| Quote: | Here's the std::copy() from BCB4 (RogueWave STL, which is crap):
As usual.
template
OutputIterator copy(InputIterator first, InputIterator last,
OutputIterator result)
{
while (first != last)
*result++ = *first++;
return result;
}
They should not use post-increment (I have a thread on this though).
while (first != last)
{*result = *first;
++result; ++first;}
return result;
Would be GUARANTEED to be much faster for non-PODs.
|
Guaranteed? And how much do you pay me if they aren't.
I did some measures some time back, and found no difference between
pre-increment and post-increment for the g++ libraries. In general,
enough of the iterator code is inline (the iterators are templates,
after all, so the implementation must be visible anyway) that the
compiler was able to suppress the extra copy when it wasn't used.
On the other hand, it is possible (or even probable, for some
containers) that:
*result = *first ;
result ++ ;
first ++ ;
is faster than the original version. After all, the compiler can't
suppress the copy if it is used. In this case, your version will also
be faster.
--
James Kanze GABI Software mailto:kanze (AT) gabi-soft (DOT) fr
Conseils en informatique orientée objet/ http://www.gabi-soft.fr
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France, Tél. : +33 (0)1 30 23 45 16
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Ron Natalie Guest
|
Posted: Fri Jul 04, 2003 3:42 am Post subject: Re: vector.push_front |
|
|
"Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote
| Quote: | This is of course without optimizations. With optimization enabled, both
give the exact same timing.
|
There is no point in doing any timing or discussing efficeincy when you have inlining
disabled for the standard library. The standard library in most implementations makes
heavy use of lots of tiny inline functions that no sane person would write that way if they
didn't expect the overhead to get inlined away.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|