C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

My Quick memcpy

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C Language (Moderated)
View previous topic :: View next topic  
Author Message
young
Guest





PostPosted: Tue May 01, 2007 9:11 am    Post subject: My Quick memcpy Reply with quote



typedef unsigned char ylib_byte_t;
typedef unsigned int ylib_word_t;

enum YOUNG_LIBRARY_MEMORY_FUNCTION_CONSTANT
{
OUTSPREAD = 16,

RESIDUE = OUTSPREAD - 1
};

#define CYCLE_OUTSPREAD( expression ) \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression;

void* ylib_memcopy( void* dst, const void* src, size_t count )
{
ylib_byte_t *d, *s;
ylib_word_t* dstword = (ylib_word_t*)dst;
const ylib_word_t* srcword = (const ylib_word_t*)src;
size_t tmp, word_count = count / sizeof(ylib_word_t);
size_t word_outspread = word_count & ~((size_t)RESIDUE);

for( tmp = 0; tmp < word_outspread; tmp += OUTSPREAD )
{
CYCLE_OUTSPREAD( *dstword++ = *srcword++ )
}

for( ; tmp < word_count; ++tmp )
*dstword++ = *srcword++;

d = (ylib_byte_t*)dstword;
s = (ylib_byte_t*)srcword;
for( tmp *= sizeof(ylib_word_t); tmp < count; ++tmp )
*d++ = *s++;

return dst;
}

void test_memcpy(void)
{
double sec;
clock_t begin, end;
int i, dst[10000];
int src[10000];

begin = clock();
for( i = 0; i < 100000; ++i )
ylib_memcopy( dst, src, sizeof(int) * 10000 );
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nylib_memcopy time = %f\n", sec );

begin = clock();
for( i = 0; i < 100000; ++i )
memcpy( dst, src, sizeof(int) * 10000 );
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nmemcpy time = %f\n", sec );
}
--
comp.lang.c.moderated - moderation address: clcm (AT) plethora (DOT) net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
Back to top
Jonathan Leffler
Guest





PostPosted: Mon May 07, 2007 9:12 am    Post subject: Re: My Quick memcpy Reply with quote



young wrote:
Quote:
typedef unsigned char ylib_byte_t;
typedef unsigned int ylib_word_t;

enum YOUNG_LIBRARY_MEMORY_FUNCTION_CONSTANT
{
OUTSPREAD = 16,
RESIDUE = OUTSPREAD - 1
};

#define CYCLE_OUTSPREAD( expression ) \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression;

void* ylib_memcopy( void* dst, const void* src, size_t count )
[...]

Doesn't it perform rather horribly if your src or dst are not
sufficiently well aligned?

For example, if the src is aligned such that (src % 4) == 1 and (dst %
4) == 3, assuming 4-byte integers. Depending on the architecture, that
could be a crash (misaligned memory) or poor performance (while the CPU
fetches two ints, creates one int from appropriate sub-sections of the
two, and then fetches, assigns and writes to two ints.

What performance gain do you get with copies of smaller quantities?

Is the memcpy() on your compiler an inline assembler function, or a
plain old C function?

On my old (1 GHz G4 Mac, but running MacOS X 10.4.9 and GCC 4.0.1), the
timing I got was:

ylib_memcopy time = 3.840000

memcpy time = 1.570000

ylib_memcopy time = 7.570000

memcpy time = 1.670000

The first test is your test, verbatim; the second test was using the
mal-aligned code:

char dst[10000 * sizeof(int) + 8];
char src[10000 * sizeof(int) + 8];
...
for( i = 0; i < 100000; ++i )
ylib_memcopy( dst + 1, src + 3, sizeof(int) * 10000 );
...
for( i = 0; i < 100000; ++i )
memcpy( dst + 1, src + 3, sizeof(int) * 10000 );

The '+ 8' leaves space for trampling; it copies the same amount of data
as your test.

The system (built-in) memcpy() will almost inevitably out-perform
anything you write in C because it is not written in C, most usually.

--
Jonathan Leffler #include <disclaimer.h>
Email: jleffler (AT) earthlink (DOT) net, jleffler (AT) us (DOT) ibm.com
Guardian of DBD::Informix v2007.0226 -- http://dbi.perl.org/
--
comp.lang.c.moderated - moderation address: clcm (AT) plethora (DOT) net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
Back to top
Roberto Waltman
Guest





PostPosted: Mon May 07, 2007 9:12 am    Post subject: Re: My Quick memcpy Reply with quote



young <younghuan1980 (AT) gmail (DOT) com> wrote:

Quote:
typedef unsigned char ylib_byte_t;
typedef unsigned int ylib_word_t;

enum YOUNG_LIBRARY_MEMORY_FUNCTION_CONSTANT
{
OUTSPREAD = 16,

RESIDUE = OUTSPREAD - 1
};

#define CYCLE_OUTSPREAD( expression ) \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression;

void* ylib_memcopy( void* dst, const void* src, size_t count )
{
..
}


... assert(count > 16);


Roberto Waltman

[ Please reply to the group,
return address is invalid ]
--
comp.lang.c.moderated - moderation address: clcm (AT) plethora (DOT) net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
Back to top
Jonas
Guest





PostPosted: Mon May 07, 2007 9:12 am    Post subject: Re: My Quick memcpy Reply with quote

"young" <younghuan1980 (AT) gmail (DOT) com> wrote in message
news:clcm-20070501-0005 (AT) plethora (DOT) net...
Quote:
typedef unsigned char ylib_byte_t;
typedef unsigned int ylib_word_t;

enum YOUNG_LIBRARY_MEMORY_FUNCTION_CONSTANT
{
OUTSPREAD = 16,

RESIDUE = OUTSPREAD - 1
};

#define CYCLE_OUTSPREAD( expression ) \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression;

void* ylib_memcopy( void* dst, const void* src, size_t count )
{
ylib_byte_t *d, *s;
ylib_word_t* dstword = (ylib_word_t*)dst;
const ylib_word_t* srcword = (const ylib_word_t*)src;
size_t tmp, word_count = count / sizeof(ylib_word_t);
size_t word_outspread = word_count & ~((size_t)RESIDUE);

for( tmp = 0; tmp < word_outspread; tmp += OUTSPREAD )
{
CYCLE_OUTSPREAD( *dstword++ = *srcword++ )
}

for( ; tmp < word_count; ++tmp )
*dstword++ = *srcword++;

d = (ylib_byte_t*)dstword;
s = (ylib_byte_t*)srcword;
for( tmp *= sizeof(ylib_word_t); tmp < count; ++tmp )
*d++ = *s++;

return dst;
}

void test_memcpy(void)
{
double sec;
clock_t begin, end;
int i, dst[10000];
int src[10000];

begin = clock();
for( i = 0; i < 100000; ++i )
ylib_memcopy( dst, src, sizeof(int) * 10000 );
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nylib_memcopy time = %f\n", sec );

begin = clock();
for( i = 0; i < 100000; ++i )
memcpy( dst, src, sizeof(int) * 10000 );
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nmemcpy time = %f\n", sec );
}

On my system (after including headers necessary to make your code compile):

ylib_memcopy time = 1.328000

memcpy time = 0.672000


In general, trying to improve on a standard library function isn't such a
great idea. You are attempting to create a general solution that improves on
one that is most probably optimized for the system you are compiling for. In
the case of memcpy, optimization may for example include using SIMD
instructions where available.

--
Jonas
--
comp.lang.c.moderated - moderation address: clcm (AT) plethora (DOT) net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
Back to top
WillerZ
Guest





PostPosted: Mon May 07, 2007 9:12 am    Post subject: Re: My Quick memcpy Reply with quote

If you want to challenge both algorithms, that's a bad test. See below.

young wrote:
Quote:
void test_memcpy(void)
{
double sec;
clock_t begin, end;
int i, dst[10000];
int src[10000];

begin = clock();
for( i = 0; i < 100000; ++i )
ylib_memcopy( dst, src, sizeof(int) * 10000 );

Change this to:

ylib_memcopy( ((char *)&dst[0]) + 1, ((char *)&src[0]) + 1,
sizeof(int) * 10000 - 1 );

Quote:
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nylib_memcopy time = %f\n", sec );

begin = clock();
for( i = 0; i < 100000; ++i )
memcpy( dst, src, sizeof(int) * 10000 );

And this to:

memcpy( ((char *)&dst[0]) + 1, ((char *)&src[0]) + 1, sizeof(int) *
10000 - 1 );

Quote:
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nmemcpy time = %f\n", sec );
}
--

comp.lang.c.moderated - moderation address: clcm (AT) plethora (DOT) net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
Back to top
Barry Schwarz
Guest





PostPosted: Mon May 07, 2007 9:12 am    Post subject: Re: My Quick memcpy Reply with quote

On 01 May 2007 07:38:48 GMT, young <younghuan1980 (AT) gmail (DOT) com> wrote:

Quote:
typedef unsigned char ylib_byte_t;
typedef unsigned int ylib_word_t;

enum YOUNG_LIBRARY_MEMORY_FUNCTION_CONSTANT
{
OUTSPREAD = 16,

RESIDUE = OUTSPREAD - 1
};

#define CYCLE_OUTSPREAD( expression ) \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression; \
expression; expression; expression; expression;

void* ylib_memcopy( void* dst, const void* src, size_t count )
{
ylib_byte_t *d, *s;
ylib_word_t* dstword = (ylib_word_t*)dst;

Have you done something to insure that the address in dst is properly
aligned to point to an int.

Quote:
const ylib_word_t* srcword = (const ylib_word_t*)src;

The cast is unnecessary in both preceding statements.

Quote:
size_t tmp, word_count = count / sizeof(ylib_word_t);
size_t word_outspread = word_count & ~((size_t)RESIDUE);

for( tmp = 0; tmp < word_outspread; tmp += OUTSPREAD )
{
CYCLE_OUTSPREAD( *dstword++ = *srcword++ )
}

for( ; tmp < word_count; ++tmp )
*dstword++ = *srcword++;

d = (ylib_byte_t*)dstword;
s = (ylib_byte_t*)srcword;
for( tmp *= sizeof(ylib_word_t); tmp < count; ++tmp )
*d++ = *s++;

return dst;
}

void test_memcpy(void)
{
double sec;
clock_t begin, end;
int i, dst[10000];
int src[10000];

None of the elements of either array have been assigned values. They
are all indeterminate. Each of the last two for loops in ylib_memcpy
will invoke undefined behavior.

Quote:

begin = clock();
for( i = 0; i < 100000; ++i )
ylib_memcopy( dst, src, sizeof(int) * 10000 );
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;

One of the casts is superfluous.

Quote:
printf( "\nylib_memcopy time = %f\n", sec );


Why print it as a float instead of a double?

Quote:

begin = clock();
for( i = 0; i < 100000; ++i )
memcpy( dst, src, sizeof(int) * 10000 );
end = clock();
sec = (double)(end - begin) / (double)CLOCKS_PER_SEC;
printf( "\nmemcpy time = %f\n", sec );
}

Did you have a reason for posting the code, such as a question or
concern?


Remove del for email
--
comp.lang.c.moderated - moderation address: clcm (AT) plethora (DOT) net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.
Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.