C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Calculating safe alignment for any type

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Tom
Guest





PostPosted: Sun Nov 28, 2004 6:08 pm    Post subject: Calculating safe alignment for any type Reply with quote



I recently finished reading Herb Sutter's Exceptional C++ style. In the
book he mentions again the long standing problem of calculating alignment
for a type T.

In this post I will (hopefully) show how to calculate safe alignment for any
type T.

First a few definitions:
I will refer to arbitrary addresses as a(n) (e.g. a(1), a(2)) and to
addresses of specific data as a(<name>).
(e.g. a(t) == &t)

Unknown constants, n(<name>), where n(<name>) >= 0

Alignment of T is X(T), s.t. a(t) % X(T) = 0, or X(T) * n(t) = a(t), for
some n(t).

Now to the solution:

Consider the following struct:

template<class T>
struct A
{
char c;
T t;
};

The "c" member is in the stuct to ensure that sizeof(A<T>) > sizeof(T).

Now consider an array of A<T> in memory starting at address a(A).
We can say that

a(t) = a(A) + spaceBefore(c) + sizeof(char) + spaceAfter(c) //1

but because A<T>::t needs to be aligned on a T boundary, we have

padBefore(c) + sizeof(char) + padAfter(c) = X(T) * n(0)

Rewrite 1 as

a(t) = a(A) + X(T) * n(0) //2

Quote:
From 2, we can see that A<T> has to be aligned on a multiple of X(T), since

a(A) = a(t) - X(T)*n(0)

but a(t) = X(T) * n(t)

so a(A) = X(T) * n(t) - X(T) * n(0) = X(T) * ( n(t) - n(0) )

=> a(A) % X(T) = 0 //3

We can also say that

a(t) + sizeof(T) + spaceAfter(t) = a(next A in memory)

but since a must be aligned on a multiple of X(T),

padAfter(t) = X(T) * n(1)

so we get

a(t) + sizeof(T) + X(T)*n(1) = a(next A in memory) //4

Putting 2) and 4) together

a(A) + X(T) *n(0) + sizeof(T) + X(T)*n(1) = a(next A in memory)

But by definition of "next in memory" we also know that

a(next A in memory) = a(A) + sizeof(A)

So

a(A) + X(T) * n(0) + sizeof(T) + X(T) * n(1) = a(A) + sizeof(A)

Rewrite and simplify:

X(T) * (n(0) + n(1)) = sizeof(A) - sizeof(T) //5

Q.E.D.

So 5) above implies that sizeof(A<T>) - sizeof(T) is a always a safe
alignment for T (i.e. is a multiple of X(T) and
by construction sizeof(A<T>) > sizeof(T).
Also, note that n(0) is the multiplier of X(T) bytes in the struct before t
and n(1) is the number after.
It's easy to see that any decent compiler would make n(0) = 1 and n(1) = 0,
so that
sizeof(A<T>) - sizeof(T) = X(T), i.e. the exact alignment.

Comments?

Tom



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
M Jared Finder
Guest





PostPosted: Mon Nov 29, 2004 10:43 am    Post subject: Re: Calculating safe alignment for any type Reply with quote



Tom wrote:
Quote:
I recently finished reading Herb Sutter's Exceptional C++ style. In the
book he mentions again the long standing problem of calculating alignment
for a type T.

In this post I will (hopefully) show how to calculate safe alignment for any
type T.

First a few definitions:
I will refer to arbitrary addresses as a(n) (e.g. a(1), a(2)) and to
addresses of specific data as a(<name>).
(e.g. a(t) == &t)

Unknown constants, n(<name>), where n(<name>) >= 0

Alignment of T is X(T), s.t. a(t) % X(T) = 0, or X(T) * n(t) = a(t), for
some n(t).

Now to the solution:

Consider the following struct:

template<class T
struct A
{
char c;
T t;
};

The "c" member is in the stuct to ensure that sizeof(A sizeof(T).

Now consider an array of A<T> in memory starting at address a(A).
We can say that

a(t) = a(A) + spaceBefore(c) + sizeof(char) + spaceAfter(c) //1

spaceBefore(c) = 0, because of struct layout requirements.

Quote:
but because A<T>::t needs to be aligned on a T boundary, we have

padBefore(c) + sizeof(char) + padAfter(c) = X(T) * n(0)

Are padBefore(c) and padAfter(c) the same as spaceBefore(c) and
spaceAfter(c), respectively? If so, then how can you state that the
offset into the struct must be aligned to T, without previously stating
that the whole struct must be aligned to T? I'm not saying that a
struct could have alignment weaker than any one of its members (it
can't), but you should state that first.

Quote:
Rewrite 1 as

a(t) = a(A) + X(T) * n(0) //2

From 2, we can see that A<T> has to be aligned on a multiple of X(T), since

a(A) = a(t) - X(T)*n(0)

but a(t) = X(T) * n(t)

so a(A) = X(T) * n(t) - X(T) * n(0) = X(T) * ( n(t) - n(0) )

=> a(A) % X(T) = 0 //3

We can also say that

a(t) + sizeof(T) + spaceAfter(t) = a(next A in memory)

but since a must be aligned on a multiple of X(T),

padAfter(t) = X(T) * n(1)

so we get

a(t) + sizeof(T) + X(T)*n(1) = a(next A in memory) //4

Putting 2) and 4) together

a(A) + X(T) *n(0) + sizeof(T) + X(T)*n(1) = a(next A in memory)

But by definition of "next in memory" we also know that

a(next A in memory) = a(A) + sizeof(A)

So

a(A) + X(T) * n(0) + sizeof(T) + X(T) * n(1) = a(A) + sizeof(A)

Rewrite and simplify:

X(T) * (n(0) + n(1)) = sizeof(A) - sizeof(T) //5

Q.E.D.

So 5) above implies that sizeof(A<T>) - sizeof(T) is a always a safe
alignment for T (i.e. is a multiple of X(T) and
by construction sizeof(A<T>) > sizeof(T).
Also, note that n(0) is the multiplier of X(T) bytes in the struct before t
and n(1) is the number after.
It's easy to see that any decent compiler would make n(0) = 1 and n(1) = 0,
so that
sizeof(A<T>) - sizeof(T) = X(T), i.e. the exact alignment.

When is it useful to know the alignment of a type T as a value less than
sizeof(T) and some platform specific maximum alignment? Surely I am
just overlooking something simple.

-- MJF

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Vladimir Marko
Guest





PostPosted: Mon Nov 29, 2004 10:56 am    Post subject: Re: Calculating safe alignment for any type Reply with quote



Tom <NoSpam (AT) ThankYouVeryMuch (DOT) com> wrote

[snip]
Quote:
Now consider an array of A<T> in memory starting at address a(A).
We can say that

a(t) = a(A) + spaceBefore(c) + sizeof(char) + spaceAfter(c) //1

but because A<T>::t needs to be aligned on a T boundary, we have

padBefore(c) + sizeof(char) + padAfter(c) = X(T) * n(0)

You silently assumed that A is aligned on a T boundary here.

Quote:
Rewrite 1 as

a(t) = a(A) + X(T) * n(0) //2

From 2, we can see that A<T> has to be aligned on a multiple of X(T), since

a(A) = a(t) - X(T)*n(0)

but a(t) = X(T) * n(t)

so a(A) = X(T) * n(t) - X(T) * n(0) = X(T) * ( n(t) - n(0) )

=> a(A) % X(T) = 0 //3

And here you have successfuly proved the assumption used.

I've seen a lot of pseudoproofs and using P to prove P is quite
common. It's really sad that the vast majority of people can't
distinguish a proof and a pseudoproof.

A correct proof that sizeof(A<T>) is a multiple of the alignment
of T is as follows:

Given the definitions
addr(s) := reinterpret_cast<char*>(&s)
and
A<T> a[2];
we know that addr(a[0].t) and addr(a[1].t) are suitably aligned
for the type T,
addr(a[0].t)-addr(a[0]) == addr(a[1].t)-addr(a[1])
and
addr(a[1])-addr(a[0]) == sizeof(A<T>) .
Thus,
sizeof(A<T>) == addr(a[1].t)-addr(a[0].t) ,
i.e. sizeof(A<T>) is a multiple of the alignment of T Q.E.D.

AFAICT a "decent compiler" is not covered by the standard and
your final result
X(T) == sizeof(A<T>)-sizeof(T)
really depends on it.

Regards,
Vladimir Marko

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Ben Hutchings
Guest





PostPosted: Mon Nov 29, 2004 9:44 pm    Post subject: Re: Calculating safe alignment for any type Reply with quote

Vladimir Marko wrote:
<snip>
Quote:
A correct proof that sizeof(A<T>) is a multiple of the alignment
of T is as follows:

Given the definitions
addr(s) := reinterpret_cast<char*>(&s)
and
A<T> a[2];
we know that addr(a[0].t) and addr(a[1].t) are suitably aligned
for the type T,
addr(a[0].t)-addr(a[0]) == addr(a[1].t)-addr(a[1])
and
addr(a[1])-addr(a[0]) == sizeof(A<T>) .
Thus,
sizeof(A<T>) == addr(a[1].t)-addr(a[0].t) ,
i.e. sizeof(A<T>) is a multiple of the alignment of T Q.E.D.

Similarly, so is sizeof(T).

Quote:
AFAICT a "decent compiler" is not covered by the standard and
your final result
X(T) == sizeof(A<T>)-sizeof(T)
really depends on it.
snip


We do know that sizeof(A<T>) - sizeof(T) is a multiple of the
alignment (since both terms are multiples) and we know that it's
non-zero because members can't overlap. So it seems to be a safe
value to use for alignment, even though it could conceivably be
greater than the true value.

--
Ben Hutchings
Reality is just a crutch for people who can't handle science fiction.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
David Abrahams
Guest





PostPosted: Mon Nov 29, 2004 9:48 pm    Post subject: Re: Calculating safe alignment for any type Reply with quote

Vladimir Marko wrote:

Quote:
A correct proof that sizeof(A<T>) is a multiple of the alignment
of T is as follows:

Given the definitions
addr(s) := reinterpret_cast<char*>(&s)
and
A<T> a[2];
we know that addr(a[0].t) and addr(a[1].t) are suitably aligned
for the type T,
addr(a[0].t)-addr(a[0]) == addr(a[1].t)-addr(a[1])

Actually nothing in the standard supports that assumption unless T is
POD. An implementation could build structs as arrays of pointers to
memory that's allocated elsewhere, for example. So addr(a[0]) and
addr(a[0].t) are allowed to have no relationship at all, and subtracting
them could lead to undefined behavior.

Too bad, IMO, because it's a portable assumption in practice and without
it, some very useful programs have undefined behavior.

Quote:
and
addr(a[1])-addr(a[0]) == sizeof(A<T>) .
Thus,
sizeof(A<T>) == addr(a[1].t)-addr(a[0].t) ,
i.e. sizeof(A<T>) is a multiple of the alignment of T Q.E.D.

Right.

But, stepping back a moment, what does this tell you? It's not very
interesting to have this information, because sizeof(T) is also a
multiple of the alignment of T. And it very well might be the same as
sizeof(A<T>). With that information you can't even reliably use GCD to
find what you're really interested in, which is probably the least
multiple of the alignment of T you can find.

Anyone interested in a _practical_ algorithm for finding alignment might
look at the implementation of boost::alignment_of in the type traits
library.

--
Dave Abrahams
Boost Consulting
http://www.boost-consulting.com

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Tom
Guest





PostPosted: Tue Nov 30, 2004 11:06 am    Post subject: Re: Calculating safe alignment for any type Reply with quote

Quote:
Anyone interested in a _practical_ algorithm for finding alignment might
look at the implementation of boost::alignment_of in the type traits
library.

I see that my initial post is very similar to the code in boost.
Damn. One day I will have an idea someone else hasn't already thought of,
but it's not going to be this one.

Tom



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Tom
Guest





PostPosted: Tue Nov 30, 2004 11:06 am    Post subject: Re: Calculating safe alignment for any type Reply with quote

<snip>
Quote:
spaceBefore(c) = 0, because of struct layout requirements.

No, this is only guaranteed by the standard for POD types.
T could be a non-POD, which would also make A<T> non-POD...

Quote:
but because A<T>::t needs to be aligned on a T boundary, we have

padBefore(c) + sizeof(char) + padAfter(c) = X(T) * n(0)

Are padBefore(c) and padAfter(c) the same as spaceBefore(c) and
spaceAfter(c), respectively?

Yes, sorry.
I tried to rename them from padAfter to spaceAfter, because in non-POD types
there may be more than padding in the class - e.g. the Vfn table pointer.

Quote:
If so, then how can you state that the
offset into the struct must be aligned to T, without previously stating
that the whole struct must be aligned to T? I'm not saying that a
struct could have alignment weaker than any one of its members (it
can't), but you should state that first.

Yes, I made a logical error in the argument, as pointed out by another
poster.
I shot myself in the foot while trying to write down the explanation of the
idea.
However, he proposed a way to fix it and hence the rest of the argument
still stands.

Quote:
When is it useful to know the alignment of a type T as a value less than
sizeof(T) and some platform specific maximum alignment? Surely I am
just overlooking something simple.

I was trying to show how to calculate this value portably.
Also, on all compilers I've tried the result of the calculation is the
actual alignment, because if
the compiler doesn't do anything stupid, like adding unnecessary padding
(which is what my definition of "decent compiler" was) the value is exact.

Tom



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Tom
Guest





PostPosted: Tue Nov 30, 2004 11:07 am    Post subject: Re: Calculating safe alignment for any type Reply with quote

<snip correction>

Thanks for pointing that out.
The error was an oversight while I was trying to write things down into the
email.
As have shown yourself, it's still possible to show that X(A) = n * X(T).

Quote:
AFAICT a "decent compiler" is not covered by the standard and
your final result
X(T) == sizeof(A<T>)-sizeof(T)
really depends on it.

The final result is

X(T) * (n(0) + n(1)) = sizeof(A) - sizeof(T)

Which is still a safe alignment for X, and is as much as I claimed to
calculate.

There are some guarantees regarding n(0) for POD types, but not for user
defined types.

Now my definition of a "decent compiler" is one that doesn't add unnecessary
padding.

So for example if you assume 32 bit ints and a 1 byte char,

struct foo
{
char c;

int i;
};

On a "decent compiler" sizeof(foo) would be 8, with n(0) being 1 and n(1)
being 0.

Thanks for the feedback,

Tom



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Terje Slettebų
Guest





PostPosted: Tue Nov 30, 2004 11:08 am    Post subject: Re: Calculating safe alignment for any type Reply with quote

"Tom" <NoSpam (AT) ThankYouVeryMuch (DOT) com> wrote

Quote:
I recently finished reading Herb Sutter's Exceptional C++ style. In the
book he mentions again the long standing problem of calculating alignment
for a type T.

In this post I will (hopefully) show how to calculate safe alignment for
any
type T.

<snip>

Quote:
So 5) above implies that sizeof(A<T>) - sizeof(T) is a always a safe
alignment for T (i.e. is a multiple of X(T) and
by construction sizeof(A<T>) > sizeof(T).
Also, note that n(0) is the multiplier of X(T) bytes in the struct before
t
and n(1) is the number after.
It's easy to see that any decent compiler would make n(0) = 1 and n(1) =
0,
so that
sizeof(A<T>) - sizeof(T) = X(T), i.e. the exact alignment.

Comments?

You might also look at boost::alignment_of, which uses the same technique,
i.e. guaranteeing to get a safe alignment (a value that is a multiple of the
exact alignment): http://www.boost.org/libs/type_traits/index.html.

Regards,

Terje



[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Bill Wade
Guest





PostPosted: Thu Dec 02, 2004 12:49 am    Post subject: Re: Calculating safe alignment for any type Reply with quote

David Abrahams <dave (AT) boost-consulting (DOT) com> wrote

Quote:
An implementation could build structs as arrays of pointers to
memory that's allocated elsewhere, for example. So addr(a[0]) and
addr(a[0].t) are allowed to have no relationship at all

I disagree:

9.2p12 says "Nonstatic data members ... have ... addresses within a
class object."

I would expect that to be verifiable with std::less<void*>.

Quote:
and subtracting
them could lead to undefined behavior.

I agree with this for various reasons. For instance it seems that the
'offset' of nonstatic data elements in a non-pod is not required to be
a multiple of 1 char.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Vladimir Marko
Guest





PostPosted: Thu Dec 02, 2004 1:02 am    Post subject: Re: Calculating safe alignment for any type Reply with quote

David Abrahams <dave (AT) boost-consulting (DOT) com> wrote

[snip]
Quote:
addr(s) := reinterpret_cast<char*>(&s)
[snip]
addr(a[0].t)-addr(a[0]) == addr(a[1].t)-addr(a[1])

Actually nothing in the standard supports that assumption unless T is
POD. An implementation could build structs as arrays of pointers to
memory that's allocated elsewhere, for example. So addr(a[0]) and
addr(a[0].t) are allowed to have no relationship at all, and subtracting
them could lead to undefined behavior.

Too bad, IMO, because it's a portable assumption in practice and without
it, some very useful programs have undefined behavior.
[snip]


Well, it seems that my "proof" contains an "axiom" that is in fact
just an unproved statement. And a statement that I'm starting to
believe to be false.

IMHO the implementation you suggested is invalid. The constructor would
have to allocate the additional storage and such an allocation might
fail. Thus, it would be impossible for a non-POD constructor to be
non-throwing (and if this does not break something in the language it
breaks at least the stdlib, for example auto_ptr).

So, let's propose another malicious implementation. For each data member
the object will contain this member on a random position and its offset
(relative to "this") at a fixed position. The size of the object will
be sufficient for bases, offsets, members and necessary padding for any
permutation of the members. After constructing the bases the constructor
will create a random permutation of the new data members and calculate
and store their offsets before initializing these members. (Random
offsets of bases are impossible since it would break static_cast up and
down the hierarchy.) If we don't introduce a random padding,
addr(a[0].t)-addr(a[0]) == addr(a[1].t)-addr(a[1])
will evidently have 50% chance to hold (&a[0].t points to the randomly
located member, not to the index).

If you can find anything that makes this implementation broken wrt the
standard let me know.

Regards,
Vladimir Marko

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.