 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
OSA Guest
|
Posted: Thu Jun 26, 2003 10:30 pm Post subject: sizeof |
|
|
I have such structure
struct A
{
double d;
short s;
};
and I am using the next code to write this structure to the file:
A a;
..............
pFile->Write(&a,sizeof(a));
// .............. write other data
It seems this is corrected code. But as I understand the result returned
by sizeof is depended on the packing alignment of the structures and
unions. So it's possible situation when this code may be wrong:
A a;
pFile->Read(&, sizeof(a))
// .................... read other data.
Here I can have the problem because the sizeof(a) from above example may
be not equaled to the sizeof from this example. How I can to make this
code doesn't depend on the alignment?
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Ron Natalie Guest
|
Posted: Sat Jun 28, 2003 12:01 am Post subject: Re: sizeof |
|
|
"OSA" <s_osa2000 (AT) ua (DOT) fm> wrote
| Quote: | It seems this is corrected code. But as I understand the result
returned
by sizeof is depended on the packing alignment of the structures and
unions.
|
sizeof includes all the packing, padding, etc...
One issue is that the packing/padding/etc.. (and in turn sizeof) may be
different from compiler implementation to compiler implementation (or
even based on settings within the single implmenetation), so you have
to be very careful that you only use this in cases where you won't be
affected by this.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dhruv Guest
|
Posted: Sat Jun 28, 2003 12:29 pm Post subject: Re: sizeof |
|
|
On Thu, 26 Jun 2003 18:30:18 -0400, OSA wrote:
[snip]......
| Quote: | A a;
pFile->Read(&, sizeof(a))
// .................... read other data.
Here I can have the problem because the sizeof(a) from above example may
be not equaled to the sizeof from this example. How I can to make this
code doesn't depend on the alignment?
|
Do not use a binary format to store data. Store it in ASCII format.
Define operator<< for your class. Something like this:
ostream &A::operator<< (const ostream& out, const A &a)
{
out<
return out;
}
Regards,
Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Matthias Hofmann Guest
|
Posted: Sat Jun 28, 2003 11:25 pm Post subject: Re: sizeof |
|
|
OSA <s_osa2000 (AT) ua (DOT) fm> schrieb in im Newsbeitrag:
bde8tf$3fv$1 (AT) pandora (DOT) alkar.net...
| Quote: | I have such structure
struct A
{
double d;
short s;
};
and I am using the next code to write this structure to the file:
A a;
.............
pFile->Write(&a,sizeof(a));
// .............. write other data
It seems this is corrected code. But as I understand the result returned
by sizeof is depended on the packing alignment of the structures and
unions. So it's possible situation when this code may be wrong:
A a;
pFile->Read(&, sizeof(a))
// .................... read other data.
Here I can have the problem because the sizeof(a) from above example may
be not equaled to the sizeof from this example. How I can to make this
code doesn't depend on the alignment?
|
Do not write the structure as a whole but save each member indicidually:
pFile->Write( &a.d, sizeof( a.d ) );
pFile->Write( &a.s, sizeof( a.s ) );
// Write other data.
Reading is analogous:
pFile->Read( &a.d, sizeof( a.d ) );
pFile->Read( &a.s, sizeof( a.s ) );
// Read other data.
As far as I know, this should be platform independent. I have got this
advice from a book dealing with frequent errors in C.
Regards,
Matthias
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
MiniDisc_2k2 Guest
|
Posted: Sat Jun 28, 2003 11:28 pm Post subject: Re: sizeof |
|
|
You could avoid Write and Read all together:
(I'm assuming that you're using ofstream/ifstream/fstream, from the
<fstream.h> header file)
| Quote: | struct A
{
double d;
short s;
};
A a;
.............
pFile->Write(&a,sizeof(a));
instead of that, do this |
(*pFile) << a.d;
(*pFile) << endl;
(*pFile) << a.s;
| Quote: | // .............. write other data
It seems this is corrected code. But as I understand the result returned
by sizeof is depended on the packing alignment of the structures and
unions. So it's possible situation when this code may be wrong:
A a;
pFile->Read(&, sizeof(a))
and instead of that, use this: |
(*pFile) >> a.d;
(*pFile) >> a.s;
Just make sure you do the read and write in the same order.
| Quote: | // .................... read other data.
Here I can have the problem because the sizeof(a) from above example may
be not equaled to the sizeof from this example. How I can to make this
code doesn't depend on the alignment?
|
Using a text file always works (as long as you can format it properly so you
can read the data) and is easy to do.
-- MiniDisc_2k2
To reply, replace nospam.com with cox dot net.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
KIM Seungbeom Guest
|
Posted: Sun Jun 29, 2003 9:50 pm Post subject: Re: sizeof |
|
|
"Matthias Hofmann" <hofmann (AT) anvil-soft (DOT) com> wrote
| Quote: |
Do not write the structure as a whole but save each member
indicidually:
pFile->Write( &a.d, sizeof( a.d ) );
pFile->Write( &a.s, sizeof( a.s ) );
// Write other data.
Reading is analogous:
pFile->Read( &a.d, sizeof( a.d ) );
pFile->Read( &a.s, sizeof( a.s ) );
// Read other data.
As far as I know, this should be platform independent. I have got this
advice from a book dealing with frequent errors in C.
|
If you want it to be really portable, you cannot depend on this:
how would you know how many bytes a double would contain, and a short?
Without taking that into account, there would be little meaning in
decomposing the write and read calls into those for each member.
When you transfer binary data across the "internal-external" boundary
(e.g. between memory and file, or between memory and network),
you should define the layout on the level of the smallest size that is
guaranteed to be portably read and written: that is usually "byte".
So, in the OP's case
struct A
{
double d;
short s;
};
a layout like this should be defined:
d : 8 bytes, the most significant byte first
s : 2 bytes, the most significatt byte first
----------------------------------------------
total 10 bytes
and each member should be accordingly packed and unpacked.
This is the platform-dependent part.
--
KIM Seungbeom <musiphil (AT) bawi (DOT) org>
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Thomas Mang Guest
|
Posted: Sun Jun 29, 2003 10:10 pm Post subject: Re: sizeof |
|
|
Matthias Hofmann schrieb:
| Quote: |
Do not write the structure as a whole but save each member
indicidually:
pFile->Write( &a.d, sizeof( a.d ) );
pFile->Write( &a.s, sizeof( a.s ) );
// Write other data.
Reading is analogous:
pFile->Read( &a.d, sizeof( a.d ) );
pFile->Read( &a.s, sizeof( a.s ) );
// Read other data.
As far as I know, this should be platform independent. I have got this
advice from a book dealing with frequent errors in C.
|
Matthias,
How could this code be platform independent?
For a simple example, take one system where sizeof(int) == 2, and one
where
sizeof(int) == 4.
Do you really expect writing an int on one platform and then reading it
on the
other to be guaranteed to work?
Even if sizes were the same, how can you assume different
implementations use
the same internal representation of data?
What's a valid bit pattern on one system may well be trash on another
system
regards,
Thomas
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Matthias Hofmann Guest
|
Posted: Tue Jul 01, 2003 6:23 pm Post subject: Re: sizeof |
|
|
Thomas Mang <a9804814 (AT) unet (DOT) univie.ac.at> schrieb in im Newsbeitrag:
[email]3EFEFACF.35283224 (AT) unet (DOT) univie.ac.at[/email]...
| Quote: |
Matthias Hofmann schrieb:
Do not write the structure as a whole but save each member
indicidually:
pFile->Write( &a.d, sizeof( a.d ) );
pFile->Write( &a.s, sizeof( a.s ) );
// Write other data.
Reading is analogous:
pFile->Read( &a.d, sizeof( a.d ) );
pFile->Read( &a.s, sizeof( a.s ) );
// Read other data.
As far as I know, this should be platform independent. I have got this
advice from a book dealing with frequent errors in C.
Matthias,
How could this code be platform independent?
For a simple example, take one system where sizeof(int) == 2, and one
where
sizeof(int) == 4.
Do you really expect writing an int on one platform and then reading it
on the
other to be guaranteed to work?
Even if sizes were the same, how can you assume different
implementations use
the same internal representation of data?
What's a valid bit pattern on one system may well be trash on another
system
|
You are right, my method does not guarantee that doubles or ints can be
interchanged between indiviual systems using a file. But it does solve the
problem of compilers using pad-bytes to ensure a certain alignment.
If interchangeability is crucial, it might be a good idea to stick to data
types whose size is well defined, i.e prefer "long" or "short" towards an
"int" to make sure the number of bytes are the same.
In order to ensure interchangeability between little-endian (least
significant byte first) and big-endian (most significant byte first)
machines, you could use the functions htonl() and ntohl(), which convert
from "Host TO Network Long" and "Network TO Host Long", respectively. These
functions are available on many platforms (e.g. Linux and Windows). Using
these functions, you have to procede as follows:
Writing:
long myLong = 1234;
myLong = htonl( myLong ); // Convert to network byte order.
SaveToFile( hFile, &myLong, sizeof (myLong) );
Reading:
long myLong;
ReadFromFile( hFile, &myLong, sizeof (myLong) );
myLong = ntohl( myLong ); // Convert back to host byte order.
The basic idea is that there already is a kind of an agreement on the layout
of data "outside" your computer (in a file or when transmitting data over a
network): The layout used is the big-endian format, which means most
significant byte first. However, I don't know how to deal with data that
does not fit into 32 bit, especially if the layout is quite different (e.g.
there is no rule about the representation of floating point values).
Regards,
Matthias
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Matthias Hofmann Guest
|
Posted: Thu Jul 03, 2003 6:02 pm Post subject: Re: sizeof |
|
|
<kanze (AT) gabi-soft (DOT) fr> schrieb in im Newsbeitrag:
[email]d6652001.0307020113.348784c1 (AT) posting (DOT) google.com[/email]...
| Quote: | "Matthias Hofmann" <hofmann (AT) anvil-soft (DOT) com> wrote in message
news:<3f007df0 (AT) news (DOT) nefonline.de>...
Thomas Mang <a9804814 (AT) unet (DOT) univie.ac.at> schrieb in im Newsbeitrag:
[email]3EFEFACF.35283224 (AT) unet (DOT) univie.ac.at[/email]...
Matthias Hofmann schrieb:
If interchangeability is crucial, it might be a good idea to stick
to
data types whose size is well defined, i.e prefer "long" or "short"
towards an "int" to make sure the number of bytes are the same.
There are no types whose size is well defined. On my system, short is
always 16 bit and int 32; long depends on the command line options of
the compiler. And that's just my system today -- I've seem systems
with
char 8, 9 or 32 bits, shorts of 16, 18, 32, 36 or 48 bits, etc., etc.
Not to mention that some systems will encode -1 as 0xffff, others as
0xfffe and still others as 0x8001 (for a 16 bit short).
|
Well, I am surprised to hear that there are data types whose size is not
a
multiple of 8 - I thought that a byte was always 8 bit and all other
types
were multiples of a byte. Who on earth builds a CPU that works with 9,
18 or
36 bit integers?
| Quote: | For floating point, of course, there are even more differences.
In order to ensure interchangeability between little-endian (least
significant byte first) and big-endian (most significant byte first)
machines, you could use the functions htonl() and ntohl(), which
convert from "Host TO Network Long" and "Network TO Host Long",
respectively. These functions are available on many platforms (e.g.
Linux and Windows). Using these functions, you have to procede as
follows:
Writing:
long myLong = 1234;
myLong = htonl( myLong ); // Convert to network byte order.
SaveToFile( hFile, &myLong, sizeof (myLong) );
Reading:
long myLong;
ReadFromFile( hFile, &myLong, sizeof (myLong) );
myLong = ntohl( myLong ); // Convert back to host byte order.
And why not something simple and portable, like:
void
writeLong( std::ostream& dest, long value )
{
unsigned long x = value ;
int shiftCount = 32 ;
while ( shiftCount > 0 ) {
shiftCount -= 8 ;
dest.put( (x >> shiftCount) & 0xff ) ;
}
}
|
According to what you have said about the size of a "long" being
dependent
on command line options of the compiler, I don't see what makes this
code
portable, as it assumes that the size of a "long" is 32 bits. Maybe it
would
be OK if the corresponding line were replaced with
int shiftCount = sizeof ( long );
Or do you suggest that the function writeLong() itself be ported and
adjusted accordingly on each new system?
Best regards,
Matthias
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Bo Persson Guest
|
Posted: Fri Jul 04, 2003 8:02 am Post subject: Re: sizeof |
|
|
"Matthias Hofmann" <hofmann (AT) anvil-soft (DOT) com> skrev i meddelandet
news:3f035654 (AT) news (DOT) nefonline.de...
| Quote: |
[email]kanze (AT) gabi-soft (DOT) fr[/email]> schrieb in im Newsbeitrag:
There are no types whose size is well defined. On my system, short
is
always 16 bit and int 32; long depends on the command line options
of
the compiler. And that's just my system today -- I've seem
systems
with
char 8, 9 or 32 bits, shorts of 16, 18, 32, 36 or 48 bits, etc.,
etc.
Not to mention that some systems will encode -1 as 0xffff, others
as
0xfffe and still others as 0x8001 (for a 16 bit short).
Well, I am surprised to hear that there are data types whose size is
not
a
multiple of 8 - I thought that a byte was always 8 bit and all other
types
were multiples of a byte. Who on earth builds a CPU that works with
9,
18 or
36 bit integers?
|
Not many nowadays, but there has been quite a few from some very well
known manufacturers:
http://www.36bit.org/
Bo Persson
[email]bop2 (AT) telia (DOT) com[/email]
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Rob Guest
|
Posted: Fri Jul 04, 2003 5:39 pm Post subject: Re: sizeof |
|
|
"Ben Hutchings" <do-not-spam-ben.hutchings (AT) businesswebsoftware (DOT) com>
wrote in
message news:slrnbg8taa.1eg.do-not-spam-ben.hutchings (AT) tin (DOT) bwsint.com...
| Quote: | In article <3f035654 (AT) news (DOT) nefonline.de>, Matthias Hofmann wrote:
[Snip]
Groups of 8 bits are properly called octets. Bytes, however, have
as many bits as are convenient for encoding text. The C and C++
standards require them to have at least 8 bits and require the
character encoding to encode certain characters as a single byte.
|
That's not quite true. The C++ standard says nothing about the number
of bits that must be in a byte. It simply specifies minimum ranges of
values
that an implementation must support for various types.
The minimum ranges for char types basically equate to 8 (or more) bits
in practice. But that is actually an implementation detail rather than
a
specifically stated requirement in the standard.
[Other interesting stuff on machines and encodings of chars with
different
number of bits snipped].
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Andy Sawyer Guest
|
Posted: Sun Jul 06, 2003 2:36 am Post subject: Re: sizeof |
|
|
In article <be32ai$uuc$1 (AT) fang (DOT) dsto.defence.gov.au>,
on 4 Jul 2003 13:39:31 -0400,
"Rob" <nospam (AT) nonexistant (DOT) com> wrote:
| Quote: | "Ben Hutchings" <do-not-spam-ben.hutchings (AT) businesswebsoftware (DOT) com
wrote in
message news:slrnbg8taa.1eg.do-not-spam-ben.hutchings (AT) tin (DOT) bwsint.com...
In article <3f035654 (AT) news (DOT) nefonline.de>, Matthias Hofmann wrote:
[Snip]
Groups of 8 bits are properly called octets. Bytes, however, have
as many bits as are convenient for encoding text. The C and C++
standards require them to have at least 8 bits and require the
character encoding to encode certain characters as a single byte.
That's not quite true.
|
Actaully, it's entirely true.
| Quote: | The C++ standard says nothing about the number of bits that must be
in a byte.
|
I think you'll find it does...
| Quote: | It simply specifies minimum ranges of values that an
implementation must support for various types.
|
.....on the same page as it specifies those ranges.
| Quote: | The minimum ranges for char types basically equate to 8 (or more) bits
in practice. But that is actually an implementation detail rather than
a specifically stated requirement in the standard.
|
No. The C++ standard includes, by reference, certain sections of the C
standard. Importantly (as far as this discussion is concerned), one of
those is the definition of and minimum value of CHAR_BIT. CHAR_BIT is
(to quote the base document) "number of bits for smallest object that
is not a bit-field (byte)", and must have a value of at least 8. So in
fact is IS a specifically stated requirement - you just have to read
the standard very carefully.
--
"Light thinks it travels faster than anything but it is wrong. No matter
how fast light travels it finds the darkness has always got there first,
and is waiting for it." -- Terry Pratchett, Reaper Man
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Matthew Collett Guest
|
Posted: Sun Jul 06, 2003 10:20 am Post subject: Re: sizeof |
|
|
In article <3f05c338 (AT) news (DOT) nefonline.de>,
"Matthias Hofmann" <hofmann (AT) anvil-soft (DOT) com> wrote:
| Quote: | [I]n a world where real
standards are hard to find, the number of bits in a byte seemed to be
the only reliable thing to me. And now you are telling me that the
sizes of bytes, words and whatever are just arbitrary. Under these
circumstances, how am I supposed to write portable code? [...]
It would be fine if C/C++ provided a means of determining
the bit size of a data type
|
std::numeric_limits<unsigned char>::digits tells you the number of bits
in a byte. Multiply by sizeof to get the number of bits in any other
type, or use std::numeric_limits<other_type>::digits directly for
integer types.
Best wishes,
Matthew Collett
--
Those who assert that the mathematical sciences have nothing to say
about the good or the beautiful are mistaken. -- Aristotle
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
James Kanze Guest
|
Posted: Sun Jul 06, 2003 6:34 pm Post subject: Re: sizeof |
|
|
"Matthias Hofmann" <hofmann (AT) anvil-soft (DOT) com> writes:
| Quote: | There have also been machines with 12, 18 and 60-bit words
(examples: PDP-5, PDP-1, CDC 6600). In C these would probably
have 9, 12 and 10-bit bytes respectively, though most
programming environments used 6-bit characters.
Dear, this sounds like a pure nightmare to me - in a world where
real standards are hard to find, the number of bits in a byte
seemed to be the only reliable thing to me. And now you are
telling me that the sizes of bytes, words and whatever are just
arbitrary. Under these circumstances, how am I supposed to write
portable code?
|
Don't depend on having exactly 8 bits in a byte.
Note that this is only necessary for code which extreme portability
requirements. Windows, all of the mainstream Unix, and IBM mainframes
all have 8 bit bytes, as do all of the recent embedded processors I've
seen. In fact, I think that the only remaining systems with bytes of
another size are the large Unisys OS2200 systems and the PDP-10
emulator.
| Quote: | So far I have thought that it was easy to get the
most significant bit in a byte:
char byte;
bool bit = ( byte >> 7 ) & 1; // Actually the "& 1" is redundant in this
case
|
The standard idiom used to be something like:
bool highBit = ((unsigned char)( byte )
& ~((unsigned char)( ~0 ) >> 1)) != 0 ;
Once ISO C became prevelant, of course, you could write:
bool highBit = (byte & (1 << (CHAR_BIT - 1))) != 0 ;
| Quote: | But how is this going to work if the byte has 9, 3 or 27 bits on
some weird machine from mars?
|
See above. And Unisys is based somewhere in Pennsylvania, I think,
and not on Mars.
| Quote: | It would be fine if C/C++ provided a means of determining the bit
size of a data type like, so I could code something like:
char byte;
bool bit = ( byte >> (bitsize( byte ) - 1) ) & 1; // Get MSB using "operator
bitsize"
|
You mean something like CHAR_BIT? Defined in <limits.h>. (I think
that there is something similar in <limits>. But not all of my
compilers support <limits>.)
| Quote: | It is not even possible to get the bit size of a data type by
multiplying the return value of "operator sizeof" by 8, because on
some machine it would have to be multiplied by another number of
bits.
|
Correct.
None of this is new. In "The C Programming Language", Kernighan and
Richie, 1978, there is a table on page 34 with "some representative
values" for existing implementations at the time: the Honeywell 6000
has 9 bit bytes. Given this, I really don't know where the idea comes
from that all bytes are 8 bits.
--
James Kanze mailto:kanze (AT) gabi-soft (DOT) fr
Conseils en informatique oriente objet/
Beratung in objektorientierter Datenverarbeitung
11 rue de Rambouillet, 78460 Chevreuse, France Tel. +33 1 41 89 80 93
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Ron Natalie Guest
|
Posted: Mon Jul 07, 2003 3:35 pm Post subject: Re: sizeof |
|
|
"Matthias Hofmann" <hofmann (AT) anvil-soft (DOT) com> wrote
| Quote: | Dear, this sounds like a pure nightmare to me - in a world where real
standards are hard to find, the number of bits in a byte seemed to be
the
only reliable thing to me. And now you are telling me that the sizes
of
bytes, words and whatever are just arbitrary. Under these
circumstances, how
am I supposed to write portable code? So far I have thought that it
was easy
to get the most significant bit in a byte:
|
Understand that back in the past, the entire world was not slaved to a
terminal.
Why allocated 8 bits when the entire allowable character set on the
printers
fit nicely into 6 or 7? Memory was not sold or allocated by the byte.
| Quote: | char byte;
bool bit = ( byte >> 7 ) & 1; // Actually the "& 1" is redundant in
this
case
But how is this going to work if the byte has 9, 3 or 27 bits on some
weird
machine from mars? It would be fine if C/C++ provided a means of
determining
the bit size of a data type like, so I could code something like:
|
CHAR_BIT is what you are looking for.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|