C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

Re: Linked list takes 10X more memory than expected. Why?

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Ulrich Eckhardt
Guest





PostPosted: Sun Jan 25, 2004 1:41 am    Post subject: Re: Linked list takes 10X more memory than expected. Why? Reply with quote



Chocawok wrote:
Quote:
struct Node
{
Node* next;
Node* down;
char letter;
bool terminal;
};

As you can see, the node struct should be about 4 bytes per node.

Hmmm, on IA32, that is 2x4 bytes for the pointers and one byte each for the
char and the bool. And that is without padding, which might blow this up
to 12 or even 16 bytes. Plus the overhead when allocating these with new
to store the MM-internal data.

Uli

--
Questions ?
see C++-FAQ Lite: http://parashift.com/c++-faq-lite/ first !


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Raoul Gough
Guest





PostPosted: Tue Jan 27, 2004 6:44 pm    Post subject: Re: Linked list takes 10X more memory than expected. Why? Reply with quote



"Chocawok" <chocawok (AT) hotmail (DOT) com> writes:

Quote:
Hi Guys,
this is my first post here so please forgive any faux pas's such as too long
a post.

I have a problem which I am totally baffled by.

I have written a program which reads in a text file of words, which looks
like this:
ABLE
ARDVARK
BALLOON
ZYGOTE


It converts this into a doubly linked list, with each character of the word
as node.

I then search this structure using a depth first search.

The problem is that this structure is taking a HUGE memory foot print.
I have tried setting up the structure using 2 different input files. Both
use OVER TEN TIMES more memory than I expect.

To try and trouble shoot I have added to my code a "count_node" variable
which is
increased by 1 every time a node is added.

Here are the details:

First file, 1.5Million characters (170,000 words) and comes out about
373,683 nodes.
Second file I have is 3.5Million characters (215,000 words) 1,833,930 nodes
are created for this.

Now, my node structure is this:

-----------------------------------------------------------

struct Node

{
Node* next;
Node* down;
char letter;
bool terminal;

//constructor
Node()
{
down = 0;
next = 0;
letter = 0;
terminal = false; }

//constructor
Node(char CurrentLetter)
{
down = 0;
next = 0;
letter = CurrentLetter;
terminal = false;
}
};

------------------------------------------------------------


As you can see, the node struct should be about 4 bytes per node.

Well, I would guess on any 32-bit architecture I know of, each Node
would take up at least 12 bytes - 4 bytes each for the pointers, one
each for char and bool, and then two bytes of padding to align to a
four-byte boundary (so that the structure can be stored in an array).
That's a lot of overhead for the char and bool that you're actually
storing.

Try adding this to your main function:

std::cout << sizeof (Node) << "n";

This will tell you exactly how much space the structure takes up. As
Ulrich pointed out, there is also a certain amount of space used up by
memory management for every Node you allocate with "new".

Quote:

For the first file 373,683 X 4 = 1.5MegaBytes
For the second file 1,833,930 X 4 = 7.3 MegaBytes.

Even if the pointers in the struct are long (pointers are not a strong point
for me) then these figures should only be about 2.5 and 10 MB.

If you're talking about millions of nodes, a linked list is probably
the wrong data structure for storing the chars. Without knowing
exactly what you're doing, I would guess something like this would be
better:

struct AnnotatedString {
std::string m_chars;
std::vector };

or maybe you can get away with this:

struct AnnotatedString {
std::string m_chars;
unsigned int m_terminal_char_location;
};

Using std::string or std::vector<char> will considerably reduce the
overhead per character, provided that you're storing "many" characters
in each string (e.g. words or sentences rather than single
characters). Maybe you would have to go to something like this:

struct Dictionary {
std::vector<char> m_chars;
std::vector<bool> m_terminal;
};

Hard to say without knowing more about your design goals.

[snip]

Quote:
--=_Turnpike_97kOI3Fm4mEAR4fF=
Content-Type: application/zip
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=dictionary_test2.zip
Content-MD5: 43V0k25sKSg25BQuSiGQhw==

Wow, attachments on clc++m? Never seen that before.

{Moderator's note: Normally, articles with attachments are indeed filtered
and rejected autotmatically. However, this one slipped through due to
an unlikely chain of conditions. -mod/dk}

--
Raoul Gough.
export LESS='-X'

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Francis Glassborow
Guest





PostPosted: Wed Jan 28, 2004 2:13 pm    Post subject: Re: Linked list takes 10X more memory than expected. Why? Reply with quote



In message <buudqj$ma8nd$1 (AT) ID-178288 (DOT) news.uni-berlin.de>, Ulrich
Eckhardt <doomster (AT) knuut (DOT) de> writes
Quote:
Chocawok wrote:
struct Node
{
Node* next;
Node* down;
char letter;
bool terminal;
};

As you can see, the node struct should be about 4 bytes per node.

Hmmm, on IA32, that is 2x4 bytes for the pointers and one byte each for the
char and the bool. And that is without padding, which might blow this up
to 12 or even 16 bytes. Plus the overhead when allocating these with new
to store the MM-internal data.

alignment requirements will require 12 bytes (with 4-byte alignment) or
16 bytes (with eight or 16 byte alignment). Added to which some systems
are moving toward 8 byte pointers. Creating a doubly linked list of
single chars is unreasonable, particularly if the nodes are being
constructed with the new expression calling the global operator new
(some systems handle dynamic memory in multiples of 32 bytes.

At best using a linked list will multiply memory usage by ten and you
can expect factors of sixteen or thirty-two.

The OP should look at what he is trying to do (which was far from clear
to me from the post) because I cannot believe that a linked list is
appropriate.

--
Francis Glassborow ACCU
Author of 'You Can Do It!' see http://www.spellen.org/youcandoit


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.