C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

source code organization

 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
suresh_shukla3@rediffmail
Guest





PostPosted: Fri Oct 14, 2005 5:38 pm    Post subject: source code organization Reply with quote



I am facing a problem of code organization.

There is a system with multiple run-time components.
Currently the usage is single CVS repository with all components
sharing a lot of header files. Over time this seems to have made a
subtle interdependency web.
The noticeable things are when I can see one component being aware of a
compiler padding for another component which will run on RISC machine
etc.
The components are being developed by different teams. Often a change
in a component code silently breaks another. Integration is often a
nightmare.

Combined code base is huge over 1 million lines of C++ code with 7-8
components. Most of inter-component communication is over TCP/IP
sockets.

I am looking for a better way of managing code.

One alternative I have thought of is to seperate code for each runtime
component into smaller isolated code repositories. This will introduce
some code level redundancy but also reduce build time. Can this code
base reduction improve programmer understanding?
Each component documenting its interfaces. Any change in one component
being communicated from producer to consumer(s), but through documents
and not through code.

But I am not sure whether this component isolation with some code level
redundancy will improve the situation.
Any body has faced similar problem or has some practical suggestions.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Murali
Guest





PostPosted: Sat Oct 15, 2005 1:10 pm    Post subject: Re: source code organization Reply with quote



Hi,

[email]suresh_shukla3 (AT) rediffmail (DOT) com[/email] wrote:
Quote:
I am facing a problem of code organization.

There is a system with multiple run-time components.
Currently the usage is single CVS repository with all components
sharing a lot of header files. Over time this seems to have made a
subtle interdependency web.

How does sharing header files cause a problem? Ideally, this should
ensure that all components use the same interface, which is a good
thing to have.

Quote:
The noticeable things are when I can see one component being aware of a
compiler padding for another component which will run on RISC machine
etc.

Do you mean implementation details are being exposed? Then this is a
design/implementation level issue. Not a code organization issue.

Quote:
The components are being developed by different teams. Often a change
in a component code silently breaks another. Integration is often a
nightmare.

Combined code base is huge over 1 million lines of C++ code with 7-8
components. Most of inter-component communication is over TCP/IP
sockets.

I am looking for a better way of managing code.

One alternative I have thought of is to seperate code for each runtime
component into smaller isolated code repositories. This will introduce
some code level redundancy but also reduce build time. Can this code
base reduction improve programmer understanding?

With over 1 million lines of code, any reorganizations that you make
will have a huge impact. Also, having multiple copies of the same code
will be even more of a maintenance nightmare.

Based on the details that you have provided, it appears that doing the
following may solve your problems:

1) Have a well-defined source code configuration policy and ensure that
all groups follow that.
2) Automate the building and testing of each component and schedule a
build & test of each component daily. This will catch any
inconsistencies caused by any change immediately.
3) Analyze your source code and identify core components (components
which are self-contained and on which other components depend). Ensure
that only one group/person maintains these core components. All others
should use a pre-built version of these components from a standard
location. Others should not be allowed to modify these components.
4) Enforce a code review process so that all changes undergo peer
reviews by different groups. This will help in changes being
communicated in a better way and also in catching potential issues much
earlier.

Hope this helps.

Regards,
Murali


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
AnonMail2005@gmail.com
Guest





PostPosted: Sat Oct 15, 2005 1:13 pm    Post subject: Re: source code organization Reply with quote



Since there are only 7-8 components in a code base of 1 million
lines of code, these components are still large.

As a first cut, you should try to factor out common code that
these components use and put them in seperate libraries. Anything
that is used in more than one component can be factored out so
there won't be code redundancy when you separate out the components
in the future.

There can be alot of commonly used low level classes that are used
across components - logging classes, communications classes, etc.
This would be a logical first step to breaking up the code.

Two good rules of thumb are that libraries should have a tree
structure of dependencies - no circular dependencies. And they
should be versioned. Once they are factored out, one component
may need a change in, say, the logging class. You want to be
able to change the library and use it in one component without
being _forced_ to upgrade your other components to use the new
library.

A good book on physical code organization is by Lakos, Large Scale
C++ Software Design. It's not easy reading but it's the only book
I know of that systematically addresses these types of issues.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Abhishek Pamecha
Guest





PostPosted: Sun Oct 16, 2005 9:47 am    Post subject: Re: source code organization Reply with quote

I dont think reorganizing a lot of inter-dependent code into chunks of
redundant code is gonna make life easier for you. What you are
suggesting is more of a workaround to the situation. Trust me, if that
code-base is active, you will run into more errors in the long run,
should you choose that path.

Ideally, if you can compare a topology(depdendency tree) of the
interdependencies as seen by the compiler with the real topology based
on function calls actually made (you can write a script for it), you
can reduce and factorize the entire code base.

What you can also do is manually refactor the files into smaller files
with one function/class per file. This technique is used to refactor
and componentize older code base for which much is not known. You can
then group similar files under one directory and archive them while
building. Of course, here you have to be careful of file-static vars
being used across funcs. (and other existing file scopes)

Atleast you can write a script for the first draft of such refactoring
and then 'beautify' :-> it manually.

While documentation is good as a guide to understand the code-base it
should "not" be used to "steer" it.

Redundant code is acceptable if that code base is not touched a lot.
Or else you are opening a can of worms for you.

hth
abhishek


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

Back to top
Jorgen Grahn
Guest





PostPosted: Sun Oct 16, 2005 4:16 pm    Post subject: Re: source code organization Reply with quote

On 16 Oct 2005 05:47:23 -0400, Abhishek Pamecha <abhishekpamecha (AT) gmail (DOT) com> wrote:
Quote:
I dont think reorganizing a lot of inter-dependent code into chunks of
redundant code is gonna make life easier for you. What you are
suggesting is more of a workaround to the situation. Trust me, if that
code-base is active, you will run into more errors in the long run,
should you choose that path.
....
While documentation is good as a guide to understand the code-base it
should "not" be used to "steer" it.

Redundant code is acceptable if that code base is not touched a lot.
Or else you are opening a can of worms for you.

Yes, but are you taking into account that this code base, according to the
poster, is really a handful of standalone processes communicating over TCP?

For me, the first impulse would be to split that code base, and document
the protocol(s) (in text and/or executable test cases).

That is partly because I've too frequently had to reverse-engineer protocols
because the original authors (a) let shared source code implicitly define
the protocol and (b) failed to see that they had a protocol in the first
place. And (c) because if you /do/ have a protocol, you find uses for it
which don't involve all of the different parts -- in testing and debugging,
for example.

I don't know if I'm right in this particular case, of course. Whatever the
solution is, it's not going to be easy, given the size of that code base.

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
X/ algonet.se> R'lyeh wgah'nagl fhtagn!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Abhishek Pamecha
Guest





PostPosted: Mon Oct 17, 2005 11:13 am    Post subject: Re: source code organization Reply with quote

Quote:
Yes, but are you taking into account that this code base, according to the
poster, is really a handful of standalone processes communicating over TCP?

Yes, but the little number of standalone processes might have a lot in
common which can be refactored and archived to act as foundation layers
for those components to build upon.

I also agree with you in what you said about documentation but what the
original poster was suggesting is to "depend" on documentation to
propagate and inform about replicating changes from one code-base to
all its redundant clones. Would you agree to doing that? Shouldn't
that be automated or better centralized?


I agree with another poster about suggesting John Lakos's book about
source code organisation. It has a wealth of ideas...


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Ralf Fassel
Guest





PostPosted: Mon Oct 17, 2005 11:21 am    Post subject: Re: source code organization Reply with quote

* [email]suresh_shukla3 (AT) rediffmail (DOT) com[/email]
Quote:
One alternative I have thought of is to seperate code for each
runtime component into smaller isolated code repositories. This will
introduce some code level redundancy but also reduce build time. Can
this code base reduction improve programmer understanding?

Even with better understanding in each team you will get errors that
derive from subtle interdependencies between the modules which no one
is aware of.

IMHO, automated testing is the way to go: you need a regression test
based policy which allows changes to the code base only if all tests
succeed. If you want to try something different in that direction,
check aegis: http://aegis.sourceforge.net (Unix only).

R'

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
kanze
Guest





PostPosted: Mon Oct 17, 2005 5:11 pm    Post subject: Re: source code organization Reply with quote

Jorgen Grahn wrote:
Quote:
On 16 Oct 2005 05:47:23 -0400, Abhishek Pamecha
[email]abhishekpamecha (AT) gmail (DOT) com[/email]> wrote:
I dont think reorganizing a lot of inter-dependent code into
chunks of redundant code is gonna make life easier for
you. What you are suggesting is more of a workaround to the
situation. Trust me, if that code-base is active, you will
run into more errors in the long run, should you choose that
path.
...
While documentation is good as a guide to understand the
code-base it should "not" be used to "steer" it.

Redundant code is acceptable if that code base is not
touched a lot. Or else you are opening a can of worms for
you.

Yes, but are you taking into account that this code base,
according to the poster, is really a handful of standalone
processes communicating over TCP?

For me, the first impulse would be to split that code base,
and document the protocol(s) (in text and/or executable test
cases).

That is partly because I've too frequently had to
reverse-engineer protocols because the original authors (a)
let shared source code implicitly define the protocol and (b)
failed to see that they had a protocol in the first place. And
(c) because if you /do/ have a protocol, you find uses for it
which don't involve all of the different parts -- in testing
and debugging, for example.

I don't know if I'm right in this particular case, of
course. Whatever the solution is, it's not going to be easy,
given the size of that code base.

The original poster mentioned "one component being aware of a
compiler padding for another component which will run on RISC
machine." This sounds very much like they are simply using
"write" on a structure to transmit data. If this is the case,
that's definitly what they have to fix first; one machine should
never be aware of how another machine pads structures.

Basically, if this is the case, I'd start by defining what needs
to be transmitted on the line, as a series of classes, with the
functions necessary to read and write them, and I'd put that in
a common library, to be shared between all of the programs.
Depending on the application, it may be possible to also define
a basic library for manipulating data which is common in many
different programs; this library would also be shared.
Everything else would be specific to the program which needed
it.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Martin Bonner
Guest





PostPosted: Mon Oct 17, 2005 5:19 pm    Post subject: Re: source code organization Reply with quote


[email]suresh_shukla3 (AT) rediffmail (DOT) com[/email] wrote:
Quote:
I am facing a problem of code organization.

There is a system with multiple run-time components.
[snip]
The noticeable things are when I can see one component being aware of a
compiler padding for another component which will run on RISC machine
etc.
[snip]
Most of inter-component communication is over TCP/IP sockets.

To follow up what another poster said about documenting your protocols.
It sounds to me like you are expecting to be able to read a stream of
bytes from a socket and store them in a struct. I **strongly** urge
you not to allow that. What you get from / put into a socket is a
sequence of unsigned char. Specify each record as such, and put in
place the pack/unpack code to deal with converting your structs to
sequences of unsigned char. This will mean that compiler padding will
just cease to be an issue (as will endian-ness, and conversion to 64bit
machines - of course you probably won't have room in your protocol for
a 64bit quantity, but at least you won't be killed by a 64bit long.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Jorgen Grahn
Guest





PostPosted: Tue Oct 18, 2005 10:32 am    Post subject: Re: source code organization Reply with quote

On 17 Oct 2005 07:13:39 -0400, Abhishek Pamecha <abhishekpamecha (AT) gmail (DOT) com> wrote:
Quote:
Yes, but are you taking into account that this code base, according to the
poster, is really a handful of standalone processes communicating over TCP?

Yes, but the little number of standalone processes might have a lot in
common which can be refactored and archived to act as foundation layers
for those components to build upon.

I also agree with you in what you said about documentation but what the
original poster was suggesting is to "depend" on documentation to
propagate and inform about replicating changes from one code-base to
all its redundant clones. Would you agree to doing that? Shouldn't
that be automated or better centralized?

My first instinct would be to agree to that, yes -- if I could see a
protocol between those pieces that might be generally useful.

Like, if there's A talking to B with a text protocol over TCP: I might want
to use telnet to debug B, or I might want to write another client C in a
different language, which cannot easily use the C++ API shared by A and B.

But that's not the situation the original poster is in, I guess.

Quote:
I agree with another poster about suggesting John Lakos's book about
source code organisation. It has a wealth of ideas...

I should really buy and read it -- that was one of the hardest things to
grasp when I started programming (in C) way back, and also the one thing the
books had very little to say about.

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
X/ algonet.se> R'lyeh wgah'nagl fhtagn!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Jorgen Grahn
Guest





PostPosted: Tue Oct 18, 2005 10:33 am    Post subject: Re: source code organization Reply with quote

On 17 Oct 2005 13:11:16 -0400, kanze <kanze (AT) gabi-soft (DOT) fr> wrote:
Quote:
Jorgen Grahn wrote:
....
That is partly because I've too frequently had to
reverse-engineer protocols because the original authors (a)
let shared source code implicitly define the protocol and (b)
failed to see that they had a protocol in the first place. And
(c) because if you /do/ have a protocol, you find uses for it
which don't involve all of the different parts -- in testing
and debugging, for example.
....
The original poster mentioned "one component being aware of a
compiler padding for another component which will run on RISC
machine." This sounds very much like they are simply using
"write" on a structure to transmit data. If this is the case,
that's definitly what they have to fix first; one machine should
never be aware of how another machine pads structures.

And yet it ends up like that so often!

Quote:
Basically, if this is the case, I'd start by defining what needs
to be transmitted on the line, as a series of classes, with the
functions necessary to read and write them, and I'd put that in
a common library, to be shared between all of the programs.

That's where I'd instead start by defining the protocol, on the bit level,
to be exactly what his toolchains and structs (more or less accidentally)
make it right now. That way, the software one side can be migrate before the
software on the other (and maybe even come up with reusable code).

But I have to admit that it depends. Once, when I had pseudo-protocol
problems, one side was carved in stone in three ways: politically, CM-wise,
and by being ROMmed. That left little choice ;-)

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
X/ algonet.se> R'lyeh wgah'nagl fhtagn!

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
kanze
Guest





PostPosted: Wed Oct 19, 2005 8:49 am    Post subject: Re: source code organization Reply with quote

Jorgen Grahn wrote:
Quote:
On 17 Oct 2005 13:11:16 -0400, kanze <kanze (AT) gabi-soft (DOT) fr> wrote:
Jorgen Grahn wrote:
...
That is partly because I've too frequently had to
reverse-engineer protocols because the original authors (a)
let shared source code implicitly define the protocol and
(b) failed to see that they had a protocol in the first
place. And (c) because if you /do/ have a protocol, you
find uses for it which don't involve all of the different
parts -- in testing and debugging, for example.
...
The original poster mentioned "one component being aware of
a compiler padding for another component which will run on
RISC machine." This sounds very much like they are simply
using "write" on a structure to transmit data. If this is
the case, that's definitly what they have to fix first; one
machine should never be aware of how another machine pads
structures.

And yet it ends up like that so often!

Regretfully.

I always insist on the fact that all data is formatted. The
question isn't whether you are writing formatted data or not;
the question is whether you know and have documented the format
you are writing. Using undocumented formats is in the long run
a recepe for a disaster.

Quote:
Basically, if this is the case, I'd start by defining what
needs to be transmitted on the line, as a series of classes,
with the functions necessary to read and write them, and I'd
put that in a common library, to be shared between all of
the programs.

That's where I'd instead start by defining the protocol, on
the bit level, to be exactly what his toolchains and structs
(more or less accidentally) make it right now. That way, the
software one side can be migrate before the software on the
other (and maybe even come up with reusable code).

Good point. You'd start by defining the protocol currently
being used. And reuse it (but intentionally, as a defined
protocol) if possible. Sounds like a good idea.

Quote:
But I have to admit that it depends. Once, when I had
pseudo-protocol problems, one side was carved in stone in
three ways: politically, CM-wise, and by being ROMmed. That
left little choice Wink

In that case, you have to use the existing protocol, yes. Your
point about being able to migrate the two ends separately is
valid even in less extreme situations, however.

Note that if we are talking about disk files, and we want to
keep existing data, the situation is similar to your case: you
can't go back in time to change the program that was used to
write the data, so you're stuck with the existing protocol (or
writing an extra program to migrate it).

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]


Back to top
Display posts from previous:   
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Powered by phpBB © 2001, 2006 phpBB Group
SEO toolkit © 2004-2006 webmedic.