C++Talk.NET Forum Index C++Talk.NET
C++ language newsgroups
 
Archives   FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 

C++ private/protected hack
Goto page Previous  1, 2, 3
 
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated)
View previous topic :: View next topic  
Author Message
Bo Persson
Guest





PostPosted: Thu May 17, 2007 1:12 pm    Post subject: Re: C++ private/protected hack Reply with quote



bji-ggcpp (AT) ischo (DOT) com wrote:
:: On May 17, 3:47 am, ytrem...@nyx.net (Yannick Tremblay) wrote:
::
::: Key point: *You* didn't tell the compiler that nobody should
::: access this field. *I* did. *I* do not want *you* to tell the
::: compiler that what *I* told it to do is not valid. *I* decide
::: that myself.
:::
:::: I am the programmer; why *shouldn't* I be allowed to tell the
:::: compiler to do whatever I want it to do in this case?
:::
::: Because *you* didn't write the original class. Someone else did.
::: If you did, then that whole discussion would be pointless since
::: you could just have declared all these fields public in the first
::: place.
::
:: I've gotten this same sort of response in other emails. I guess it
:: highlights a difference of how I view code from how other people
:: view it. Did you realize that once you give me your class
:: definitions and your libraries, you are no longer in total control
:: of them? Did you know that there is no law against me using your
:: classes in ways that you did not intend? Did you know that I
:: could write a "proxy" class with the same exact structure, but
:: with different access specifiers, cast a pointer to your class to
:: mine, and then modify the members of your class, and that there's
:: nothing you can do to stop me?

This is another of your hacks, that doesn't work portably. :-)

When you change the access specifiers, how do you know where to find
the member variables? What if I change the number and order of the
private variables in the next compile off my class?

::
:: I find it interesting that people have repeatedly expressed this
:: concept of "ownership" of their classes. Once you release it, it's
:: not yours. It's not even your original header file in fact; it's a
:: copy of the header file. I could modify my copy directly and
:: replace "private" with "public" if I wanted to.

Ownership comes with responsibility. As long as I have to be
responsible for what my class does, I must maintain the ownership.

If you want to make a similar class, based on mine, that's fine
(barring copyright violation:-) - but now it is your responibility to
make it work. It is not my class anymore.

::
:: My point here is that this barrier you are speaking of, that I
:: can't try to get access to internals of classes that someone else
:: has defined, is wrong on the face of it; I most certainly can, and
:: *will*, if it solves my problem in the best way.

I also have a lock on my front door. It will stop most people from
entering my home, but it will not stop everyone. A determined SWAT
team is likely to succeed, for example. But that's not what I am
protecting myself from.

If they care to ring the bell, I might unlock the door for them. Or
add a friend declaration.

:: If that leaves
:: you feeling violated, or that your intentions have been subverted
:: for something evil, then I most sincerely apologize. But don't
:: take it personally. And besides - my goal was not to subvert your
:: classes; it was to provide a tool that allowed *you* to subvert
:: *your own* classes, in a way that I thought might be useful to you.

But I don't want to subvert my own classes. I don't trust myself
either! That's why I use the tools of the language.


Bo Persson




--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Guest






PostPosted: Fri May 18, 2007 12:47 am    Post subject: Re: C++ private/protected hack Reply with quote



On May 17, 8:19 pm, "Stephen Howe"
<sjhoweATdialDOTpipexDOT...@giganews.com> wrote:
Quote:
And besides - my goal was not to subvert your classes; it was to
provide a tool that allowed *you* to subvert *your own* classes, in a
way that I thought might be useful to you.

But if we are using your tool and we have some 3rd-party class libraries,
your tool may subvert the use of 3rd-party class libraries when we did not
sign up to your design philosophy. Your approach means it is impossible to
use _both_ your tool and 3rd-party class libraries as your tool subverts
their header files.
And it would interesting to read the small print of whatever license is in
effect for the 3rd-party class library.

You just wouldn't run the tool on their header files. The tool would
allow you to specify what header files you want it to run on, and
which classes from those header files you want it to use. But I
thought that this would be obvious? Do you really need me to tell you
that this is the kind of option that tools like this would provide?

Quote:
Yes I have. And let me respond with a question in turn: have you ever
tried to push the boundaries of a problem to try to come up with a
unique solution - or do you always just follow the rules that others
have set out for you?

You mean, I can push boundaries like overwriting memory I dont own,
dividing
by 0, reading and writing to the NULL pointer, writing an assignment
operator in terms of calling a destructor and copy constructor, those kind
of rules - all meant to be broken?

This is a logical fallacy called "false dilemma". You are saying that
I can only choose to use techniques that violate the C++ language
rules that have no chance of being useful, or choose to use techniques
that do not violate the C++ language rules. I am talking about a
third category which you are pretending does not exist: techniques
that violate the C++ language rules but that have some chance of being
useful. I know that my technique works because I've already run lots
of test code based on it using the g++ compiler (all the way back to
2.x versions and up to 4.x versions, and it worked in every case). I
can predict in advance that none of the things that you describe would
work for any compiler.

Now I fully recognize that I just might be "lucky" with the GNU
tools. And that my luck could run out at any time. But I knew that
all along. I'm not saying that what I am proposing would be the best
choice for you or anyone else, given that it is definitely a risk -
breaking the language rules for some benefit that may end up being
lost when the code is moved to a new version of the compiler, a new
compiler, or a new platform. But I think it trivilalizes the issue to
try to say that a technique which works on some compilers, which can
be theorized would always work given "standard" implementations of C++
compilers (I would guess that most compilers only use access
specifiers to error check against illegal usage of classes and class
members, not to affect the resulting compiled code), is the same as
techniques which could never work on any compiler.

I just want to make the point that there is no absolute right and
absolute wrong when it comes to using C++ compilers. There is only
varying degrees of risk. Perhaps 99.9% of other people out there
would not want to take the risk of using a hack that violated the
rules of C++. That is fine. But that's a risk assessment. The 0.1%
of people who would use such a hack are not "wrong". They just have
made a different risk-vs-reward assessment.

Quote:
It is not a case of "best". Many others offer solutions.
But the difference is they dont try and subvert the language.
I would not use any 3rd-party library that subverted the language even
if it
claimed it was the best.

I can respect that. But please recognize that all you're saying is
that you are too averse to the risks incurred in violating the C++
standard. Please don't say that you don't want to do it because
subverting class access mechanisms is "wrong".

Thanks,
Bryan


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Sebastian Redl
Guest





PostPosted: Fri May 18, 2007 11:50 pm    Post subject: Re: C++ private/protected hack Reply with quote



On Thu, 17 May 2007 coal (AT) mailvault (DOT) com wrote:

Quote:
On May 14, 10:38 am, Sebastian Redl <e0226...@stud3.tuwien.ac.at
wrote:
On Sun, 13 May 2007 bji-gg...@ischo.com wrote:


I don't think the majority does. Quite on the contrary, we see it as a
workaround for a limitation of templates, but a necessary one because we
are not willing to sacrifice the performance or even the correctness of
the program. (Just because an object is swappable doesn't mean it's
copy-constructible. Without the template magic, the code wouldn't work for
objects that are the former, but not the latter.)

I read that professor Stroustrup thinks there are around 3 million C++
programmers out there. Are you sure you can speak for the majority?

No, which is why I introduced the paragraph with "I think".
Still, it would greatly surprise me if it were otherwise. I would also
consider it a problem in the viewpoint of the C++ community. Modern C++
template code is cool because of what the client can do with the
libraries, definitely not because of the libraries' implementations.

Quote:
OT, but XML is overhyped, not useless.

It isn't OT IMO. One of the few ways to get meta/about information
in C++ uses gccxml. IMO he was rightly complaining about his
options.

My comment was off-topic, not necessarily his. This thread is not about
the merits of XML, and thus my comment that the language is merely
overhyped, not outright useless, is off-topic. I considered it short
enough that I still put it in, because the contribution to the overall
post length was negligible.

Quote:
Right, but that doesn't mean there isn't a place for the "just works"
approach. There should be a way to indicate that you will write the
marshalling code for a given class by hand. The "just works" way
would still be used in many cases and it's use may increase as the
technology matures.

I see it the other way round. If I had lots of trivial classes to
serialize, I would still want to explicitly mention them. E.g. I could put
in:

TRIVIAL_SERIALIZATION(ClassName);

and then have an additional header preprocessor (not unlike this
gccxml-based tool) go over the source and replace these declarations with
trivial serialization functions. For Boost.Serialization, it could look
like:

class Foo
{
int a, b, c;

TRIVIAL_SERIALIZATION(Foo);

// ...
};

->

class Foo
{
int a, b, c;

template <typename Archive>
void serialize(Archive &ar, /* others */)
{
ar & a & b & c;
}
};

Quote:
I just don't think your "just works" approach is feasible or a good thing.

CORBA and others prove it is feasible. I enjoy not working with CORBA
so won't comment about the goodness part.

I haven't studied CORBA in depth, only used it during part of a university
class in Java, but I didn't get the feeling that it "just works" at all.

Quote:
Again, I really don't think that's a good thing. A class is usable only
when all its invariants are fulfilled. A partially deserialized class is
worse than useless: it's dangerous.

I disagree, but acknowledge your point. Think in terms of messages
instead of objects. "I went tk prayr meeting. Be nacl around" is
better than nothing as long as you understand you didn't get the
whole thing. Sometimes you have to limp along and trust things
will work out.

I'm afraid that, as long as we're talking about objects, I cannot make
myself think of messages instead. Half-baked objects are dangerous, and
code to handle these situations would make the deserialization mechanism
anything but non-intrusive. In fact, intuitively I'd say it would take
over the majority of the class's code, not to mention affect the
interface.

It might make sense for some classes. But they are the (rare) exception
instead of the rule, and as such I would consider them for a general
serialization library.

Sebastian Redl

--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Guest






PostPosted: Sun May 20, 2007 11:44 pm    Post subject: Re: C++ private/protected hack Reply with quote

Quote:
Right, but that doesn't mean there isn't a place for the "just works"
approach. There should be a way to indicate that you will write the
marshalling code for a given class by hand. The "just works" way
would still be used in many cases and it's use may increase as the
technology matures.

I see it the other way round. If I had lots of trivial classes to
serialize, I would still want to explicitly mention them. E.g. I could put
in:

TRIVIAL_SERIALIZATION(ClassName);

and then have an additional header preprocessor (not unlike this
gccxml-based tool) go over the source and replace these declarations with
trivial serialization functions. For Boost.Serialization, it could look
like:

class Foo
{
int a, b, c;

TRIVIAL_SERIALIZATION(Foo);

// ...

};

-

class Foo
{
int a, b, c;

template <typename Archive
void serialize(Archive &ar, /* others */)
{
ar & a & b & c;
}

};


If there are n total classes in a program, then m <= n
of those classes will have instances marshalled. And
h <= m would have hand written marshalling code. The
Ebenezer approach has something like what you want to
indicate a class is part of m -- the presence of
Send/Receive function prototypes. What we don't
currently have (online) is a way to indicate you want
the class to be a part of h. We're thinking about
how to do that. One of the questions we currently
have is how fine grained to make it. Should it be
possible to have a computer written send function and
a hand written receive? In our experience, sending
isn't as tricky as receiving in part because memory
allocation is not an issue.
We might do something like this:

class A
{

int hwSend(...);
int hwReceive(...);
....
};

The "hw"s would tell us you intend to write both of
those functions by hand.


Quote:

Again, I really don't think that's a good thing. A class is usable only
when all its invariants are fulfilled. A partially deserialized class is
worse than useless: it's dangerous.

I disagree, but acknowledge your point. Think in terms of messages
instead of objects. "I went tk prayr meeting. Be nacl around" is
better than nothing as long as you understand you didn't get the
whole thing. Sometimes you have to limp along and trust things
will work out.

I'm afraid that, as long as we're talking about objects, I cannot make
myself think of messages instead. Half-baked objects are dangerous, and
code to handle these situations would make the deserialization mechanism
anything but non-intrusive. In fact, intuitively I'd say it would take
over the majority of the class's code, not to mention affect the
interface.


Say you have two releases, 1.1 and 1.2. If a given 1.2
class only extends its 1.1 predecessor and the stuff that
is new to 1.2 is "after" the 1.1 stuff, you might be able
to produce a 1.1 object from a message that was intended
to yield a 1.2 object. Often 1.2 servers have to be able
to work with both 1.1 and 1.2 ambassadors(clients) so it
fits in with that understanding.

As far as the deserialization mechanism, the receive
code casts the object to it's base class if an error
occurs in a derived part of the process. This makes
it possible to have some functionality and not waste the
effort that has been expended to that point. There are levels/steps
involved here that can be used beneficially.
You don't have to fall all the way down the stairs.

Quote:

It might make sense for some classes. But they are the (rare) exception
instead of the rule,

I think it should be the norm.

Quote:
and as such I would consider them for a general
serialization library.


I wouldn't recommend using B.Ser if you want to do this.
It makes a mush of 1.1 and 1.2 code/data by lumping all
of it into the same class. There is no longer a 1.1
class to use in a cast. This is another source of
inefficiency with B.Ser. If a 1.1 ambassador works with
a 1.2 B.Ser-based server, the server has to use bloated
1.2 instances to receive the 1.1 data because that is
all it has. On the other hand, the Ebenezer approach
would have two distinct classes.

Brian Wood
www.webEbenezer.net

"It is better to send(give) than to receive."


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Yannick Tremblay
Guest





PostPosted: Mon May 21, 2007 7:43 pm    Post subject: Re: C++ private/protected hack Reply with quote

In article <1179348985.571477.43380 (AT) q23g2000hsg (DOT) googlegroups.com>,
<bji-ggcpp (AT) ischo (DOT) com> wrote:
Quote:
On May 17, 3:47 am, ytrem...@nyx.net (Yannick Tremblay) wrote:

Because *you* didn't write the original class. Someone else did. If
you did, then that whole discussion would be pointless since you could
just have declared all these fields public in the first place.

I've gotten this same sort of response in other emails. I guess it
highlights a difference of how I view code from how other people view
it. Did you realize that once you give me your class definitions and
your libraries, you are no longer in total control of them? Did you
know that there is no law against me using your classes in ways that
you did not intend? Did you know that I could write a "proxy" class
with the same exact structure, but with different access specifiers,
cast a pointer to your class to mine, and then modify the members of
your class, and that there's nothing you can do to stop me?

I am afraid, you've lost me here. What's your goal?

My understanding was that you were trying to write a library for other
peoples to use. I.e. have other peoples use your classes. Not take
some other peoples classes, hack them and do whatever you want with
them.

If it is the second case, by all mean, if you have legal rights to the
code and are willing to take responsibilities for your modifications,
you are totally free to do whatever modifications to it that you
want.

However, if this is the first case, my classes stay my classes, I am
still responsible for them. You are publishing a library in the hope
that I will choose to use it because it will be useful for me but you
will not take the responsibilities for issues caused by the breaking
of the class internals due to some incompatibilities between your
implementation and my class. So in this case, as far as I am
concerned, they stay my class because I keep responsibility.

Quote:
And besides - my goal was not to subvert your classes; it was to
provide a tool that allowed *you* to subvert *your own* classes, in a
way that I thought might be useful to you.

I don't want to subvert my own classes. I wrote them that way because
I decided that it was the way they should be written. If I had wanted
some fields to be accessible by anyone, I'd have declared them public.
If I'd wanted some fields not to be acccessible by everyone but only
by a selected few, I'd have declared these few friends. I see no need
to subvert a conscious decision I made.

Quote:
Yes I have. And let me respond with a question in turn: have you ever
tried to push the boundaries of a problem to try to come up with a
unique solution - or do you always just follow the rules that others
have set out for you?

Smile Depends if these rules make sense to me and if I respect the
opinions and knowledge of those that set them. :-)

In this particular case however, I don't see the need to break the
rules because I can see a lot of potential for a creative solution
that doesn't involve rule-breaking.

Rules are made to be broken but you are better be ready with a good
explaination why. :-)

Quote:
No, you are not *the* programmer. You are not the designer of the
class. You may well be *a* programmer but you are not *the*
programmer. You are a user of the class. There is a very fundamental
difference between the two.

There is something you need to understand: the compiler doesn't care
who wrote the code. I don't care either when it comes to the purposes
of my tool. Your code, my code, it doesn't make a difference in the
end. All that matters is that the resulting application work, and
that the code itself is maintainable. With every single approach you
can ever mention for anything programming-related, there are
tradeoffs, and my tool would be no different. It will make some code
easier to maintain and some code harder. It becomes a matter for the
developer to decide how and when to best apply it.

I totally agree with the above paragraph. Unfortunately, it leads me
to a totally different conclusion than you. The key sentence is in
the middle but I didn't want to cut the context:

"All that matters is that the resulting application work, and that the
code itself is maintainable."

IMO, your suggestion might save a small amount of time at getting the
application to work at huge cost to the maintainability of the code
base.

Implementation (information) hiding is a very important tool in the
arsenal of the software engineer. Much better known peoples than me
have written on the matter so I won't try to explain all the
advantages here but essentially, your approach breaks any attempt at
implementation hiding from the original designer of the class and
couple the code that needs a serialised object with the internal
implementation of the object. A much safer approach couples code that
needs serialised objects with a serialisation interface.

Quote:
Look, I'm not stupid; I fully realize that there are going to be cases
where not all member variables make sense when serialized. I will
provide mechanisms for allowing the programmer to tell the serializer
what to do. My approach is one that will make serializing "trivially
serializable" classes completely and utterly trivial. But it will
still require programmer effort for "non-trivially serializable"
classes.

Here's my problem: your tool as you plan it would for the "trivially
serializable classes" give me serialisation for free but for the
"non-trivially serialiseable classes" it would break my application
unless I make a specific effort(and do not forget to do it) to explain
that this class is not trivially serializable.

By opposition, an approach that either requires me add member methods
for serialisation or "friend TrivialSerialiser" give me serialisation
of my trivial class for cheap (think if class is trivial, if yes, add
one line to header) but is no danger to non-trivially serializable
classes unless I make a specific mistake.

Unsafe approach: requires an active action to avoid a bug.
Safer approach: will work unless a mistake is actively done.

Worse, I could write a class that is trivially serializable. I could
make the conscious decision to use your tool with that class because I
know it is safe in that case. A different developper in the future
could modify the internal iplementation of the class and as a
consequence change it so that it is not trivially serializable
anymore. Unfortunately, there would be nothing in the class that
inform that developper that what he is doing is unsafe. All he is
doing is modify the private internal of the class and not touching the
interface so he should be able to do it in isolation. There's nothing
that tell him. He shouldn't have to analyse the whole code base to find
out that one communication module at one place in the code that he is
not working on actually x-ray the class without asking permission.
So, this apparently safe little internal change has now broken a
totally separated part of the product which is only detectable by
analysing that a client received a message that is different than
expected but cannot be covered by the class unit test themselves
because the class doesn't know it is being x-ray'ed.

The alternative approach clearly states in the class that the
maintenance programmer need to consider serialisation greatly reducing
the risk of it being forgotten or mishandled.

So forgive me but my evaluation of this approach is:

- Probably save myself 10 minutes of development time by not having to
edit a header file to add a friend declaration for trivial classes.

- Potentially trip myself by forgetting to use your facility for
non-trivial serialisation since the compiler doesn't warn me of
anything.

- Potentially trip a future maintainer by hiding implementation
dependency deep into client code that should only care about
interface.

- Potentially release broken application to customers because these
bugs between a client that x-ray a class and a class implementation
are fundamentally hard to detect. (compiler doesn't detect, class
unit test won't detect, etc.)

- Potentially need to spend several man-weeks of time tracking and
fixing the problem of this released buggy application.

- Potentially loose $ LARGE_AMOUNT due to the release of a buggy
application to customers.

So I am afraid the saving of 10 minutes (or let's call it 1 hour with
decent testing) is not IMO worth the risk introduced by breaking the
interface. So if I were looking for a serialisation tool, came across
a tools such as the one you originally proposed and I evaluated it, I
would very quickly reject it. In addition, I would also strongly
recommend against its use to any peoples I collaborate with and I
would refuse to approve its use where I have authority.


Yan




--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Guest






PostPosted: Tue May 22, 2007 3:37 am    Post subject: Re: C++ private/protected hack Reply with quote

On May 21, 11:44 am, c...@mailvault.com wrote:

Quote:
If there are n total classes in a program, then m <= n
of those classes will have instances marshalled. And
h <= m would have hand written marshalling code. The
Ebenezer approach has something like what you want to
indicate a class is part of m -- the presence of
Send/Receive function prototypes. What we don't
currently have (online) is a way to indicate you want
the class to be a part of h. We're thinking about
how to do that. One of the questions we currently
have is how fine grained to make it. Should it be
possible to have a computer written send function and
a hand written receive? In our experience, sending
isn't as tricky as receiving in part because memory
allocation is not an issue.
We might do something like this:

class A
{

int hwSend(...);
int hwReceive(...);
....

};

The "hw"s would tell us you intend to write both of
those functions by hand.

I think that sounds like a good approach. You detect if the class has
defined hwSend and if so, you generate a Send method that just calls
hwSend. Otherwise, you assume that the developer wants you to
generate the Send method logic and you generate a fully implemented
method. Same for hwReceive/Receive. In this way the class always
ends up with a generated Send/Receive method, but the implementation
is either supplied by your tool, or supplied by the developer.

I will use something similar in my serialization tool; however, I
define the problem a little differently than Send and Receive
methods. I am taking a different (and to most people, probably pretty
weird) approach of never letting the developer write the serialization
code. The serialization code is always generated by the tool. The
thing that the developer does get to do is to define "proxy" classes
that convert classes which are not in an automatically-serializable
form, into an automatically-serializable form, such that the
serialization logic for them can be generated by the tool.

So the developer's job for "nontrivial" classes becomes, instead of
"write Send/Receive methods" (or Serialize/Deserialize methods,
whatever you want to call them), "write a different representation of
the class, one which is trivially serializable, and write methods for
converting to/from your class and its serializable representation".

You might wonder why this wacky approach is any better than letting
the developer hand-write serialization code. Well, the thing is this:
my serialization format, and the associated serialization logic,
allows for deserialization in "pieces" (i.e. without requiring that
all of the data be available immediately; you can be reading chunks of
data off of the network and calling into the deserializer to "add
them" to the object being deserialized; you could be adding 1 byte at
a time to the deserialized object and the deserialization logic
handles it by keeping track of the deserialization state so that when
you have more data available, it can pick up where it left off). This
would be very, very hard to do manually by writing your own
deserialization method. It is best left up to the tool which can
generate this error-prone logic for you.

Your job instead is to write logic for converting your non-trivially-
serializable class into a trivial form. This I expect to be much
easier code to write and debug. But I admit, it is a little weird, to
have to write "proxy classes" whose data members obey certain rules,
and convert your classes to/from them. I can only hope that most
people would start out writing trivially-serializable classes if they
knew that they were going to be using serialization with them, thus
eliminating the need for later writing a proxy class, and only in a
few extraordinary cases resorting to proxies.

I intend to provide pre-written proxies for common classes such as STL
library classes.


Quote:
Say you have two releases, 1.1 and 1.2. If a given 1.2
class only extends its 1.1 predecessor and the stuff that
is new to 1.2 is "after" the 1.1 stuff, you might be able
to produce a 1.1 object from a message that was intended
to yield a 1.2 object. Often 1.2 servers have to be able
to work with both 1.1 and 1.2 ambassadors(clients) so it
fits in with that understanding.

If you're talking about versioning, I intend to leave this up to the
developer to handle. I see versioning as a process whereby:

- Software is released into the wild and generates serialized versions
of internal data structures to persistent storage

- Software is later updated to change its internal data structures,
and needs to read the data structures written by previous versions of
the software

- At this time, the developer should be defining new classes to
represent the new internal data structures while leaving the old
intact. So if you had a "BusRoute" class that you were serializing
with a previous release, then you would rename that class "BusRouteV1"
and make a new "BusRoute" class for your new release. Your code would
contain logic for serializing both types, and when you encountered a
BusRouteV1 object, you'd read it in and either use it as it was, or
write methods to convert it into a BusRoute object. In this way you
could have the vast majority of your code written to use only the new
BusRoute object, and only have your save/load code know that there is
such a thing as a BusRouteV1 object and how to read it in and convert
it into the BusRoute object that the rest of your code uses.

In this way no particular class is versioned per se; each class
represents an unalterable representation of a particular data
structure as it was used in a particular release of the software.

This has three advantages over the B.Ser approach:

1) It takes versioning support out of the serialization library and
API itself, simplifying that library and the API

2) It allows the software to easily "end-of-life" old data structures
simply by removing them from the code. For example, you might
concurrently support "BusRouteV1", "BusRouteV2", and "BusRoute" data
structures for a while, with code to convert the V1 and V2 objects
into the "current" version whenever you read a V1 or V2. Now for the
next release, you decide to remove V1 support because it's pretty old
and you don't want to carry that baggage anymore. All you have to do
is to delete BusRouteV1 from your source files.

3) It keeps each class "pure" in that each version of the data
structure is self-contained and doesn't keep track of multiple
concurrent versions of itself. The serialization/deserialization
logic is not "cluttered" with multi-version support.

The disadvantages are:

1) You have to explicitly keep track of multiple versions of class
files in your source code and write conversion logic to take an old V1
version of a class and convert it into a new version of that class.

2) You end up doing some more work at runtime because you read in a V1
object, then create a "current" version of the object from it, then
delete the read-in V1 that you don't need any more.

3) You have to use something other than the class name to identify the
class in the serialized form, to allow the class name to change but
the data structure being represented to continue to have the same
serialization identifier. This is similar to Java's serialVersionUID.

Quote:
As far as the deserialization mechanism, the receive
code casts the object to it's base class if an error
occurs in a derived part of the process. This makes
it possible to have some functionality and not waste the
effort that has been expended to that point. There are levels/steps
involved here that can be used beneficially.

Hm, I don't quite see how this could work. It does seem dangerous to
me to have created a Bar (subclass of Foo) for the purposes of
deserialization, then having deserialization fail and decide that you
already deserialized all of the Foo fields, so just cast the thing to
a Foo and use it like that. Later when you go to destroy the object,
you're going to run into trouble because you'll be treating it like a
Foo when it's really a Bar, and the destructor may do very wrong
things with a partially-deserialized class.

While I think that it could work in certain limited situations, it is
error-prone enough that I would tend to not want to use such a
technique. If the thing can't be deserialized fully and correctly,
then you have to destroy it (by the way this is another thing that the
generated deserialization logic can do easily that a developer would
have a hard time doing without errors: stopping a deserialization
process halfway through and cleaning up a half-deserialized object)
and consider the deserialization as having failed entirely.

Quote:
You don't have to fall all the way down the stairs.

I like the way you put things. I will have to remember these quips
and use them myself :)

Quote:
It might make sense for some classes. But they are the (rare) exception
instead of the rule,

I think it should be the norm.

I would hope that anyone writing classes which are intended to be
serialized would take the time to ensure that the structure of these
classes match the requirements of the serialization tool. If they do
this, then there would no need for proxy classes or other hand coding
later on to support the serialization process.

Thanks,
Bryan


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
Back to top
Post new topic   Reply to topic    C++Talk.NET Forum Index -> C++ Language (Moderated) All times are GMT
Goto page Previous  1, 2, 3
Page 3 of 3

 
 


Powered by phpBB © 2001, 2006 phpBB Group