 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Gennaro Prota Guest
|
Posted: Tue Sep 26, 2006 8:37 pm Post subject: Standard back inserter on non-container |
|
|
Hi all,
I'm currently refining a mini-framework to implement one-way hash
functions. To simplify things a bit let's assume the main abstractions
of the sistem are
template< typename Engine >
class hasher;
(where Engine can be MD5, SHA1, etc.) and
template< class Hasher >
class digest;
The only user-callable functions in class hasher (besides
constructors) are:
void push_back( byte_type b );
template< typename InputIter >
void append( InputIter begin, InputIter end );
which allow to "enqueue" a byte, or sequence thereof, into an input
block (merkle damgard tecnique).
Now, what I wonder is: if in addition to the push_back() function my
hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on it even if
the hasher itself doesn't qualify as a standard container?
Example:
md5_hasher h;
std::copy( s.begin(), s.end(), std::back_inserter( h ) );
--
Gennaro Prota
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Greg Herlihy Guest
|
Posted: Wed Sep 27, 2006 4:13 am Post subject: Re: Standard back inserter on non-container |
|
|
Gennaro Prota wrote:
| Quote: | Hi all,
I'm currently refining a mini-framework to implement one-way hash
functions.
Now, what I wonder is: if in addition to the push_back() function my
hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on it even if
the hasher itself doesn't qualify as a standard container?
Example:
md5_hasher h;
std::copy( s.begin(), s.end(), std::back_inserter( h ) );
|
The insert iterators are adapter class templates that wrap standard
containers. I would certainly interpret "container" to mean a class
that satisifies the container requirements spelled out in Chapter 23.
And judging from the looks of them, there is more to writing a standard
container class than declaring two typedefs.
But why even go to the trouble of converting your hashing container
class to be a standard container, instead of just using one of the two
that have already been written? From what I can tell, the two
requirements for this container are a) a customizable hashing function
b) standard container conformance. And given those criteria,
std::tr1::unordered_set and std::tr1::unordered_map have no trouble
qualifiying.
Greg
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
kanze Guest
|
Posted: Wed Sep 27, 2006 4:26 pm Post subject: Re: Standard back inserter on non-container |
|
|
Gennaro Prota wrote:
| Quote: | I'm currently refining a mini-framework to implement one-way
hash functions. To simplify things a bit let's assume the main
abstractions of the sistem are
template< typename Engine
class hasher;
(where Engine can be MD5, SHA1, etc.) and
template< class Hasher
class digest;
|
See http://kanze.james.neuf.fr/code-en.html. In the Digest
component of the IO subsystem, there are template classes Hasher
and Digest, as well as policy classes for MD5 and the known (to
me) SHA hashes.
| Quote: | The only user-callable functions in class hasher (besides
constructors) are:
void push_back( byte_type b );
template< typename InputIter
void append( InputIter begin, InputIter end );
|
How do you read the value? .
I've also found a reset() function useful.
Interestingly enough, I don't have a push_back function,
although I probably should. The initial design was for the
class to be an "accumulator", to be used with std::accumulate,
thus:
out << std::accumulate( text.begin(), text.end(), MD5Hasher() ) ;
(where MD5Hasher is a typedef for Hasher<MD5Policy>, of course).
This ends up resulting in a lot of copying, so I added the
append function. There is a short discussion about this (an my
own considerations about using an output iterator) in the
documentation:
http://kanze.james.neuf.fr/doc/en/IO/html/index.html, then
navigate to the class Hasher.
(Note that I was fairly careful to implement the class in a way
that copying would take a minimum amount of time. On the other
hand, I haven't really tried to find any means of sharing the
state, without copying---operator+ would have to return a
temporary which would convert to a Hasher in general case, but
for which assignment would be overloaded so that it modified the
state in place.)
| Quote: | which allow to "enqueue" a byte, or sequence thereof, into an
input block (merkle damgard tecnique).
Now, what I wonder is: if in addition to the push_back()
function my hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on it
even if the hasher itself doesn't qualify as a standard
container?
Example:
md5_hasher h;
std::copy( s.begin(), s.end(), std::back_inserter( h ) );
|
You could probably make it work, but it seems like abuse (and
almost obfuscation) to me. The route I was considering was to
write my own insertion iterator, and add a function which
returned it, so you could write:
std::copy( s.begin(), s.end(), h.inserter() ) ;
I haven't yet done so; to quote my documentation:
I'm not completely sure of this however: it looks almost
like abuse, and since unlike using std::accumulate, it
requires an actual instance lasting beyond the end of the
expression in order to extract the results, I wonder about
the actual utility. (If you have an instance, there's no
reason not to use the template form of append.)
Personally, I like the accumulate idiom, and if I can find time
(ha!), I'd like to investigate the possibility of writing a
version of accumulate which uses += if it can find it, and +
then = otherwise. If accumulate used +=, there would be no real
time penalty (unless you called accumulate for a lot of little
sequences, instead of just once).
--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
kanze Guest
|
Posted: Wed Sep 27, 2006 5:08 pm Post subject: Re: Standard back inserter on non-container |
|
|
Greg Herlihy wrote:
| Quote: | Gennaro Prota wrote:
I'm currently refining a mini-framework to implement one-way hash
functions.
Now, what I wonder is: if in addition to the push_back() function my
hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on it even if
the hasher itself doesn't qualify as a standard container?
Example:
md5_hasher h;
std::copy( s.begin(), s.end(), std::back_inserter( h ) );
The insert iterators are adapter class templates that wrap standard
containers. I would certainly interpret "container" to mean a class
that satisifies the container requirements spelled out in Chapter 23.
And judging from the looks of them, there is more to writing a standard
container class than declaring two typedefs.
But why even go to the trouble of converting your hashing
container class to be a standard container, instead of just
using one of the two that have already been written?
|
Because his hasher isn't a container at all, to begin with, at
least not in the sense of the standard. He simply wants to fool
the system into thinking it's one. Thus, for example, push_back
doesn't increase the size of container; it just changes some
internal state.
As I mentionned in my own response to his question, I considered
myself creating an "insertion iterator", an output iterator
which "pushed back" each element into the hasher, and decided
against it. Somehow, a hasher just doesn't feel like a
container, and copying into it seems almost like obfuscation.
(Logically, I find that a hasher is more of an accumulator: it
has state of constant length, which changes each time you append
an element. When attempting to make it conform to a "standard"
model or idiom of some sort, the most natural overload to me
seemed +=. Although it's different enough that I probably
wouldn't have used it except that accumulate uses +, and I
didn't want + without +=.)
| Quote: | From what I can tell, the two requirements for this container
are a) a customizable hashing function b) standard container
conformance. And given those criteria, std::tr1::unordered_set
and std::tr1::unordered_map have no trouble qualifiying.
|
Except that he doesn't want standard container conformance. In
particular, a function like size() doesn't make sense; if there
is a size(), it is a constant, and you certainly cannot meet the
post condition of the default constructor, that size() == 0. I
can't find any formal requirement that size() == old.size() + 1
is a post-condition of push_back(), but I think it's implicit
from the other requirements, and I think we'd all be pretty
surprised if it wasn't true for something claiming to be a
container.
I think his question is really two questions: given something
that really isn't a container, but can be twisted to sort of
look like one for the sort of things that back_inserters do, is
it legal, and is it moral? In my mind, the answer to the second
is a resounding no---giving the impression that something is a
container when it isn't will generally lead to the sort of
confusion about it that you seem to have. The answer to the
first is a bit more complicated. As you say, the insert
iterators in general seem to have been conceived with the
standard containers in mind. And while it wouldn't be at all in
the philosophy of the STL if they wouldn't work equally as well
with a user defined container, the standard does seem rather
silent about what parts of the container requirements are really
requirements. In the absense of any more definite statement, I
would say that to be a legal container argument for an insertion
iterator, a class must meet all of the container requirements
specified in §2.3.1---and as you say, there's a lot more too it
than just a few typedef's. Logically, I can't imagine any
conceivable reason why an insertion iterator would need a
default constructor which creates an object of size() == 0, but
in the absense of any other specification, I'd say that that is
what the standard requires.
--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Guest
|
Posted: Wed Sep 27, 2006 5:09 pm Post subject: Re: Standard back inserter on non-container |
|
|
Gennaro Prota wrote:
[trimmed a bit]
| Quote: | class hasher {
public: void push_back( byte_type b );
If in addition to the push_back() function my
hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on it even if
the hasher itself doesn't qualify as a standard container?
|
No, for the reasons given, but is it really hard to create your own
iterator
in this case? operator* and operator++ are trivial, and you can get the
typedefs by inheriting.
HTH,
Michiel Salters
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Guest
|
Posted: Wed Sep 27, 2006 10:39 pm Post subject: Re: Standard back inserter on non-container |
|
|
kanze wrote:
| Quote: | Because his hasher isn't a container at all, to begin with, at
least not in the sense of the standard. He simply wants to fool
the system into thinking it's one. Thus, for example, push_back
doesn't increase the size of container; it just changes some
internal state.
|
In some ways, it looks more like a stream than a container. In
particular, a hasher resembles a stringstream: you put a bunch of
things into it (via operator <<) , then extract a value from it (via
str()). If that analogy makes sense, you'd want an ostream_iterator
rather than a back_insert_iterator.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Peter Dimov Guest
|
Posted: Wed Sep 27, 2006 10:39 pm Post subject: Re: Standard back inserter on non-container |
|
|
Gennaro Prota wrote:
| Quote: | Hi all,
I'm currently refining a mini-framework to implement one-way hash
functions. To simplify things a bit let's assume the main abstractions
of the sistem are
template< typename Engine
class hasher;
(where Engine can be MD5, SHA1, etc.) and
template< class Hasher
class digest;
The only user-callable functions in class hasher (besides
constructors) are:
void push_back( byte_type b );
template< typename InputIter
void append( InputIter begin, InputIter end );
which allow to "enqueue" a byte, or sequence thereof, into an input
block (merkle damgard tecnique).
Now, what I wonder is: if in addition to the push_back() function my
hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on it even if
the hasher itself doesn't qualify as a standard container?
|
My opinion is that:
1. Yes, it is moral to do so, as the only requirements that the
standard explicitly places on the Container template parameter are
const_reference, push_back and unary operator&.
2. If it isn't technically legal to do so (if, for example, a future
concept-aware compiler is technically allowed to reject the code), then
this is a defect in the standard and needs to be fixed.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Gennaro Prota Guest
|
Posted: Thu Sep 28, 2006 2:03 am Post subject: Re: Standard back inserter on non-container |
|
|
On Tue, 26 Sep 2006 22:13:03 CST, "Greg Herlihy" <greghe (AT) pacbell (DOT) net>
wrote:
| Quote: | But why even go to the trouble of converting your hashing container
class to be a standard container, instead of just using one of the two
that have already been written? From what I can tell, the two
requirements for this container are a) a customizable hashing function
b) standard container conformance. And given those criteria,
std::tr1::unordered_set and std::tr1::unordered_map have no trouble
qualifiying.
|
I think there has been a misunderstanding. My hasher template has
nothing to do with TR1's unordered_map and has no concept of "key" in
itself. It is just a machine for computing digests. Basically you
provide a policy which implements a small number of algorithm-specific
operations and it provides all the scaffolding necessary to obtain a
full fledged Merkle-Damgard construction:
<http://en.wikipedia.org/wiki/Merkle-Damgard>
(in effect, in my mini-framework it is called merkle_damgard_hasher;
that's not the only possible kind of hasher)
--
Gennaro Prota
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
kanze Guest
|
Posted: Thu Sep 28, 2006 3:42 pm Post subject: Re: Standard back inserter on non-container |
|
|
johnchx2 (AT) yahoo (DOT) com wrote:
| Quote: | kanze wrote:
Because his hasher isn't a container at all, to begin with, at
least not in the sense of the standard. He simply wants to fool
the system into thinking it's one. Thus, for example, push_back
doesn't increase the size of container; it just changes some
internal state.
In some ways, it looks more like a stream than a container. In
particular, a hasher resembles a stringstream: you put a bunch of
things into it (via operator <<) , then extract a value from it (via
str()). If that analogy makes sense, you'd want an ostream_iterator
rather than a back_insert_iterator.
|
Agreed, although the analogy isn't perfect either. In both
cases, you expect what you put in to be what you can get out
later, which isn't the case here. Myself, I would tend toward
a custom output iterator, and leave both ostream_iterator and
back_insert_iterator be.
--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
kanze Guest
|
Posted: Thu Sep 28, 2006 4:03 pm Post subject: Re: Standard back inserter on non-container |
|
|
johnchx2 (AT) yahoo (DOT) com wrote:
| Quote: | kanze wrote:
Because his hasher isn't a container at all, to begin with, at
least not in the sense of the standard. He simply wants to fool
the system into thinking it's one. Thus, for example, push_back
doesn't increase the size of container; it just changes some
internal state.
In some ways, it looks more like a stream than a container. In
particular, a hasher resembles a stringstream: you put a bunch of
things into it (via operator <<) , then extract a value from it (via
str()). If that analogy makes sense, you'd want an ostream_iterator
rather than a back_insert_iterator.
|
Agreed, although the analogy isn't perfect either. In both
cases, you expect what you put in to be what you can get out
later, which isn't the case here. Myself, I would tend toward
a custom output iterator, and leave both ostream_iterator and
back_insert_iterator be.
--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
kanze Guest
|
Posted: Thu Sep 28, 2006 5:11 pm Post subject: Re: Standard back inserter on non-container |
|
|
Peter Dimov wrote:
| Quote: | Gennaro Prota wrote:
I'm currently refining a mini-framework to implement one-way
hash functions. To simplify things a bit let's assume the
main abstractions of the sistem are
template< typename Engine
class hasher;
(where Engine can be MD5, SHA1, etc.) and
template< class Hasher
class digest;
The only user-callable functions in class hasher (besides
constructors) are:
void push_back( byte_type b );
template< typename InputIter
void append( InputIter begin, InputIter end );
which allow to "enqueue" a byte, or sequence thereof, into
an input block (merkle damgard tecnique).
Now, what I wonder is: if in addition to the push_back()
function my hasher just provides
typedef byte_type & reference;
typedef const byte_type & const_reference;
is it legal/moral to use a standard back_insert_iterator on
it even if the hasher itself doesn't qualify as a standard
container?
My opinion is that:
1. Yes, it is moral to do so, as the only requirements that the
standard explicitly places on the Container template parameter are
const_reference, push_back and unary operator&.
|
Where does it say that? I can't find any requirements per se,
but only a statements along the lines of "An insert iterator is
constructed from a container". And the only standard definition
for a container is in section 23, which contains the
requirements for standard containers.
Another thing to take into account is the fact that the template
parameter is named Container. Although it doesn't say so here,
in most, if not all other contexts where the name of a template
paramter corresponds to the name of a concept, the actual
argument must meed the requirements of that concept.
My impression is that the intent is to require a Container,
although I think that the standard is far from clear about it.
| Quote: | 2. If it isn't technically legal to do so (if, for example, a
future concept-aware compiler is technically allowed to reject
the code), then this is a defect in the standard and needs to
be fixed.
|
I agree that it is a defect that the standard doesn't specify
the requirements. And although I think that the intent was to
require a container, period, I would have no problem with looser
requirements, if the experts are sure that they are sufficient
for all reasonable implementations.
--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Gennaro Prota Guest
|
Posted: Thu Sep 28, 2006 7:41 pm Post subject: Re: Standard back inserter on non-container |
|
|
On Wed, 27 Sep 2006 10:26:24 CST, "kanze" <kanze@gabi-soft.fr> wrote:
| Quote: | Gennaro Prota wrote:
I'm currently refining a mini-framework to implement one-way
hash functions. To simplify things a bit let's assume the main
abstractions of the sistem are
template< typename Engine
class hasher;
(where Engine can be MD5, SHA1, etc.) and
template< class Hasher
class digest;
See http://kanze.james.neuf.fr/code-en.html. In the Digest
component of the IO subsystem, there are template classes Hasher
and Digest, as well as policy classes for MD5 and the known (to
me) SHA hashes.
|
Hmm As I said you a while ago I didn't want to look at your code
to avoid taking too much inspiration. My classes will be released
under the GPL. I got no replies when I asked you if you agreed to use
the same license for your code so I gave up. From what you say it
seems to me we came up with very similar solutions which certainly is
a good sign For me, of course.
| Quote: | The only user-callable functions in class hasher (besides
constructors) are:
void push_back( byte_type b );
template< typename InputIter
void append( InputIter begin, InputIter end );
How do you read the value? .
|
The digest you mean? There was a create_digest() function which just
forwarded the work to the digest constructor. In the end I removed it;
why not writing just md5_digest( h ).
| Quote: | I've also found a reset() function useful.
|
I'm still undecided about this. So far I haven't found a usage for it.
The digest constructor takes a hasher argument *by value*. IOWs its
parameter is a copy of the original hasher and the finalization is
performed on the copy.
| Quote: | Interestingly enough, I don't have a push_back function,
although I probably should.
|
In effect it was called append() in the first place The only
reason why I renamed it to push_back is, as you clearly inferred, that
I wanted to use a standard back inserter, if possible.
| Quote: | The initial design was for the
class to be an "accumulator", to be used with std::accumulate,
thus:
out << std::accumulate( text.begin(), text.end(), MD5Hasher() ) ;
(where MD5Hasher is a typedef for Hasher<MD5Policy>, of course).
This ends up resulting in a lot of copying, so I added the
append function. There is a short discussion about this (an my
own considerations about using an output iterator) in the
documentation:
http://kanze.james.neuf.fr/doc/en/IO/html/index.html, then
navigate to the class Hasher.
|
Ok, I looked at the docs. Yes, accumulate() seems conceptually more
appropriate (though in general I think along the lines of Peter
Dimov). I think the "why not just using append" argument is a bit
weak, in that there can be generic programming situations where some
of the types you want to instantiate the template on don't have an
append function but do have an operator+. I can't think of any, though
:-O.
| Quote: | (Note that I was fairly careful to implement the class in a way
that copying would take a minimum amount of time. On the other
hand, I haven't really tried to find any means of sharing the
state, without copying---operator+ would have to return a
temporary which would convert to a Hasher in general case, but
for which assignment would be overloaded so that it modified the
state in place.)
|
Yeah. I mentally went though that route as well. Then
Sutter/Alexandrescu peeped through my mind (Only add complexity
when you know you benefit from it)
In the end it seemed to me the complexity/benefit ratio was too high.
| Quote: | [snip]
Personally, I like the accumulate idiom, and if I can find time
(ha!), I'd like to investigate the possibility of writing a
version of accumulate which uses += if it can find it, and +
then = otherwise. If accumulate used +=, there would be no real
time penalty (unless you called accumulate for a lot of little
sequences, instead of just once).
|
That's interesting. In fact I asked here why the standard version is
defined in terms of + instead of +=, with no replies. Performance-wise
my first tests, modeled after the time trial in RFC 1321, show that
for MD5 there's quite a difference between calls to push_back and
calls to append (Not surprisingly, as push_back has to check if the
input buffer is full at each insertion. Incidentally, using append
remains around 1.1 Mbit/s on a Pentium 4 2.6 GHz --that's still lower
than what I've read around the Net, though)
Thanks for your detailed reply :-)
--
Gennaro Prota
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Guest
|
Posted: Fri Sep 29, 2006 12:06 am Post subject: Re: Standard back inserter on non-container |
|
|
kanze wrote:
| Quote: | Peter Dimov wrote:
Gennaro Prota wrote:
...
1. Yes, it is moral to do so, as the only requirements that the
standard explicitly places on the Container template parameter are
const_reference, push_back and unary operator&.
Where does it say that? I can't find any requirements per se,
|
Those requirements are implicitly deriveable from the fact that the
behavior of std::back_insert_iterator is defined in terms of the
corresponding members of the Container class, whether or not that class
actually meets the container requirements.
| Quote: | but only a statements along the lines of "An insert iterator is
constructed from a container". And the only standard definition
for a container is in section 23, which contains the
requirements for standard containers.
Another thing to take into account is the fact that the template
parameter is named Container. Although it doesn't say so here,
in most, if not all other contexts where the name of a template
paramter corresponds to the name of a concept, the actual
argument must meed the requirements of that concept.
|
There are cases where that is true, but it's always the case that the
standard says so explicitly. I can find no such statement which covers
"Container".
| Quote: | My impression is that the intent is to require a Container,
although I think that the standard is far from clear about it.
|
That may have been the intent; but it would certainly be possible to
impose less restrictive requirements, and that may have been the
intent. Clarification would be desireable.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
Earl Purple Guest
|
Posted: Thu Oct 05, 2006 4:03 pm Post subject: Re: Standard back inserter on non-container |
|
|
kanze wrote:
| Quote: |
Personally, I like the accumulate idiom, and if I can find time
(ha!), I'd like to investigate the possibility of writing a
version of accumulate which uses += if it can find it, and +
then = otherwise. If accumulate used +=, there would be no real
time penalty (unless you called accumulate for a lot of little
sequences, instead of just once).
|
boost are clever at doing things like that. I'm not that clever but am
just about clever enough to write my own accumulate that uses +=
template <class InputIterator, class T>
T accumulate(InputIterator start, InputIterator finish, T init )
{
while ( start != finish )
{
init += *start;
++start;
}
return init;
}
template <class InputIterator, class T, class BinaryOperation >
T accumulate(InputIterator start, InputIterator finish, T init,
BinaryOperation binary_op)
{
while ( start != finish )
{
binary_op( init, *start );
++start;
}
return init;
}
This requires that binary_op takes its first parameter by non-const
reference.
Each makes 2 copies, one on entry to the function and one on exit, the
last of which may be optimised away by RVO so only one is strictly
necessary.
If you force the template parameter to a reference (you can do this by
specificing the template parameters on calling the algorithm instead of
relying on automatic deduction) then you can optimise away even the one
copy. You cannot then however pass in a temporary or a literal.
Of course you'd put these implementations into your own namespace.
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
kanze Guest
|
Posted: Fri Oct 06, 2006 3:36 pm Post subject: Re: Standard back inserter on non-container |
|
|
Gennaro Prota wrote:
| Quote: | On Wed, 27 Sep 2006 10:26:24 CST, "kanze" <kanze@gabi-soft.fr> wrote:
Gennaro Prota wrote:
I'm currently refining a mini-framework to implement one-way
hash functions. To simplify things a bit let's assume the main
abstractions of the sistem are
template< typename Engine
class hasher;
(where Engine can be MD5, SHA1, etc.) and
template< class Hasher
class digest;
See http://kanze.james.neuf.fr/code-en.html. In the Digest
component of the IO subsystem, there are template classes
Hasher and Digest, as well as policy classes for MD5 and the
known (to me) SHA hashes.
Hmm As I said you a while ago I didn't want to look at
your code to avoid taking too much inspiration. My classes
will be released under the GPL. I got no replies when I asked
you if you agreed to use the same license for your code so I
gave up.
|
I'm having connectivity problems again. Email seems to work in
batch mode---nothing for a week or two, then everything comes
through.
My license is really pretty simple: you can do anything you want
with the code as long as you don't expect me to maintain your
modifications (or even claim that I am responsible for them).
As I think I state at the site, I don't consider much of the
code there as really being "production quality"; it's more a
sounding board for ideas than anything else. (The actual
licence is a BSD licence. I'm too lazy to come up with words of
my own, and that seemed to be the most open I'd seen.)
My own code will NOT be released under the GPL---it takes away
too much freedom from the user. However, it can be incorporated
into other things which are GPL. As long as you take the
responsibility for the GPL on the copy you are using. (I don't
think I'm the first to do this, and it certainly isn't unusual
for code to be available under different licences. At least if
the original sources aren't GPL.)
Note that you probably won't want to copy it verbatim anyway; my
naming conventions are likely different than yours, and it also
depends a bit on my build system for handling portability issues
(e.g. it includes "gb/stdint.hh", from an entirely different
component, to get things like uint32_t in a portable manner).
And copyright only covers literal copying and "derived works";
| Quote: | From what you say it seems to me we came up with very similar
solutions which certainly is a good sign For me, of
course.
The only user-callable functions in class hasher (besides
constructors) are:
void push_back( byte_type b );
template< typename InputIter
void append( InputIter begin, InputIter end );
How do you read the value? .
The digest you mean? There was a create_digest() function
which just forwarded the work to the digest constructor. In
the end I removed it; why not writing just md5_digest( h ).
|
My code actually evolved in the opposite direction. I
originally provided the functions to read the digest directly in
the hasher, e.g. toString() or an << operator. After a short
discussion on comp.lang.c++.moderated, I added a Digest class
and an asDigest() function. (Although you can do it either way:
Digest also has a constructor which takes a Hasher.)
The only real interest of the separate Digest class, I think, is
its support for comparisons (< and ==, particularly) and hash
coding. In case you want to use it as a key in an std::map or
(soon) in an std::unordered_map.
| Quote: | I've also found a reset() function useful.
I'm still undecided about this. So far I haven't found a usage
for it. The digest constructor takes a hasher argument *by
value*. IOWs its parameter is a copy of the original hasher
and the finalization is performed on the copy.
|
It permets reusing the hasher. Personally, I generally prefer
creating a new object, each time I need it, but I know from
experience that this preference is not universal. And I can
sort of imagine cases where you might hash sections of a
document separately---when you detect a new section, you output
the digest, and reset the hasher.
But it's more of a convenience function. The hasher supports
assignment (obviously, if it is to be used with accumulate), so
something like:
h = MD5Hasher() ;
could also be used.
| Quote: | Interestingly enough, I don't have a push_back function,
although I probably should.
In effect it was called append() in the first place The
only reason why I renamed it to push_back is, as you clearly
inferred, that I wanted to use a standard back inserter, if
possible.
|
Back inserter or no, the standard idiom for appending a single
element in the library is push_back. (I'm not 100% sure that
either name is really right here, since we aren't really adding
anything to the sequence, and the size doesn't increase, but I
don't have any better alternatives.)
| Quote: | The initial design was for the
class to be an "accumulator", to be used with std::accumulate,
thus:
out << std::accumulate( text.begin(), text.end(), MD5Hasher() ) ;
(where MD5Hasher is a typedef for Hasher<MD5Policy>, of course).
This ends up resulting in a lot of copying, so I added the
append function. There is a short discussion about this (an my
own considerations about using an output iterator) in the
documentation:
http://kanze.james.neuf.fr/doc/en/IO/html/index.html, then
navigate to the class Hasher.
Ok, I looked at the docs. Yes, accumulate() seems conceptually
more appropriate (though in general I think along the lines of
Peter Dimov). I think the "why not just using append" argument
is a bit weak, in that there can be generic programming
situations where some of the types you want to instantiate the
template on don't have an append function but do have an
operator+. I can't think of any, though :-O.
(Note that I was fairly careful to implement the class in a
way that copying would take a minimum amount of time. On the
other hand, I haven't really tried to find any means of
sharing the state, without copying---operator+ would have to
return a temporary which would convert to a Hasher in general
case, but for which assignment would be overloaded so that it
modified the state in place.)
Yeah. I mentally went though that route as well. Then
Sutter/Alexandrescu peeped through my mind (Only add complexity
when you know you benefit from it)
|
Actually, it's not a case of more or less complexity. In my
mind, the implementation would be a lot simpler if I'd used
std::vector, instead of fixed size C style arrays. But even
without profiling , the idea of severall allocations for each
copy scared me off. Recent postings in it.comp.lang.c++ may
show that using [] on an std::vector is even faster (at least in
some cases) than using it on a C style array, but I' pretty sure
that assigning a class with a char[8] or a uint32_t[16] is a
lot faster than if the class used std::vector for the two. Like
an order of magnitude or more.
The main reason I didn't go for a solution with shared data is
that I couldn't figure out a simple critera for knowing when to
unshare. Simple COW doesn't help, because you do end up writing
a large number of the objects which you copy. And giving the
class reference semantics, with + and += having the same
semantics, just didn't cut it, in my mind.
| Quote: | In the end it seemed to me the complexity/benefit ratio was
too high.
[snip]
Personally, I like the accumulate idiom, and if I can find time
(ha!), I'd like to investigate the possibility of writing a
version of accumulate which uses += if it can find it, and +
then = otherwise. If accumulate used +=, there would be no real
time penalty (unless you called accumulate for a lot of little
sequences, instead of just once).
That's interesting. In fact I asked here why the standard
version is defined in terms of + instead of +=, with no
replies.
Performance-wise my first tests, modeled after the time trial
in RFC 1321, show that for MD5 there's quite a difference
between calls to push_back and calls to append (Not
surprisingly, as push_back has to check if the input buffer is
full at each insertion.
|
My append affectively checks each time anyway. I suppose I
could have optimized it for random access iterators, so that it
would insert the maximum each time around. (It's probably worth
it, since I imagine that most of the use is with string
iterators, which are random access.)
--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
---
[ comp.std.c++ is moderated. To submit articles, try just posting with ]
[ your news-reader. If that fails, use mailto:std-c++@ncar.ucar.edu ]
[ --- Please see the FAQ before posting. --- ]
[ FAQ: http://www.comeaucomputing.com/csc/faq.html ] |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|