 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Dhruv Guest
|
Posted: Sun Jan 25, 2004 10:55 am Post subject: Rationale behind std::binary_search? |
|
|
What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
Regards,
-Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Benoit Dejean Guest
|
Posted: Sun Jan 25, 2004 5:33 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
Le Sun, 25 Jan 2004 05:55:46 -0500, Dhruv a écrit :
| Quote: | What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find?
|
see std::lower_bound, std::upper_bound and std::equal_range
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Jeff Schwab Guest
|
Posted: Sun Jan 25, 2004 10:11 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
Dhruv wrote:
| Quote: | What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
|
What if the element appears more than once in the sequence? Which
position should be returned? This ambiguity is resolved by
std::lower_bound and std::upper_bound, both of which do binary searches
and return iterators. Of course, sometimes you just want to check
whether an element is in a sequence at all, and you don't want to have
to compare iterators...
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Ivan Vecerina Guest
|
Posted: Sun Jan 25, 2004 10:25 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
"Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote
| Quote: | What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
Note that the standard library provides 3 functions similar |
to binary_search: lower_bound, upper_bound, and equal_range.
These are the functions to be used if it is the location
of the item within the sorted sequence that is of interest.
If binary_search was returning the item's position, failure
would have to be indicated either by returning the end of
the sequence, or by returning an <iterator,bool> pair.
Maybe the latter could be of interest, but can be emulated
easily enough with a call to lower_bound or equal_range...
The current set of 4 functions might not provide an optimal
solution in few cases. But the offered coverage is rather
complete, while avoiding redundancies.
Regards,
Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- e-mail contact form
Brainbench MVP for C++ <> http://www.brainbench.com
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
P.J. Plauger Guest
|
Posted: Sun Jan 25, 2004 10:27 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
"Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote
| Quote: | What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
|
lower_bound does that job for you.
P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
John Potter Guest
|
Posted: Mon Jan 26, 2004 12:04 am Post subject: Re: Rationale behind std::binary_search? |
|
|
On 25 Jan 2004 05:55:46 -0500, "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote:
| Quote: | What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
|
If it returned an iterator, you would need to check that the iteratee
is the thing that you seek. Lower_bound returns an iterator that you
can check. Binary_search answers the question without further testing.
bool binary_search (Iter first, Iter past, item) {
Iter found(lower_bound(first, past, item));
return found != past && ! (item < *found);
}
If you don't like the result, use lower_bound and write your own
conditional. Yes, binary_search is about as useful as
John
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dhruv Guest
|
Posted: Mon Jan 26, 2004 10:32 am Post subject: Re: Rationale behind std::binary_search? |
|
|
On Sun, 25 Jan 2004 19:04:46 -0500, John Potter wrote:
| Quote: | On 25 Jan 2004 05:55:46 -0500, "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote:
What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
If it returned an iterator, you would need to check that the iteratee
is the thing that you seek. Lower_bound returns an iterator that you
can check. Binary_search answers the question without further testing.
|
Ok. Thanks a lot! And also to everyone else who replied.
| Quote: | bool binary_search (Iter first, Iter past, item) {
Iter found(lower_bound(first, past, item));
return found != past && ! (item < *found);
}
|
I checked in libstdc++, and that's pretty much how binary_search is
implemented. Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would make so
many comparisions to find the first occurance of 2 (value before which
another 2 could be inserted), however if you have a simple binary_search
that does not use lower_bound internally, then the same result (of finding
whether 2 exists in the given sequence or not) could be implemneted in a
single check itself. So, it seems that lower_bound's worst case and
average case complexity is both approx. lg(N), while binary_search coul be
implemented with a better average case complexity.
Or is it mandated by the standard that binary_search should be using
lower_bound internally? Or is it a QoI issue?
| Quote: | If you don't like the result, use lower_bound and write your own
conditional. Yes, binary_search is about as useful as
cstdlib>::bsearch.
|
Yes.
Regards,
-Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Jeff Schwab Guest
|
Posted: Mon Jan 26, 2004 8:32 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
Dhruv wrote:
| Quote: | On Sun, 25 Jan 2004 19:04:46 -0500, John Potter wrote:
On 25 Jan 2004 05:55:46 -0500, "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote:
What is the rationale behind making std::binary_search return bool instead
of an iterator like std::find? I find it extremely useless this way. On
the other hand, I would not only get the existence status of a particular
element in a container, but I would also get it's specific position if it
exists.
If it returned an iterator, you would need to check that the iteratee
is the thing that you seek. Lower_bound returns an iterator that you
can check. Binary_search answers the question without further testing.
Ok. Thanks a lot! And also to everyone else who replied.
bool binary_search (Iter first, Iter past, item) {
Iter found(lower_bound(first, past, item));
return found != past && ! (item < *found);
}
I checked in libstdc++, and that's pretty much how binary_search is
implemented. Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would make so
many comparisions to find the first occurance of 2 (value before which
another 2 could be inserted), however if you have a simple binary_search
that does not use lower_bound internally, then the same result (of finding
whether 2 exists in the given sequence or not) could be implemneted in a
single check itself. So, it seems that lower_bound's worst case and
average case complexity is both approx. lg(N), while binary_search coul be
implemented with a better average case complexity.
Or is it mandated by the standard that binary_search should be using
lower_bound internally? Or is it a QoI issue?
|
No, it's mandated that lower_bound has logarithmic complexity. A binary
search can be implemented in such a way as always to find the first
occurrence of a value; there's no need to find a random value, then walk
backward to the first occurrence. You might find your
instructive.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dave Harris Guest
|
Posted: Tue Jan 27, 2004 10:57 am Post subject: Re: Rationale behind std::binary_search? |
|
|
[email]dhruvbird (AT) gmx (DOT) net[/email] (Dhruv) wrote (abridged):
| Quote: | Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would
make so many comparisions to find the first occurance of 2 (value
before which another 2 could be inserted),
|
It would still be O(ln(N)), though. It isn't doing a linear search
over the 2s.
| Quote: | however if you have a simple binary_search that does not use
lower_bound internally, then the same result (of finding whether
2 exists in the given sequence or not) could be implemneted in a
single check itself.
|
But at the cost of a 3-way comparison. Lower_bound uses 2-way
comparisons; it does not check for equality at each iteration, just
for less-then. If you want to stop as soon as you find an equal
value, then you do need to check for equality. If the item is absent
from the sequence, it would mean an extra ln(N) checks. Your
implementation would be slower in that case.
-- Dave Harris, Nottingham, UK
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
John Potter Guest
|
Posted: Tue Jan 27, 2004 11:05 am Post subject: Re: Rationale behind std::binary_search? |
|
|
On 26 Jan 2004 05:32:13 -0500, "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote:
| Quote: | I checked in libstdc++, and that's pretty much how binary_search is
implemented. Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would make so
many comparisions to find the first occurance of 2 (value before which
another 2 could be inserted), however if you have a simple binary_search
that does not use lower_bound internally, then the same result (of finding
whether 2 exists in the given sequence or not) could be implemneted in a
single check itself. So, it seems that lower_bound's worst case and
average case complexity is both approx. lg(N), while binary_search coul be
implemented with a better average case complexity.
|
Depends upon what you call better average. It takes two, not one,
compares to detect equality using less. If we write binary_search
to make two compares stopping when found, we must also note that
the number of compares to find a leaf node is now 2lgN and half of
the nodes are leaves. Calculate that average and I think you will
find that the lower_bound version has a better average behavior
across all possible data sets.
| Quote: | Or is it mandated by the standard that binary_search should be using
lower_bound internally? Or is it a QoI issue?
|
QoI. 2lgN is still lgN.
John
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dylan Nicholson Guest
|
Posted: Tue Jan 27, 2004 6:28 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
Jeff Schwab <jeffplus (AT) comcast (DOT) net> wrote
| Quote: | Dhruv wrote:
Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would make so
many comparisions to find the first occurance of 2 (value before which
another 2 could be inserted), however if you have a simple binary_search
that does not use lower_bound internally, then the same result (of finding
whether 2 exists in the given sequence or not) could be implemneted in a
single check itself. So, it seems that lower_bound's worst case and
average case complexity is both approx. lg(N), while binary_search coul be
implemented with a better average case complexity.
Or is it mandated by the standard that binary_search should be using
lower_bound internally? Or is it a QoI issue?
No, it's mandated that lower_bound has logarithmic complexity. A binary
search can be implemented in such a way as always to find the first
occurrence of a value; there's no need to find a random value, then walk
backward to the first occurrence. You might find your <algorithms> file
instructive.
That doesn't really answer the OP's point - which is that in cases |
where there are repeated elements, it must on average be faster just
to find *any* occurrence of an element, rather than the first one.
But I doubt too many implementers of binary_search would see this sort
of optimization worth-while. It would have to be a fairly specific
sort of problem that would benefit from it.
Dylan
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dhruv Guest
|
Posted: Wed Jan 28, 2004 9:34 am Post subject: Re: Rationale behind std::binary_search? |
|
|
On Tue, 27 Jan 2004 06:05:00 -0500, John Potter wrote:
| Quote: | On 26 Jan 2004 05:32:13 -0500, "Dhruv" <dhruvbird (AT) gmx (DOT) net> wrote:
I checked in libstdc++, and that's pretty much how binary_search is
implemented. Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would make so
many comparisions to find the first occurance of 2 (value before which
another 2 could be inserted), however if you have a simple binary_search
that does not use lower_bound internally, then the same result (of finding
whether 2 exists in the given sequence or not) could be implemneted in a
single check itself. So, it seems that lower_bound's worst case and
average case complexity is both approx. lg(N), while binary_search coul be
implemented with a better average case complexity.
Depends upon what you call better average. It takes two, not one,
compares to detect equality using less. If we write binary_search
to make two compares stopping when found, we must also note that
the number of compares to find a leaf node is now 2lgN and half of
the nodes are leaves. Calculate that average and I think you will
find that the lower_bound version has a better average behavior
across all possible data sets.
|
Yes, you're right. it was a horrible mis-calculation on my part! Even
Dave Harris mentioned this. Even then, I guess that typically, a data set
would not contain so many repeats, so I can't even say that from a cache
point of view, the one wth 2 comparisons would be better.
| Quote: | Or is it mandated by the standard that binary_search should be using
lower_bound internally? Or is it a QoI issue?
QoI. 2lgN is still lgN.
|
2ln(N) comparisons at max. is O(lg(N))! Are you feeling lazy to write the
brackets!!!
Regards,
-Dhruv.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Jeff Schwab Guest
|
Posted: Wed Jan 28, 2004 1:31 pm Post subject: Re: Rationale behind std::binary_search? |
|
|
Dylan Nicholson wrote:
| Quote: | Jeff Schwab <jeffplus (AT) comcast (DOT) net> wrote
Dhruv wrote:
Now, I was wondering that when you have a sequence:
1 2 2 2 2 2 2 2 2 2 2 2 2 3
And you do a binary search on it (for value 2), lower_bound would make so
many comparisions to find the first occurance of 2 (value before which
another 2 could be inserted), however if you have a simple binary_search
that does not use lower_bound internally, then the same result (of finding
whether 2 exists in the given sequence or not) could be implemneted in a
single check itself. So, it seems that lower_bound's worst case and
average case complexity is both approx. lg(N), while binary_search coul be
implemented with a better average case complexity.
Or is it mandated by the standard that binary_search should be using
lower_bound internally? Or is it a QoI issue?
No, it's mandated that lower_bound has logarithmic complexity. A binary
search can be implemented in such a way as always to find the first
occurrence of a value; there's no need to find a random value, then walk
backward to the first occurrence. You might find your <algorithms> file
instructive.
That doesn't really answer the OP's point - which is that in cases
where there are repeated elements, it must on average be faster just
to find *any* occurrence of an element, rather than the first one.
|
No, I think you (and OP) are wrong. Finding the "first" occurrence
should be just as fast as finding "any" occurrence. If I'm wrong (this
wouldn't be the first time , please explain why.
| Quote: | But I doubt too many implementers of binary_search would see this sort
of optimization worth-while. It would have to be a fairly specific
sort of problem that would benefit from it.
|
Wrong again. If you were right about a linear number of comparisons
being required to implement lower bound, then this would be an obvious
candidate for optimization.
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dylan Nicholson Guest
|
Posted: Thu Jan 29, 2004 9:51 am Post subject: Re: Rationale behind std::binary_search? |
|
|
Jeff Schwab <jeffplus (AT) comcast (DOT) net> wrote
| Quote: | Dylan Nicholson wrote:
Jeff Schwab <jeffplus (AT) comcast (DOT) net> wrote
That doesn't really answer the OP's point - which is that in cases
where there are repeated elements, it must on average be faster just
to find *any* occurrence of an element, rather than the first one.
No, I think you (and OP) are wrong. Finding the "first" occurrence
should be just as fast as finding "any" occurrence. If I'm wrong (this
wouldn't be the first time , please explain why.
Adding in an extra == comparison won't give you better performance |
unless a fairly particular set of conditions happen to exist, so I
agree it's no better on average (and of course, could actually give
you worse performance).
Dylan
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
Dylan Nicholson Guest
|
Posted: Thu Jan 29, 2004 9:52 am Post subject: Re: Rationale behind std::binary_search? |
|
|
Jeff Schwab <jeffplus (AT) comcast (DOT) net> wrote
| Quote: | Dylan Nicholson wrote:
That doesn't really answer the OP's point - which is that in cases
where there are repeated elements, it must on average be faster just
to find *any* occurrence of an element, rather than the first one.
No, I think you (and OP) are wrong. Finding the "first" occurrence
should be just as fast as finding "any" occurrence. If I'm wrong (this
wouldn't be the first time , please explain why.
|
One example might be a list of ints. On x86 chips, and I would guess
many others, you don't even need to perform an extra comparison:
cmp eax, ecx
jge label1
; eax less than ecx, adjust iterators accordingly
; ...
label1:
je exit_func
; eax greater than ecx, adjust iterators accordingly
exit_func:
; return current iterator
Dylan
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
|
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|