 |
C++Talk.NET C++ language newsgroups
|
| View previous topic :: View next topic |
| Author |
Message |
Yannick Tremblay Guest
|
Posted: Wed Aug 30, 2006 9:11 am Post subject: Median of values in a std::map |
|
|
Hi,
I have a std::map<id, timestamp> collection.
The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.
i.e.
typedef std::map<id, timestamp> collection_t;
collection_t collection;
//...
std::vector<timestamp_t> allTimes;
allTimes.reserve(collection.size());
for(collection_t::const_iterator iter = collection.begin();
iter != collection.end() ; ++iter)
{
allTimes.push_back(iter->second);
}
std::sort(allTimes.begin(), allTimes.end());
const timestamp_t timeToPrune = allTimes.at(allTimes.size()/2);
It works, it does what I want, it's easy to understand, performance
isn't needed. It does the job. I should leave it there. But,
but... it just seems that there must be a more elegant way to do it
:-(
Any suggestions?
Thanks
Yan |
|
| Back to top |
|
 |
Kai-Uwe Bux Guest
|
Posted: Wed Aug 30, 2006 9:11 am Post subject: Re: Median of values in a std::map |
|
|
Yannick Tremblay wrote:
| Quote: | Hi,
I have a std::map<id, timestamp> collection.
The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.
i.e.
typedef std::map<id, timestamp> collection_t;
collection_t collection;
//...
std::vector<timestamp_t> allTimes;
allTimes.reserve(collection.size());
for(collection_t::const_iterator iter = collection.begin();
iter != collection.end() ; ++iter)
{
allTimes.push_back(iter->second);
}
std::sort(allTimes.begin(), allTimes.end());
const timestamp_t timeToPrune = allTimes.at(allTimes.size()/2);
It works, it does what I want, it's easy to understand, performance
isn't needed. It does the job. I should leave it there. But,
but... it just seems that there must be a more elegant way to do it
:-(
Any suggestions?
|
To convey intent more clearly, you could use nth_element() instead of sort.
That would be all, I'd change. [It could also be faster since nth_element
is linear on average.]
Best
Kai-Uwe Bux |
|
| Back to top |
|
 |
Old Wolf Guest
|
Posted: Thu Aug 31, 2006 9:10 am Post subject: Re: Median of values in a std::map |
|
|
Yannick Tremblay wrote:
| Quote: |
I have a std::map<id, timestamp> collection.
The id is the normal reference so the key-value ordering is correct.
However, once in a blue moon, I want to prune this collection. So
that it doesn't happens to often, I prune the 50% oldest entries and
go back to normal usage of the collection. In order to do so, I need
to find the median timestamp then delete all entries older than the
median timestamp. The second part is easy, just iterate through the
map and delete items. My problem is finding the median. Since it
shouldn't be executed often, I used the dumb way: create a vector,
iterate through the map and insert the timestamp associated to each
element in the vector, sort the vector, item[size()/2] is the median.
|
You could avoid the space and time overhead by using the mean
instead of the median. The mean will be quite close to the median,
unless there is something strange about your distribution.
Of course, the code to do this would probably be longer  |
|
| Back to top |
|
 |
Powered by phpBB © 2001, 2006 phpBB Group
|