~roguescholar/boost/+git/multi_index:feature/ranked_index_count

Last commit made on 2022-02-05
Get this branch:
git clone -b feature/ranked_index_count https://git.launchpad.net/~roguescholar/boost/+git/multi_index

Branch merges

Branch information

Name:
feature/ranked_index_count
Repository:
lp:~roguescholar/boost/+git/multi_index

Recent commits

3f557bf... by joaquintides <email address hidden>

updated docs

446bd18... by joaquintides <email address hidden>

covered ranked_index::count

bca81e3... by joaquintides <email address hidden>

reorganized code

4628403... by joaquintides <email address hidden>

simplified Damian's contribution

84b0658... by DamianSawicki <email address hidden>

Documentation updated with new complexity of count in ranked_index

473b85b... by DamianSawicki <email address hidden>

Benchmark of count(): ordered_index vs ranked_index

A benchmark is added as count_benchmark.cpp in the 'example'
directory.

When the values of an index are unique, both implementations are
comparable (ranked_index is 10-15% faster).

However, for highly non-unique indices (like the age of people),
the new implementation in ranked_index outperforms ordered_index.
For 1 000 people of age in 0..99 ranked_index is ~2x faster,
for 10 000 people it is 12-13x faster, and
for 100 000 people it is 95-100x times faster.

For even more non-unique indices (like sex or the age of pupils)
or coarse comparison predicates (grouping people in age groups
like 0..9, 10..19 etc.) the gap in performance grows further.
For a comparison predicate comparing 'age/10' for age in 0..99,
similar gaps in performance occur already for 10x smaller
containers:
for 100 people count in ranked_index is 2x faster,
for 1 000 people it is ~9x faster,
for 10 000 people it is 95-100x faster,
for 100 000 people it is almost 1000x faster.

b64f1f1... by DamianSawicki <email address hidden>

Added test for new count(x) and count(x,comp) in ranked_index

Both, the existing implementation of count from ordered_index and
the new implementation for ranked_index are compared with the
common sense. Positive results of this test show in particular that
the numbers produced by the new implementation are consistent with
those from the existing implementation and hence correct.

9992c4d... by DamianSawicki <email address hidden>

count for ranked_index uses rank

Normally, count is calculated as the distance between iterators,
which takes linear time when count(x,comp) is comparable with n,
but for ranked indices we can subtract the values or rank,
reducing the complexity from log(n)+count(x,comp) to log(n).

7c3cb66... by joaquintides <email address hidden>

corrected statements on SCARYness

2026b94... by joaquintides <email address hidden>

fixed 1b38fdcc77c6298a2eb83cb9c7554ad2337b3938