include/lshkit/forest.h File Reference

A preliminary implementation of LSH Forest index. More...

#include <algorithm>
#include <lshkit/common.h>
#include <lshkit/topk.h>

Go to the source code of this file.

Namespaces

namespace  lshkit

Classes

class  lshkit::ForestIndex< LSH, KEY >
 LSH Forest index. More...
struct  lshkit::ForestIndex< LSH, KEY >::Tree
struct  lshkit::ForestIndex< LSH, KEY >::Tree::Node


Detailed Description

A preliminary implementation of LSH Forest index.

This is a preliminary implementation of main memory LSH Forest index. The implementation is largely based on the WWW'05 LSH Forest paper. The descend and synchascend algorithms are implemented in a different but equivalent way, so the candidate set do not have to be explicitly generated. I also made a simplification to the synchascending algorithm, so that the original line

    while (x > 0 and (|P| < cl or |distinct(P)| < m)) {
is simplified to
    while (x > 0 and |P| < M) {
the deduplication is left to the scanning process.

The implementation is not efficient. The initial goal of this implementation is to study the selectivity of the algorithm. I'll further optimize the implementation if selectivity is proved competitive.

Reference

Mayank Bawa , Tyson Condie , Prasanna Ganesan, LSH forest: self-tuning indexes for similarity search, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan.


Get LSHKIT at SourceForge.net. Fast, secure and Free Open Source software downloads doxygen