Go to the source code of this file.

## Namespaces | |

namespace | lshkit |

## Classes | |

class | lshkit::Histogram< LSH > |

Random histogram construction. More... |

Random histograms are used to match sets of feature vectors. A random histogram is a simplified version of LSH hash table, except that for each bin, only a count, instead of a list of features, is maintained. If two sets are similar, then the counts in the corresponding bins would be similar. Thus, set similarity is mapped to vector similarity.

Following is an example to embed a set of sift features into a random histogram. We use the ThresholdingLsh to approximate L1 distance between SIFT features.

#include <lshkit.h> using namespace lshkit; // One ThresholdingLsh only produces 1 bit. We use the Repeat composition // to produce a hash value of 8 bits, which can be used to produce one // histogram of 2^8 = 256 dimensions. typedef Repeat<ThresholdingLsh> MyLSH; typedef Histogram<MyLSH> MyHistogram; TresholdingLsh::Parameter param; param.repeat = 8; // repeat 8 times, produce 8 bits. param.dim = 128; param.min = 0.0; param.max = 255.0; // Parameters are set for SIFT features. DefaultRng rng; unsigned N = 10, M = 10; MyHistogram hist; hist.reset(M, N, param, rng); // The histogram is constructed by concatenating N histograms of 2^8 dimensions, // So there will be 2560 dimensions in all. float *output = new float[hist.getDim()]; // should be 2560. hist.zero(output); // init the output histogram // IMAGE is a container of SIFT features. BOOST_FOREACH(const float *sift, IMAGE) { hist.add(output, sift); } // now the output should contain the desired histogram. // The histogram can be fed into SVM for classification.

See the following reference for details.

Wei Dong, Zhe Wang, Moses Charikar, Kai Li. Efficiently Matching Sets of Features with Random Histograms. In Proceedings of the 16th ACM International Conference on Multimedia. Vancouver, Canada. October 2008.