E
 the type of data objects in the hash table.public class LSH<E> extends java.lang.Object implements NearestNeighborSearch<double[],E>, KNNSearch<double[],E>, RNNSearch<double[],E>
By default, the query object (reference equality) is excluded from the neighborhood.
You may change this behavior with setIdenticalExcluded
. Note that
you may observe weird behavior with String objects. JVM will pool the string literal
objects. So the below variables
String a = "ABC";
String b = "ABC";
String c = "AB" + "C";
are actually equal in reference test a == b == c
. With toy data that you
type explicitly in the code, this will cause problems. Fortunately, the data would be
read from secondary storage in production.
MPLSH
Constructor and Description 

LSH(double[][] keys,
E[] data)
Constructor.

LSH(double[][] keys,
E[] data,
double w)
Constructor.

LSH(double[][] keys,
E[] data,
double w,
int H)
Constructor.

LSH(int d,
int L,
int k)
Constructor.

LSH(int d,
int L,
int k,
double w)
Constructor.

LSH(int d,
int L,
int k,
double w,
int H)
Constructor.

Modifier and Type  Method and Description 

boolean 
isIdenticalExcluded()
Get whether if query object self be excluded from the neighborhood.

Neighbor<double[],E>[] 
knn(double[] q,
int k)
Search the k nearest neighbors to the query.

Neighbor<double[],E> 
nearest(double[] q)
Search the nearest neighbor to the given sample.

void 
put(double[] key,
E value)
Insert an item into the hash table.

void 
range(double[] q,
double radius,
java.util.List<Neighbor<double[],E>> neighbors)
Search the neighbors in the given radius of query object, i.e.

LSH<E> 
setIdenticalExcluded(boolean excluded)
Set if exclude query object self from the neighborhood.

java.lang.String 
toString() 
public LSH(double[][] keys, E[] data)
keys
 the keys of data objects.data
 the data objects.public LSH(double[][] keys, E[] data, double w)
keys
 the keys of data objects.data
 the data objects.w
 the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.public LSH(double[][] keys, E[] data, double w, int H)
keys
 the keys of data objects.data
 the data objects.w
 the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.H
 the size of universal hash tables.public LSH(int d, int L, int k)
d
 the dimensionality of data.L
 the number of hash tables.k
 the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.public LSH(int d, int L, int k, double w)
d
 the dimensionality of data.L
 the number of hash tables.k
 the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.w
 the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.public LSH(int d, int L, int k, double w, int H)
d
 the dimensionality of data.L
 the number of hash tables.k
 the number of random projection hash functions, which is usually
set to log(N) where N is the dataset size.w
 the width of random projections. It should be sufficiently
away from 0. But we should not choose an w value that is too large, which
will increase the query time.H
 the size of universal hash tables.public java.lang.String toString()
toString
in class java.lang.Object
public boolean isIdenticalExcluded()
public LSH<E> setIdenticalExcluded(boolean excluded)
public void put(double[] key, E value)
public Neighbor<double[],E> nearest(double[] q)
NearestNeighborSearch
nearest
in interface NearestNeighborSearch<double[],E>
q
 the query key.public Neighbor<double[],E>[] knn(double[] q, int k)
KNNSearch