Class BKTree<K,V>

java.lang.Object
smile.neighbor.BKTree<K,V>
Type Parameters:
K - the type of keys.
V - the type of associated objects.
All Implemented Interfaces:
Serializable, RNNSearch<K,V>

public class BKTree<K,V> extends Object implements RNNSearch<K,V>, Serializable
A BK-tree is a metric tree specifically adapted to discrete metric spaces. For simplicity, let us consider integer discrete metric d(x,y). Then, BK-tree is defined in the following way. An arbitrary element a is selected as root. Root may have zero or more subtrees. The k-th subtree is recursively built of all elements a such that d(a,b) = k. BK-trees can be used for approximate string matching in a dictionary.

By default, the query object (reference equality) is excluded from the neighborhood. You may change this behavior with setIdenticalExcluded. Note that you may observe weird behavior with String objects. JVM will pool the string literal objects. So the below variables String a = "ABC"; String b = "ABC"; String c = "AB" + "C"; are actually equal in reference test a == b == c. With toy data that you type explicitly in the code, this will cause problems. Fortunately, the data would be generally read from secondary storage in production.

References

  1. W. Burkhard and R. Keller. Some approaches to best-match file searching. CACM, 1973.
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    BKTree(Metric<K> distance)
    Constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    add(Map<K,V> data)
    Adds a dataset into BK-tree.
    void
    add(K key, V value)
    Adds a datum into the BK-tree.
    boolean
    Returns true if the BK-tree is empty.
    static <T> BKTree<T,T>
    of(List<T> data, Metric<T> distance)
    Return a BK-tree of the data.
    static <T> BKTree<T,T>
    of(T[] data, Metric<T> distance)
    Return a BK-tree of the data.
    void
    search(K q, double radius, List<Neighbor<K,V>> neighbors)
    Retrieves the neighbors in a fixed radius of query object, i.e.
    void
    search(K q, int radius, List<Neighbor<K,V>> neighbors)
    Search the neighbors in the given radius of query object, i.e.
    int
    Returns the number of nodes in the BK-tree.
     

    Methods inherited from class Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • BKTree

      public BKTree(Metric<K> distance)
      Constructor.
      Parameters:
      distance - the metric used to build BK-tree. Note that the metric must be a discrete distance, e.g. edit distance, Hamming distance, Lee distance, Jaccard distance, and taxonomic distance, etc.
  • Method Details

    • size

      public int size()
      Returns the number of nodes in the BK-tree.
      Returns:
      the number of nodes in the BK-tree.
    • isEmpty

      public boolean isEmpty()
      Returns true if the BK-tree is empty.
      Returns:
      true if the BK-tree is empty.
    • of

      public static <T> BKTree<T,T> of(T[] data, Metric<T> distance)
      Return a BK-tree of the data.
      Type Parameters:
      T - the type of keys and values.
      Parameters:
      data - the data objects, which are also used as key.
      distance - the metric used to build BK-tree. Note that the metric must be a discrete distance, e.g. edit distance, Hamming distance, Lee distance, Jaccard distance, and taxonomic distance, etc.
      Returns:
      BK-tree.
    • of

      public static <T> BKTree<T,T> of(List<T> data, Metric<T> distance)
      Return a BK-tree of the data.
      Type Parameters:
      T - the type of keys and values.
      Parameters:
      data - the data objects, which are also used as key.
      distance - the metric used to build BK-tree. Note that the metric must be a discrete distance, e.g. edit distance, Hamming distance, Lee distance, Jaccard distance, and taxonomic distance, etc.
      Returns:
      BK-tree.
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • add

      public void add(K key, V value)
      Adds a datum into the BK-tree.
      Parameters:
      key - the data key.
      value - the data object.
    • add

      public void add(Map<K,V> data)
      Adds a dataset into BK-tree.
      Parameters:
      data - the dataset to insert into the BK-tree.
    • search

      public void search(K q, double radius, List<Neighbor<K,V>> neighbors)
      Description copied from interface: RNNSearch
      Retrieves the neighbors in a fixed radius of query object, i.e. d(q, v) <= radius.
      Specified by:
      search in interface RNNSearch<K,V>
      Parameters:
      q - the query key.
      radius - the radius of search range from target.
      neighbors - the list to store found neighbors on output.
    • search

      public void search(K q, int radius, List<Neighbor<K,V>> neighbors)
      Search the neighbors in the given radius of query object, i.e. d(q, v) <= radius.
      Parameters:
      q - the query object.
      radius - the radius of search range from target.
      neighbors - the list to store found neighbors in the given range on output.