smile.sequence

package smile.sequence

Sequence labeling algorithms.

Attributes

Members list

Type members

Classlikes

object $dummy

Hacking scaladoc issue-8124. The user should ignore this object.

Hacking scaladoc issue-8124. The user should ignore this object.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
$dummy.type

Value members

Concrete methods

def crf(sequences: Array[Array[Tuple]], labels: Array[Array[Int]], ntrees: Int, maxDepth: Int, maxNodes: Int, nodeSize: Int, shrinkage: Double): CRF

First-order linear conditional random field. A conditional random field is a type of discriminative undirected probabilistic graphical model. It is most often used for labeling or parsing of sequential data.

First-order linear conditional random field. A conditional random field is a type of discriminative undirected probabilistic graphical model. It is most often used for labeling or parsing of sequential data.

A CRF is a Markov random field that was trained discriminatively. Therefore it is not necessary to model the distribution over always observed variables, which makes it possible to include arbitrarily complicated features of the observed variables into the model.

====References:====

  • J. Lafferty, A. McCallum and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML, 2001.
  • Thomas G. Dietterich, Guohua Hao, and Adam Ashenfelter. Gradient Tree Boosting for Training Conditional Random Fields. JMLR, 2008.

Value parameters

labels

sequence labels.

maxDepth

the maximum depth of the tree.

maxNodes

the maximum number of leaf nodes in the tree.

nodeSize

the number of instances in a node below which the tree will not split, setting nodeSize = 5 generally gives good results.

ntrees

the number of trees/iterations.

sequences

the observation attribute sequences.

shrinkage

the shrinkage parameter in (0, 1] controls the learning rate of procedure.

Attributes

def gcrf[T <: AnyRef](sequences: Array[Array[T]], labels: Array[Array[Int]], features: Function[T, Tuple], ntrees: Int, maxDepth: Int, maxNodes: Int, nodeSize: Int, shrinkage: Double): CRFLabeler[T]

First-order linear conditional random field. A conditional random field is a type of discriminative undirected probabilistic graphical model. It is most often used for labeling or parsing of sequential data.

First-order linear conditional random field. A conditional random field is a type of discriminative undirected probabilistic graphical model. It is most often used for labeling or parsing of sequential data.

A CRF is a Markov random field that was trained discriminatively. Therefore it is not necessary to model the distribution over always observed variables, which makes it possible to include arbitrarily complicated features of the observed variables into the model.

====References:====

  • J. Lafferty, A. McCallum and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML, 2001.
  • Thomas G. Dietterich, Guohua Hao, and Adam Ashenfelter. Gradient Tree Boosting for Training Conditional Random Fields. JMLR, 2008.

Value parameters

labels

sequence labels.

maxDepth

the maximum depth of the tree.

maxNodes

the maximum number of leaf nodes in the tree.

nodeSize

the number of instances in a node below which the tree will not split, setting nodeSize = 5 generally gives good results.

ntrees

the number of trees/iterations.

sequences

the observation attribute sequences.

shrinkage

the shrinkage parameter in (0, 1] controls the learning rate of procedure.

Attributes

def hmm(observations: Array[Array[Int]], labels: Array[Array[Int]]): HMM

First-order Hidden Markov Model. A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. An HMM can be considered as the simplest dynamic Bayesian network.

First-order Hidden Markov Model. A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states. An HMM can be considered as the simplest dynamic Bayesian network.

In a regular Markov model, the state is directly visible to the observer, and therefore the state transition probabilities are the only parameters. In a hidden Markov model, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.

Value parameters

labels

the state labels of observations, of which states take values in [0, p), where p is the number of hidden states.

observations

the observation sequences, of which symbols take values in [0, n), where n is the number of unique symbols.

Attributes

def hmm[T <: AnyRef](observations: Array[Array[T]], labels: Array[Array[Int]], ordinal: ToIntFunction[T]): HMMLabeler[T]

Trains a first-order Hidden Markov Model.

Trains a first-order Hidden Markov Model.

Value parameters

labels

the state labels of observations, of which states take values in [0, p), where p is the number of hidden states.

observations

the observation sequences, of which symbols take values in [0, n), where n is the number of unique symbols.

Attributes