Class CRF
- All Implemented Interfaces:
Serializable
A CRF is a Markov random field that was trained discriminatively. Therefore, it is not necessary to model the distribution over always observed variables, which makes it possible to include arbitrarily complicated features of the observed variables into the model.
This class implements an algorithm that trains CRFs via gradient tree boosting. In tree boosting, the CRF potential functions are represented as weighted sums of regression trees, which provide compact representations of feature interactions. So the algorithm does not explicitly consider the potentially large parameter space. As a result, gradient tree boosting scales linearly in the order of the Markov model and in the order of the feature interactions, rather than exponentially as in previous algorithms based on iterative scaling and gradient descent.
References
- J. Lafferty, A. McCallum and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML, 2001.
- Thomas G. Dietterich, Guohua Hao, and Adam Ashenfelter. Gradient Tree Boosting for Training Conditional Random Fields. JMLR, 2008.
- See Also:
-
Nested Class Summary
Nested Classes -
Constructor Summary
ConstructorsConstructorDescriptionCRF(StructType schema, RegressionTree[][] potentials, double shrinkage) Constructor. -
Method Summary
Modifier and TypeMethodDescriptionstatic CRFFits a CRF model.static CRFfit(Tuple[][] sequences, int[][] labels, CRF.Options options) Fits a CRF model.int[]Labels each position in the feature sequence independently by the forward-backward algorithm.int[]Labels sequence with Viterbi algorithm.
-
Constructor Details
-
CRF
Constructor.- Parameters:
schema- the schema of features.potentials- the potential functions.shrinkage- the learning rate.
-
-
Method Details
-
viterbi
Labels sequence with Viterbi algorithm. Viterbi algorithm returns the whole sequence label that has the maximum probability, which makes sense in applications (e.g.part-of-speech tagging) that require coherent sequential labeling. The forward-backward algorithm labels a sequence by individual prediction on each position. This usually produces better accuracy although the results may not be coherent.- Parameters:
x- the sequence.- Returns:
- the sequence labels.
-
predict
Labels each position in the feature sequence independently by the forward-backward algorithm. At each position the label with the highest marginal probability (alpha * beta) is selected. This per-position marginal argmax generally achieves lower token-level error than Viterbi, but the resulting label sequence may not correspond to a single globally most-likely path.- Parameters:
x- a sequence.- Returns:
- the per-position most likely label sequence.
-
fit
-
fit
Fits a CRF model.- Parameters:
sequences- the training data.labels- the training sequence labels.options- the hyperparameters.- Returns:
- the model.
-