vectorize

Converts a bag of words to a feature vector.

Return

a vector of frequency of feature tokens in the bag.

Parameters

terms

the token list used as features.

bag

the bag of words.


fun vectorize(terms: List<String>, bag: Set<String>): IntArray

Converts a binary bag of words to a sparse feature vector.

Return

an integer vector, which elements are the indices of presented feature tokens in ascending order.

Parameters

terms

the token list used as features.

bag

the bag of words.