arm

fun arm(minSupport: Int, confidence: Double, itemsets: Array<IntArray>): Stream<AssociationRule>

Association Rule Mining. Let I = {i1, i2,..., in} be a set of n binary attributes called items. Let D = {t1, t2,..., tm} be a set of transactions called the database. Each transaction in D has a unique transaction ID and contains a subset of the items in I. An association rule is defined as an implication of the form X ⇒ Y where X, Y ⊆ I and X ∩ Y = Ø. The item sets X and Y are called antecedent (left-hand-side or LHS) and consequent (right-hand-side or RHS) of the rule, respectively. The support supp(X) of an item set X is defined as the proportion of transactions in the database which contain the item set. Note that the support of an association rule X ⇒ Y is supp(X ∪ Y). The confidence of a rule is defined conf(X ⇒ Y) = supp(X ∪ Y) / supp(X). Confidence can be interpreted as an estimate of the probability P(Y | X), the probability of finding the RHS of the rule in transactions under the condition that these transactions also contain the LHS. Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time.

Return

the stream of discovered association rules.

Parameters

itemsets

the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.

minSupport

the required minimum support of item sets in terms of frequency.

confidence

the confidence threshold for association rules.


fun arm(confidence: Double, tree: FPTree): Stream<AssociationRule>

Association Rule Mining. Let I = {i1, i2,..., in} be a set of n binary attributes called items. Let D = {t1, t2,..., tm} be a set of transactions called the database. Each transaction in D has a unique transaction ID and contains a subset of the items in I. An association rule is defined as an implication of the form X ⇒ Y where X, Y ⊆ I and X ∩ Y = Ø. The item sets X and Y are called antecedent (left-hand-side or LHS) and consequent (right-hand-side or RHS) of the rule, respectively. The support supp(X) of an item set X is defined as the proportion of transactions in the database which contain the item set. Note that the support of an association rule X ⇒ Y is supp(X ∪ Y). The confidence of a rule is defined conf(X ⇒ Y) = supp(X ∪ Y) / supp(X). Confidence can be interpreted as an estimate of the probability P(Y | X), the probability of finding the RHS of the rule in transactions under the condition that these transactions also contain the LHS. Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time.

Return

the stream of discovered association rules.

Parameters

tree

the FP-tree of item set database.

confidence

the confidence threshold for association rules.