trait Operators extends AnyRef
High level association rule operators.
- Alphabetic
- By Inheritance
- Operators
- AnyRef
- Any
- by any2stringadd
- by StringFormat
- by Ensuring
- by ArrowAssoc
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
- def +(other: String): String
- def ->[B](y: B): (Operators, B)
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
arm(file: String, minSupport: Int, confidence: Double, output: String): Long
Association Rule Mining.
Association Rule Mining. This method scans data twice. We first scan the database to obtains the frequency of single items. Then we scan the data again to construct the FP-Tree, which is a compressed form of data. In this way, we don't need load the whole database into the main memory. In the data, the item identifiers have to be in [0, n), where n is the number of items.
- file
the input file of item sets. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items.
- minSupport
the required minimum support of item sets in terms of frequency.
- confidence
the confidence threshold for association rules.
- output
the output file.
- returns
the number of discovered association rules.
-
def
arm(file: String, minSupport: Int, confidence: Double, output: PrintStream): Long
Association Rule Mining.
Association Rule Mining. This method scans data twice. We first scan the database to obtains the frequency of single items. Then we scan the data again to construct the FP-Tree, which is a compressed form of data. In this way, we don't need load the whole database into the main memory. In the data, the item identifiers have to be in [0, n), where n is the number of items.
- file
the input file. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- confidence
the confidence threshold for association rules.
- output
a print stream for output of association rules.
- returns
the number of discovered association rules.
-
def
arm(itemsets: Array[Array[Int]], minSupport: Int, confidence: Double, output: String): Long
Association Rule Mining.
Association Rule Mining. Usually the algorithm generates too many data to fit in the memory. This alternative prints the results to a stream directly without storing them in the memory.
- itemsets
the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- confidence
the confidence threshold for association rules.
- output
the output file.
- returns
the number of discovered association rules.
-
def
arm(itemsets: Array[Array[Int]], minSupport: Int, confidence: Double, output: PrintStream): Long
Association Rule Mining.
Association Rule Mining. Usually the algorithm generates too many data to fit in the memory. This alternative prints the results to a stream directly without storing them in the memory.
- itemsets
the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- confidence
the confidence threshold for association rules.
- output
a print stream for output of association rules.
- returns
the number of discovered association rules.
-
def
arm(itemsets: Array[Array[Int]], minSupport: Int, confidence: Double): Buffer[AssociationRule]
Association Rule Mining.
Association Rule Mining. Let I = {i1, i2,..., in} be a set of n binary attributes called items. Let D = {t1, t2,..., tm} be a set of transactions called the database. Each transaction in D has a unique transaction ID and contains a subset of the items in I. An association rule is defined as an implication of the form X ⇒ Y where X, Y ⊆ I and X ∩ Y = Ø. The item sets X and Y are called antecedent (left-hand-side or LHS) and consequent (right-hand-side or RHS) of the rule, respectively. The support supp(X) of an item set X is defined as the proportion of transactions in the database which contain the item set. Note that the support of an association rule X ⇒ Y is supp(X ∪ Y). The confidence of a rule is defined conf(X ⇒ Y) = supp(X ∪ Y) / supp(X). Confidence can be interpreted as an estimate of the probability P(Y | X), the probability of finding the RHS of the rule in transactions under the condition that these transactions also contain the LHS. Association rules are usually required to satisfy a user-specified minimum support and a user-specified minimum confidence at the same time.
- itemsets
the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- confidence
the confidence threshold for association rules.
- returns
the number of discovered association rules.
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
- def ensuring(cond: (Operators) ⇒ Boolean, msg: ⇒ Any): Operators
- def ensuring(cond: (Operators) ⇒ Boolean): Operators
- def ensuring(cond: Boolean, msg: ⇒ Any): Operators
- def ensuring(cond: Boolean): Operators
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
def
finalize(): Unit
- Attributes
- protected[java.lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
- def formatted(fmtstr: String): String
-
def
fpgrowth(file: String, minSupport: Int, output: String): Long
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm.
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm. This is for mining frequent item sets by scanning data twice. We first scan the database to obtains the frequency of single items. Then we scan the data again to construct the FP-Tree, which is a compressed form of data. In this way, we don't need load the whole database into the main memory. In the data, the item identifiers have to be in [0, n), where n is the number of items.
- file
the input file of item sets. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items.
- minSupport
the required minimum support of item sets in terms of frequency.
- output
the output file.
- returns
the number of discovered frequent item sets.
-
def
fpgrowth(file: String, minSupport: Int, output: PrintStream): Long
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm.
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm. This is for mining frequent item sets by scanning data twice. We first scan the database to obtains the frequency of single items. Then we scan the data again to construct the FP-Tree, which is a compressed form of data. In this way, we don't need load the whole database into the main memory. In the data, the item identifiers have to be in [0, n), where n is the number of items.
- file
the input file of item sets. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items.
- minSupport
the required minimum support of item sets in terms of frequency.
- output
a print stream for output of frequent item sets.
- returns
the number of discovered frequent item sets.
-
def
fpgrowth(itemsets: Array[Array[Int]], minSupport: Int, output: String): Long
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm.
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm. Usually the algorithm generates too many data to fit in the memory. This alternative prints the results to a stream directly without storing them in the memory.
- itemsets
the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- output
the output file.
- returns
the number of discovered frequent item sets.
-
def
fpgrowth(itemsets: Array[Array[Int]], minSupport: Int, output: PrintStream): Long
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm.
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm. Usually the algorithm generates too many data to fit in the memory. This alternative prints the results to a stream directly without storing them in the memory.
- itemsets
the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- output
a print stream for output of frequent item sets.
- returns
the number of discovered frequent item sets.
-
def
fpgrowth(itemsets: Array[Array[Int]], minSupport: Int): Buffer[ItemSet]
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm, which employs an extended prefix-tree (FP-tree) structure to store the database in a compressed form.
Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm, which employs an extended prefix-tree (FP-tree) structure to store the database in a compressed form. The FP-growth algorithm is currently one of the fastest approaches to discover frequent item sets. FP-growth adopts a divide-and-conquer approach to decompose both the mining tasks and the databases. It uses a pattern fragment growth method to avoid the costly process of candidate generation and testing used by Apriori.
The basic idea of the FP-growth algorithm can be described as a recursive elimination scheme: in a preprocessing step delete all items from the transactions that are not frequent individually, i.e., do not appear in a user-specified minimum number of transactions. Then select all transactions that contain the least frequent item (least frequent among those that are frequent) and delete this item from them. Recurse to process the obtained reduced (also known as projected) database, remembering that the item sets found in the recursion share the deleted item as a prefix. On return, remove the processed item from the database of all transactions and start over, i.e., process the second frequent item etc. In these processing steps the prefix tree, which is enhanced by links between the branches, is exploited to quickly find the transactions containing a given item and also to remove this item from the transactions after it has been processed.
- itemsets
the item set database. Each row is a item set, which may have different length. The item identifiers have to be in [0, n), where n is the number of items. Item set should NOT contain duplicated items. Note that it is reordered after the call.
- minSupport
the required minimum support of item sets in terms of frequency.
- returns
the list of frequent item sets.
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
- def →[B](y: B): (Operators, B)
High level Smile operators in Scala.