smile.association.FPGrowth

All Implemented Interfaces:: Iterable<ItemSet>

public class FPGrowth extends Object implements Iterable<ItemSet>

Frequent item set mining based on the FP-growth (frequent pattern growth) algorithm, which employs an extended prefix-tree (FP-tree) structure to store the database in a compressed form. The FP-growth algorithm is currently one of the fastest approaches to discover frequent item sets. FP-growth adopts a divide-and-conquer approach to decompose both the mining tasks and the databases. It uses a pattern fragment growth method to avoid the costly process of candidate generation and testing used by Apriori.

The basic idea of the FP-growth algorithm can be described as a recursive elimination scheme: in a preprocessing step delete all items from the transactions that are not frequent individually, i.e., do not appear in a user-specified minimum number of transactions. Then select all transactions that contain the least frequent item (least frequent among those that are frequent) and delete this item from them. Recurse to process the obtained reduced (also known as projected) database, remembering that the item sets found in the recursion share the deleted item as a prefix. On return, remove the processed item from the database of all transactions and start over, i.e., process the second frequent item etc. In these processing steps the prefix tree, which is enhanced by links between the branches, is exploited to quickly find the transactions containing a given item and also to remove this item from the transactions after it has been processed.

References

Jiawei Han, Jian Pei, Yiwen Yin, and Runying Mao. Mining frequent patterns without candidate generation. Data Mining and Knowledge Discovery 8:53-87, 2004.
Gosta Grahne and Jianfei Zhu. Fast algorithms for frequent itemset mining using FP-trees. IEEE TRANS. ON KNOWLEDGE AND DATA ENGINEERING 17(10):1347-1362, 2005.
Christian Borgelt. An Implementation of the FP-growth Algorithm. OSDM, 1-5, 2005.

Method Summary

Modifier and Type

Method

Description

static Stream<ItemSet>

apply(FPTree tree)

Mines the frequent item sets.

Iterator<ItemSet>

iterator()

int

size()

Returns the number transactions in the database.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface java.lang.Iterable
forEach, spliterator

Method Details
- size
  
  public int size()
  
  Returns the number transactions in the database.
  
  Returns:
  
  the number transactions in the database
- iterator
  
  public Iterator<ItemSet> iterator()
  
  Specified by:
  
  iterator in interface Iterable<ItemSet>
- apply
  
  public static Stream<ItemSet> apply(FPTree tree)
  
  Mines the frequent item sets.
  
  Parameters:
  
  tree - the FP-tree of item sets.
  
  Returns:
  
  the stream of frequent item sets.

Class FPGrowth

References

Method Summary

Methods inherited from class java.lang.Object

Methods inherited from interface java.lang.Iterable

Method Details

size

iterator

apply