Many efforts have been performed to couple frequent itemset extraction with relational DBMSs, but a true integration into a relational DBMS kernel has been rarely achieved. The I(temset)-tree index allows a tight integration of the frequent itemset extraction task into a relational DBMS kernel. An appropriate structure of the stored information has been devised, in order to allow a selective access of the index blocks necessary for the current extraction phase. The representation of the data is complete, i.e., no support threshold is enforced, in order to allow reusing the index for any support threshold.
The I-tree index has been implemented into the PostgreSQL open source DBMS and exploits its physical level access methods. Experiments have been run for various datasets, characterized by different data distributions. Results showed that the execution time of the frequent itemset extraction task exploiting the I-tree index is always comparable with and sometime faster than a C++ implementation of the FP-growth algorithm accessing data stored on a flat file.