Title

Frequent Pattern Queries with Optimized Constraint-pushing Operational Semantics

Author

Francesco Bonchi, Dino Pedreschi and Fosca Giannotti

Abstract

This talk is about data mining query language and optimizations in the context of a Logic-based Knowledge Discovery Support Environment. i.e., a flexible knowledge discovery system with capabilities to obtain, maintain, represent, and utilize both induced and deduced knowledge. In particular, we focus on frequent pattern queries, since this kind of query is at the basis of many mining tasks, and it seems appropriate to be encapsulated in a knowledge discovery system as a primitive operation. We introduce an inductive language for frequent pattern queries, which is simple enough to be highly optimized and expressive enough to cover the most of interesting queries.

Formal semantics of the new inductive rules is provided by showing that exists a unique mapping from each inductive rule of the language to a Datalog++ program with aggregates. Thanks to this mapping we can define the formal declarative semantics of an inductive rule as the iterated stable model of the corresponding Datalog++ program. Expressiveness of the proposed inductive query language is studied by means of examples of complex queries involving frequent patterns.

Finally we define an optimized constraint-pushing operational semantics for our inductive language. A frequent pattern mining operator is defined exploiting all the most recent results in algorithm for constrained frequent itemsets mining. The mining operator is able to exploit as much as possible the given set of constraints, and which can adapt its behavior to the characteristics of the given input set of data.

Slides

PPT (1284096 bytes)

Last modified: $Date: 2004/04/05 13:22:16 $ (UTC)