Title

Mining Complex Patterns in Massive Relational Datasets

Authors

Mohammed Zaki
Department of Computer Science
Rensselaer Polytechnic Institute
Troy NY 12180-3590, USA

Abstract

Frequent Pattern Mining (FPM) is a very powerful paradigm for mining informative and useful patterns in massive, complex datasets. In this talk, I will describe recent work on tree and graph mining. I will present Data Mining Template Library, a collection of generic containers and algorithms for data mining, as well as persistency and database management classes. DMTL provides a systematic solution to a whole class of common FPM tasks like itemset, sequence, tree and graph mining. DMTL is extensible, scalable, and high-performance for rapid response on massive datasets. A detailed set of experiments show that DMTL is competitive with special purpose algorithms designed for a particular pattern type, and that DMTL outperforms those methods when the database exceeds main-memory.

Slides

PPT (1272320 bytes)

Last modified: $Date: 2004/04/05 12:30:44 $ (UTC)