|
|
Web Mining
Prof. Dr. Luc De Raedt
Dipl.-Inf. Kristian Kersting
Seminar (2) Thursday, 14-16 o'clock, SR 079-10-019 (basement)
Next meeting: Thursday, February 01., 2001
13:30
o'clock s.t.
The World-Wide Web has seen a period of enormous growth and is still
growing at a rapid pace. Retrieval of data and extraction of knowledge
from the Web is considered one of the most challenging research problems
in practical computer science. Researchers from the Artificial Intelligence
(AI) and the Data Mining communities have realized that the Web is an area
where AI and Data Mining techniques are really needed and where they have
a great potential to solve some of the most important problems. A lot of
research on these topics is currently going on world-wide and published
every year at the major AI conferences. In this seminar, students have
the opportunity to familiarize themselves with the application of Machine
Learning and Data Mining techniques to the Web.
Topics of papers presented in the seminar are:
- Text Classification
- Personal Information
Agents, i.e., agents (autonomously "acting" and "sensing" computer programs)
that gather information for the user.
- Web Mining, i.e.,
the extraction of knowledge from the Web.
- Intelligent Browsing,
i.e., the extension of Web browsers by intelligent capabilities, such as
making suggestions for links to be followed next, etc.
- Web Search
- Collaborative Filtering,
i.e., techniques that enable users to exploit similarities between interests
and tastes for information filtering
The
first meeting for this seminar will be in the third week of October.
Schedule
Short presentations at 11. January 2001
|
|
Name
|
Topic
|
| Goetz Sattler |
WebWatcher
|
| Marcin Nadolny |
Data Mining
|
| Ulrich Kuhn |
Agents
|
Main talks at 11. Januar 2001
|
| Kristian Kersting |
Relational Learning with Statistical Predicate Invention
|
Main talks at 01. February 2001
|
|
Name
|
Topic
|
| Goetz Sattler |
Adaptive Web sites
|
| Marcin Nadolny |
WHIRL
|
| Ulrich Kuhn |
Internet Portals
|
List of literature
-
From Data Mining to Knowledge Discovery: An Overview. Usama M. Fayyad,
Gregory Piatesky-Shapiro and Padhraic Smyth. In Usama M. Fayyad, Gregory
Piatesky-Shapiro, Padhraic Smyth and Ramasamy Uthurusamu, editors,"Advances
in Knowledge Discovery and Data Mining",
pages 1-36, AAAI
Press / The MIT Press.
(Book avaible on request)
-
The Process of Knowledge Discovery in Databases: A Human-Centered Approach.
Ronald J. Brachman and Tej Anand. In Usama M. Fayyad, Gregory Piatesky-Shapiro,
Padhraic Smyth and Ramasamy Uthurusamu, editors,"Advances
in Knowledge Discovery and Data Mining",
pages 37-58, AAAI
Press / The MIT Press.
(Book avaible on request)
-
Data mining for hypertext: A tutorial survey.
Soumen
Charkrabarti. In SIGKDD Explorations Volume 1, Number 2, January
2000.
(pdf,
ps)
|
-
Wrapper induction: Efficiency and expressiveness. Nicholas Kushmerick.
In Artificial Intelligence, Volume 118, Issues
1-2,pages 15-68, April 2000.
(pdf)
|
-
Learning to construct knowledge bases from the World Wide Web. Mark
Craven, Dan DiPasquo, Dayne Freitag, Andrew McCallum, Tom Mitchell, Kamal
Nigam and Seán Slattery. In Artificial Intelligence,
Volume 118, Issues 1-2,pages 69-113, Aprill 2000.
(pdf)
-
Relational Learning with Statistical Predicate Invention: Better Models
for Hypertext. Mark Craven and Sean Slattery. To appear in Machine
Learning Journal.
(printed article avaible on request)
|
-
WHIRL: A word-based information representation language. William
W. Cohen. In Artificial Intelligence, Volume
118, Issues 1-2, pages 163-196, April 2000.
(pdf)
|
-
Towards adaptive Web sites: Conceptual framework and case study. Mike
Perkowitz and Oren Etsioni. In Artificial Intelligence,
Volume 118, Issues 1-2, pages 245-275, April 2000.
(pdf)
|
-
WebWatcher: A Learning Apprentice for the World Wide Web.
R. Armstrong, D. Freitag , T. Joachims, T. Mitchell . In Working Notes
of the AAAI Spring Symposium Series on Information Gathering from Distributed,
Heterogeneous Environments, Stanford, 1995.
(ps.Z)
-
WebWatcher: A Tour Guide for the World Wide Web. T. Joachims, D.
Freitag, and T. Mitchell. In Proceedings of the 1997 IJCAI, August 1997.
(ps.gz)
|
-
Automating the Construction of Internet Portals with Machine Learning.
Andrew McCallum, Kamal Nigam, Jason Rennie, Kristie Seymore. In Information
Retrieval Journal, Volume 3, pages 127-163. Kluwer. 2000.
(ps.gz)
-
Using Reinforcement Learning to Spider the Web Efficiently. Jason
Rennie and Andrew K. McCallum. In Proceedings of ICML-1999 Workshop "Machine
Learning in Text Data Analysis.
(ps.gz)
|
Background:
-
Intelligent Internet systems. Alon Y. Levy and Daniel S. Weld.
In Artificial Intelligence, Volume 118,
Issues 1-2,pages 1-14. April 2000.
(pdf)
|
-
Web Mining Research: A Survey. R. Kosala and H. Blockheel..
In SIGKDD Explorations, Volume 2, Number 1,
pages 1-15, 2000.
(pdf,
ps)
|
Slides:
-
Introductionary lecture 19.10.2000.(ps
in german)
|