Institute for Computer Science

Machine Learning and Natural Language Processing Lab

PreviousNext

Master Thesis

Speeding Up Logistic Model Tree Induction

Marc Sumner, 2005


Logistic Model Trees have been shown to be very accurate and compact classifiers. Their greatest disadvantage is the computational complexity needed for induction of the tree. I address this issue at two different levels of the induction process. First of all, the base learner for the logistic regression models at the nodes of the tree is sped up by eliminating attributes early on. Secondly, the method used to build the logistic regression models is modified so as to stop automatically at the optimum iteration and a weight trimming heuristic is used which produces a significant speedup. I compare the training time and accuracy of the new induction process with the original one on various datasets and show that the training time often decreases while the classification accuracy diminishes only slightly.