A Layered Approach to People Detection in 3D Range Data

L. Spinello , K. Arras, R. Triebel and R. Siegwart

AAAI Conference on Artificial Intelligence 2010 (AAAI)

Keywords: 3D people detection, 3D pedestrian detection, 3D range data learning

 

People tracking is a key technology for autonomous systems, intelligent cars and social robots operating in populated environments. What makes the task difficult is that the appearance of humans in range data can change drastically as a function of body pose, distance to the sensor, self-occlusion and occlusion by other objects. In this paper we propose a novel approach to pedestrian detection in 3D range data based on supervised learning techniques to create a bank of classifiers for different height levels of the human body. In particular, our approach applies AdaBoost to train a strong classifier from geometrical and statistical features of groups of neighboring points at the same height. In a second step, the AdaBoost classifiers mutually enforce their evidence across different heights by voting into a continuous space. Pedestrians are finally found efficiently by mean-shift search for local maxima in the voting space. Experimental results carried out with 3D laser range data illustrate the robustness and efficiency of our approach even in cluttered urban environments. The learned people detector reaches a classification rate up to 96% from a single 3D scan.

 

Paper (PDF) Bibtex Presentation
+ Movies

 

3D point cloud
3D point cloud recorded in a busy urban environment where people, trams and cars are moving around.
Range data segmentation
The layers of range data are segmented: colors indicate different segments.
Voting space
Segments are classified for belonging to a part of the learned 3D model of a person. A bank of Adaboost classifiers trained via 'one vs all' technique is employed for this task. Each segment cast votes for the center of a person in a 3.5D voting space: position (x,y,z) and classification 'confidence' w. Votes for different 3D model parts are displayed as colored balls.
Detection hypotheses
High density loci in the voting space represent the detection hypotheses. A mean-shift is applied to extract the modes of the vote distribution. The bigger the ball the higher the detection score that combines votes quantities, detection confidence, and parts detected.
Detection results
Detected people are depicted as boxes (Tannenstrasse dataset).
Detection results
Detected people are depicted as boxes (Polyterrasse dataset).