Q-Trajectories – People detection and tracking in complex image sequences (2012)

Team:  T. Klinger
Jahr:  2012
Förderung:  DFG
Laufzeit:  1.12.2010 to 31.11.2012
Ist abgeschlossen:  ja

The project Q-Trajectories addresses the acquisition of high quality motion trajectories, derived from an event-driven smart camera network. Q-Trajectories is a bundle project funded by the German Research Foundation, which is elaborated by the IPI together with the institutes ikg (link) and SRA (link) at the Leibniz Universität Hannover.

Motivation

Detection, tracking and classification of persons in image sequences is one of the most active research topics in the fields of image analysis and computer vision. For controlled environments, where persons are clearly recognisable in images of static monoscopic cameras, the problem is almost solved. In more complex scenarios a large depth range, inter-object occlusions and time-variable orientation parameters of the involved cameras often require a combination of various person descriptors and the integration of motion models. A so called smart camera is able to change the orientation by using PTZ (pan, tilt and zoom) actuators. A collaborative network of self-organised smart cameras enables the surveillance of large areas with a minimum of camera units and constitutes the basis for the bundle project.


Figure 1: Circle of collaboration

Goal

The project aims at the development of new methodological approaches and tools for the detection and tracking of persons in complex image sequences. The partial projects support each other by the integration of self-organised camera alignment, object tracking and pattern recognition, c.f. Figure 1.

The partial project IPI deals with the detection and tracking of persons in image sequences captured by the smart cameras. The term detection involves finding evidence for the presence of a pedestrian and at least a coarse localisation. Tracking involves linking of single detections over successive frames (time), which leads to a combinatorial problem if multiple persons are present in the scene. To this end, a classification strategy was developed, which captures appearance based features at run-time and specialises on individual persons gradually. Figure 2 shows the confidence achieved by the classifier for each visible person in the scene after the processing of two frames (left sub-figure) and after the processing of forty frames (right sub-figure). The columns in the graph visualise the confidence achieved by the classification of the colour-coded image regions, where each colour represents a different person. The three diagrams beneath each sub-figure indicate the confidences achieved by classification of the blue, red and yellow framed image regions, respectively.


Figure 2: Tracking-by-detection using an instance specific classifier trained at run-time. By the incorporation of further training samples, the confidence of the classification increases gradually. On the left: Trajectories and achieved confidence values after the processing of two frames; to the right: the same after 40 frames.