2010-0003

advertisement

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

Active Range Imaging Dataset for

Indoor Surveillance

Cosimo Distante 1 , Giovanni Diraco 2 , Alessandro Leone 2

1 National Institute for Applied Optics INOA

Viale della Libertà 3 Arnesano, 73010 Lecce (Italy)

2 Institute for Microelectronics and Microsystems

National Research Council

Provinciale per Arnesano, Palazzina A3, 73100 Lecce (Italy)

<cosimo.distante@inoa.it>

Abstract

Range Imaging (RIM) is a new suitable choice for measurement and modelling in many different applications. The ability to describe scenes in three dimensions opens new scenarios, providing new opportunities in different fields, including visual monitoring and security. Active range sensors provide depth information allowing to develop algorithms much less complex and allows problems to be approached in a new, robust and efficient way. The paper presents a wide dataset for indoor surveillance applications acquired by a state-of-the art range camera. The main issue is the definition of a common basis for the comparative evaluation of the performance of vision algorithms.

1 Introduction

Acquiring 3D information in real environments is an important task for many computer vision applications (i.e. surveillance, security, autonomous navigation, computer graphics, human-computer interaction, automotive, biomedical, augmented environments, etc.) and benefits from range image acquisition are apparent. In the last years several active range sensors having small dimensions, low-power consumption and real-time performances have been released. Since the Time-Of-Flight (TOF) range cameras are widely used by researchers in monitoring and surveillance applications (as described in the next section),

© 2010. The copyright of this document resides with its authors.

It may be distributed unchanged freely in print or electronic forms.

1

2 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010) ad-hoc workshops on this topic have been organized [1, 2] and emerging projects have been started [3, 4, 40] in order to develop TOF sensors and relative software technologies for ambient intelligence (people tracking, head tracking, eye movements tracking, behaviour analysis, etc.).

Comparing algorithms is especially difficult since they are tested on different datasets under widely varying conditions. A large amount of real and synthetic image datasets have been defined for the evaluation of the performance of passive vision-based algorithms in surveillance applications [6, 7, 8, 9], whereas datasets of range images for surveillance and monitoring applications are limited to head/face detection and gesture analysis [15] (different databases are presented in literature with the aim to evaluate object segmentation on synthetic still range images and also for biometrics and robot navigation

[16, 17, 18, 41]). Rather, 3D range imaging datasets for people posture analysis and behaviour recognition are not available yet. For the previous reason, the definition of a database of real 3D range images for human behaviour analysis and posture recognition appears useful and important for the scientific community in order to define a common basis for the evaluation of algorithms in the indoor surveillance field.

This paper presents a large database called ARIDIS (Active Range Imaging Dataset for

Indoor Surveillance) of benchmark sequences captured by a state-of-the-art TOF range camera. The paper is organized as follows. In section 2, the state-of-the-art of TOF sensors is discussed and the widely used commercial TOF cameras and related works are presented. Section 3 provides some details about advantages and drawbacks by using 3D active range vision instead of passive stereo vision in surveillance. In section 4, the ARIDIS dataset is described whereas conclusive considerations are presented in section 5.

2 TOF cameras: State of The Art and Related Works

2.1

TOF State-Of-The-Art

Optical distance measurement techniques are usually classified into three fundamental categories: Interferometry, Stereo/Triangulation and Time-of-Flight [14]. The

Interferometry method makes use of the periodic nature of the light by implementing an interferometric setup where the light from a reference beam path interferes with the light reflected from an object. The high accuracies mainly depend on the coherence length of the light source, so that Interferometry is not suitable for ranges greater than few centimetres since the method is based on the evaluation of very short optical wavelength. The active triangulation method exploits geometrical relations between an illumination source and a camera, whereas the passive triangulation concerns relations between two or more cameras. The geometrical relations between the object, the sensor and a known baseline are used for the calculation of the distance. In the passive case the acquisition of depth is done by stereoscopy or multi-view geometry, meaning the clever combination of at least

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

3 two 2D images from different points of view. The Time-of-Flight technique exploits the finite speed of light for the determination of distances through modulated light signal. The distribution of distances can be determined by measuring the round-trip time of the modulated light from the illuminator to the object and back. The round-trip time of the

(a)

(b) (c) (d)

Figure 1. a) TOF PMD Principle; b) PMD 19k; c) SwissRanger 3000; d) Canesta PD200.

emitted signal is related to the object distance and it may be determined in various ways.

Currently, devices employing nanosecond laser pulses are widely investigated; architectural constraints and the integration of highly precise miniaturized timing circuitry increase the costs of the systems.

The sensors based on the Photonic Mixer Devices (PMD) technology assure advantages according to size, weight and scan time. A PMD sensor employees incoherent, nearinfrared, amplitude-modulated, continuous wave light and it determine the signal phase shift (and hence the object distance) by mixing the emitted signal with the returned signal at every pixel (Fig. 1.a) [13]. The illumination unit of PMD cameras are usually realized with low-cost LED arrays. Various TOF cameras support the Suppression of Background

Intensity (SBI) modality or an equivalent IR-suppression scheme, allowing the usage of the devices in outdoor applications. As mentioned above, the TOF technique is based on the indirect estimation of the arrival time by measuring the phase shift between the transmitted and the received signal. The phase shift φ is directly proportional to the round-trip distance d as shown in the following relation:

4 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010) d

 mod 

2

with d

 mod

2

(1)

2 where λ mod

is the modulation wavelength which is about 15m in most TOF cameras. By substituting λ mod

=15m in Eq. 1, the resulting non-ambiguous distance is of 7.5m. Recent cameras allow to use more than one modulation frequencies in order to increase the nonambiguity range, but increasing device complexity and decreasing accuracy of measure.

However, TOF sensors normally suffer for the following drawbacks [29]:

Low Resolution: Current sensors have a resolution between 64x48 and 176x144, which is lower if compared to standard RGB sensors but sufficient for most current applications. Additionally, the resulting large solid angle yields affect the measure in inhomogeneous depth areas (e.g. at silhouette boundaries).

Depth Distortion: Since a sinusoidal signal is practically not achievable, the measure is affected by systematic errors.

Motion Artefacts: Since four phase images are acquired subsequently in order to reconstruct the received signal, movements of the device or in the scene lead to erroneous distance values at object boundaries.

Multiple Cameras: The illuminator of a camera can affect another camera placed in the same environment. Recent developments allow the use of multiple TOF cameras.

More technical details on range imaging can be found in [35].

2.2

Commercial TOF camera

The TOF phase-measurement principle is used by several manufacturers of TOF cameras, such as Canesta Inc. [30], PMDTechnologies GmbH [31] and MESA Imaging AG [32] (Fig.

1.b, Fig. 1.c, Fig. 1.d). Canesta provides several models of depth vision sensors differing for pixel resolution, covered distance, frame rate and field of view. Canesta distributes sensors with field of view ranging between 30degs and 114degs, depending on the nature of the application. Currently, the maximum resolution of Canesta sensor is 160x120. Although some applications would benefit from more pixels, the sensor cost grows in size with the amount of pixels. Canesta sensors are mainly employed in automotive applications [10, 11] and customized for specific application requirements. The TOF cameras by

PMDTechnologies GmbH include all key technologies for a TOF-System. The illumination

LEDs are generally mounted in two arrays, one for each side of the camera. This illumination configuration is not optimal [34], since it introduces errors due to TOF differences between the left and the right array that are corrected in hardware. The correlation inside the camera is performed by using a CMOS based optical semiconductor increasing speed and decreasing cost and noise of the system. PMDTechnologies GmbH provides several models of TOF camera with different features and suitable for measurements in bright daylight since all cameras are equipped with the SBI technology.

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

5

Specifications

Distance

Resolution

Distance

Accuracy

(single pixel)

Frame Rate

Outdoor

+/- 1cm

(absolute accuracy)

25 fps, typical

Yes

MESA

SR3000

+/- 1cm

(absolute accuracy)

54 fps max

No

MESA

SR4000

PMD

Technologies

PMD19k

6cm (std. dev.) at 2m and 90% reflectivity

15 fps

No

PMD

Technologies

CamCube

3cm (std. dev.) at 2m and 90% reflectivity

25 fps

Yes

Table 1. Main features of the TOF cameras employed by scientific community.

0.6cm –20cm

(std. dev.) at

0.3m-20m

30 fps

Yes

Canesta

DPxxx

Pixel Array Size

Field of View

(H x V)

Interface

Illumination

Power

176x144

(QCIF)

47.5° × 39.6°

USB 2.0

Power Supply (V) 12

Modulation 20, default

176x144

(QCIF)

43.6° × 34.6°

USB 2.0,

Ethernet

1Watt @850nm 1Watt @850nm

12

29 ÷ 31

160x120

(QQVGA)

40° × 40°

Firewire,

Ethernet

3Watt @870nm

9… 18

20, default

204x204

32° × 32°

USB 2.0

3Watt @870nm

12

20 ÷ 90

64x64 / 120x160

(30° ÷ 88°) ×

(30° ÷ 114°)

USB 1.1 / 2.0

1Watt @785nm

5Watt @870nm

5

13, 26, 52, 104

Frequency (MHz) (30, default) (20, default)

Non-ambiguous range (meters)

7.5 0.3 ÷ 5 7.5 0.3 ÷ 7 0.3 ÷ 20

1% of range 1% of range 3% of range 2% of range 2% of range

The PMDTechnologies cameras can be used to get the excellent depth information as well as gray scale value of the scene. Currently, the PMD devices provide the resolutions of

48x64 (model 13k), 64x16 (model A2) and a non-ambiguity distance up to 40m. Greater pixel resolution (160x120) can be achieved by PMD 19k and by the CamCube model

(204×204) with a non-ambiguity range up to 7.5m. However, at now the only camera models referenced in scientific works are the 13k, the A2 and the 19k model, employed in automotive and robotics applications [12, 19, 20, 21].

The MESA Imaging AG has developed the SwissRanger TOF camera series: SR-2, SR3000 and SR4000. MESA SwissRanger cameras are employed in automotive, medical, biometrics, robotics and surveillance applications [22-27]. Since Swissranger SR3000 offers an excellent trade-off between pixel resolution and covered distance, it has been employed for the ARIDIS dataset acquisition. Table 1 summarizes the main characteristics of the discussed optical devices. Since the MESA SR3000 has been employed in dataset acquisition, a more detailed description is discussed in the following [5]. SR3000 camera provides two kind of images at QCIF spatial resolution (a depth map and a grey levels image) at least 15 fps (up to 25fps, variable with setting parameters). The camera works with an integrated, modulated near infrared light source (the illuminator is composed by an array of 55 near infrared LEDs). The emitted light is reflected by the objects in the scene and sensed by a pixel array realized in a specialized mixed CCD and CMOS process: the

6 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010) target depth is estimated by measuring the phase shift of the signal round-trip from the device to the target and back. Moreover, attention must be dedicated to the Field-Of-View

(FOV) of the device: normally active range sensors exhibit narrow FOV so a pan-tilt solution could be used for monitoring large areas. A narrow-band infrared filter is used so

(a) (b) (c)

(d) (e) (f)

Figure 2. Unwanted effects are shown for different acquisition configuration. Noise corrupts the measure (a) since the Integration Time (IT) is too low (IT=5ms). Saturation effects (b) are due to the high IT setting (IT=30ms) and they are limited (c) when the IT is correctly adjusted (IT=20ms). The effects of the IT tuning are visible in the acquired dataset (in f the intensity image). A 10ms IT causes noise in the measure (d), whereas a better tuning (IT=20ms) is shown in (e). that depth map and intensity image are not affected by environmental illumination conditions (the camera is suitable in night vision applications). By using the default parameters, the SR camera is able to define a depth map with a good approximation

(greater than 99%) when the target object falls in the non-ambiguity range (up to 7.5 meters). When the camera-target distance overcomes the non-ambiguity range, aliasing effects appear (i.e. object at 8 meters is “seen” as at 0.5 meters) as shown in Fig. 2.d and

Fig. 2.e, demoting only the depth estimation. Another important parameter is the

Integration Time (IT), that can be tuned in a proper way to limit saturation and noise effects. In particular, if the IT is short (Fig. 2.a) the results are very noisy; if it is long (Fig.

2.b) the results are smoothed and overflow starts to contaminate the results with objects close to the camera. The above discussion is also suitable for the other TOF sensors using similar technologies, as they exhibit the same problems (non-ambiguity range, etc.). A 3D plotting of the scene shown in Fig. 2.e is provided in Fig. 3.

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

3 Active Range Vision vs. Passive Vision

7

In this section we focus the attention on advantages and drawbacks when an active vision system is used instead of a classical 2D/3D vision system (passive vision). Table 2 synthesizes the most important characteristics of TOF cameras in comparison with triangulation system ones (passive stereo vision). The main advantage in the use of RIM is the description of a scene with a more detailed information, since RIM provides both depth map and intensity image, that could be used at the same time in the image processing applications. In particular, typical problems of passive vision (foreground camouflage, shadows, partial occlusions, etc.) could be overcome by using depth information that is not affected by illumination conditions and objects appearance. On the other hand, in several situations active RIM exhibits unwanted behaviours due to limitations of specific 3D sensing technology (limited depth range due to aliasing, multi path, reflection object properties). Note that the depth information could be obtained by using inexpensive passive stereo vision. However, this approach presents high computational costs and it fails when the scene is poorly textured or the illumination is insufficient; vice-versa, active vision provides depth maps even if appearance information is untextured and in all illumination conditions [28]. It is important to note that both distance and amplitude images delivered by the TOF camera have a number of systematic drawbacks that must be compensated. The main amplitude-related problem comes from the fact that the power of a wave decreases with the square of the distance it covers. For the previous consideration, the light reflected by imaged objects rapidly decreases with the distance between object and camera. In other words, objects with the same reflectance located at different distances from the camera will appear with different amplitudes in the image. Benefits of TOF RIM in surveillance contexts are summarized in Table 3, whereas drawbacks are emphasizes in Table 4. Drawbacks are showed in Fig. 4 by using typical frames of the ARIDIS dataset sequences.

8 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

TOF sensor Stereo (passive) vision

Depth resolution

Spatial resolution

Sub-centimetre (if conditions are satisfied). chromaticity

Sub-millimetre (if images are highly textured).

Medium (about QCIF, 144x176 pixels). High (over 4CIF, 704x576 pixels).

Portability

Computational efforts

Cost

Dimensions are the same of a normal camera.

On-board FPGA for phase and intensity measurement.

High for a customizable prototype

(1500€ - 5000€).

Two video cameras are needed and also external light source.

High workload (the calibration step and the correspondences search process are hard).

It depends on the quality of stereo vision system.

Table 2. Comparison of important characteristics of TOF cameras and stereo vision systems.

TOF sensor Passive vision

Illumination conditions

Accurate depth measurement in all illumination conditions.

Sensible to illumination variations and artificial lights. Unable to operate in dark environments.

Shadows presence

Partial occlusions

Objects appearance

It does not affect principal steps of monitoring applications.

Reduced performances in segmentation, recognition, etc.

Partially occluded objects are detected as separated (if they are at different depths).

Due to projective ambiguity, occlusions are difficult to detect (merging blobs) and handle

(predictive tracking strategies).

Camouflage is avoided but appearance could affect depth precision (chromaticity dependence).

Camouflage effects are presented when foreground/background present same appearance properties.

Table 3. Advantages in the use of TOF sensors in surveillance contexts.

Aliasing

Drawback description

It affects the non-ambiguity range i.e. the maximum achieved depth is reduced (up to 7.5 meters).

Multi-path effects Depth measurement is strongly corrupted when the target surface presents corners.

Objects reflection properties

Materials having different colors exhibit dissimilar reflection properties that affect reflected light intensity and, therefore, depth resolution.

Field of view

Usually it is limited so that an accurate positioning of the sensor is needed. A pan-tilt architecture could be useful.

Table 4. Drawbacks in the use of TOF sensors in surveillance contexts.

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

9

(a) (b)

Figure 4. The aliasing effect (a) appears in regions for which the distance from the camera overcomes the 7.5m (the opened door). The multi-path effects are apparent (a) in the regions in which adjacent walls intersect. Moreover, depth measurement is corrupted by object reflection properties (the LCD TV in (a)). Note that the corresponding intensity image (b) doesn’t presents the previous discussed problems.

In order to understand the advantages in the use of range imaging in surveillance, a qualitative comparison between intensity-based and depth-based segmentation is presented in Fig. 5, when the same well-known segmentation approach (Mixture of

Gaussians-based background modelling and Bayesian framework for segmentation) [39] is used. The better segmentation is achieved by using the depth image, whereas the same segmentation approach applied on the intensity image suffers of mimetic effects.

(a) (b)

(c) (d)

Figure 5. Segmentation results are shown (b, d) when the segmentation approach in [39] is applied on depth (a) and intensity (c) information, respectively. The better segmentation is achieved starting from the depth image, whereas the same segmentation approach applied on the intensity image suffers of mimetic effects (sweater and wall present the same brightness).

10

4 Datasets Description

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

In the visual-surveillance context, several benchmark sequences and ground-truth data have been defined by using traditional monocular (passive) vision systems (ETISEO [2],

CAVIAR [3], PETS [4]). Datasets for the evaluation of range imaging algorithms on dynamic scenes are not available yet. The goal of ARIDIS is to define a common dataset that could be used in order to evaluate the performance of computer vision algorithms by using active RIM technologies in indoor scenes. The main idea is to investigate RIM approaches when typical troublesome issues of the traditional passive vision occur

(illumination conditions changing, shadows, mimetic and untextured regions, occlusions as described in Table 3). Moreover the dataset is useful for the evaluation of active visionbased algorithms when typical problems for the active range imaging are present (aliasing, reflective surfaces, multipath effects as described in Table 4). Due to the limited FOV and

-

-

-

- the non-ambiguity range of the device, the benchmark sequences have been captured in small indoor environments (up to 5×5 meters).

The camera configuration employed in data collection is shown in Fig. 6, according to the following notation:

- L : covered room length (5.0 meters)

H : distance camera/ground floor

θ : camera tilt angle

β : camera roll angle

(x, y, z) : camera coordinates

The camera is statically positioned in wall mounting configuration at height H from the ground floor with a tilt angle θ and roll angle β. For each collected sequence, a text file provides the parameters (H, β, θ). The dataset is composed by 61 sequences and each of them is about 1800 frames long, captured at different frame rate and according to the most critical parameter (Integration Time). In order to cover a large amount of events, several persons have been captured with different orientations in the scene and different

Figure 6. Wall mounting camera setup used in the data collection. The metric information are referred to the TOF camera coordinate axes (x, y, z) represented. The setup parameters (H, β, θ) are provided for each sequence collected in the ARIDIS dataset.

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

11

Behaviour / Posture Sequences

Walking / Stand 1, 2, 5, 6, 9, 10, 11, 12, 13, 14, 17, 18, 19, 20, 21, 22, 25, 26, 29, 30, 33,

34, 35, 36, 37, 38, 39, 40, 43, 44, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,

57, 58, 59, 60, 61

Falling / Lying down 11, 12, 13, 14, 43, 44, 51, 52, 53, 54, 55, 56, 58, 60, 61

Limping

Sitting / Sit

Crouching / Squat

Object moving

19, 20, 59

21, 22, 33, 34, 37, 38, 39, 40, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57

19, 20, 35, 36, 37, 38, 53, 54, 55

17, 18, 21, 22, 25, 26, 29, 30, 33, 34, 39, 40, 47, 48, 49, 50, 55, 56, 57,

58, 59

Bending / Bent 19, 20, 35, 36, 37, 38, 43, 44, 51, 52, 53, 54, 55, 56, 57

Table 5. Behaviours and postures collected in ARIDIS sequences. heights/dimensions and coloured dresses. Sequences present typical indoor situations, such as walking peoples (one/more people in the scene), persons having different postures

(stand, sit, lying down, bent, squat) and persons having different behaviours (walking, falling down, limping, sitting, crouching, objects moving, bending).

Furthermore, the dataset acquisition has been realized in different occlusion conditions: without occlusions, with static occlusions (an object occluding a person, named objectperson occlusion) and with dynamic occlusions (two or more people in the scene occluding each others, denoted as person-person occlusion) [36, 37, 38]. Tab.5 and Tab.6 report the sequences in ARIDIS dataset grouped by behaviours/postures and occlusion types, respectively. Others useful parameters can be found in the documentation downloadable together with the dataset [33], such as the Integration Time of the device, the camera setup parameters, the amount of persons in the scene and the kind of objects present in the scene for each sequence. Also a background modelling sequence is provided when the background of the scene changes. In Fig.7 keyframes are shown corresponding to the collected posture/behaviour typologies. The proposed dataset do not includes ground-truth data referred to segmentation and tracking. Further work is addressed with the aim to define ground-truth allowing a quantitative performance comparison of algorithms.

Occlusions

None

Sequences

1, 2, 11, 12, 19, 20, 60, 61

Object-Person 5, 6, 17, 18, 21, 22, 25, 26, 29, 30, 33, 34, 35, 36, 37, 38, 39, 40, 43, 44, 47, 48,

49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59

Person - Person 9, 10, 13, 14, 43, 44

Table 6. ARIDIS sequences grouped by occlusion (if any) type.

12 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

(e)

(g)

(i)

(a)

(c)

(f)

(h)

(b)

(d)

(j)

(k) (l)

Figure 7. Keyframes of the collected postures and behaviours (depth and intensity images). a) Objectperson occlusion; b) Person-person occlusion; c) Walking; d) Falling down; e) Sitting; f) Crouching; g)

Object moving; h) Bending. Postures and behaviours are collected in scenes with chairs (i), a desk table, a chair and a stool (j), a large table, chairs and a stool (k) and a vertical desk (l).

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

5 Conclusions

13

In indoor surveillance applications, range images provide a better perception of scenes in all illumination conditions, deterring the use of cheap stereo systems that fail in dark or low-textured environments. If critical parameters of TOF sensor are adjusted, reliable, computationally low-cost and real-time segmentation/tracking can be realized by only using depth measure, since intensity images present unwanted fluctuations. Depth information overcomes projective ambiguity, whereas intensity image provides appearance information, so that the joined use of them could improve critical steps (object recognition, behaviour analysis, etc.) allowing a better description of moving objects. The suggested dataset provides a common basis to investigate vision algorithms; it can be improved by defining ground-truth data to quantify performances. Further work is addressed with the aim to define ground-truth allowing quantitative performance analysis.

References

[1] Workshop on Time of Flight Camera based Computer Vision (TOF-CV), IEEE

International Conference on Computer Vision (CVPR 2008), Anchorage, Alaska (USA),

June 2008.

[2] Workshop on Dynamic 3D Imaging, 31 Annual Symposium of the German Association for Pattern Recognition (DAGM 2009), Jena (Germany), September 2009.

[3] ARTTS UE Project, http://www.artts.eu/ Last accessed 2 November, 2009

[4] NETCARITY UE Project, http://www.netcarity.org/ Last accessed 2 November, 2009

[5] T. Oggier, M. Lehmann, R. Kaufmannn, M. Schweizer, M. Richter, P. Metzler, G. Lang,

F. Lustenberger, N. Blanc, An all-solid-state optical range camera for 3D-real-time imaging with sub-centimeter depth-resolution (SwissRanger), Proc. SPIE Vol. 5249, pp. 634-545,

2003.

[6] A.T. Nghiem, F. Bremond, M. Thonnat, V. Valentin, ETISEO, performance evaluation for video surveillance systems, Proc. IEEE AVSS 2007, pp. 576-481, 2007.

[7] The CAVIAR website http://homepages.inf.ed.ac.uk/rbf/CAVIAR/ Last accessed 2

November, 2009.

[8] http://pets2006.net Last accessed 2 November, 2009.

[9] http://marathon.csee.usf.edu/range/raw-image-data Last accessed 2 November, 2009.

[10] O. Gallo, R. Manduchi, A. Rafii, Robust Curb and Ramp Detection for Safe Parking

Using the Canesta TOF Camera, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW '08), Anchorage (Alaska), June 2008.

[11] S. Acharya, C. Tracey, A. Rafii, System design of time-of-flight range camera for car park assist and backup application, IEEE Computer Society Conference on Computer

Vision and Pattern Recognition Workshops (CVPRW '08), Anchorage (Alaska), June 2008.

14 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

[12] T. Ringbeck, B. Hagebeuker, A 3D Time of flight camera for object detection, Optical

3-D Measurement Techniques, ETH Zürich, 2007.

[13] W. Karel, "Integrated Range Camera Calibration Using Image Sequences from Hand-

Held Operation", XXIst Congress of the International Society for Photogrammetry and remote sensing (ISPRS 2008), Beijing (China), July 2008.

[14] B. Buttgen, P. Seitz, Robust Optical Time-of-Flight Range Imaging Based on Smart

Pixel Structures, IEEE Transactions on Circuits and Systems, Vol. 55, Issue 6, pp. 1512 –

1525, July 2008.

[15] ARTTS 3D-TOF Database, http://www.artts.eu/publications/3d_tof_db Last accessed 2

November, 2009.

[16] A. Birk, K. Pathak, S. Schwertfeger,W. Chonnaparamutt, The IUB Rugbot: an intelligent, rugged mobile robot for search and rescue operations, International Workshop on Safety Security and Rescue Robotics (SSRR) IEEE Press, 2006.

[17] A. Birk, S. Markov, I. Delchev, K. Pathak, Autonomous Rescue Operations on the IUB

Rugbot International Workshop on Safety Security and Rescue Robotics (SSRR) IEEE

Press, 2006.

[18] Jacobs University, Robotics group, http://robotics.iu-bremen.de Last accessed 2

November, 2009.

[19] S. Hussmann, T. Liepert, Three-Dimensional TOF Robot Vision System, IEEE

Transactions on Instrumentation and Measurement, Accepted for future publication.

[20] C. Joochim, H. Roth, Development of a 3D mapping using 2D 3D sensors for mobile robot locomotion, IEEE International Conference on Technologies for Practical Robot

Applications (TePRA 2008), Nov. 2008.

[21] T. Ringbeck, B. Hagebeuker, A 3D Time Of Flight Camera For Object Detection,

Optical 3-D Measurement Techniques, ETH Zürich.

[22] J.T. Thielemann, G.M. Breivik, A. Berge, Pipeline Landmark Detection for

Autonomous Robot Navigation using Time-of-Flight Imagery, IEEE Computer Society

Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008),

June 2008.

[23] S.A. Guomundsson, R. Larsen, H. Aanaes, M. Pardas, J.R. Casas, TOF Imaging in

Smart Room Environments towards Improved People Tracking, IEEE Computer Society

Conference on Computer Vision and Pattern Recognition Workshops (CVPRW 2008), June

2008.

[24] M.B. Holte, T.B. Moeslund, P. Fihl, Fusion of Range and Intensity Information for

View Invariant Gesture Recognition, IEEE Computer Society Conference on Computer

Vision and Pattern Recognition Workshops (CVPRW 2008), June 2008.

[25] Youding Zhu, B. Dariush, K. Fujimura, Controlled human pose estimation from depth image streams, IEEE Computer Society Conference on Computer Vision and Pattern

Recognition Workshops (CVPRW 2008), June 2008.

DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

15

[26] J. Penne , C. Schaller , J. Hornegger, T. Kuwert, Robust real-time 3D respiratory motion detection using time-of-flight cameras, International Journal of Computer Assisted

Radiology and Surgery, pp. 427-431, July 2008.

[27] T. Oggier, B. Büttgen, F. Lustenberger, G. Becker, B. Rüegg, A. Hodac, SwissRanger

SR3000 and first experiences based on miniaturized 3D-TOF cameras, Proceedings of the

1st Range Imaging Research Day. Zurich, Switzerland, 2005.

[28] Z. Hu, T. Kawamura, K. Uchimura, A Performance Review of 3D TOF Vision Systems in Comparison to Stereo Vision Systems, Stereo Vision, InTech Education and Publishing,

November 2008.

[29] A. Kolb, E. Barth, R. Koch, ToF-Sensors: New Dimensions for Realism and

Interactivity, IEEE Computer Society Conference on Computer Vision and Pattern

Recognition Workshops (CVPRW 2008), June 2008.

[30] Canesta Inc., www.canesta.com Last accessed 2 November, 2009.

[31] PMDTechnologies GmbH, www.pmdtec.com Last accessed 2 November, 2009.

[32] Mesa Imaging AG, www.mesa-imaging.ch Last accessed 2 November, 2009.

[33] SIPlab Imm Cnr, http://siplab.le.imm.cnr.it/3D/index.htm Last accessed 2 November,

2009.

[34] H. Rapp, Experimental and Theoretical Investigation of Correlating TOF-Camera

Systems, Master Thesis, Interdisciplinary Center for Scientific Computing (IWR),

University of Heidelberg, September 2007.

[35] B. Büttgen, T. Oggier, M. Lehmann, R. Kaufmann, F. Lustenberger, CCD/CMOS lockin pixel for range imaging: Challenges, limitations and state-of-the-Art, 1st Range Imaging

Research Days, ETH, Zurich, Switzerland, pp. 21–32, Sep. 2005.

[36] C.J. Lin, J.G. Wang, Human Body Posture Classification Using a Neural Fuzzy

Network based on Improved Particle Swarm Optimization, 8th International Symposium on Advanced Intelligent Systems, Sokcho-City , Korea , pp. 414-419, Sep. 5-8, 2007.

[37] Nooritawati Md Tahir, Aini Hussain, Salina Abdul Samad, Hafizah Husain, “Posture recognition using correlation filter classifier, Journal of Theoretical and Applied

Information Technology, 2008.

[38] B.B. Francois, F. Bremond, M. Thonnat, I. S. Antipolis, Human Posture Recognition in

Video Sequence, Joint IEEE International Workshop on Visual Surveillance and

Performance Evaluation of Tracking and Surveillance (VS-PETS 2003), pp. 23-29, 2003.

[39] D. Lee, Effective Gaussian Mixture Learning for Video Background Subtraction, IEEE

Transactions on Pattern Analysis and Machine Intelligence, Vol.27, no.5, pp.827-832, 2005.

[40] COGAIN Network of excellence, http://www.cogain.org Last accessed 2 November,

2009.

[41] G. Passalis, I. A. Kakadiaris, T. Theoharis, G. Toderici, T. Papaioannou, “Towards fast

3D ear recognition for real-life biometric applications," avss, pp.39-44, 2007 IEEE

16 DISTANTE et al: ACTIVE RANGE IMAGING DATASET...

Annals of the BMVA Vol. 2010, No. 3, pp 1−14 (2010)

Conference on Advanced Video and Signal Based Surveillance, 2007.

(http://vislab.ucr.edu) Last accessed 2 November, 2009.

Download