Technical Setup
Recording Platform
- A self-constructed recording platform was used.
- The "AWEAR II" called system consists of two cameras and four microphones and can be powered by rechargeable batteries or line current.
- It can be controlled via a wifi-connected netbook.
>>Detailed information<<
Dataformat
- Portable Network Graphics (PNG) format is used for visual data.
- Audio data are stored as wav-files with 48kHz sampling rate and 32 bit resolution.
- Optionally, the audio-visual scene is rendered as a preview video using one or both camera signals and the front stereo microfon set.
>>Detailed information<<
Metadata description
- the metadata, i.e. the detector output and the ground truth annotations, are available as a zip-file for each audio-visual sequence . The name of the sequence contains a 5bit key to illustrate the presence of the metadata in the zip-file.
- 1st bit: detector_SnS.mat
- 2nd bit: detector_DoA.mat
- 3rd bit: detector_TT.mat
- 4th bit: groundtruth_SnS.lab
- 5th bit: groundtruth_TT.txt
>>Detailed information<<
Ground Truth Audio
- Ground truth annotations were generated for the database of audiovisual recordings.
- This includes labels for speech-non-speech discrimination, acoustic object detection as well as ground truth labels for acoustic object localization.
- A semi-supervised tools was developed for ground truth annotation of the data in an efficient way. This tool provides output interfaces for MATLAB, EXCEL and the HTK Speech Recognition framework
>>Detailed information<<
Ground Truth Video
- Ground truth annotations were generated for the database of audiovisual recordings.
- This includes binary labels for human movement analysis and interpretation as well as absolute positions of body parts within a video frame in pixels.
- A semi-supervised tools was developed for ground truth annotation of the data in an efficient way. This tool provides a *.txt output interfaces for easy import and further processing steps
>>Detailed information<<
Label
- Each audio-visual recording contains a label file which includes information about the name of the recording and the scene, the date, the location, the used device, frame rate, a placeholder for comments, as well as a detailed description of the scene.
- All the labels have been generated manually, either by hand or by using custom made semi-supervised tools.
>>Detailed information<<