Acoustic Object Localization
For the audio localization, recordings from Zurich have been evaluated. Due to the special evaluation scheme, only hit and miss rates are available. The evaluation results in detail can be found in the table below. For 87,7% of the 3756 frames, a localized position matches the annotated position of a speaking person.
hits
|
misses |
3297 | 459
|
Audio-visual Recording | FALSE | CORRECT |
22_17Mar2010_Zurich_living_lab | ||
scene_01_woman_telephone_light_take_a | 435 | 39 |
scene_01_woman_telephone_take_c | 394 | 32 |
scene_01_woman_telephone_take_d | 367 | 71 |
scene_02_knocking_light_take_c | 39 | 33 |
scene_02_knocking_take_a | 42 | 48 |
scene_02_knocking_take_b | 6 | 30 |
scene_07_fab_standup_talkshimself_light_take_d | 261 | 15 |
scene_07_fab_standup_talkshimself_light_take_e | 225 | 45 |
scene_07_fab_standup_talkshimself_take_b | 280 | 8 |
scene_07_fab_standup_talkshimself_take_c | 291 | 3 |
scene_16_fab_oov_couch_light_take_c | 165 | 15 |
scene_16_fab_oov_couch_take_e | 194 | 16 |
scene_16_fab_oov_couch_take_f | 176 | 16 |
scene_20_fab_hits_limping_speech_light_take_a | 90 | 18 |
scene_20_fab_hits_limping_speech_light_take_b | 87 | 21 |
scene_20_fab_hits_limping_speech_take_c | 84 | 12 |
scene_20_woman_hits_limping_speech_light_take_f | 49 | 11 |
scene_20_woman_hits_limping_speech_take_d | 45 | 15 |
scene_20_woman_hits_limping_speech_take_e | 67 | 11 |
Sum | 3297 | 459 |