Audio Visual Speaker Change Detection

Speaker change detection is very valuable information for speaker identification
and as metadata for search and retrieval of multimedia content. Speaker
change detection can be inherently unrobust due to mismatches in training
and test conditions like, changes in acoustic channel and background noise.
This research focuses on exploiting visual speaker and scene change information
to remove the limitations of audio-based speaker change detection.

Key component technologies:

  • Visual scene/speaker change detection
  • Audio-based speaker change detection
  • Fusion techniques

Paper: