Measuring Speaking Time from Privacy-Preserving Videos

Shun Maeda, Chunzhi Gu, Chao Zhang*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ongoing pandemic caused by the COVID-19 virus is challenging many aspects of daily life such as restricting the conversation time. A vision-based face analyzing system is considerable for measuring and managing the person-wise speaking time, however, pointing a camera to people directly would be offensive and intrusive. In addition, privacy contents such as the identifiable face of the speakers should not be recorded during measuring. In this paper, we adopt a deep multimodal clustering method, called DMC, to perform unsupervised audiovisual learning for matching preprocessed audio with corresponding locations at videos. We set the camera above the speakers, and by feeding a pair of captured audio and visual data to a pre-trained DMC, a series of heatmaps that identify the location of the speaking people can be generated. Eventually, the speaking time measurement of each speaker can be achieved by accumulating the lasting speaking time of the corresponding heatmap.

Original languageEnglish
Title of host publicationInternational Workshop on Advanced Imaging Technology, IWAIT 2022
EditorsMasayuki Nakajima, Shogo Muramatsu, Jae-Gon Kim, Jing-Ming Guo, Qian Kemao
PublisherSPIE
ISBN (Electronic)9781510653313
DOIs
StatePublished - 2022
Event2022 International Workshop on Advanced Imaging Technology, IWAIT 2022 - Hong Kong, China
Duration: 2022/01/042022/01/06

Publication series

NameProceedings of SPIE - The International Society for Optical Engineering
Volume12177
ISSN (Print)0277-786X
ISSN (Electronic)1996-756X

Conference

Conference2022 International Workshop on Advanced Imaging Technology, IWAIT 2022
Country/TerritoryChina
CityHong Kong
Period2022/01/042022/01/06

Keywords

  • Deep Multi-modal Clustering
  • Speaking time measurement
  • Video-based sound source localization

ASJC Scopus subject areas

  • Electronic, Optical and Magnetic Materials
  • Condensed Matter Physics
  • Computer Science Applications
  • Applied Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Measuring Speaking Time from Privacy-Preserving Videos'. Together they form a unique fingerprint.

Cite this