Using deep features for video scene detection and annotation

Stanislav Protasov*, Adil Mehmood Khan, Konstantin Sozykin, Muhammad Ahmad

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

38 Scopus citations

Abstract

The semantic video indexing problem is still underexplored. Solutions to the problem will significantly enrich the experience of video search, monitoring, and surveillance. This paper concerns scene detection and annotation, and specifically, the task of video structure mining for video indexing using deep features. The paper proposes and implements a pipeline that consists of feature extraction and filtering, shot clustering, and labeling stages. A deep convolutional network is used as the source of the features. The pipeline is evaluated using metrics for both scene detection and annotation. The results obtained show high scene detection and annotation quality estimated with various metrics. Additionally, we performed an overview and analysis of contemporary segmentation and annotation metrics. The outcome of this work can be applied to semantic video annotation in real time.

Original languageEnglish
Pages (from-to)991-999
Number of pages9
JournalSignal, Image and Video Processing
Volume12
Issue number5
DOIs
StatePublished - 1 Jul 2018
Externally publishedYes

Bibliographical note

Publisher Copyright:
© 2018, Springer-Verlag London Ltd., part of Springer Nature.

Keywords

  • Deep convolutional networks
  • Image recognition
  • Scene detection
  • Semantic mining

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Using deep features for video scene detection and annotation'. Together they form a unique fingerprint.

Cite this