PowerPoint - Yanran (Joyce) Wang

Understanding and Predicting
Interestingness of Videos
Yu-Gang Jiang, Yanran Wang, Rui Feng
Xiangyang Xue, Yingbin Zheng, Hanfang Yang
Fudan University, Shanghai, China
AAAI 2013, Bellevue, USA, July 2013
Motivation
• Large amount of videos on the Internet
– Consumer Videos, advertisement…
• Some videos are interesting, while many are not
More interesting
Less interesting
Two Advertisements of digital products
Applications
• Web Video Search
• Recommendation System
• ...
Related Work
• Predicting Aesthetics and Interestingness of Images
– Datta et al. ECCV, 2006; Dhar et al. CVPR, 2011; N.
Murray et al. CVPR, 2012…
More interesting
Less interesting
• We are the first to explore the interestingness of
Videos
…
…
…
Two New Datasets
• Flickr
– source: Flickr.com Consumer Video
– videos: 1200 (20 hrs in total)
• YouTube
– source: Youtube.com Advertisement Video
– videos: 420 (4.2 hrs in total)
Flickr Dataset
• Collected by 15 interestingness-enabled queries
– Top 10% of 400 as interesting videos; Bottom 10% as
uninteresting
– 80 videos per category/query
YouTube Dataset
• Collected by 15 ads queries on YouTube
• 10 human assessors (5 females, 5 males)
– Compare video pairs
General observation: videos with humorous stories, attractive
background music, or better professional editing tend to be
more interesting
Annotation Interface
Our Computational Framework
• Aim: compare two videos and tell which is more
interesting
Visual features
vs.
Audio
features
High-level
attribute features
Multi-modal
fusion
Ranking
SVM
results
Feature
Visual
features
Color
Histogram
SIFT
HOG
Audio
features
MFCC
Spectrogram
SIFT
Audio-Six
High-level
attribute
features
Classemes
ObjectBank
Style
Flower, Tree, Cat, Face…
SSIM
GIST
Rule of Thirds
Vanishing Point
Soft Focus
Motion Blur
Shallow DOF
…
Prediction & Evaluation
• Prediction
– Ranking SVM trained on our dataset
• Chi square kernel for histogram-like features
• RBF kernel for the others
– 2/3 for training and 1/3 for testing
• Evaluation
– Prediction accuracy
• The percentage of correctly ranked test video pairs
Prediction Accuracies(%)
Visual Feature Results
80
75
70
65
60
55
50
80
75
70
65
60
55
50
74.5
74.2
76.6
Flickr
67.0
67.1
68.0
YouTube
Prediction Accuracies(%)
Audio Feature Results
80
75
70
65
60
55
50
80
75
70
65
60
55
50
74.7
76.4
Flickr
64.8
65.7
YouTube
Prediction Accuracies(%)
Attribute Feature Results
80
75
70
65
60
55
50
80
75
70
65
60
55
50
74.8
64.5
Flickr
Different from predicting Image
Interestingness
56.8
64.3
YouTube
Visual+Audio+Attribute Results
Prediction Accuracies(%)
80
78.6
76.6
2.6%
70
Flickr
60
50
Visual
Audio
Attribute
Visual+Audio
Visual+Audio+Attribute
80
70
60
50
68.0
71.7
5.4%
YouTube
Summary
• Conducted a pilot study on video interestingness
• Built two datasets to support this study
– Publicly available at:
www.yugangjiang.info/research/interestingness
• Evaluated a large number of features
– Visual + audio features are very effective
– A few features useful in image interestingness do not work
in video domain (e.g., Style Attributes…)
Thank you !
Datasets are available at:
www.yugangjiang.info/research/interestingness