Understanding and Predicting Interestingness of Videos Yu-Gang Jiang, Yanran Wang, Rui Feng Xiangyang Xue, Yingbin Zheng, Hanfang Yang Fudan University, Shanghai, China AAAI 2013, Bellevue, USA, July 2013 Motivation • Large amount of videos on the Internet – Consumer Videos, advertisement… • Some videos are interesting, while many are not More interesting Less interesting Two Advertisements of digital products Applications • Web Video Search • Recommendation System • ... Related Work • Predicting Aesthetics and Interestingness of Images – Datta et al. ECCV, 2006; Dhar et al. CVPR, 2011; N. Murray et al. CVPR, 2012… More interesting Less interesting • We are the first to explore the interestingness of Videos … … … Two New Datasets • Flickr – source: Flickr.com Consumer Video – videos: 1200 (20 hrs in total) • YouTube – source: Youtube.com Advertisement Video – videos: 420 (4.2 hrs in total) Flickr Dataset • Collected by 15 interestingness-enabled queries – Top 10% of 400 as interesting videos; Bottom 10% as uninteresting – 80 videos per category/query YouTube Dataset • Collected by 15 ads queries on YouTube • 10 human assessors (5 females, 5 males) – Compare video pairs General observation: videos with humorous stories, attractive background music, or better professional editing tend to be more interesting Annotation Interface Our Computational Framework • Aim: compare two videos and tell which is more interesting Visual features vs. Audio features High-level attribute features Multi-modal fusion Ranking SVM results Feature Visual features Color Histogram SIFT HOG Audio features MFCC Spectrogram SIFT Audio-Six High-level attribute features Classemes ObjectBank Style Flower, Tree, Cat, Face… SSIM GIST Rule of Thirds Vanishing Point Soft Focus Motion Blur Shallow DOF … Prediction & Evaluation • Prediction – Ranking SVM trained on our dataset • Chi square kernel for histogram-like features • RBF kernel for the others – 2/3 for training and 1/3 for testing • Evaluation – Prediction accuracy • The percentage of correctly ranked test video pairs Prediction Accuracies(%) Visual Feature Results 80 75 70 65 60 55 50 80 75 70 65 60 55 50 74.5 74.2 76.6 Flickr 67.0 67.1 68.0 YouTube Prediction Accuracies(%) Audio Feature Results 80 75 70 65 60 55 50 80 75 70 65 60 55 50 74.7 76.4 Flickr 64.8 65.7 YouTube Prediction Accuracies(%) Attribute Feature Results 80 75 70 65 60 55 50 80 75 70 65 60 55 50 74.8 64.5 Flickr Different from predicting Image Interestingness 56.8 64.3 YouTube Visual+Audio+Attribute Results Prediction Accuracies(%) 80 78.6 76.6 2.6% 70 Flickr 60 50 Visual Audio Attribute Visual+Audio Visual+Audio+Attribute 80 70 60 50 68.0 71.7 5.4% YouTube Summary • Conducted a pilot study on video interestingness • Built two datasets to support this study – Publicly available at: www.yugangjiang.info/research/interestingness • Evaluated a large number of features – Visual + audio features are very effective – A few features useful in image interestingness do not work in video domain (e.g., Style Attributes…) Thank you ! Datasets are available at: www.yugangjiang.info/research/interestingness
© Copyright 2024 ExpyDoc