Distance Metric Learning for Large Margin Nearest Neighbor (LMNN) classification Presenter: Yen-Cheng Chou Image downloaded from Kilian Q. Weinberger (the first author) website, http://www.cse.wustl.edu/~kilian/code/lmnn/lmnn.html 1 The cake is for his 30th birthday. How do you classify these images? 2 It depends on your purpose 3 You will get same results for two problems if you use kNN in Euclidean space (This is not what you want!) 4 Can we find a better distance measurement for kNN? 5 For foreground classification Intuition! Similar objects should be close 6 For background classification Metric Learning: a way to make similar objects closer • Metric = distance function • Design a better distance function (for our purpose) • For foreground object classification, we want • D( , ) > D( 7 , ) LMNN: a metric learning method for KNN Intuition for robust kNN classification • A sample should share the same label as its (k) neighbors • samples with different labels should be widely separated 9 Goal of LMNN • Design a good distance function that: 1. Similarly-labeled predefined neighbors are close. 2. Differently-labeled samples are out of neighborhood. The sample similarly-labeled predefined neighbor similarly-labeled sample (but not one of predefined neighbor) 10 differently-labeled samples Example of LMNN The sample similarly-labeled predefined neighbor similarly-labeled sample (but not one of predefined neighbor) 11 differently-labeled samples Result of LMNN 12 Result of LMNN 13 Model of LMNN cost function, optimization, testcase classification Distance Function • Define distance function by ! • Learn the best M with optimization problem based on distance function 15 Cost Function of LMNN • Define a loss function that leverages our two goals, then minimize it. Similarly-labeled predefined neighbors should be close 16 Differently-labeled samples should be out of neighborhood Loss Function ! ! • Similarly-labeled predefined neighbors should be close The sample similarly-labeled predefined neighbor similarly-labeled sample (but not one of predefined neighbor) 17 differently-labeled samples Loss Function • Differently-labeled samples should be out of neighborhood. The sample similarly-labeled predefined neighbor similarly-labeled sample (but not one of predefined neighbor) 18 differently-labeled samples Illustration of ideal Metric Similarly-labeled predefined neighbors should be close Imaged downloaded from http://en.wikipedia.org/wiki/File:Lmnn.png 19 Differently-labeled samples should be out of neighborhood Formulate as Convex Optimization Problem Testcase Classification • Use derived distance function on kNN ! • ? Energy based classification • Apply every label to the test case • Computer its lose • Use the one that has smallest loss The sample similarly-labeled predefined neighbor similarly-labeled sample (but not one of predefined neighbor) differently-labeled samples Advantages of LMNN Similarly-labeled predefined neighbors should be close • Differently-labeled samples should be out of neighborhood Focus on (local) predefined neighbors • Less constraints • doesn’t require all constraints • It only requires the differently-labeled data far enough (outside neighborhood), not as-far-as-possible • Formulated as a convex optimization problem. Global optimal solution exists. Two views of LMNN Distance Metric Function Transformation onto another space Model Comparison 24 LMNN Versus PCA Left image: downloaded from http://en.wikipedia.org/wiki/File:Lmnn.png Right image: cropped from Principal Component Analysis(PCA) slides of Mario Savvides 25 LMNN versus MMC The sample Penalize all differently-labeled samples. sum(distance within same label) should strictly < 1 similarly-labeled predefined neighbor The sample similarly-labeled sample (but not one of predefined neighbor) similarly-labeled sample differently-labeled samples differently-labeled samples LMNN versus POLA The sample Constrain over similarly-labeled pair and differently-labeled pair similarly-labeled predefined neighbor The sample similarly-labeled sample (but not one of predefined neighbor) similarly-labeled sample differently-labeled samples differently-labeled samples LMNN versus SVM Extension Multi-pass LMNN … Multi-metric LMNN Learn various metrics for each labels Discussion & Conclusion How to define a good cost function for metric learning? • Purpose: For kNN? SVM? De-noise? • Constraints: • • too weak? (bad performance) • too strong? (hard to achieve) Computation Issue • Close form solution? • Convex Optimization? Well designed cost function for LMNN • Purpose: For kNN • Constraints over • • similarly-labeled predefined neighbor • differently-labeled sample only need to be outside the neighborhood! Well designed for convex optimization VisualRank: Applying PageRank to Large-Scale Image Search Yushi Jing, Member, IEEE, and Shumeet Baluja, Member, IEEE Slides: Jonathan Chao Summary There are not many image-based search engines around. Image-based search methods are proposed, but the effectiveness on large scale search is still uncertain. Thus, VisualRank, an efficient computation of similarities to images in large quantity, is proposed. Summary Why VisualRank? How is it done? What is the result? Further work? Text-Based Search Methods Due to the ease of text search, the commercial image search often uses techniques from text search. Instead of using the images as sources, the search engines take the context in the websites that contain certain key words and display the images from those websites. Text-Based Search Methods However, such method will lead to display of misleading information due to ambiguous wording. Sometimes this will cause inconvenience for users. Text-Based Search Methods Key Word: Coca Cola Text-Based Search Methods Image-Based Search Instead, the distribution of visual similarities are analyzed. “Visual Themes” and their strength in images are used to create a visual ranking system Image-Based Search VisualRank: an end-to-end system that improves Google image search results emphasis on robust and efficient computation of image similarities applicable to a large number of queries and images. VisualRank Methodology Each graph has vertices and edges. A representation of an image VisualRank Methodology Eigenvector Centrality: Principle eigenvector of a square stochastic adjacency matrix, constructed from the weights of the edges in the graph. Or the ranking scores corresponding the likelihood of travelling from one (random) vertex to another (Random Walk) VisualRank Methodology Visual hyperlinks: visual similarity in initial results. If images A & B have a visual hyperlink, there exists a probability for a user to jump from A to B. The importance of an image is measured by how many visual hyperlinks pointing to it. VisualRank Methodology If an image is important, the images such image points to is also important. VisualRank is defined as S* is the column normalized adjacency matrix, S. VisualRank Methodology By repeatedly multiplying VR by S*, the dominant eigenvector of S* stands out. Does VR converge? S* needs to be aperiodic & irreducible. Aperiodicity: generally true in all cases. Irreducibility: guaranteed by introducing a damping factor to the equation. VisualRank Methodology The new equation becomes: d: scalar damping factor. p: a matrix that can be uniformed of ununiformed. VisualRank Methodology Measurement of visual similarity Local features to obtain more stable information and avoid info change on Rotation Transformation Scale VisualRank Methodology Local Feature extractors: SIFT Standard SIFT Difference of Gaussian interest point detector Builds a pyramid of scaled images by iteratively applying Gaussian filter to the original image Adjacent images are subtracted to create DoG Interest points located at the local extrema of 2D image space and scale space are selected. Orientation histogram feature representation Gradient map around the interest points are computed and then divided into sub-regions from which the orientation histogram can be computed. VisualRank Methodology At the end, a 128-D vector is created 4 x 4 orientation histogram 8 bins Concatenate them into one vector Visual similarity is defined as the number of local features shared between two images divided by their average number of interest points. VisualRank Methodology VisualRank Methodology Web-Scale Visual Rank Infeasible to generate the S matrix with billions of images. Precluster web images based on metadata Or use existing commercial search engine to create such cluster Ex. Using top N results in Google Image Search to form a cluster can calculate visual similarity. Web-Scale Visual Rank Hashing Local Features Using hash table to store the features is a more efficient way. Locality Sensitive Hashing Using approximate K-NNR (return all points that are within range r) Web-Scale Visual Rank (Summary) 1. Local features are generated. 2. A collective of hash tables are created, each storing bunch of hash functions. Each of the descriptors is indexed into each of the hash table. 3. Descriptors that share the same key in more than a certain number of hash tables are considered a match. Web-Scale Visual Rank (Summary) 4. Regroup matched features by images they are associated with. A 4-D histogram is used to store “votes.” Select the histogram that has most votes to be used to compute similarity score. 5. A pair of images are considered a match if the share more than C (C=3, empirical result) descriptors. 6. Given similarity matrix S, can generate results using VisualRank (VR) Experimental Results VisualRank vs. Google on Product Query: Experimental Results VisualRank vs. Other Related Works on Product Query: Experimental Results VisualRank vs. Google on Landmark Query: Future Work Working on both labeling and ranking unlabeled images from personal image collections studying the relationship between image similarity and “likelihood for transition” more extensively extending this work to other domains that are not amenable to traditional text-based analysis including video and audio analysis.
© Copyright 2024 ExpyDoc