Interactive Feature Tracking using K-D Trees and Dynamic Programming Aeron M. Buchanan, [email protected] (Engineering, University of Oxford, UK) Andrew W. Fitzgibbon, [email protected] (Microsoft Research, Cambridge, UK) • with a human in the loop, which is modelled by the system, • a comprehensive track definition allowing optimal interactive tracking with massive automatic preprocessing. Interactivity Preprocessing Detection Filter jet creation K-D tree searches Extremely simple and intuitive user-guided process: 1. Single user click generates a track through the entire sequence at over 100fps. 2. Further clicks lock down keyframe features and improve the track. Optimization Refinement Dynamic programming Sub-pixel precision >100fps Interactive Overview A faster than real-time tracker: 500 Prior State of the Art: • including Mean-shift, KLT, WSL, etc, • do not model the possibility of failure, • fail even on ‘easy’ sequences, due to I “Track-to-first”—do not model appearance change, I “Track-to-previous”—track is lost through drift, • require user to continually monitor and restart, • time taken to track sequence: >100% of sequence length. Instead, User-in-Loop Batch Tracking: Our approach overcomes traditional conflict between: 1. Small search region—tracking is fast per frame; must use every frame; unable to recover after occlusions. 2. Large search region—tracking is slow per frame, but subsampling allows speed-ups; recovery possible after occlusions, but bad tracks due to mismatches. 950 2000 3500 5000 6500 8420 9500 11750 5 50 500 725 950 2000 3500 5000 6500 8420 9500 11750 5 50 500 725 950 2000 4550 4910 6515 7940 9140 10040 Input: Image stream, I(x, y, t); keyframe features, Q = {(xi, yi, ti, qi)}; tuning parameters, µ1,2. Output: Optimal track, T = {(xt, yt, pt)}, minimizing: E(T ) = X t ? µ1kxt − xt−1k | {z } trajectory smoothness ? ? ? + µ2kpt − pt−1k + min(kqi − ptk ) | {z } appearance smoothness | i {z } keyframe similarity note: norms are robust for occlusions and multichannel th with pt being the vectorized appearance of the feature at xt in frame t and qi the appearance of the i keyframe feature. Immense speed-ups, taking this optimization to interactive rates, are achieved by separating detection from tracking: • Fast detection is facilitated by converting each image to a 16D point cloud, using convolution with a filter bank (obtained via PCA of natural image patches), and storing it in a K-D tree. • The optimal track is efficiently extracted using dynamic programming (Viterbi algorithm) from the small set of candidate matches for each frame selected by the detection stage. Occlusions are incorporated by treating them as a special type of feature. Refinement automatically updates the track trajectory to sub-pixel precision. This is required because the image locations of the data points in the K-D trees are necessarily quantized. 20 60 100 140 180 220 260 300 340 380 420 460 8 100 200 300 400 500 600 700 800 900 1000 1100 320×240, 200fps Results 360×270, 135fps Comparison • user specifies keyframe features, one at a time, • the optimal track over the entire sequence is generated, • optimizing over I smoothness of trajectory, I similarity to keyframe features, I smoothness of appearance, I summed occlusion penalties, • tuning parameters can be updated interactively, • time taken to track sequence: 40% of sequence length. 725 Details 50 Mean-shift Colour KLT Frame 5
© Copyright 2024 ExpyDoc