Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques ¨ J. Hammerle-Uhl, M. Pober, A. Uhl Presenter: Christof Kauba Department of Computer Sciences University of Salzburg May 30th, 2014 Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 1/22 Overview 1 Introduction & Motivation 2 The Stirmark Toolkit 3 Experiments 4 Conclusion Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 2/22 Outline 1 Introduction & Motivation 2 The Stirmark Toolkit 3 Experiments 4 Conclusion Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 3/22 Introduction & Motivation Fingerprint recognition robustness and sample quality Sample image quality impacts on recognition accuracy Skin conditions (e.g., dryness, moisture, dirt, cuts and bruises, ageing), sensor type and conditions (e.g., dirt, noise, size), user cooperation, crime scene preservation Benchmarking frameworks: FVC, BioSecure, SFinGe, StirMark [1] Essential: Indices to reliably determine fingerprint image quality in various circumstances, but how to assess such proposed indices ? Here: Propose standardised tool to assess correlation between image quality indices and recognition accuracy on fingerprint image data representing a wide range of real-world acquisition conditions and quality levels (which are simulated by StirMark [1]). ¨ [1] J. Hammerle-Uhl, M. Pober, A. Uhl, ”Towards Standardised Fingerprint Matching Robustness Assessment: The StirMark Toolkit – Cross-Database Comparisons with Minutiae-based Matching”, In Proceedings of the 1st ACM Workshop on Information Hiding and Multimedia Security (IH&MMSec’13), pp. 111-116, Montpellier, France, June 17 - June 19, 2013. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 4/22 Outline 1 Introduction & Motivation 2 The Stirmark Toolkit 3 Experiments 4 Conclusion Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 5/22 The Stirmark Toolkit Basic idea: The StirMark Benchmark is a generic benchmark test for evaluating the robustness of digital image watermarking methods, many of the systematic errors introduced into data can be interpreted as specific fingerprint acquisition conditions. Additive noise: Actual dust on the fingerprint contact area, sensor noise, grainy surface a latent fingerprint has been taken off Median Cut filtering: Simulates blur in fingerprint images, e.g. smudgy fingerprints (too much moist) etc. Remove Lines and Columns: Sensor errors, esp. sweep sensors can be affected by line removal (examples are shown) Rotation: Omnipresent challenge in fingerprint recognition Stretching: A higher force applied when pressing the finger onto the contact area, in forensics a soft or flexible surface Shearing: Simulates a setting where the applied pressing force is not perpendicular to the contact area Random distortions: Modelling e.g. unevenly distributed pressure or a latent fingerprint scanned from an uneven surface. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 6/22 StirMark Examples (a) Noise (level 15) (b) Median Cut Filter (size 9) (c) Rotation of −15◦ (d) Stretching (d = 1.350) (e) Shearing (b = c = 0.20) (f) Random (lrnddist 4.2) Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques Dist. 7/22 Real Examples (a) Missing lines (b) Warping effects Figure: Examples for distortions from actual acquisition problems. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 8/22 Fingerprint quality indices assessment strategy Generate a large corpus of test data exhibiting various quality levels Start with an available dataset of the target sensor (or multiple sensors of interest) Apply StirMark image manipulations of different types in various intensities Conduct fingerprint recognition experiments on these data with different types of feature extraction / matching algorithms Correlate recognition result parameters (e.g. EER) to the quality index values of different strength intensities In this manner, specific strengths and weaknesses of candidate quality indices can be identified Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 9/22 Types of Fingerprint Matchers Correlation-Based Matcher Use the fingerprint images in their entirety, the global ridge and furrow structure of a fingerprint is decisive. Images are correlated at different rotational and translational alignments. Ridge Feature-Based Matcher Deal with the overall ridge and furrow structure in the fingerprint, yet in a localised manner. Characteristics like local ridge orientation or local ridge frequency are used. Minutiae-Based Matcher The set of minutiae within each fingerprint is determined and stored as list, each minutia being represented (at least) by its location and direction. The matching process then basically tries to establish an optimal alignment between the minutiae sets. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 10/22 Outline 1 Introduction & Motivation 2 The Stirmark Toolkit 3 Experiments 4 Conclusion Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 11/22 Experimental Settings: Data & Recognition Software DB1 - DB3 from the FVC2004 data are used in a verification setting employing the evaluation protocol as specified by FVC Fingerprint matching software 3 minutiae-based schemes [2], including“NIST Biometric Image Software” (NBIS) package (mindtct and bozorth3). Due to the similarity of the results, average EER is given only. Phase only correlation (POC), custom implementation. First, the normalised cross spectrum (or cross-phase spectrum) of the DFT of the two images is computed. The POC is then obtained by taking the inverse DFT of the normalised cross spectrum. Fingercode (FC), custom implementation. A Gabor filter bank is applied to the orientation image resulting in a “Ridge Feature Map” which is translationally and rotationally aligned for matching. EER is used to assess recognition accuracy ¨ [2] J. Hammerle-Uhl, M. Pober, A. Uhl, ”Towards Standardised Fingerprint Matching Robustness Assessment: The StirMark Toolkit – Cross-Feature Type Cmparisons”, In Proceedings of the 14th IFIP International Conference on Communications and Multimedia Security (CMS’13), pp. 3-17, Magdeburg, Germany, Springer Lecture Notes on Computer Science, 8099, Sept 25 Sept 26, 2013. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 12/22 Experimental Settings: StirMark Settings & Quality Indices StirMark Settings: We use 12 types of StirMark manipulations, each of which with 3 to 10 intensity levels, overall 91 different manipulations per image. Quality Indices nfiq: Part of the NIST Biometric Image Software (NBIS) package. It relies on information produced by the minutiae detector mindtct and basically conducts a neural net-based classification of the minutiae-vector into one of five overall fingerprint quality classes. SpatDom: Based on determining the block-wise clarity of the ridges and furrow orientation in the spatial domain. Per foreground-block, the gradient vectors of the gray level intensities are used to build a covariance matrix, based on which the normalized coherence measure is computed, which is then combined for all foreground-blocks in a weighed sum. FreqDom: Image quality is defined in terms of energy concentration within a specific frequency band containing ridge frequency, which is measured in terms of entropy. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 13/22 Experimental Settings: Determining Quality Indices’ Reliability With respect to matching StirMark manipulated fingerprint images, the enrolled gallery image is taken in its original (non-manipulated) version, while the probe image involved in matching is the manipulated one. Thus, for each matching scheme, for each manipulation type, and for each manipulation strength we are able to compute the corresponding EER, overall 91 EER values per database per matching scheme, arranged in lists of increasing manipulation strength. We generate quality indices for all involved images and generate mean values per manipulation type and intensity level. From these data, we generate lists of quality mean values, ordered by manipulation intensity as well. Finally, we compute Spearman’s rank order correlation per manipulation type, per quality measure, per database, and per fingerprint matcher. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 14/22 Result Selection Strategy We screen the results for quality indices where correlation is close to zero (low) for some data sets or matcher types, while it is clearly positive or negative (high) for other conditions. Existence of such results does indicate that quality indices need to be assessed for each data set and matching scheme separately, while the absence of such results allows to draw data and matcher-independent conclusions. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 15/22 1 1 0.5 0.5 Correlation Coefficient Correlation Coefficient Results: Minutiae-based schemes vs. FC on DB1 0 -0.5 -1 0 -0.5 -1 t t is dd lrn t is dd ear rn h Ys f.X ar af e sh h f.Y af retc st h f.Y af retc st f.X af nfiq ro (a) Minutiae-based matchers s l rm aus G nv n co ea M t nv co nCu ia ed m e is FreqDom no t SpatDom t is dd lrn t is ar dd rn she Y f.X ar af e sh h f.Y af retc st h f.Y af retc st f.X af ro s l rm aus G nv n co ea M t nv co nCu ia ed m e is no nfiq SpatDom FreqDom (b) Fingercode (FC) Figure: Mean correlation for DB1. −→ significant inter-matcher variability is observed for nfiq (for rml) Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 16/22 1 1 0.5 0.5 Correlation Coefficient Correlation Coefficient Results: FC vs. POC on DB1 0 -0.5 -1 0 -0.5 -1 t t is dd lrn t is dd ear rn h Ys f.X ar af e sh h f.Y af retc st h f.Y af retc st f.X af nfiq ro (a) Fingercode (FC) s l rm aus G nv n co ea M t nv co nCu ia ed m e is FreqDom no t SpatDom t is dd lrn t is ar dd rn she Y f.X ar af e sh h f.Y af retc st h f.Y af retc st f.X af ro s l rm aus G nv n co ea M t nv co nCu ia ed m e is no nfiq SpatDom FreqDom (b) Phase-only Correlation (POC) Figure: Mean correlation for DB1. −→ FreqDom exhibits inter-matcher variability (for noise) Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 17/22 1 1 0.5 0.5 Correlation Coefficient Correlation Coefficient Results: Minutiae-based vs. FC on DB2 0 -0.5 -1 0 -0.5 -1 h t is dd lrn t is ar dd rn she XY r a he h tc f. af tc tre Ys tre Ys Xs SpatDom f. af f. af t f. af ro s l rm aus G nv n ea M nv ut C an nfiq co se i ed co (a) Minutiae-based m FreqDom i no h t is dd lrn t is ar dd rn she XY r a he h tc f. af tc tre Ys tre Ys Xs f. af f. af t SpatDom f. af ro s l rm aus G nv n ea M nv ut C an co se i ed co m i no nfiq FreqDom (b) FC Figure: Mean correlation for DB2. −→ obvious inter-matcher variability for nfiq −→ comparing minutiae-based and FC schemes on DB1 & DB2, significant inter-data variability is found for all three matching types Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 18/22 1 1 0.5 0.5 Correlation Coefficient Correlation Coefficient Results: Minutiae-based vs. FC on DB3 0 -0.5 -1 0 -0.5 -1 t is dd t is dd ear sh lrn rn h tc ar he XY f. af h tc tre Ys tre Ys Xs SpatDom f. af f. af t f. af ro s l rm aus G nv n ea M ut C an nv nfiq co se i ed co (a) minutiae-based m FreqDom i no t is dd t is dd ear sh lrn rn h tc ar he XY f. af h tc tre Ys tre Ys Xs f. af f. af t SpatDom f. af ro s l rm aus G nv n ea M ut C an nv co se i ed co m i no nfiq FreqDom (b) FC Figure: Mean correlation for DB3. −→ inter-data set variability: nfiq results correspond better to DB2, the behaviour on DB1 is much different. −→ inter-matcher variability: For SpatDom, low correlation values are seen with FC matching for medianCut,convMean,convGauss, while high values are obtained for the minutiae-based matchers. Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 19/22 Outline 1 Introduction & Motivation 2 The Stirmark Toolkit 3 Experiments 4 Conclusion Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 20/22 Conclusion Lessons learnt: We observe significant inter-data set variability as well as cases of significant inter-matcher variability with respect to rank order correlation between recognition accuracy and fingerprint image quality. Such effects are found for all three example fingerprint quality indices. −→ we reveal that fingerprint image quality indices need to be related to different data sets (or sensors) and different fingerprint matching schemes separately, which can be done efficiently with the proposed methodology −→ a general purpose fingerprint quality index reliably applicable to any sensor / matching scheme combination does not seem to have come into existence so far Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 21/22 Thank you for your attention! Questions? Andreas Uhl: Systematic Evaluation Methodology for Fingerprint-Image Quality Assessment Techniques 22/22
© Copyright 2024 ExpyDoc