Programma del modulo 1 "Computer Vision"

Corso gratuito
MODULO 1
COMPUTER VISION
Finanziamento della Provincia di Bologna
PROGRAMMA
1. Introduzione – Definizioni di base inerenti l'elaborazione di immagini e la computer vision.
Panoramica sui principali scenari applicativi.
2. Formazione ed Acquisizione dell'Immagine -
Modello geometrico della formazione
dell'immagine. Camera pinhole e proiezione prospettica. Ricostruzione 3D mediante visione
stereo. Impiego di lenti. Campo visivo e profondità di campo. Coordinate proiettive e PPM
(Perspective Projection Matrix). Calibrazione della telecamera: parametri intrinseci, estrinseci e
distorsioni ottiche. Calibrazione mediante target planari e stima dell'omografia (algoritmo di
Zhang). Rettificazione e calibrazione stereo. Concetti di base inerenti sensing, campionamento
e quantizzazione dell'immagine.
3. Trasformazioni dell'Intensità -
Istogramma. Incremento lineare e non-lineare del
contrasto. Equalizzazione e matching dell'istogramma.
4. Filtraggio di immagini – Operatori lineari invarianti per traslazione e convoluzione.
Trasformata di Fourier per segnali 2D. Media e filtro Gaussiano. Filtro di Sharpening. Filtro
Mediano. Filtro Bilaterale. Non-local means..
5. Segmentazione dell'Immagine – Binarizzazione mediante soglia globale. Determinazione
automatica della soglia. Sogliatura adattativa. Region growing. Segmentazione basata sul
colore.
6. Segmentazione mediante stima del movimento – Differenze fra frames successivi e
confronto con il background. Inizializzazione ed aggiornamento del background, Robustezza
alle variazioni di illuminazione.
7. Morfologia Binaria – Dilatazione ed erosione. Apertura e chiusura. Trasformata Hit-and-Miss.
Thinning.
T3LAB – Via Sario Bassanelli n° 9/11 - 40129 Bologna (BO) – Codice Fiscale e Partita IVA 02451831206
Tel: +39 051-58.70.187 Fax: +39 051-58.70.186 [email protected] www.t3lab.it
8. Analisi delle Componenti Connesse – Distanze sul piano immagine e connettività Labeling
delle componenti connesse. Descrittori di base: area, perimetro, compattezza, circolarità,
numero di Eulero. Orientamento e rettangolo che racchiude l'oggetto. Fattore di forma e
relativi descrittori. Momenti dell'immagine e momenti invarianti.
9. Estrazione dei contorni - Gradiente dell'immagine. Derivate “smooth”: Prewitt, Sobel, FreiChen. Determinazione degli estremanti del gradient. Laplaciano della Gaussiana. Operatore di
Canny.
10. Features locali invarianti – Il paradigma “detector/descriptor”. Harris Corners. SIFT e
SURF. Riconoscimento di features locali mediante randomized trees.
11. Individuazione di oggetti – Pattern matching mediante SSD, SAD, NCC and ZNCC. Pattern
matching veloce. Shape-based mathing. Trasformata di Hough per forme analitiche.
Trasformata di Hough generalizzata.
Individazione di oggetti mediante features locali
invarianti: matching per mezzo di kd-trees, Hough-based voting, stima ai minimi quadrati della
similarità.
12. Computer Vision 3D – Tecnologie: visione stereo, laser-scanning, TOF. Immagini RGB-D
(e.g. sensore Kinect). Algoritmi di matching stereo: approcci locali, semi-globale e globali.
Elementi di base inerenti l'elaborazione e l'analisi di nuvole di punti.
DOCENTI
Prof. Luigi di Stefano
Luigi Di Stefano received the degree in electronic engineering from the
University of Bologna, Italy, in 1989 and the PhD degree in electronic
engineering and computer science from the Department of Electronics,
Computer Science and Systems (DEIS) at the University of Bologna in
1994. In 1995, he was postdoctoral research fellow at Trinity College,
Dublin. He is currently an associate professor at the Department of
Computer Science and Engineering, University of Bologna, His research
interests include computer vision, image processing and computer architecture. Prof. Di Stefano is
the author of more than 150 papers and five patents. He is a member of the IEEE Computer
Society and the IAPR-IC. From 2012 he is a member of the Scientific Advisory Board of Datalogic
Group.
T3LAB | Technology Transfer Team | www.t3lab.it
pagina 2 di 4
Federico Tombari
Federico Tombari holds an appointment as an Assistant Professor (“RTD”)
at the University of Bologna, after obtaining from the same institution a
Ph.D. in 2009. His current research activity concerns computer vision and
robotic perception, and it encompasses co-authoring more than 60 papers
on peer-reviewed international conferences and journals, mainly focused on
2D/3D object recognition, stereo vision, video analysis for surveillance and
efficient indexing. In 2004 he has been visiting student at University of Technology, Sydney, while
in 2008 he was an intern at Willow Garage, California. He is a Senior Scientist volunteer for the
Open Perception foundation and a developer for the Point Cloud Library. In 2012 and 2013 he held
a position as an Adjunct Professor at the University of Bologna. He is member of IEEE and IAPRGIRPR. He is the recipient of the “Best Paper Award Runner-up” of the International Conference
on 3D Imaging, Modeling, Processing and Visualization Technologies (3DIMPVT 2011).
Alioscia Petrelli
Alioscia Petrelli received the degree in computer science engineering from the University of
Bologna, Italy, in 2005. He spent four years as research fellow at the Computer Vision Laboratory
of the Department of Electronics, Computer Science, and Systems in Bologna. Currently, he is a
Ph.D. student with the Department of Computer Science and Engineering, University of Bologna.
His research focuses on computer vision, including 3D surface matching and machine learning. He
serves as a reviewer for the IEEE International Conference on Computer Vision and is a member of
the IEEE Computer Society.
Samuele Salti
Samuele Salti received the M.Sc. degree in computer science engineering in 2007 and the Ph.D.
degree in computer science engineering in 2011, both from the University of Bologna, Italy. Since
2011 he is a Post-Doc at Computer Vision Lab, DISI (Department of Computer Science and
Engineering), University of Bologna. In 2007 he visited the Heinrich-Hertz-Institute in Berlin,
Germany working on human computer interaction. In 2010 he visited the Multimedia and Vision
Research Group (MMV) at Queen Mary, University of London, where we investigated adaptive
appearance models for video tracking. His research interests are adaptive video tracking, 3D shape
matching, Bayesian filtering and object recognition. Dr. Salti has co-authored 19 publications in
international conferences and journals. He was awarded the best paper award runner-up at
3DIMPVT, the International Conference on 3D Imaging, Modeling, Processing, Visualization and
T3LAB | Technology Transfer Team | www.t3lab.it
pagina 3 di 4
Transmission in 2011. He serves as a reviewer for IEEE Transactions on Signal Processing, IEEE
Transactions on Image Processing and a number of international conferences. He is a member of
the IEEE and GIRPR.
Tommaso Cavallari
Tommaso Cavallari received a M.Sc degree in Computer Science Engineering from the University of
Bologna, Italy, in spring 2013. He spent six months as a research fellow at the Computer Vision
Lab, DISI (Department of Computer Science and Engineering), University of Bologna, working on
the topics of computer vision and machine learning applied to industrial problems. He is currently
enrolled as a Ph.D student, working for the Computer Vision Lab; his research is focused on
computer vision. In 2012 he has been an intern at Willow Garage, California, working on the
detection and tracking of moving objects to allow a reliable grasping action from a robot.
T3LAB | Technology Transfer Team | www.t3lab.it
pagina 4 di 4