스테레오 비젼을 이용한 삼차원 형상 계측을 위한 이미지 매칭

工學碩士 學位論文
스테레오 비젼을 이용한
삼차원 형상 계측을 위한 이미지 매칭
알고리즘 개발
Development of an Image Matching Algorithm
for the Measurement of 3-D Geometry
Using Stereo Vision
2000 年 2 月
서울市立大學校 大學院
制御計測工學科
李
好 宰
스테레오 비젼을 이용한
삼차원 형상 계측을 위한 이미지 매칭
알고리즘 개발
Development of an Image Matching Algorithm
for the Measurement of 3-D Geometry
Using Stereo Vision
지도교수 : 김희식
이 논문을 석사학위 논문으로 제출함
2000 년 11 월
서울市立大學校 大學院
制御計測工學科
李
好 宰
이호재의 제어계측공학 석사학위 논문을 인준함.
심사위원장
인
심사위원
인
심사위원
인
2000 년 12 월
서울시립대학교 대학원
Abstract
In this thesis, we present an approach for matching stereo pair images from finding
interesting points and applying gradient based image matching algorithm. The stereo
matching is correspondence problem.
The correspondence problem can be stated as follows: for each point in the left
image, find the corresponding point in the right image. To match the two points, it is
necessary to measure the similarity of the points. Clearly, the point to be matched
should be distinctly different from its surrounding pixels; otherwise, every point
would be a match. Thus, it is necessary to extract interesting features.
There are several methods to find distinct points in an image. We commonly use the
Förstner Interest Operator. This method finds edges, outstanding points, the center of
circles, etc.
We first apply this operator to find outstanding points in left image, then we need to
match the conjugate pair points in right image. Simply, we can guess that it is not
effective to calculate whole image to find matching points. Initially, we set the range
of finding region in right image and in this range we proceed the gradient based image
matching method.
Even though there are several limitations such as occlusion, compression effect, etc,
we can conclude this thesis is very appropriate to apply real images.
Keywords : Stereo Vision, Image Matching, Interesting Point, Pattern Recognition,
Machine Vision, Robot Vision
i
Contents
Abstract
… … … … … … … … … … … … … … … … … … ….
i
Contents
… … … … … … … … … … … … … … … … … … … . … … … … . ii
List of Figures … … . … … … … … … … … … … … … … … … … … … … iv
List of Tables … … … … … … … … … … … … … … … … … … . … … … .. v i
1. Introduction
… . … … … … … … … … … … … … … .. … … … … … … .
2. Stereo Image Matching
2.1 A Review of Stereo Vision
… … … … … … … … … … … … … … .. … .
3
… … … … … … … … … … … … … … .. …
3
2.1.1 Camera Model and Image Formation
2.1.2 Stereo Geometry
1
… … … … … .. … … .. …
3
… … … … … … … … … … … … … … … . … ...
5
2.2. Stereo Image Matching Methods … … … … … … … … … … … . … … .
9
2.2.1 What is Stereo Matching? … … … … … … … … … . . … … . … … …
9
2.2.2 Edge based Matching Method
2.2.3 Region Correlation
… .. … … … … … … … . … … …
9
… … … … … … … … … … … … … . … … ..
13
2.2.4 Least Square Matching Method
… … … … … … … … . … . ….
13
… … . … … … … … … … … . …. ….
14
… … … … … … … … … … … . … … … … . …. ….
16
2.2.5 Gradient-Based Matching
2.3 Interesting Points
2.3.1 The Difference of Direction
.. … … … … … … … … … . … . …
16
… .. … … … … … … … . . … …
17
2.3.2 The Förstner Interest Operator
3. Experiment of 3D Geometry Measurement
… … .. … ... … .
19
3.1 H/W Specification
… … … … … … … … … … … … … … … .. … … .
19
3.2 S/W Specification
… … … … … … … … … … … … … . … … .. … … .
21
… … … … … … … . … … … … … … … … … . . … … … . . … … .....
23
4. Result
4.1 Result of Finding Interest Points Using Förstner Interest Operator … 23
4.2 Apply gradient-based Image Matching Method to Real Images
ii
… 27
5. Conclusion
… … … … … … … . … … … … … … … … … . … … ...
42
References
… … … … … ... … … … … … … … … … . . … … … . . … … .. 43
Abstract(Korean)
… .. … . … … … … … … … … … . . … … … . . … … ... 45
Dedication
… … … … … … … … . … … … … … … … … … . . … … … . … 46
iii
List of Figures
<Figure 1> Typical Optical 3D-measurement Setup
… … … … … … … … … … … .. 4
<Figure 2> An illustration of perspective projection showing the line of sight that is
used to calculate the projected point (x’,y’) from the object point (x,y,z). … … … … 5
<Figure 3> Principle of bundle adjustment for image triangulation … … … … … . … . 7
<Figure 4> Stereo geometry. The disparity of a scene point P of depth Z. … … … ... 8
<Figure 5> > A reference image area(10x10) and a searching image area(100x100)
… … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … … .. 15
<Figure 6> Flow chart of the system … … … … … … … … … … … … … … … … ...
22
<Figure 7> Original Pair Images. (a) The left image (b) The right image … …
24
<Figure 8> The results of finding interesting points. The white points represent
interesting points. (a) The left image (b) The right image
… … .. … … … … . …
25
<Figure 9> The final results of finding interesting points of <Fig. 6> by removing
non-local maxima. The white points represent interesting points. (a) The left image
(b) The right image
… … … … … … … … … … … … … … … … … … … … . … ..
<Figure 10> Original Pair Images. (a) The left image (b) The right image … … ..
26
27
<Figure 11> The final results of finding interesting points by removing non-local
maxima. The white points represent interesting points. (a) The left image (b) The right
image
… … … … … … … … … … … … … … … … … … … … … … … … … … … ….
28
<Figure 12> Result of gradient based evidence(1) … … … … … … … … … … . . … .
29
<Figure 13> Result of gradient based evidence(2) … … … … … … … … … … … ..
30
<Figure 14> The 2D matching result. The lined white points represent matched pairs.
… … … … … … … … … … … … … … … … … … … … … … … … …. … … … … …. ….
31
<Figure 15> The 2D matching result. The lined white points represent matched pairs.
… … … … … … … … … … … … … … … … … … … … … … … … … … .. … … … … … .
32
<Figure 16> Original captured images. (a) The image of left view. (b) The image of
right view. … … … … … … … … … … … … … … … … … … … … … … . . … … … … ...
33
<Figure 17> The 3D Image matching result of <Figure 13> with error points. The
white points of left image represent interesting points. The white points of right image
iv
represent each matched points. … … … … … … … … … … … … … … … … … … . . …
34
<Figure 18> The 3D Image matching result of <Figure 13> with eliminating error
points. The white points of left image represent interesting points. The white points of
right image represent each matched points. … … . … … … … … … … … … … … . . …
35
<Figure 19> Original captured images. (a) The image of left view. (b) The image of
right view. … … … … … … … … … … … … … … … … … … … … … … … … … … . … .
36
<Figure 20> The 3D Image matching result of <Figure 16> with error points. The
white points of left image represent interesting points. The white points of right image
represent each matched points. … … … … … … … … … … … … … … … … … … . . …
37
<Figure 21> The 3D Image matching result of <Figure 16> with eliminating error
points. The white points of left image represent interesting points. The white points of
right image represent each matched points. … … . … … … … … … … … … … … . . …
38
<Figure 22> Original captured images. (a) The image of left view. (b) The image of
right view. … … … … … … … … … … … … … … … … … … … … … … … … … . . … … .
39
<Figure 23> The 3D Image matching result of <Figure 19> with error points. The
white points of left image represent interesting points. The white points of right image
represent each matched points. … … … … … … … … … … … … … … … … … … . . …
40
<Figure 24> The 3D Image matching result of <Figure 19> with eliminating error
points. The white points of left image represent interesting points. The white points of
right image represent each matched points. … … . … … … … … … … … … … … . . …
v
41
List of Tables
<Table 1> History of Photogrammetry
… … … … … … … … … … … … … … . … ...
<Table 2> Application of Optical 3D-Measurement Techniques
… … … … … ....
2
2
<Table 3> Properties of some correspondence algorithms for image matching … 10
<Table 4>Precision of image matching and edge detection
…. … … … … … . … .
12
… … … … … … … … … … . ….
19
<Table 6> Camera Specification
… … … … … … … … … … … … … … … … … ….
20
<Table 7> Software specification
… … … … … … … … … … … … … … ... … … … .
21
<Table 5> Image matching system specification
<Table 8> The result of 2D and 3D image matching
vi
… … .. … … … . . … . … … … .
40
1. Introduction
Stereo Vision is one of the earliest and most thoroughly investigated topics in the
computer vision (see table 1), and an exhaustive discussion of related work in stereo
vision is well beyond the scope of this thesis.
All stereo methods must address the correspondence problem, that is the problem
of finding matching points in two images of the same scene. Two image points match
if they result from the projection of the same point in the scene. The desired output of
a stereo correspondence algorithm is a disparity, specifying the relative displacement
of matching points between images.
There are several approaches in matching algorithm : Feature-based, Area-based,
etc. Feature-based approaches only match points with a certain amount of local
information (such as intensity edges). Area-based approaches match small image
area, relying on the assumption that near points usually have similar displacements.
The typical area-based stereo matching algorithm proceeds in the following way:
1. Preprocessing images (Band-pass filtering, etc.)
2. Compute a per-pixel matching cost.
3. Aggregate support spatially(e.g., by summing over a window, or by diffusion)
4. Across all disparities, find the best match based on the aggregated support
5. Compute a sub-pixel disparity estimate (optional)
Other stereo techniques include hybrid and interactive techniques, such as
stochastic search [Marroquin et al., 1987]and joint matching and surface
reconstruction[Hoff and Ahuja, 1989; Olsen, 1990; Stewart et al., 1996].
More than two images are used in multiframe stereo to increase stability of the
algorithm [Bells et al., 1987;Kang et al., 1995]. A special case is multiple-baseline
stereo, where all images have identical epipolar lines [Okutomi and Kanade, 1993].
Table 1 states about the historical study of photogrammetry. The fundamental study
of perspective projection was started from 1759.
1
<Table 1> History of Photogrammetry
1759
1839
1859
J.H. Lambert. “Ther Free Perspective”
(Geometric Fundamental of the Photogrammetry)
Arago, Daguerre, Niepce, Talbot. Negative picture
1901
A. Laussedat. Commission of Parisian Scientific Academy
B. (Father of Photogrammetry)
C. Pulfrich. Stereophotogrammetry
1923
W. Bauersfeld. Stereoplanigraphy(aerial photography)
1930s
Optical-mechanical evauation machine
1950s
Analytic photogrammetry, aerotriangulation, computer, TV
1980
From ISP onto ISPRS
1990s
Digital photogrammetry
Table 2 is about Application of Optical 3D-Measurement Techniques. We can use
the thesis in this paper at industrial or non-industrial purpose.
<Table 2> Application of Optical 3D-Measurement Techniques
Industrial
Non-industrial
Automobile
Shipbuilding
Aerospace
Plant
Mining
Architect
Ergonomics
Shoe, textile, cloth
Archeology
Anthropology
Medical rehabilitation
Monument preservation
Sport
Entertainment
Plice, military
2
2. Stereo Image Matching
2.1 A Review of Stereo Vision
2.1.1 Camera model and image formation
In general, we setup cameras to measure 3D geometry shown at Figure 1.
Throughout this dissertation, we use perspective projection (see Figure 2, Figure 3
and Figure 4) as our geometric model of image formation: an image is formatted by
projecting each scene point along a straight line through the center of projection onto
an image plane. This is commonly referred to as the pinhole camera model. The
pinhole camera is a powerful model that resembles very closely the operation of real
cameras. The only principal differences are that real cameras have a lens instead of a
simple hole, and the imaging surface is an array of sensors or film.
Mathematically, perspective projection is most easily described using homogeneous
[ωx ωy ωz ω] T called projective coordinates). In
homogeneous coordinates, each point is ω ≠ 0 extended by a dummy coordinate
coordinates (also
that maps the point to a line through the origin in a space whose dimension is one
higher than that of the original space. For example, a two-dimensional image point
[ωx ωy ω ]T in
(x,y) is represented by the set of vectors
homogeneous
coordinates: a three-dimensional point (X, Y ,Z) is represented by the set of vectors.
Although homogeneous coordinates are redundant, they are very useful as they allow
us to express otherwise non-linear transformations linearly. In particular, the
perspective projection of a 3D-scene point onto a 2D-image plane can be written with
the following linear equation using homogeneous coordinates:
X 
u
 
 v  = [P ] Y 
 
Z 
 w 
 
1
3
(2.1)
<Figure 1> Typical Optical 3D-measurement Setup
4
In this equation, (X, Y, Z) are the coordinates of a scene point, and (x,y) = (u/w, v/w)
are the coordinates of its projection. The projection matrix P is a 3x4 matrix defined
up to a scalar factor that captures both the extrinsic and intrinsic camera parameters.
The extrinsic parameters specify the position and orientation of the camera with
respect to the scene coordinate system, while the intrinsic parameters specify the focal
length, the aspect ratio, and the position of the origin of the image coordinate system.
Therefore, if the camera is moved to a new position, we need to change only the
extrinsic parameters. If all parameters are known, we can speak of a calibrated camera.
Camera calibration can be achieved by observing a special calibration object whose
dimensions and position are known.
<Figure 2> An illustration of perspective projection showing the line of sight that is
used to calculate the projected point (x’,y’) from the object point (x,y,z).
2.1.2 Stereo Geometry
Stereo vision is the process of estimating the depth of scene points from their
change in position between two images. Commonly, we need more than 2 cameras to
calculate the depth of scene points (See Figure 3). This is done effortlessly by the
human visual system, which translates the differences between the views from the two
eyes into a three-dimensional impression of the scene. Figure 4 illustrates how the
5
disparity or change of image location, of a point is related to its depth for two
identical parallel cameras. The figure shows a scene point P and its two images PL and
xL X
=
f
Z
(2.2)
PR in the left and right images, respectively. Let’s denote the focal length by f and the
baseline (the distance between the two cameras) by b. Then, given that the scene point
P has distance Z and lateral offset X, and given further that P’s images PL and PR have
xR X + b
=
f
Z
(2.3)
coordinates xL and xR, we can conclude from consideration of similar triangles that
The disparity d is
d = xR − xL =
fb
Z
(2.4)
Thus we can conclude Z is proportional to focal length f and baseline, and
inversely proportional to its depth.
6
<Figure 3> Principle of bundle adjustment for image triangulation
7
P
Z
CR
CL
X
b
f
PR
PL
xR
xL
<Figure 4> Stereo geometry. The disparity of a scene point P of depth Z.
8
2.2 Stereo Image Matching Methods
2.2.1 What is stereo matching?
Stereo image matching problems are known as the correspondence problem. The
correspondence problem can be stated as follows: for each point in the left image, find
the corresponding point in the right image. To compute that two points, one in each
image, form a conjugate pair, it is necessary to measure the similarity of the points.
Clearly, the point to be matched should be distinctly different from its surrounding
pixels; otherwise, every point would be a match. Thus, it is necessary to extract
interesting features.
There are several methods to find matching points such as regional matching, line
matching and point matching, etc. Even if, however, there are lots of methods, the
objective is finding matching points between more than two images. For example,
in regional mating method, we commonly use cross correlation method and least
square matching method. When we choose line-matching method, edge based
matching method can be used. The point matching method is using interest points
represent high variance point such as edge or isolated area.
Table 3 and Table 4 show us the several proposed methods chronologically.
2.2.2 Edge based Matching Method
This matching algorithm is that features are extracted from the left and right images
by filtering the images, and the features are matched along the epipolar lines. This
image matching algorithm uses edges detected by the first derivative of Gaussian. It is
because the edges computed using the gradient of Gaussian are more reliable with
respect to noise. In order to simplify the process, the search steps for a match for each
feature in one image takes place along the corresponding epipolar line in the second
image for a limited distance centered around it expected position.
9
<Table 3> Properties of some correspondence algorithms for image matching
Author
Year
Hannah
1974
1989
1980
Barnard
and
Thompson
Dreschier
Features and Attributes of Similarity
Interest Operator
Measure
Variance
Correlation
Moravec
Intensity difference
1981
Corners Blobs
1982
Edges
Intensity difference class
Interest value
Sign Strength
1981
1985
1982
Edges
Direction sign
Abstract edges
Class
1983
Edges blobs
Nagel and
Enkelman
n
Förstner
1983
Corners
Grad. + intens. Diff.,
direction
Intensity difference Interest
value
1986
1984
1986
Roundness and curvature
of ACF corners Blobs
Zimmer
and Kories
Faugeras
Ayache
and
Faverjon
1984
Blobs
SNR
Interest value
Distinctness
Clas
1985/
1987
Edges
Orientation sighn
Baker and
Binford
Grimson
Stockman
Kopstein,
and
Bennett
Benard
10
Second page of Table 6
Author
Year
Ohta and
Kanade
Kanade
Barnard
1985
Features and Attributes of Similarity
Interest Operator
Measure
Edge
Intensity difference
1987
1981
Corners Blobs
Baker and
Binford
Witkin,
Terzopoul
os,
and
Kass
1986
Intensity values
Interest value
Intensity difference class
Interest value
Intensity constraint
1987
Correlation
11
<Table 4>Precision of image matching and edge detection
Author
Year
Sharp,
Christensen,
and Gilman
Bernstein
1965
Emirical
Findings
1
Computer
Simulations
Theoretical
Values
Application/
Remarks
Digital terrain
models
1973
0.1
Klaasman
1975
Cafforio and
Rocca
1976
McGillem
and Svedlov
1976
Hill
1980
0.02-0.1
Binary images
Target loaction
Huang/Hsu
1981
0.02-0.1
Parallax
estimation
Förstner
1982
Thurgood
and Mikhail
1982
Ackermann
and Pertl
1983
Ho
1983
0.05-0.1
Binary images
Target location
Grün/Baltsav
ias
1985
0.05-0.1
Parallax
estimation
Vosselman
1986
0.02-0.03
Target location
Registration
0.05
0.1
Edge detection
TV
image
sequences
0.5/SNR
0.01-0.1
0.02-0.1
Target location
Target location
0.02-0.2
12
Registration
Parallax
estimation
2.2.3 Region Correlation
An important limitation of edge matching methods for stereo matching is that the
value of the computed depth is not meaningful along occluding edges.
This method is used when we find a matching point with the region around the
point. We choose one point and area around the point with specific size of reference
window on one image and then select searching area expected that the matching point
exists on second image. The searching area is also called searching window. If the
reference window size is NXM, we can conclude the correlation between two images,
N
r (n, m) =
M
∑∑ {G
x =1 y =1
N
w
( x , y ) − G } ⋅ {GS ( x, y ) − Gw }
M
[∑∑ {Gw ( x, y) − G w }
N
2
x =1 y =1
(2.1)
M
∑∑ {G
x =1 y =1
( x, y) − GS } ]
2 1/ 2
S
where,
r (n, m) :
− 1 ≤ r(n, m) ≤ 1
correlation coefficient.
Gw ( x, y) : the grayscale value of point (x,y) in reference window
GS ( x, y): the grayscale value of point (x,y) in searching window
Gw ( x, y): the mean value in reference window
GS ( x, y): the mean value in searching window
First, we find maximum
r (n, mof) reference and searching window, if the value is
over certain threshold, we can conclude that the point is matching point.
This method, however, has several problems. For example, if the reference window
is not enough big and the searching window is not enough small, the error rate is
getting higher.
2.2.4 Least Square Matching Method
This matching is that the least square value of point between reference and
13
searching window is the matching point. We assume that the reference window size is
NxM, the value of point in this window is G w ( x w , y w ), certain point of searching
window with same size of reference window isG S ( xS ,
the point
y S ), the point placed near
GS ( x S , y S ) in the searching window is GS0 ( xS0 , y 0S ). We need to find
the point (xs, ys) that makes different value e(x,y) of two windows small.
Gw ( x w , y w ) − Gs ( x s , y s ) = e( x, y )
(2.2)
We can rewrite this equation as
Gw ( x w , y w ) − e( x, y) = G s0 ( x s0 , y s0 ) + Gs x dxs + G s y dy s
(2.3)
where,
Gsx =
Gs y =
∂s 0x ( x 0s , y 0s )
∂x s
(2.4)
∂s 0y ( x s0 , y s0 )
(2.5)
∂xs
2.2.5 Gradient-Based Matching
This matching method is similar to region correlation that find how well two
locations in the two images resemble each other. It is, however, different that this
method uses gradient values of images than intensity values of images in region
correlation method. If gL, g R are the two gradient vectors to be compared, we use the
average magnitude of the two gradients
m = ( gL + gR ) / 2
(2.6)
at a certain point to represent confidence, and the magnitude of the difference of the
two gradients
14
− d = − gL − gR
(2.7)
to represent similarity. In this method, we combine similarity and distinctiveness into
a single measure evidence for a match at a certain location under a certain
displacement. Thus we define evidence as
e = m − αd
=
(2.8)
gL + gR
− α gL − g R
2
The similarity is that how well two locations in the two images are match, and
distinctiveness is that how well the matching is correct.
In order to apply this method to discrete images, we approximate the gradients by
differences. After an initial smoothing step with a Gaussian filter to compensate for
quantization error and noise, the gradients in the x and y directions are computed by
convolution with simple filter coefficients [-1 0 1] and [-1 0 1]T. In this experiments,
we used a Gaussian filter coefficient to reduce noise in the images.
1024
10
10
100
Reference
Interesting
Point
768
1024
Searching
Window
100
768
Right Image
Left Image
<Figure 5> > A reference image area(10x10) and a searching image area(100x100)
15
2.3 Interesting Points
In matching points from two images, we need points that can be easily identified
and matched in the two images. It’s needless to say that the points in a uniform region
are not good candidates for matching. The interest operator finds areas of image with
high variance such as edge or isolated area, etc. It is expected that there will be
enough of such isolated areas in images for matching.
2.3.1 The Difference of Direction
The variances along different directions computed using all pixels in a window
centered about a point are good measures of the distinctness of the point along
different directions. The directional variances are given by
I1 =
I2 =
I3 =
I4 =
∑ [ f (x , y ) − f (x, y + 1)]
2
(2.9)
( x , y )∈ S
∑ [ f (x , y ) − f (x + 1, y)]
2
(2.10)
( x , y )∈ S
∑ [ f ( x, y) − f ( x + 1, y + 1)]
(2.11)
∑ [ f (x , y ) − f (x , y − 1)]
(2.12)
2
( x, y )∈S
2
( x , y )∈ S
Where S represents the pixels in the window. Typical window sizes range from 5 X
5 to 11 X 11 pixels. Since simple edge points have no variance in the direction of the
edge, the minimum value of the above directional variances is taken as the interest
value at the central pixel, (xc ,yc ). This eliminates edge pixels from consideration since
an edge pixel in one image would match all pixels along the same edge in the second
image, making it difficult to determine exact disparity. Thus, we have
I ( xc , y c ) = min( I 1 , I 2 , I 3 , I 4 )
16
(2.13)
Finally, to prevent multiple neighboring points from being selected as interesting
points for the same feature, feature points are chosen where the interest measure has a
local maxima. A point is considered a good interesting point if this local maxima is
greater than a preset threshold.
2.3.2 The Förstner Interest Operator
They are suggested many other methods; Moravec, Hannah, Förstner, etc. All of
them, the Fö rstner Interest Operator is used most commonly. The Förstner Interest
Operator finds edges, outstanding points, the center of circles.
The gradients of image set g x and g y each axis. These gradients are computed by
Robert Operator or Sobel Operator. That is,
Robert Operator :
1
hm (m, n) = 
0
Sobel Operator :
0
− 1
 − 1 0 1
hm (m, n) = − 2 0 2
 − 1 0 1
,
 0 1
hn (m, n) = 

− 1 0
(2.14)
2
1
1

, hn (m , n) = 0
0
0  (2.15)

− 1 − 2 − 1
After we convolute whole image by one of these operators, find normalized matrix,
 ∑ g x2
N =
∑ g x g y
∑g g
∑g
x
2
y
y



(2.16)
For example, in the small window (7 x 7), the interest values q(the roundness), w(the
weight) are follow,
17
N
w=
q=
(2.17)
trace( N )
4N
trace 2 ( N )
,
0<q<1
(2.18)
The w is related by the size of error ellipse and proportional to contrast, and the q is
related by the appearance.
If we use the thresholding values, the weights are,
w( x, y) = w( x, y)
w(x, y) = 0
The
q minand
If
q ( x , y ) ≥ q min and w( x, y) ≥ wmin (2.19)
Otherwise.
(2.20)
wmin
are thresholoding values. In this paper, we choose
q min = 0.6
wmin =
(2.21)
wmax + wavg
5
(2.22)
by experimentally.
After we threshold the w by
wmin, we remove non-local maxima. We remove all
the w except the maximum w in specific window used 13x13 pixels in this experiment.
18
3. Experimental System of H/W and S/W
3.1 H/W Specification
The stereo image matching system is mainly consisted of computer system and
camera system shown as table 5 and table 6 respectively.
<Table 5> Image matching system specification
Specification
CPU
Pentium II (350MHz)
RAM
128MB
Video Card
MAROX Millenium G-200
Capture Board
MV-1500
Camera
PULNiX TM-1001
Before applying the stereo matching algorithm, we need to acquire the stereo image.
In this dissertation, we used PLUNiX TM-1001 camera and MV-1500 capture board.
In order to apply image processing and image matching algorithm, we captured
1024x768 pixels grayscale image. This size of image is little bit big and we need
much time to process the algorithm, we, however, can get accurate results. The
detailed camera specification is shown as table 6.
19
<Table 6> Camera Specification
PULNiX TM-1001 Specification
Imager
Pixel
Cell Size
Sync.
Data clock
output
Resolution
S/N Ratio
Min.
illumination
Video output
AGC
Gamma
Lens
Power req.
Operating
temp.
Vibration &
shock
Size (W x H
x L)
Weight
Power supply
Auto
iris
connector
I/O
1" (9.1mm x 9.2mm) Progressive Scanning Interline Transfer CCD
1024 (H) x 1024 (V)
9.0µm x 9.0µm
Internal/external auto switch
HD/VD, 4.0 Vp-p impedance 4.7K ohm
VD=15 Hz ±5%, non-interlace
HD=15.75kHz±5
20.034 MHz (40.068 MHz for DSP)
Digital: 1008 (H) x 1018 (V), Analog: over 700 TV lines (H) x 800
TV lines (V)
50dB min. ( AGC = off )
1.0 lux, f=1.4 without IR cut filter (no shutter) Sensitivity: 10µV/e1.0 Vp-p composite video, 75 ohm and 8-bit RS-422 output
ON*/OFF (OFF std.) *AGC is applicable on analog output only
0.45 or 1.0 (1.0 std.)
C-mount (use 1" format lenses) Adjustable Back Focus
12V DC, 500 mA
-10°C to 50°C
Random Vibration: 7G (200 Hz to 2000 Hz) Shock: 70G
44mm x 48mm x 136mm (1.73" x 1.91" x 5.35")
330 grams (11.6 oz)
PD-12P (includes power connector)
none
MP-211-031-113-4300 31-pin mating connector; 30DG-02, or
30DG-02K interface cable
20
3.2 S/W specification
The image matching software specification is shown as Table 7.
<Table 7> Software specification
Operating System
Language
Specifications
Windows 98 SE
Visual C++ 6.0
In order to find matching point among more than two images, we acquire several
images with different positions of cameras. First, we calculate and find interesting
points on one image to save time. There are several different methods to find them,
we adopted Förstner Interest Operator. This method is very commonly used to find
interesting points on real images. The next step, we find matching points based on
interesting points found. In this step, we used gradient-based image matching methods
to find matching points. Figure 6 shows the detailed flow chart.
21
START
Find Interest Point
Calculate gradients of image set gx and gy
Find normalized matrix N
Find weight W
Remove non-local maxima
Find Matching Points
Gaussian Filtering to remove noise
Find evidence value e
Find maximum value e
Draw lines between matching points
END
<Figure 6> Flow chart of the system
22
4. Results
4.1 Result of Finding Interest Points Using Förstner Interest
Operator
Figure 7 shows the results of finding interesting points with the non-local maxima.
The white points represent the interest points found. We use threshoding values
shown on Equation (2.21) and (2.22).
After we compute the first step of finding interest points, we apply the method of
removing non-local maxima. The results were shown on Figure 9. The white points
display the final interesting points. Figure 11 also shows interesting points without
non-local maxima.
23
(a)
(b)
<Figure 7> Original Pair Images. (a) The left image (b) The right image
24
(a)
(b)
<Figure 8> The results of finding interesting points. The white points represent
interesting points. (a) The left image (b) The right image
25
(a)
(b)
<Figure 9> The final results of finding interesting points of <Fig. 7> by removing
non-local maxima. The white points represent interesting points. (a) The left image
(b) The right image
26
(a)
(b)
<Figure 10> Original Pair Images. (a) The left image (b) The right image
27
(a)
(b)
<Figure 11> The final results of finding interesting points by removing non-local
maxima. The white points represent interesting points. (a) The left image (b) The right
image
28
4.2 Apply gradient-based Image Matching Method to Real Images
After finding the interesting points on one image, we need to find matching points
from other image. To find the best match for an isolated point, all we can do is to
maximize Eδ at this point for all δ (displacement) under consideration (See Figure 12
and Figure 13). Doing for every point is not very efficient and produces a noisy and
inconsistent displacement field.
To deal with this problem, motion computation methods usually make the
assumption that nearby points have similar displacements, based on the observation
that motion in real scenes varies smoothly almost everywhere. Many point-oriented
methods utilize the assumption of a smooth motion field after computing initial
matches by smoothing the displacement field, often employing some confidence
measure associated with each match to constrain the smoothing process.
In contrast, our displacement-oriented method uses the assumption of a smooth
motion field while finding the matches. The idea is that if a certain displacement δ
aligns two matching objects, Eδ will have a strong positive response at the location of
the match. By accumulating Eδ over a certain area, dominant motions can be detected.
That is, only the correct displacement Eδ will yield support for a match over a larger
area, thereby creating a maximum among all δ under consideration.
The two dimensional stereo image matching results were shown at from figure 14
to figure 15 and three dimensional stereo image matching results were shown at from
figure 16 to figure 24.
29
Gradient Based Evidence
15000
10000
5000
0
Evidence Value
-5000
-10000
-15000
-20000
91
96
81
86
71
y
S15
76
61
x
66
51
56
41
S29
46
31
36
16
21
26
1
6
11
-25000
S1
<Figure 12> Result of gradient based evidence(1)
Gradient-Based Evidence
1000
500
0
Evidence Value
-500
-1000
<Figure 13> Result of gradient based evidence(2)
30
97
85
91
67
73
61
S15
79
x
55
37
S29
43
49
31
1
7
13
19
25
-1500
S1
y
<Figure 14> The 2D matching result. The lined white points represent matched pairs.
31
<Figure 15> The 2D matching result. The lined white points represent matched pairs.
32
(a)
(b)
<Figure 16> Original captured images. (a) The image of left view. (b) The image of
right view.
33
<Figure 17> The 3D Image matching result of <Figure 16> with error points. The
white points of left image represent interesting points. The white points of right image
represent each matched points.
34
<Figure 18> The 3D Image matching result of <Figure 16> with eliminating error
points. The white points of left image represent interesting points. The white points of
right image represent each matched points.
35
(a)
(b)
<Figure 19> Original captured images. (a) The image of left view. (b) The image of
right view.
36
<Figure 20> The 3D Image matching result of <Figure 19> with error points. The
white points of left image represent interesting points. The white points of right image
represent each matched points.
37
<Figure 21> The 3D Image matching result of <Figure 19> with eliminating error
points. The white points of left image represent interesting points. The white points of
right image represent each matched points.
38
(a)
<Figure 22> Original captured images. (a) The image of left view. (b) The image of
right view.
39
<Figure 23> The 3D Image matching result of <Figure 22> with error points. The
white points of left image represent interesting points. The white points of right image
represent each matched points
40
<Figure 24> The 3D Image matching result of <Figure 22> with eliminating error
points. The white points of left image represent interesting points. The white points of
right image represent each matched points
41
5. Conclusion
In this thesis, we have presented the use of stereo vision for the application of
image matching. The majority of the research is concerned with interesting points and
gradient based image matching algorithm.
It was shown (section 4.1) that the interesting points on images located at vertex of
rectangles, edge and isolated points. It is because these kinds of points are good
candidates for applying image matching algorithm, we can use them to next step.
In section 4.2, the results of applying image matching algorithm which was
proposed are shown. First, we find gradient evidence e in the searching window of
right image. Figure 14 and figure 15 show the 2D image matching results.
From figure 16 to figure 24, we illustrate the result of 3D image matching. By
eliminating the wrong placed points, we can get best matching results.
Table 8 shows the summary of the result of 2D and 3D image matching. We can
conclude the image matching algorithm proposed is very efficient.
<Table 8> The result of 2D and 3D image matching
Figure No.
Figure 14
Figure 15
Figure 18
Figure 21
Figure 24
Total Points
54
60
11
28
22
Wrong Placed Points
0
0
0
0
3
42
Correctness
100%
100%
100%
100%
86%
References
1. 김희승, 영상인식 “영상처리 컴퓨터 비젼 페턴인식 신경만”, 1993, 생능
출판사, 서울
2. 박성한, “3 차원 정보를 얻기 위한 Stereo Vision System 에 관한 연구”, 한
양대학교, 1988
3. 박희주, “사진측량의 표정을 위한 스테레오 매칭 기술에 대한 연구”, 성
균관대학교, 1989
4. NHK 방송기술 연구소 화상연구소, “국제 테크노정보 연구소 편역, “C 언
어에 의한 화상처리 실무”, 1995, 국제 테크노 정보 연구소, 서울
5. 이성환, “패턴인식의 원리”, 1994, 홍릉과학출판사, 서울
6. 일본공업기술센터, “컴퓨터 화상처리 입문”, 1993, 기전 연구소
7. Craig A. Lindly, “C 이미지 프로세싱”, 1991, 동일 출판사, 서울
8. Brain Lee Curless, New Methods For Surface Reconstruction From Range Images,
Stanford University, 1997
9. Clark F. Olson, Recognizing 3D Object from 2D Images Using the Probabilistic
Peaking Effect, University of California at Berkeley
10. Dana H. Ballard, Rajesh P.N. Rao, Seeing behind occlusions, University of
Rochester, 1994
11. Daniel Scharstein, View Synthesis using stereo vision, Cornell University, 1997
12. Dwayne Philips, Image Processing in C, 1994, R&D Technocal Books
13. Edward R. Dougherty, Digital Image Processing Methods, 1994, Marce Dekker
14. Faugeras, O.D., What can be seen in three dimensions with an uncalibrated stereo
rig, In Processing of the Second European Conference on Compter Vision, pp. 563578, Santa Margherita Ligure, Italy, May 1992
15. Ganapathy, S., Decomposition of transformation matrices for robot vision, attern
Recognition Letters2, pp. 401-412, 1984
16. Gideon P. Stein, Internal Camera Calibration using Rotation and Geometric
Shapes, MIT, p. 7, February 1993
17. Jitendra Malik, Ruth Rosenholtz, Computing Local Surface Orientation and
43
Shape from Texture fir Curved Surfaces , University of California at Berkeley
18. Kiriakos N. Kutulakos, Charles R. Dyer, Recovering Shape by Purposive
Viewpoint Adjustment, University of Wisconsin
19. Mohr, R. and Arbogast, E., It can be done without camera calibration, Pattern
Recognition Letters 12. pp. 39-43, 1991
20. Morton Nadler, Eric P. Smith, Pattern Recognition Engineering, Willey
Interscience, 1993
21. Rabindranath Dutta, Depth from motion and stereo : Parrel and sequential
algorithms, robustness and lower bounds, University of Massachusetts, 1994
22. Rafael C. Gonzalez, Richard E. Woods, Digital Image Processing, 1992, AddisonWesley
23. Ramesh Jain et al, Machine Vision, McGraw-Hill, 1995
24. Reg G. Willson, Steven A. Shafer, What is the Center of the image, Carnegie
Mellon University, 1993
25. Stephen Gomory, Richard Wallace, Cursor Stereo, new York University
26. Steven Maxwell Seitz, Image-Based Transformation of Viewpoint and Scene
Appearance, University of Wisconsin
27. Strat, T.M., Recovering the Camera Parameters from a Transformation Matrix in
Proc. DARPA Image Understanding Workshop, pp. 264-271, Oct. 1984
44
국문요약
본 논문에서 스테레오 영상의 매칭을 위한 특이점 검출과 그레디언트 기
반의 영상 매칭 알고리즘을 제안하였다. 스테레오 영상의 매칭은 좌측 영
상의 각 점들이 우측영상의 각각의 유사한 점들로 어떻게 일치 시킬 수 있
는가로 요약할 수 있다.
좌측과 우측의 영상에서 각각의 점들의 유사도를 찾아 매칭점을 찾기 위
해서 우선 특이점을 찾아야 한다. 특이점이란 영상에 나타나 있는 물체의
가장자리나 혹은 독립적으로 떨어져 있는 점과 같은 주위의 배경들과는 차
이가 많은 점들을 생각할 수 있다.
이런 특이점을 찾는 방법으로 많이 사용되는 방법이 Förstner Interest
Operator 가 있으며, 이것은 영상안의 가장자리, 원의 중심 그리고 배경과
많은 차이를 보이는 점들을 찾는데 효과적이다.
영상의 매칭점을 찾기 위해 우선 좌측 영상에서 특이점을 찾아낸뒤 우측
영상에서 이 점과 일치하는 점을 찾는다. 우선 고려할 것이 좌측 영상에서
찾아낸 특이점과 일치하는 우측 영상상의 점을 찾기 위해 우측 영상의 모
든 영역을 고려한다는 것은 상당히 비효율적이다. 따라서 본 논문에서는
우측영상에서 특정 범위로 검색 영역을 줄인 후 그레디언트 기반의 영상
매칭 알고리즘을 수행하였다.
비록 카메라를 통해 저장된 스테레오 영상을 이용한 매칭 기법에는 다른
물체에 의해 가려지는 현상이나 혹은 영상이 압축되는 현상이 나타나기는
하지만, 여기에서 제시된 방법을 통한 실제 영상의 실험 결과는 충분히 만
족할만한 수준이다.
주요어 : 스테레오비젼, 영상 매칭, 특이점 , 패턴 인식, 머신 비젼, 로봇 비
젼
45
감사의 글
벌써 대학원 생활에서 세 번째 가을을 맞고 있습니다. 두 번이면 충분
할 것을 저의 무능과 또한 시대적 도움(?)으로 일년이 더 길어졌습니다.
우선 아직 모자란 것이 많은 저를 언제나 뒤에서 보살펴 주셨고, 25 년간
저 혼자의 힘으로는 해결 할 수 없었던 많은 현실의 벽을 넘을 수 있도록
도와주셨던 부모님과 많은 충고를 해주셨던 하나뿐인 저의 누나에게 감사
드립니다.
또한 시대인으로 7 년간 많은 도움을 주셨던 스승이신 김희식 교수님께
감사 드립니다. 현재 건강이 조금 안 좋으신데 빠른 쾌유를 기원 합니다.
특히 여러 분야의 전공 분야뿐만 아니라 인생의 가르침을 짧지 않은 시간
동안 저에게 베풀어 주셨던 박선우 교수님, 최기상 교수님, 김규식 교수님,
김용철 교수님, 이준화 교수님 그리고 지금은 서울대에 계신 조남익 교수
님 등 제어계측공학과 교수님들게 감사 드립니다.
학부 4 년을 갓 마친 1997 년 정밀계측 연구실에서 새로운 생활을 시작
한 것은 저에게 많은 도전과 자신감을 주었습니다. 우선 아무것도 몰랐던
그 시절 프로그래밍에 많은 조언을 해주셨으며, 새로운 오락은 언제나 우
리 연구실이 장악한다는 믿음을 주셨던 영재형에게 감사 드립니다. 지금은
여자친구도 생기셔서 참 따뜻한 겨울을 보낼 것으로 생각됩니다. (좋겠다.)
또한 집들이도 아직 못하고 저의 옆에 앉아서 항상 저를 씹으셨던 평원형.
제가 원래 능력이 많은 것을 어쩌겠습니까? 부모님께서 잘 나아 주신 것을
요. 하지만 이제 평원형도 곧 장가를 가시니 좋으시겠습니다. 그리고 돌아
보니 반대로 저한테 매일 구박 받은 풍요속의 빈곤 응차니 한테도 감사의
말을 전하고 싶습니다. 또한 선배이자 저의 후배이신 조용한 현철형과 튼
튼하신 재황형께도 같이 생활한 기간은 얼마 되지 않았지만 별로 도움을
드리지 못해서 죄송합니다. 마지막으로 실험실의 새내기 영수기 또한 감사
의 말을 전하고 싶습니다. 물론 승영이형께도 감사합니다.
저의 대학 생활은 동아리 활동이 저에게는 전부였다고 해도 과언이 아
니었습니다. 특히 1993 년 입학과 동시에 아무 것도 모르는 1 학년 신입생
46
몇몇과 함께 만들어 지금까지도 계속 이어져 올 수 있었던 것은 어떻게 보
면 기적과도 같은 일이었을 것입니다. 그 때 만나서 평생 친구가 되어준
많은 선후배님들과 특히 93 동기들 재섭, 병오, 병렬, 용우, 우여사, 종순,
혜영, 진희에게 감사 드립니다. 아직 혼자인 친구들도 보이는데 곧 다들 짝
을 찾아서 청첩장을 돌릴 나이가 되어 가는군요. 혹시 관심 있으신 남녀분
들은 제게 연락을 주시면 주선을 해드리겠습니다.
올해는 영어공부라는 미명하에 6 개월간을 캐나다에서 보내면서 참 많
은 생각과 고민을 했던 한해였습니다. 특히 우리와 그들의 견해차를 많이
생각해 볼 수 있게 해준 Ken, Chyrl, Ian 에게 감사 드립니다. 또한 정말 열
성적이면서 또한 친구처럼 영어를 지도해주신 Anna 에게도 물론 감사 드립
니다. 그리고 캐나다에서 만났던 많은 친구들 특히 캐나다에서의 애인(?)으
로 많은 도움을 주었던 미선, 그리고 두 달간 룸메이트로 저를 열심히 먹
여 살려 주신 용철, 병철에게 진심으로 감사 드립니다. 또한 새집을 구하는
데 도움을 주셨던 종영형에게도 감사 드립니다.
마지막으로 지금은 영국에서 열심히 공부하고 있을 기훈이와 영원한 사
랑을 약속한 저의 여자친구 이쁜 영혜에게 감사와 사랑이 전하고 싶습니다.
1999 년 12 월
이 호 재
47