Image Indexing and Retrieval Using LSH on GIST features

WP2.T2 - Efficient image indexing
both geographically and visually
ATLAS – Advanced Tourism
Planning System
Firstly, it is mentioned that pair-wise similarity judgments between
all photos in a huge dataset, is not-practical. Efficient indexing
facilitates information search and organization in large datasets.
Accordingly, near-neighbour retrieval is proposed using a distance
threshold.
This threshold can be chosen:
1. Visually
2. Geographically
Visual Indexing
• GIST features
▫ GIST Features were computed for each image
▫ The images were resized to 256x256 for the
features computation.
▫ A 512-dimensional feature vector is given
Locality Sensitive Hashing (LSH)
• The LSH implementation from Andoni was used
▫ http://www.mit.edu/~andoni/LSH/
• All images are assigned to a number of hash
buckets along with similar to them images.
Results
Geographical Indexing
Nowadays, modern electronic devices provide location
information about an image. The geo coordinates of an
image is a valuable tool that can be used as a distance
threshold.
According to a simple approach that has been developed
the images contained in our dataset are indexed
according to their latitude and longitude. More precisely,
the image dataset is split:
1. According to their latitude. (North - South regions of Greece)
2. According to their longitude. (West - East)
Geographical Indexing
Another a little more complex approach indexes images
belonging to large city centers.
The five largest cities of Greece are:
1. Athens
2. Thessaloniki
3. Patra
4. Heraklion
5. Larissa
Geographical Indexing
1. Locating GPS coordinates of the centers of these cities.
2. Computing geo distances between these coordinates
and the ones of each image using “Haversine formula”*.
3. Setting a different distance threshold for each city.
4. Index with numbers 1-5 images with geo-distances
below that threshold.
http://www.movable-type.co.uk/scripts/latlong.html
Geographical Indexing
Finally, an hierarchical clustering algorithm has been
implemented using “Haversine formula” in order to
organize our data geographically.
Haversine
formula:
a = sin²(Δφ/2) +
cos(φ1).cos(φ2).sin²(Δλ/2)
c = 2.atan2(√a, √(1−a))
d = R.c
where
φ is latitude, λ is longitude, R is earth’s
radius (mean radius = 6371km)
note that angles need to be in radians
to pass to trig functions!