Computer Vision: Content Based Retrieval System

Content Based Retrieval System
Xiang Lan Zhuo

Abstract

The image retrieval system is implemented using the RGB color space. A 3-D histogram with 17 bins at each axis is made for each image. A successful match is defined by a histogram intersection algorithm, which is the sum of the minimum values of corresponding buckets between two histograms. Creating inidvidual histograms for images in database at run-time is extremely time consuming. Instead, histograms are made and stored onto a text file before running matching program. This method, however, puts a limitation on dynamically partition the image for local histogram matching.

Image Matching - Whole Image Histogram Matching

Matching images based on their entire histogram is simple and straight forward. To account for different image sizes, the histogram is normalized. If two images are a perfect match (i.e. matching with itself), the histogram intersection algorithm returns a value of 1.

Click here for results.

Image Matching - Interior & exterior histogram matching

For this task, it is assumed that most images in the database contains one dominant object, and that the object is located near the center of the image. Ideally, the interior (object) and exterior (background) frame of the image should be dynamically determined. But in light of long process time to build histograms at run-time, the interior is statically defined as a 33% frame located at center of the image. A histogram database is pre-calcuated to include both the interior and exterior histograms for each image. The interior and exterior histogram intersection values are equally weighted in determining matches.

Click here for results.

Image Matching - Interior/Exterior & emphasized background histogram matching

To match an object that is placed in various settings is difficult. It requires a well defined object that can be easily segmented from its environment. Since some images lack such an object, putting emphasis on matching the overall setting of the image can return relatively good results, especially for setting such as trees and grasses. Besides calculating the histogram intersection for the interior and exterior frame as mentioned above, each image in the database is also divided into four quadrants (topleft, topright, bottomright, and bottomleft). Histogram intersections are then calculated for each of these quadrants with the exterior histogram of the input image. Each intersection value is weighted equally with a maximum value of 6.

Click here for results.

Comparing results

Original Image pic.0002.ppm
Method 1	Method 2	Method 3
pic.0002.ppm	pic.0002.ppm	pic.0002.ppm
pic.0237.ppm	pic.0544.ppm	pic.0001.ppm
pic.0199.ppm	pic.0252.ppm	pic.0583.ppm
pic.0057.ppm	pic.0237.ppm	pic.0661.ppm
pic.0011.ppm	pic.0064.ppm	pic.0237.ppm
pic.0006.ppm	pic.0049.ppm	pic.0631.ppm
pic.0401.ppm	pic.0207.ppm	pic.0064.ppm
pic.0662.ppm	pic.0661.ppm	pic.0057.ppm
pic.0583.ppm	pic.0081.ppm	pic.0207.ppm
pic.0661.ppm	pic.0026.ppm	pic.0584.ppm

Original Image pic.0017.ppm
Method 1	Method 2	Method 3
pic.0017.ppm	pic.0017.ppm	pic.0017.ppm
pic.0002.ppm	pic.0016.ppm	pic.0002.ppm
pic.0019.ppm	pic.0019.ppm	pic.0019.ppm
pic.0024.ppm	pic.0004.ppm	pic.0024.ppm
pic.0005.ppm	pic.0024.ppm	pic.0005.ppm
pic.0025.ppm	pic.0020.ppm	pic.0025.ppm
pic.0007.ppm	pic.0025.ppm	pic.0088.ppm
pic.0088.ppm	pic.0088.ppm	pic.0021.ppm
pic.0009.ppm	pic.0022.ppm	pic.0009.ppm
pic.0022.ppm	pic.0010.ppm	pic.0022.ppm

Original Image pic.0250.ppm
Method 1	Method 2	Method 3
pic.0250.ppm	pic.0250.ppm	pic.0250.ppm
pic.0251.ppm	pic.0251.ppm	pic.0251.ppm
pic.0072.ppm	pic.0257.ppm	pic.0147.ppm
pic.0451.ppm	pic.0184.ppm	pic.0184.ppm
pic.0146.ppm	pic.0093.ppm	pic.0405.ppm
pic.0257.ppm	pic.0451.ppm	pic.0210.ppm
pic.0154.ppm	pic.0405.ppm	pic.0154.ppm
pic.0100.ppm	pic.0447.ppm	pic.0451.ppm
pic.0334.ppm	pic.0334.ppm	pic.0447.ppm
pic.0447.ppm	pic.0259.ppm	pic.0142.ppm

The major difference between object recognition and content-based image retrieval is that the latter does not assume a previous knowledge of the exact shape or appearance of an object. Generic object recognition tries to identify objects based on their prototypical features. However, content-based image retrieval is heavily based on statistical analysis. Object recognition can be quite efficient and accurate through identifying object features. Another way of looking at it, content-based image retrieval is probabilistic while object recognition can be both deterministic and probabilistic. Content-based image retrieval can be more difficult since we don't know beforehand what we are searching for.

Histogram-based CBIR returns fairly good results for images that are dominated by a unique intensity. Examples include trees, grass, and sky. When image's intensity is rather spread out over the whole spectrum, it results in a rather non-charactertic histogram in RGB space. In this case, matches are not very accurate.

The greatest causes of error in CBIR task are due to variants in lighting of the image. In indoor images, lighting can significantly change the intensity of objects and their surroundings. Perhaps matching the overall shape of the histogram rather than the magnitude of the histogram?

This system will probably not scale very well to a very large data base. The histogram for each image is of the size 17x17x17. Doing matching for a database containing 672 objects takes about 10 seconds. Instead of going through each image in the database and return the top matches, set a minimum threshold and stop the search once the number of hits desired by the user is filled. This could possibly shorten search time.

<<HOME