Wednesday, June 07, 2006

Removing The Background of Training Images

I recently wrote a program to white-out the background of my soda can training images. Since I wanted to reuse the training set from my experiments with Adaboost, I had to implement two different stages to accomplish this.

The first stage involved writing a Perl script to reformat the polygon files that were created by the "Javimap" program I used. This script read all of the polygons in all of the files and merged them into a single text file. Thanks to Perl's build-in regular expression matching capabilities, this process worked quite well.

The second stage involved writing a C program to cut out each soda can subimage from the training images and white-out its background. I chose C because David Lowe's provided code is written in C and contains some functions for manipulating PGM files (which he chose for his SIFT implementation).

In order to white-out the background of the soda-can images, I checked each pixel to see if it was within the polygon that I specified when first creating the training images. To accomplish this task, I implemented an algorithm that was provided on this website for raytracing:

http://neptune.cs.byu.edu/egbert/455/CourseNotes/Lecture13-RayTracingIntersections.ppt

My entire algorithm went as follows:

1) read each polygon and load its corresponding image from the Perl-generated text file of polygon coordinates.

2) cycle through each set of polygon coordinates to find the largest and smallest pair of (x,y). Use these values to specify a bounding-box for each soda can.

3) Using the bounding box from (2), crop out the image within this bounding box (should be a single soda can).

4) Translate all polygon coordinates by subtracting the smaller bounding box coordinate from each (so they are with respect to (0,0) rather than the smaller bounding-box coordinate. This is because a new image was created with the bounding box region.)

5) For each pixel in the subimage, check if it is within the bounding polygon. If not, then set that pixel value to 1.0, which is white. To determine if a pixel is within the polygon, the polygon is translated to a new coordinate system where the pixel in question is at the origin. The edges of the polygon are then checked to see if they intersect the x-axis. If the number of intersections is odd, then the pixel is inside the polygon. If the number of intersections is even, then the pixel is not inside the polygon. More details on this algorithm can be found at the aforementioned website.

Now that I have created an algorithm to automatically extract training images with clutter-free backgrounds, I can redo the training set very easily if needed without having to manually white-out the background with Gimp.

The next step will be to proceed with matching input test images against my new database of soda-can images. I will then try applying additional concepts from the SIFT paper to filter out false-positives etc.

2 comments:

Little Saiph said...

nice blog =)

Anonymous said...

Great article. Your blogs are unique and simple that is understood by anyone.

OU Degree 1st Sem Result
PDUSU BSC 2nd Year Result
VBSPU BSc 3rd year result
MJPRU BSc 3rd year result
Uniraj BSc 3rd year result