Wednesday, April 05, 2006

Finding an Image Annotation Program

Rebuilding the Training Set

After I lower the false positive rate on my soda can classifiers, I plan to try some sort of SIFT-based approach to detecting soda cans and other objects as well. Before attempting to lower the false positive rate, I need to rebuild the training set...again. I believe that the horizontally-oriented soda cans that I have been extracting from my training footage are throwing off my detection rate. Furthermore, I have been mistakenly training classifiers using 720x480 resolution BMP images extracted from the training MPEG movies.

At the time, I was using Adobe Premiere to extract video frames since I had not yet set up ffmpeg and it had not occurred to me to use it. In order to use Adobe Premiere, I had to convert the MPEG movies into an AVI format that was at 720x480 resolution by default. I did not notice this and mistakenly thought that the original footage was 720x480 resolution when in fact it was 320x240.

By rebuilding the training set, I will be able to use smaller sliding windows for each classifier which should reduce both the training and detection times by quite a bit. When the original footage was accidentally resized to 720x480, I believe that the pixels were blurred. It will be interesting to see how much the detection rates are affected by using unmodified 320x240 footage. The 320x240 footage probably contains a greater amount of noise. However, the 320x240 footage will also train faster with each image taking up less memory --which will probably allow me to use a greater number of negative training examples and allow more bootstrapping operations with false positives.

Since I will probably soon be applying some SIFT-based approach to soda can detection, it would be beneficial if I use bounding-polygons instead of bounding-rectangles. I think this helps SIFT to avoid mistakenly extracting features from the background and allows the spread of detected points to be exploited. With this in mind, I finally decided to create a Java GUI program that will allow the user to specify bounding polygons that will be saved into different files. I will then write some code to determine the extreme points (i.e. max x, min y; max x, max y; etc...) of the polygons and set these to be the corners of rectangular bounding boxes. I will then use these bounding boxes to retrain some strong classifiers. The polygon data can then be saved for later use with SIFT or some other algorithm that would benefit from tight-fitting polygons.

Java Program to Modify

I found this Java program online by searching for "image map java" on Google. The potential candidates for image annotation programs seemed overly complex and generally only used bounding boxes. I realized that image maps that one might find on a website would use the same sort of image annotation as "image annotation toolbox" programs. I wanted to avoid using Matlab since I am less familiar with its object-oriented features than Java and I can only use Matlab on my laptop.

Here is the program I will modify:

http://www4.vc-net.ne.jp/~klivo/mapo/javimap.htm

It seems to work extremely well --especially for a 7-year-old program. I will most likely modify it by allowing it to load all images from an entire directory and making it export bounding box data in addition to polygon data.

No comments: