Shawn Lankton Online

vision, science, engineering, and fun

Archive for the 'Vision' Category

I return today from a week-long trip to Anchorage, Alaska. I spent the week enjoying the beautiful mountains, and the exciting science being presentented at the Conference for Computer Vision and Pattern Recognition (CVPR 2008) [here are some links to lots of papers from the conference]. This was my first trip to this conference, and I must say that I was impressed with the quality of the work presented. Below, I list some of my favorite papers and give a (very) brief overview:

Read the rest of this entry »

Read Comments(0)  

I will be presenting “Tracking Through Changes in Scale” at the International Conference on Image Processing (ICIP) in San Diego in October, 2008. This tracker uses a two-phase template matching algorithm in conjunction with a novel template update scheme to keep track of objects as their appearance and size changes drastically over the course of a video sequence.

The pdf, presentation material, and citation information will be available on the publications page after the conference. Below are videos of the experiments shown in the paper:

 
LEAVES Sequence (High Resolution Download - 11.2Mb)

 
VEHICLE Sequence (High Resolution Download - 34.8Mb)

 
BOAT Sequence (Hi Resolution Download - 2.34Mb)

Read Comments(0)  

I took a special topics course in Spring 2008 at Georgia Tech, ECE 8893: Embedded Video Surveillance Systems. The course included three projects, each shown below. Detailed information about the algorithm is in the source code comments. (All the source is in Python)

Project 1: Activity Density Estimation

Use background subtraction to find moving foreground objects in a video sequence. Then, color-code regions with the most activity. Here is the result:

Source: p1.py

Project 2: Styrofoam Airplane Tracking

Find all white styrofoam planes in the scene and track them throughout the scene. We used color thresholding and simple dynamics to do the tracking.

Source: p2.py

Project 3: Pedestrian Tracking

Count and track the pedestrians that cross on a busy sidewalk. We use a combination of motion estimation via background subtraction and feature matching using the Bhattacharyya measure.

Source: p3.py
Final Report: p3.pdf

Most of this code is very hack-y because it was done quickly. However, it was
fun to learn Python, and the class was enjoyable overall.

Read Comments(3)  

Python is a very nice programming language. Fast. Simple. Free. I recently spent some time learning it for a class on computer vision. I was using the PIL and numpy packages to make Python feel more like my old friend Matlab.

The two functions that I couldn’t find, and missed the most (especially when writing hack-y code for class projects) were median filtering and morphological dilation. So, in hopes of sparing other the pain of writing them… here they are! The function medfilt_dilate.py has both functions.

medfilt_dilate.py

The medfilt() function uses the PIL filtering code. The dilate() function was written from scratch with NumPy.

Read Comments(0)  

A few months ago, I wrote about and uploaded some stereo vision work I had been doing. This work was an attempt to implement someone else’s paper. The paper I was implementing had three main components:

  1. 1.) Compute an estimate of the pixel disparity
  2. 2.) Segment the image with mean-shift segmentation
  3. 3.) Use the segments to determine filter disparity measurements

Recently, I was thinking of doing some work on stereo videos. Video processing requires very quick processing speeds, and mean-shift segmentation isn’t very quick! Thus, I started looking at faster ways to perform steps 2 and 3. After some experimentation, I found out that by using a selective mode filter, I was able to get satisfactory results much more quickly. Check out the slide show below for some results!



[red indicates close, blue indicates far away]

Another realization was that I could do Step 1 better too! In my original implementation, I had thrown away some useful feature information that could have been used to get measurements with less noise. Since then, those oversights have been corrected (in the code below and in the other project page).

Now that Steps 2 and 3 have been sped up with selective mode filtering, my implementation of Step 1 is the big bottleneck. The Step 1 code can be made much faster if I re-implement it in C++.

TO DO:

  1. 1.) Implement Step 1 code in C++
  2. 2.) Capture stereo video
  3. 3.) Process video
  4. 4.) …
  5. 5.) Profit

Can anybody help with suggestions for Step 4? By the way, here’s the code so-far.

stereo_modefilt.zip

Read Comments(7)  

The median filter is a well-known image processing filter. It provides a very nice way to smooth an image while preserving edges. The median filter replaces each pixel in the image with the median value of its neighboring pixels. A similar non-linear filter with slightly different properties is the mode filter which replaces each pixel with the mode of its neighboring pixels. I additionally make a slight modification so that “bad” pixels are ignored entirely in the computation of the mode.

This idea arose when I was trying to de-noise some images as well as do some in-painting of “bad” pixels (that have no value). Consider the image below. The darkest-blue areas are bad pixels. We have no information for those pixels. The other pixels are colored to show how far that pixel is from the camera. (see the post on Stereo Vision) However, some of the good pixels still have the wrong value. These are the noise pixels.


original data
[Initial Image]

In the rest this post I talk about how we use a selective mode filter to convert the above image into the one below. (There’s also download-able Matlab/C++ code)


selective modefilt
[Final Result of Selective Mode Filter]

Read the rest of this entry »

Read Comments(1)  

Today, I added demo code for the Hybrid Segmentation project. This segmentation algorithm (in the publications section) can be used to find the boundary of objects in images. This approach uses localized statistics and sometimes gets better results than classic methods. For an example, see the video below: The contour begins as a rectangle, but deforms over time so that it finally forms the outline of the monkey.

This can be used to segment many different classes of image. To try it out, download the demo below and run >>localized_seg_demo

localized_seg.zip

This code is based on a standard level set segmentation; it just optimizes a different energy. I’ve also made a demo which implements the well-known Chan-Vese segmentation algorithm. This technique is similar to the one above, but it looks at global statistics. This makes it more robust to initialization, but it also means that more constraints are placed on the image. Download it and see what you think! Again, unzip the file and run >>region_seg_demo

regionbased_seg.zip

For another Matlab implementation of Active Contours check out: James Malcolm’s Webpage. He has some codes for very fast approximate implementations as well as a full numerical implementation.

Read Comments(4)  

I came across a cute segmentation idea called “Grow Cut” [pdf]. This paper by Vladimir Vezhnevets and Vadim Konouchine presents a very simple idea that has very nice results. I always feel that the simplest ideas are the best! Below I give a brief description of the algorithm and link to the Matlab/C/mex code.

Algorithm

This algorithm is presented as an alternative to graph-cuts. The operation is very simple, and can be thought of with a biological metaphor: Imagine each image pixel is a “cell” of a certain type. These cells can be foreground, background, undefined, or others. As the algorithm proceeds, these cells compete to dominate the image domain. The ability of the cells to spread is related to the image pixel intensity.

The authors give some pseudocode that very concisely describes the algorithm.


//for every cell p
for all p in image
  //copy previous state
  labels_new = labels;
  strength_new = strength;
  // all neighbors q of p attack
  for all q neighbors
    if(attack_force*strength(q)>strength_new(p))
      labels_new(p) = labels(q)
      strength(p) = strength_new(q)
    end if
  end for
end for

Results

Once implemented, this is a nice way to get segmentations. It is quite fast, and the initialization is very intuitive. Consider this picture of a lotus flower:

growcut image

I made an initialization by clicking 20 points in the flower and 30 points outside. I then made a “label map” where unlabeled pixels are 0 (gray), foreground pixels are 1 (white) and background pixels are -1 (black).

growcut seeds

Based on this simple initialization, we obtain a very decent segmentation:

growcut output

As you can see, it isn’t perfect, but it is quite good. Its possible to interactively refine the seed points to improve the segmentation, but I didn’t do that here.

Downloads

I implemented this code in Matlab (using mex files due to the extensive use of for loops). You can download this below with compiled binaries for mac, linux, and windows. Unzip the file and run >>growcut_test for a demo.

UPDATE: I’ve made a bug-fix thanks to a note from a reader, Lin. The code works much better now!

Please let me know if you find this useful, and if you make improvements! Also, if you’re interested in segmentation, check out these other algorithms:

Mean-Shift Image Segmentation
Localized Active Contours
Classic “Chan-Vese” Active Contours

Read Comments(0)  

After spending about an hour fighting with PIL (Python Imaging Library) and trying to get it to install properly with all of its dependencies I discovered that some wonderful person posted a ready-made pil installer for osx. This worked like a charm. First try. No problems.

God I love it when people do stuff like this. Now, hang on as I learn to do image processing with Python on my mac.

Read Comments(3)  

Stereo Thumbnail2D is nice, but these days I’m getting interested in doing computer vision in 3D. One way to get 3D data is to use two cameras and determine distance by looking at the differences in the two pictures (just like eyes!). In this project I show some initial results and codes for computing disparity from stereo images. Read the rest of this entry »

Read Comments(8)