Tag Archives: fourier transforms

Elliptic Fourier Descriptors

Elliptic Fourier Descriptors (EFDs) are one really good answer to the question: How can I mathematically describe a close contour?

That’s nice, but why would we want to do this?

One good reason, with apologies to Randall Munroe, is that after reading xkcd #26 you probably really wanted to take the Fourier transform of your cat. You and me both! Below, we’ll do exactly that, using the method described by Kuhl and Giardina (1982) in their paper ‘Elliptic Fourier Descriptors with Chain Encodings’.

00

Image: xkcd by Randall Munroe

Of course this isn’t exactly fair- you’ve already taken the Fourier transform of your cat if you’ve ever saved a picture of it as a jpg or listened to it meow over the phone (listen, I don’t know what you do with your free time, but I’m not here to judge) because Fourier transforms are really popular in both video and audio compression.

Step One: Image Pre-Processing

I’m actually a dog person, so I found my creative-commons licensed cat photo through flickr. To make preprocessing easier, I was looking for a monotone cat that was entirely in the frame.

First I extracted the cat from the background using Preview’s great cropping tool. You could automate this segmentation with a script, but for one image it’s easier to do it by hand.

Next I created a binary version of the segmented image with the Matlab one-liner: imwrite(im2bw(imread(‘cat.png’)), ‘bw_cat.png’). The image is binary because every pixel is either pure black or pure white. This binary image can be thought of as a matrix where every pixel makes up one matrix entry and is either 0 (white) or 1 (black).

01

Original image: Emiliyan Dikanarov, via Flickr.

Step Two: Extracting the Boundary

Now that we have our binary image let’s grab our contour. At this step we’ll represent the contour as a list of the (x,y) positions of the pixels that make up the boundary of the cat.

From our preprocessing step we assured that all non-zero (white) pixels belonged to our cat, so we can just walk through each column of our matrix until we find the first non-zero entry. This pixel must be on the boundary of the cat! From there, we can just walk around the contour by following neighboring boundary pixels until we’ve found every pixel in the contour. At this point we’ll have a list of (x, y) positions of all the boundary pixels.

02

But wait!

I claimed that Elliptic Fourier Descriptors were a way to mathematically describe a close contour- why isn’t this description, a list of the (x,y) coordinates of all the pixels in the contour good enough of a mathematical description?

I lied when I motivated EFDs as a great way to take a Fourier transform of your cat.

The real use-case for EFDs is creating features for machine learning algorithms; something like teaching a computer to classify photos of cats from non-cats. What we really need is some way to saying how ‘cat-like’ an object in a photo looks- and that’s why this mathematical representation falls short. This representation isn’t position or scale invariant, and is really susceptible to noise- this means that if I have a more zoomed out photo of the same cat in a different position in the frame or it’s a bit blurry, then my algorithm won’t stand a chance of realizing it’s still a cat.

We’ll see below that all these problems are resolved when we take the Fourier transform.

But first, to make taking the Fourier transform easier, we’ll change out of a coordinate encoding. Instead, we’ll use a Freeman chain encoding, which describes each point in a shape relative to each previous point.

Step Three: Freeman Chain Encodings

A Freeman chain encoding is a piece-wise linear approximation of a continuous curve. Each segment of the continuous curve is approximated by one of eight line segments, denoted by a0 – a7 in the graph below. The line segments with even index have unit length, and the odd line segments with odd with odd index have a length of √2. This will be convenient later on!

03

For our purposes, we’re encoding a contour in a digital image, which is already a discrete approximation of a continuous curve. Below on the left is an example of the type of curve we might want to approximate; the chart on the right shows its Freeman chain encoding.

04

Detour: Fourier Series

In general, a Fourier series is a way to approximate a periodic function. A periodic function is just a waveform, and a Fourier transform breaks complex waveforms into sums of simple ones.

We can also use Fourier series to approximate non-periodic functions so long as we specify an interval. Below I’ll show an example of how we can approximate x2 on [-pi, pi]; we do this by setting the “period” of x2 to be [-pi, pi]. We pretend that the part of the non-periodic function we want to approximate is periodic outside of the interval we’re considering it over.

The really beautiful thing about this approximation is that we can make it arbitrarily good, that is, we can make the error between the approximation and the real function as small as we want.

The Fourier series f(x) is described by the equation below- we see that the Fourier series is parameterized entirely by its constituent an’s and bn’s.

05

This approximation approximates the true function arbitrarily well on any finite interval. If we instead replace the infinite sum with a finite one, from n = 1 to k, this is called the k’th harmonic; the higher the value of k, the better the approximation.

Let’s look at a few harmonics of x2 on [-pi, pi]. The graphs below were generated in WolframAlpha with the command “overlay[{plot FourierSeries[x^2, x, k], plot x^2}]” with k replaced by the number of the harmonic we want to look at (ie, 1, 2, 3).

Notice that the function of the harmonic is written in blue in the legend of the graph. It’s amazing how fast we converge to a good approximation!

06

Click to embiggen.

 

Step Four: Fourier Transform and Time Derivative

Okay, back to what we were doing with our cat.

Ask not what you can do for your Fourier series, but what your Fourier series can do for you.

Now that we have the Freeman chain encoding of our cat contour, and are convinced that Fourier series are the way of the future, let’s look at what they can do for us. Below is a really quick explanation of the Kuhl-Giardina 1982 paper.

Our first goal is to find how we can compute those coefficients we talked about in the previous section, an and bn.

First we’ll notice that we can take separate our chain encoding into its x and y projections. We’ll define xp and yp, the projection of the first p links in our chain encoding to be the sum of the differences between all the previous links.

07

Notice that it’s the exact same story in both the x and y direction, so for ease we’ll just work the x version out below and understand that exactly the same logic will hold for y.

We’ll consider the time derivative of x. When we say “time” here we actually mean the length of the chain, for instance, the “time” contribution of any horizontal or vertical link is 1, and the contribution of any diagonal link is √2. The “time” of the p’th link is just the sum of all the times of the previous links.

08

The punchline is going to be that we’ll write this derivative in two different ways- one way which we can compute from the chain code, and one way which will involve the coefficients we’re trying to find an and bn. Then we can solve for an and bn.

Time derivative: First Form!

09

Here ‘n’ is which harmonic we’re on, T is the period (the interval we’re looking at the function on) and alphan and betan depend on the slope of each line segment.

This is very good news, because we can compute everything in this form of the time derivative! Because we’re using a Freeman chain encoding (just piecewise linear segments) the slope of any line segment is either 0, (plus/minus) 1, or (plus/minus) √2.

Time derivative: Second Form!

10

This is also good for us, because this form includes the coefficients we’re trying to solve for. Solving for an and bn we get:

11

We know how to compute an and bn! T is the period (the interval), n is the number of the harmonic that we’re looking at, p is the index of the chain link, K is the total number of chain links, (del xp/del tp) is the slope of each link, and tp and tp-1 are the lengths of the chain at the p’th link.

The exact same thing happens in the y direction, so we call the an value in the y direction cn and the bn value in the y direction dn.

Step Five: Elliptic Equation

Phew! We figured out how to find all our coefficients. How’s this going to help us with our cat?

We’ll use these coefficients to modify an ellipse- after all, we basically want a modified ellipse to approximate our closed contour.

Below is the equation that gives the n’th Elliptic Fourier Descriptor.

12

Notice that this looks a lot like the equation for an ellipse; the xy term comes in so that we can rotate the ellipse.

Another good thing about these coefficients is that they’re the same regardless of how the contour is rotated, what size it is, or where in the frame of the image it is.

Now we’ve got the cat in the bag.

Step Six: Results!

Using the description we found above, we’ll approximate the cat contour that we found. The true contour is given in blue, and the approximation is given in red.

Notice that with 10 harmonics we already have a pretty good approximation, but with 70 harmonics we can fit much finer features.

For real-world applications, like creating features for machine learning, fitting fine features might be not be desirable, because you might be over-fitting to random noise instead your data.

13

10 harmonics on the left, 20 in the middle, and 70 on the right.

I also decided this was a great opportunity to find the EFD of Randall Munroe.
I grabbed another creative commons image, and found the 70th harmonic of Randall Munroe.

14

Appendix: Matlab code

Here’s the code you need to replicate everything we did.

Segment the image
imwrite(im2bw(imread(‘cat.png’)), ‘bw_cat.png’)
Generate the chain code
chain = mk_chain(bw_cat.png’);
[cc] = chaincode(chain);

You’ll need to grab mk_chain.m off of this website first.
Really all it does is make a call to the Matlab function bwtraceboundary
Also be warned that this code often failed for me, if this happens for you just replace the <dir> on the last line (line 36) with <‘SE’>.

Generate the EFD
chain = mk_chain(bw_cat.png’);
[cc] = chaincode(chain);
coeffs = calc_harmonic_coefficients(transpose([cc.code]), 70)

First you’ll need to grab all the m files off of this website.

You can replace 70 with whatever number of harmonic you want to look at. Coeffs will be a matrix with 4 numbers, these are the coefficients a70, b70, c70, d70.

See what your chain code looks like
chain = mk_chain(bw_cat.png’);
[cc] = chaincode(chain);
figure
plot_chain_code(transpose([cc.code]));

See what your EFD looks like compared to your chain code
chain = mk_chain(bw_cat.png’);
[cc] = chaincode(chain);
figure
plot_chain_code(transpose([cc.code]));
hold
plot_fourier_approx(transpose([cc.code]), 70, 50, 0, ‘r’);

Again, you can replace 70 with however many harmonics you’d like to see.

Advertisements