Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Image Approximation with Genetically Selected Cosines (nklein.com)
30 points by edw519 on Oct 2, 2009 | hide | past | favorite | 8 comments


This seems like a rather odd thing to do, and here's why.

1. We can already represent the image exactly (but non-sparsely) as a sum of cosine waves. If you restrict the frequencies to an appropriate lattice, there's exactly one way to do this and it's given by the Fourier transform.

2. Fourier transformation is orthogonal. Therefore, at least in the least-squares sense, the amount of error in the image is the same as the amount of error in the coefficients of those cosines.

3. So if you want the best approximation of the image by N cosine waves, and (like nklein) your notion of "best" is RMS error, then what you do is to Fourier transform the image, zero all but the N largest values, and you're done.

(This is rather like what JPEG compression does to each block of an image.)

So what this approach buys you is the ability to have a slightly wider range of values for the frequencies. Maybe that helps, but it doesn't seem all that likely; and even if it does, I bet you'd do better by beginning with the Fourier transform approach and then doing a simpler optimization process on the coefficients and frequencies and phases.

Of course, if the real point was to have some fun and learn about genetic algorithms for optimization, fair enough!


The real problem is that it attempts to break down the whole image in a single transform, rather than splitting it into blocks.

There is nothing wrong with an optimized set of basis functions, and tests have actually shown that an overcomplete set of basis functions does actually give better results than a complete transform like the DCT. Albeit at higher computational cost--an RD-optimal overcomplete transformation, even if basis functions are already known, is NP, related to subset-sum IIRC.

Even if you're not going overcomplete, tests have also shown that optimized basis functions are useful. However, there are two catches here. The first is that if you try to make an optimal basis function set over a huge amount of real-world input data, you will get the DCT! The second is that the problem of creating an optimal classifier--that is, to split inputs into groups which would get their own set of basis functions (and then optimize for each of those groups)... is also NP. Though it might be possible to approximate it efficiently with K-means.

However, regardless, you want to perform transforms on relatively uniform areas of the image. Thus, you want the individual transforms to be small enough to cover relatively uniform areas and localize nonuniformity as much as possible. That's why JPEG uses 8x8 blocks, not a full-image DCT. That's also why H.264 has two transform sizes (and effectively a third if you count the heirarchical transform in i16x16 blocks): so it can adapt to uniform areas of different size.


Check out his other videos on Vimeo for stuff he's been trying out with cosines & inverse Fourier transforms.


Genetic algorithms are in a certain sense doing things in the stupidest possible way.

Whenever it creates a random mutation, A better algorithm programed with at least a little knowledge of the problem could make a change that is more likely to be beneficial. An educated guess instead of an uneducated one.

Since we know a fair bit about image compression, regular algorithms just take their own best guess without any selection step. (or an I wrong about this?)

Unless you really don't know anything about how to solve the problem, you should be able to make an algorithm thats faster by mutating the population more intelligently.


Its a fair point, but genetic algorithms could be used on known problems to discover something new. You might think you know enough to try an educated guess, but there might be some area of the search space that the theory you are using hasn't covered.

That being said, I don't know anything about image compression and we may well have a provably best way to do this...


I've thought about your comment for a while.

If you want to test for potentially interesting search space, you should do so systematically, not randomly.

A random scattering could leave vast areas unsearched, and you might not even know. (I think you'd have to look at a visualization that happened to make it clear, or do some other analysis.)

Covering the entire search space evenly is a better approach to start with. Increasing the frequency around high scoring points would automatically dig down into interesting regions.


Thanks! Lots of interesting stuff there.


It'd probably be more fun to do this with wavelets.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: