Fast Fourier Transform for image processing

BlitzMax Forums/BlitzMax Programming/Fast Fourier Transform for image processing

ImaginaryHuman(Posted 2010) [#1]
Hey, I'm fiddling with image processing and trying to get a Fast Fourier Transform to work... ie you convert an image with the fourier transform, you then supposedly multiply the `complex numbers` of the transformed image by complex numbers of the transformed `convolution kernel`, and then inversely transform the result back - thus applying the convolution to the image.

I found C code and have tried to convert it to BlitzMax. I can say that apart from the multiplying/convolving part, I can get an image to `fourier transform` and come back to an image again. I *think* this means the FFT is working, but it could be misleading. But I cannot get the multiply/convolution to work. I just get next to nothing output. If you take out the two lines within the `multiply the image` loop you'll see the image restored to normal at the bottom right, but with the multiply in (which is supposed to do a horizontal blur), I get junk back. Any ideas what I'm doing wrong or how to make it work? Did I make a mistake in the conversion from C code that might be making the FFT not work right?

Any help greatly appreciated.




Warpy(Posted 2010) [#2]
This looks like a perfect topic for my maths blog! I'll write a proper post later, but if multiplying complex numbers is your problem:

A complex number is something like a+ib, where a is the real part and b the imaginary part. i is the square root of -1, so i*i=-1. So, when you multiply two complex numbers together, you get:

(a+ib)*(c+id) = a*c + ib*c + a*id + ib*id
= a*c + i*(b*c+a*d) + i*i*b*d
= a*c - b*d + i(b*c + a*d)

So the real part is ac-bd, and the imaginary part is bc+ad. I can't see what the multiplication in your code is meant to be doing, but maybe that will help.


ImaginaryHuman(Posted 2010) [#3]
Hmm. What does the i() part mean, multiply what's in the brackets by i?

The multiplication in my code above is meant to simply multiply one fourier-transformed image by another. I can't find anywhere online where it tells you exactly HOW to multiply it, and the two pieces of C and java sourcecode I tried to mimick didn't work.

Basically I take an image, I convert with a FFT, I take also the image of a kernel e.g. 5 horizontal pixels, I convert it also with FFT, then `multiply` the two together and then do an inverse FFT to convert back to an image... this is supposed to perform an image `convolution`, like applying a blur. If I can get the blur to work that would be great.

Do you see any signs of errors in my conversion of FFT from C code? Like I said I DO get the image back when inverse transforming, but I don't know if that means it works or if its a because a broken system undoes itself.


Amon(Posted 2010) [#4]
We need to clone warpy.


ImaginaryHuman(Posted 2010) [#5]
Good idea. :-) Personal assistants for everyone :-D Do you wash cars too, Warpy?


Warpy(Posted 2010) [#6]
DOES NOT COMPUTE


Pineapple(Posted 2010) [#7]
* pokes warpy with a stick *

What are you??


slenkar(Posted 2010) [#8]
-0-0
--+
--\/


ImaginaryHuman(Posted 2010) [#9]
So er, anyway, how do I get these fouriers to transform properly and multiply properly?


Warpy(Posted 2010) [#10]
Weirdly, I've just encountered a bug where I can use DrawPixmap or DrawRect in a frame but not both - if I use DrawRect the Pixmap disappears! I've had to switch to d3d9 because I can't use opengl, but that's weird.

Can't see where you're going wrong with the FFT, far too much code to trawl through. Maybe find another example and put it together piece by piece.


Floyd(Posted 2010) [#11]
If the goal here is to exactly undo a "very smooth blur" as in your other thread then I would judge that to be impossible. Although we don't have a precise definition of "smooth blur" the fact that you only save pixel values means we just don't have enough information at the end.

I think what you are trying to do here is essentially the same as before, but with FFT doing the grunt work of calculating the inverse. But the inverse is uniquely determined, no matter what technique you use to find it.

Here's one way to think about the problem and whether or not it is feasible. If the forward transformation maps two different inputs to the same output then inversion is impossible. The simplest output to consider is all zeroes. Consider taking the average of five consecutive pixels. What happens if the row of input pixels have color values of 0 or 1, in this pattern:

10001000100010001000...

Every "average of five" will be 1/5 or 2/5, which is 0.2 or 0.4 exactly. This output could be inverted. But now we take the additional step of converting the output to pixel values. That means every output value is 0. This is exactly the same as if the row of input pixels had been

0000000000000000...

Different input, same output hence not invertible.


ImaginaryHuman(Posted 2010) [#12]
Yes you're right. I came to the same conclusion. If I don't store the exact precision of the result of the blur, whether it be floats, doubles, or just by adding up all of the blur contributions (which actually fits into 2 bytes per pixel for a 256-pixel blur), then that data is lost and cannot be recovered precicely.

One thing I did do is add the `error` difference to the blurred pixel values so that when the blur is undone the numbers add up and it produces the right output. However this means the blurred image is noisy. I need the blurred image to be smoothly blurred at the time of it being saved to disk. The other issue is that simply by doing the blur at all, means that multiple pixel values are essentially being added together and this means you need more precision, at least 16-bit, and the lower 8-bits of that 16-bit value are very very high frequency noise which is what I was trying to avoid. Just a no-go all round so I'm giving up for now. I can see that the FFT would not help at all since it would be reversing an image that has missing data.


Floyd(Posted 2010) [#13]
If you're up for more adventures I just had an idea.

Assuming only color is of interest you could stuff the missing color information into the unused alpha channel. In the "average of seven neighbors" scheme the resulting value is an integer plus a remainder of 0,1,2,3,4,5 or 6. So for all R,G,B there are 6*6*6 = 216 possible sets of remainders. This fits into the 8-bit alpha channel.

The three remainders can be encoded as a "base 7" number. For example, remainders of 2,4,3 would be represented by the integer 2*7*7 + 4*7 + 3 = 129.


ImaginaryHuman(Posted 2010) [#14]
What would happen if you wanted to blur longer than 7 pixels, say, at least 64 or more? Would it have a modulo of up to 64?

I'm still concerned that the alpha data is still `noisy` with all these random modulo's. ALL data stored has to be smooth. But I suppose 0..6 isn't too major.

I think I almost need like an `integer blur` if there is such a thing.


Floyd(Posted 2010) [#15]
This would actually be exact. You take seven color values ,each in the range 0..255 and add them. Let's say they added up to 1500. Dividing by 7 gives 214, with a remainder of 2. The 214 is the color value and the 2 will be part of the information stored in the alpha channel.

By coincidence 7 is the largest number for which this could work. If we used 8 then each remainder would be 0..7, and there would be 7*7*7 = 343 possible remainders for the three colors. This won't fit in 8 bits.


ImaginaryHuman(Posted 2010) [#16]
How are you storing 3 modulos of 0..7 in a single byte?

By my figuring, each byte is going to have to have its own modulo. For a blur that's 256 bytes long, it would require a whole byte for the modulo of each. So 3 original bytes, plus 3 modulo bytes. Which is the same thing as just storing a Short total and not doing the divide at all.


Floyd(Posted 2010) [#17]
Oops. I was hallucinating for moment. I said there were 216 possible remainders for the "average of seven pixels". Each remainder is 0 to 6. But that is seven values, not six, so there are 7*7*7 = 343 sets of remainders. That won't fit in a byte, as I stated in my similarly flawed analysis of eight pixels.

But this would work for blocks of six pixels. That feels wrong because of the lack of symmetry. You probably want an odd number of pixels, a middle one and neighbors to either side. So we drop down to five pixels.

Let's think about how this works. We use an average value based on five pixels to produce one pixel in the blurred image. The three color channels are handled separately. First we add five reds, getting an integer value in the range 0 to 5*255 = 1275. Suppose the sum was 942. The resulting color value is 942/5. As an integer this is 188. But that loses information. We really need to remember 188 + 2/5.

There are three color values and three remainders. But we don't need three bytes for the remainders since each is only in the range 0 to 4. If the remainders happened to be 2,4,3 then we encode all three of them as the single integer 2*5*5 + 4*5 + 3 = 73. This is stored in the alpha of the blurred image, along with the 188 red value and whatever green and blue were.

All of this can be reversed. Starting with the pixel value in the blurred image we pick out the colors and the alpha. Red is 188, alpha is 73. The remainder for red must have been the integer 73 / (5*5), which is 2. And that tells us the sum of five reds which produced this must have been 188*5 + 2 = 942. Nothing has been lost.

Similarly the exact green and blue sums are recovered. The remainder for blue is ( alpha mod 5 ). The one for green is ( alpha / 5 ) mod 5.


ImaginaryHuman(Posted 2010) [#18]
Interesting. So it works, but the problem still remains, how would this work with a 64-pixel blur? A short blur is not enough.

It's interesting though how you encoded three remainders in one value.


Floyd(Posted 2010) [#19]
how would this work with a 64-pixel blur?

It wouldn't. As stated before there is now too much extra information to be saved.

You could experiment with using a small number of pixels, such a 5, but spread out farther from the center. That would change the image more, but it wouldn't be so smooth.


ImaginaryHuman(Posted 2010) [#20]
What if the second set of blurring was done including output from the first set, ie blur a group of 5, overwrite the original 5 bytes (plus store the modulo thing somewhere), then move to the right? Would the accumulative effect of it work and be reversible?


Floyd(Posted 2010) [#21]
I'm not sure what you mean about overwriting the original five. Five pixels are averaged to create one new pixel value.

You can certainly apply the blur procedure to the newly blurred image as many times as you like. The image will continue to get fuzzier. But there is no going back. The color and alpha of a blurred image can be used to retrieve the color values of the image from which it came. But it can't encode the information needed to reverse through a series of images. That would be rather like running a file through a compression routine repeated, hoping it will get smaller and smaller.

I see no hope for this short of coming up with a new image format containing extra data, not visible on screen.


ImaginaryHuman(Posted 2010) [#22]
I guess it wouldn't work, what I meant, because if you output a blurred pixel to the source image and advance 1 pixel to the right to the next average of 5, it won't get included. That's what I meant - include the blurred pixel from the last averaging as part of the next averaging.


ImaginaryHuman(Posted 2010) [#23]
Hey Floyd, do you think it would be possible to somehow store the modulo of 5 bytes as part of the blurred result, and somehow extract it again later? ie instead of storing the modulo in a separate byte for every 1-3 blurred bytes, can we like add it to the blur value and do something clever to extract it just before the deblur, so that it is all still reversible?


Floyd(Posted 2010) [#24]
If you mean blur by taking the average of five pixels then that is exactly the example I gave earlier. Scroll up to message #17.


ImaginaryHuman(Posted 2010) [#25]
I follow what you said there but what I'm asking is, to instead of storing the modulos in an alpha channel, store it as part of the color channel `on top of` the pixel values?

e.g.

Say you are doing a 2-pixel blur. (Pixel A + Pixel B) /2 = Blur Value. Lets say the result was supposed to be 54.5, so now we store a 54 + 1 remainder, =55. How can we then later take this 55 and know it to be really 54+1? ie to take 54.5*2 instead of 55*2?


Floyd(Posted 2010) [#26]
There is not enough space. Starting with 0 to 255 in steps of 1 the 256 possible values completely fill one byte.

The average of two such values is 0, .5, 1, 1.5, ... 255. The range is still 0 to 255, but in steps of .5 so there are 511 values. They can't all fit into one byte.

It gets worse with bigger averages. With five values the result is 0 to 255 in steps of .2 so we get 1276 values to encode.


ImaginaryHuman(Posted 2010) [#27]
How about a modulo 256?

And if we just blur 2 numbers the modulo is either going to be a 0 or a 1, so can we not add 0 or 1 to the 0..254 byte values?


Floyd(Posted 2010) [#28]
I don't know where you got those numbers.

The sum of two bytes covers every integer from 0 to 510. No amount of trickery will fit all of them into one byte.


ImaginaryHuman(Posted 2010) [#29]
What I meant is that if the result of 0..255 + 0..255 is more than 255, you can store it in a byte by wrapping it with modulo 256. Ie if it's >255 you subtract 256, if it's <0 you add 256. It works out and is reversible. Problem is it adds noise to the results.


xlsior(Posted 2010) [#30]
What I meant is that if the result of 0..255 + 0..255 is more than 255, you can store it in a byte by wrapping it with modulo 256.


But how do you know if there was any wrapping at all when looking at the end result?


Floyd(Posted 2010) [#31]
Exactly. I calculate x+y, and if necessary substract 256. The result is 100. What was x+y?


ImaginaryHuman(Posted 2010) [#32]
They do it in the PNG filters.

e.g. the `subtract` filter

ByteValue=OriginalValue-PreviousByteValue

If byte A is 251 and byte B is 88, 88-251=-163 ...+256=93

93 gets stored in the byte

Later when it comes to reverse the subtract filter, you go 251+93=344, then 344-256=88.

It works. It's just that this means that each time you add two bytes together to get a value 0..511, and it wraps into one byte value, that makes a lot of noise which works against the blurring effort.


Floyd(Posted 2010) [#33]
Your "subtract" example is not at all the same. Going in reverse you know the final value is an integer 0 to 255. In the "average of two" blur the reverse is 0 to 255 in steps of 0.5. You can't recover the fractional part.

That being the case no special technique is required. We are back where we started: compute the average and round to integer. The remainder is lost.


ImaginaryHuman(Posted 2010) [#34]
Hmm.
Well anyway, no biggie, not planning to use that.

Am almost giving up on this whole blur thing, have tried many many different ideas and none of them work, lol