new file compression idea

BlitzPlus Forums/BlitzPlus Programming/new file compression idea

Alaric

(Posted 2006) [#1]

I have been working on a file compresser for quite some time now. At one point I came to thinking about ways to compress files without the common pattern searching system. Around that same time, I was working on simple XOR encryption (I'm a hobbyist, not a pro so I don't stick to one project at a time). I started mixing up the two projects in my mind and sometimes got ideas for one program while working on the other. This confusion lead me to ask the following question: "If you XOR'd a file enough times using varying lengths of bit masks, eventually would not all of the data inside be set back to 0's?" If this is true then, I thought, it MAY be more space efficient to simply keep a small log of these XOR's than to use the standard compression methods. Is there any life in this idea? I know that it is a long shot, but I thought that it might be worth investigating... Also even with the billions of calculations most computers today do a second, it might not be worth the lengthy compression time. However, perhaps you could simply XOR the file just enough to bring out large patterns that the standard compresser could exploit. What do you pro's think about this?

DjBigWorm

(Posted 2006) [#2]

Alaric,

I have made my own image bank compression I think it was as simple as RLE when I was done. Lol but I did compare it to png format and mine was smaller in size and I can bulk up many images plus save extra bliz specific in them and still be small then a standard png format. What you are talking about sounds interesting. I hope you can work out the tech details and make it fly.

Alaric

(Posted 2006) [#3]

LoL the point of this post was not to call myself an idiot or to ask for help on the details of usual compression methods. The whole question was, is there even a possibillity that this method of logging XOR's of the data would be a worthwhile compression method, though I'm not sure I explained the theory thoroughly enough... This is how it would work. You take a large file that you want to compress. Then you just continue XORing the data (keeping track of the masks you XOR with in another file) until all you are left with is 0's. Then you also tack on the number of 0's to the end of the file. This way, you can reconstruct the original file using only this "XOR map". I'm sure that this would be possible (though highly time consuming during the compression process) but might this actually be able to output a "XOR map" file SMALLER than the original file?

Adam Novagen

(Posted 2006) [#4]

Alaric,
Hey there, ADAM here. I'm not sure if you are still working on this compression project, but if you are, I would be VERY interested in some working source code I could use. Let me know if you strike oil, okay?

Alaric

(Posted 2006) [#5]

Wow, good to see someone hasn't called me a complete crackpot. However, it's just a theory, such a project would involve much more time and experience than I can have right now. I just wanted to see if anybody else thought that it to be interesting.

plash

(Posted 2006) [#6]

Crackpot :), just kidding. I'd just stick to the blitz zip userlibs for compression.

Adam Novagen

(Posted 2006) [#7]

Alaric, I've been reviewing your idea for "Xor compression," and have to admit, it sounds promising; I familiarized myself with the Xor command especially to grasp the concept. One thing though: the file should indeed (in my opinion) be smaller than the original, but I'm wondering...

Did you, by any chance, have any thoughts on how to DEcompress the file?

Alaric

(Posted 2006) [#8]

My idea was to include the original file's size in the compressed version. Once you have the file's size, all you have to do is xor in a reverse manner. Therefore, you need to store which xor patterns you used in the reverse order you used them. Think of it like a rubik's cube. Start out with an unsolved cube and write down directions on how to solve the cube. Once the cube is solved, if you follow the directions in a reverse manner, you get back to the original unsolved cube. Therefore you have to write a program to "solve" the file in the shortest way possible, "writing down the directions" for each xor it must use. Then you can backtrack through the directions to get the file back to the original "unsolved" state.

*	(Posted 2006) [#9]

One thing I thought of years ago was randomizing a string to a certain length and when the string matched bytes read from the file that is the number stored, this way loads of bytes could be compressed to 8 bytes. The only problem with this solution was it takes ages to do.

Adam Novagen

(Posted 2006) [#10]

Any source code, EdzUp?