String compression advice
Blitz3D Forums/Blitz3D Programming/String compression advice
| ||
If I had a very long string containing numbers + alphabetic characters e.g. +1200121050023/0200134000233430+1200121450012 what would be a good method to compress the string into a smaller string? |
| ||
Can we assume this string is a packet of network data? So the speed of compression would be critical? If so, and if memory serves, I'm pretty sure I remember Antony Wells tested zip compression for realtime networking and found that to be pretty performant. I can't offer any personal experience, as I've never gone beyond four players over a network (KSPool) so compression was never necessary for me. |
| ||
Can we assume this string is a packet of network data? So the speed of compression would be critical? Actually no - I'm toying with the idea of storing custom map data in a string short enough to be copied/pasted in emails - so the speed of de/compression is not an issue. |
| ||
Wasn't there a ZIP userlib for Blitz ? (edit, ow Gabriel allready mentioned it sorry) But the link can be found here: http://www.blitzbasic.com/Community/posts.php?topic=63176 If you can convert your string into a bank, you could use CompressBank, then, after emailing, use UnCompressBank, and turn the bank into a string again. |
| ||
Compression, if it's just numbers like that, and a couple of symbols, should be fairly easy man. You have 0,1,2,3,4,5,6,7,8,9,+,-,/,* (14 (based on the code you posted) different combination of character there. So, each character should take 4 bits each maximum (that would hold 0 to 14, 15 is the termination character (that gives you 16 different states). 4 divides nicely into 16 (bits for a character string) and gives you 4. So, you should be able to shorten the above by about 4 times. You will need in your code, an array to hold each bits values, so: dim com_array(15) for loop = 0 to 9 com_array(loop) = loop next com_array(10) = "+" com_array(11) = "-" com_array(12) = "*" com_array(13) = "/" Then you can divide the string into character chunks of 4: +1200121050023/0200134000233430+1200121450012 would be: "+120" take each character in turn and convert it into bits, using your array... 1010 + 0001 + 0002 + 0000 gives you a 16 bit character code, that you can add together with the rest of them. However, on hindsight here, i don't know if you can copy and paste 16 bit character code (UTF-16) or whether blitz3d support it... So you might have to opt for the 8 bit ascii code, which means you'll only be able to halve your string size, which is still pretty good i reckon. |
| ||
RossC Heh - I guess I could have thought this through for myself, but this is why I post questions to the forum. Thanks for the pointer, I'll develop it further. |
| ||
If I had a very long string containing numbers + alphabetic characters e.g. +1200121050023/0200134000233430+1200121450012 what would be a good method to compress the string into a smaller string? I see numbers and '/' and '+', but what other alphabetic characters are there? Generally you model the compression technique on the data. |
| ||
>MadJack< This might be useful for you. http://www.blitzbasic.com/codearcs/codearcs.php?code=2405 |