Extension counter program [___.ext]

Blitz3D Forums/Blitz3D Programming/Extension counter program [___.ext]

virtlands(Posted 2013) [#1]
Here's some weird code,

it's a program to do a report on counts of each extension type within a directory (path).

What's unusual about this code is not what it does,
but how it does it.
It's just an exercise in string logic, and stuff...

This version of the program is made flexible to include any extension lengths, (most extensions seem to be of length 3).

Examples of exts: BB, bmx, jpeg, png, exe, dat, torrent, decls ...

How it works:
As the program runs and encounters new extensions, these extensions are appended to a superset string EXT$
which grows longer with each new addition.

The clever part is that there is an int array ExtCount() that parallels it.
The indexes in the array reference appropriate indexes within EXT$.
The index locations in ExtCount() store extension counts, (and extension lengths).

Which means that extension types don't need to be known ahead of time; They are appended to EXT$ only if they are not already in it.

So I thought of various complications that could happen with this method:

What if EXT$ is currently = "JPEG"
and a newcomer extension is "JPE", .. well "JPE" is a subset of "JPEG".
We don't want the two to be confused, so they shouldn't share the same index, or should they??

The current solution is put them in like "JPEJPEG..." or "JPEGJPE...", so that separate and accurate counts are kept of each.
( There can be intermediate data of course, like "JPExxxJPEGxyxy" .)

;;------------

Another idea which may be an improvement that I haven't tried yet
is to create an array of EXT strings like DIM EXT$[7]
which allows for the possibility of extension lengths 1 to 7
The reasoning there is that ext lengths are assumed per each EXT array index and I don't need to store lengths.

Then an additional possibility pops up when transferring this logic
to other things too (that have nothing to do with file extensions):

Create a sort of De Bruin sequence or sparse version of that
where the substrings to be found within each EXT[] are of equal
lengths each and can overlap to create a shortest superstring possible.

For example, in the superstring, "ABCDEFxxG", if each substring has a length of 3, then you just stored 7 strings in 9 bytes.
The substrings are "ABC", "BCD", "CDE", "DEF", "EFx", "Fxx", "xxG"

Another fun fact:
If you were to consider extensions of length 3, and you wanted
to include all English letters A-Z (26 letters),
then the shortest superstring that would include all combinations
(taken 3 at a time: ... ABC,ABD,ABE,.....)
would be a string of length 17601 bytes (= 26^3 + 25 ).

;=================================================================

Here's the B3D demo code:



_PJ_(Posted 2013) [#2]
Have you considered just checking the CLASSES registry for existing registered extensions?


virtlands(Posted 2013) [#3]
Hi, _PJ_

I don't yet know how to do registry reads from within BlitZ3D. I heard it's a complicated thing.

The point of the above program is not just for reading file extensions,
but for somehow encountering any random strings, then smashing them together, then counting them.