An effective way to track strings?

Blitz3D Forums/Blitz3D Beginners Area/An effective way to track strings?

Abomination(Posted 2009) [#1]
Is there a way to keep track of strings effectively?
For now I keep them in an array, but that's slow and ugly.
Perhaps a stringpointer, or something like Varptr?
any help is realy appreciated!


Mahan(Posted 2009) [#2]
I posted this a short while ago:

http://blitzmax.com/codearcs/codearcs.php?code=2381

This works if you don't have insane amounts of text, and gives you the possibility to create "StringLists-entities" that can hold one zero to many strings. Then you use the functions to add/delete/get/count strings in the list.

I got another solution at home also when i deal with larger amounts of data, but that is functionality I built into an external DLL/Userlib.


Abomination(Posted 2009) [#3]
I must say; that is very nice and neat :)
Alas, I need more Speed. I have a LARGE amount of Strings I
have to move around a database. And I need a way to select a certain one
in the fastest possible way.


Mahan(Posted 2009) [#4]
Since I'm such a nice dude (hehe) I've made a package of the lib I've made myself to handle quite large amounts of strings:

This DLL has:

1) An arraybased StringList that handles millions of strings in a breeze
2) A StringHashList to provide constant time lookups using a string as Key and an integer as value.
3) A pretty decent array based list.
4) An local windows string to UTF-8 function (which is fantastic when using the FastText lib with unicode support).

http://download.ecma.webfactional.com/CollLib.zip

This software is in alpha shape. I recently reformatted my PC and my CollLib.decls file was wiped out, so I had to rewrite a new one today, and I haven't tested all calls after this.

Please report back if any of the functions don't work.

License:

* I own all rights to this DLL.

* I take no responsibility for anything that might happen as a result of using it, directly or indirectly.

* Anyone is allowed to use >this version< of this DLL in their programs and redistribute the DLL itself with their program provided that:

1) The name of the DLL itself is unchanged. (always CollLib.dll)
2) They don't build a "wrapper" and redistribute it as their own.

A small mention on the credits-page or in a README file is encouraged if you use this lib. (example: "CollLib.dll by MaHan")


big10p(Posted 2009) [#5]
I have a LARGE amount of Strings I
have to move around a database.
Can you explain a bit more what exactly mean by 'database'? What are you needing to do exactly? If you just want to have a load of strings and be able to quickly find a given string, and possibly have other data associated with each string, I'd look into using a hash table. They're generally very fast indeed.


Mahan(Posted 2009) [#6]
@big10p: Just wanted to promote my DLL :)

Hashed String -> String in constant time is possible with CollLib.zip by combining a StringHashList and a StringList. like this:

You got:

1) key$
2) value$

create 1 stringlist and 1 stringhashlist:


To add key-value pairs do this:

StringHashAddValue(shlHandle%, key$, StringListAdd(slHandle%, value$))

Note: that the StringListAdd(value$) function returns the index of the added string.

To lookup a value using a key do this:

 
result$=StringListGetString$(slHandle%, StringHashGetValue(shlHandle%, key$))


Both these operations (when you do the lookup) go in constant time.


Abomination(Posted 2009) [#7]
I'll try to explain a bit more...
Let's say you try to make a text-processor.
Each string, the user enters, gets put in an array, and the number
of the element is put in a memory-Bank. (as an index to an array)
If you want to create a certain selection of these strings,
you put their "location in the bank" in a list.
If you want to swap 2 strings, you swap their element-numbers in the Bank.
The problem with this is, that if the amount of strings gets bigger than the array, you'd have to redim.
I'd rather use a pointer to a string instead of a number of an element in an array.
(It's quite basic actually...)_____________________________hashtable eeh? ;)


Mahan(Posted 2009) [#8]
I updated the StringList and added the following function:

StringListSwap%(slHandle%, idx1%, idx2%)

With this function you swap one element index (idx1) with another (idx2) by reference and not by value.

Its not pointer operations as you initially requested, but imho this should be even more simple than pointers and do the same thing.

the function returns true if successfully swapped the strings and false if unable to do so (out of range checking)

http://download.ecma.webfactional.com/CollLib.zip is updated.

Hope it helps.


Abomination(Posted 2009) [#9]
@Mahan

Is it possible to resize the list in realtime?
I don't want to create a list of thousands of elements,
if the "virtual" user just wants to create some small document.
It'd have to have that flexibility...


Mahan(Posted 2009) [#10]
When you create the list it's empty. You never create a list of thousands of elements, but you create an empty list, and fill it up yourself with 1 or 2000000 elements or whatever you choose to in your program:

  myList = CreateStringList()


Then you can add strings to it using:
  StringListAdd(myList, "This is the new string im adding")


You can ask the list how many strings there are in it at any time:
  nrOfStrings=StringListCount(myList)


And you can ask the list to return any of the strings in it:
  print StringListGetString$(myList, 0)


Note: The first item in the list (if count > 0) is at index zero = 0
The last item in the list is Count-1

So if you wanna write out all strings in your list you do:
  for i = 0 to StringListCount(myList)-1
    print StringListGetString$(myList, i)
  next


And when you are finished with the list you delete it like this:
  FreeStringList(myList)



There are more things you can do with a StringList. Check out the CollLib.decls file for a complete list of Calls/Commands available.