Required change in TList

BlitzMax Forums/BlitzMax Programming/Required change in TList

Paposo(Posted 2007) [#1]
Hello guys.

The method count() in TList is very ineficient. It have a cost of O(n). Is very easy obtain O(k) using a counter in TList incremented in add methods and decremented in remove methods.
Other method like toArray() are benefited with this small change.
¿ What is the motive to use actually count()? I not understand.

Bye,
Paposo


Who was John Galt?(Posted 2007) [#2]
I would rather this change wasn't made. I generally avoid list counts and conversion to arrays in my main loop. I do however and and remove lots of stuff from lists, so this would be a performance hit.


GfK(Posted 2007) [#3]
I don't want this change either. Its fine as it is.

If you don't want to use Count(), then you can manually keep track of how many listitems you've added/removed.


Paposo(Posted 2007) [#4]
mmmmmm.
I calculate the cost

try it:


1000000 increments have a cost of 4 millisecs in my machine.

Are you sure that is a performance hit?

Bye,
Paposo


GfK(Posted 2007) [#5]
Under what circumstances are you planning to add one million items per frame to a TList?

Your "4 millisecs per one million iterations" example is grossly exaggerated, imho.


Paposo(Posted 2007) [#6]
I not planning add one million items per frame!!!
It is absurd!

I make a very very bad situation. Is evident that use a internal counter not reduce performance of TList and the methods count() and toArray() are more optimizeds!

Bye,
Paposo


Space_guy(Posted 2007) [#7]
well perhaps the difference is small but it seems silly to do a O(n) calculation when a O(1) can be done with so little memory cost. Also isnt a count variable standard in list, are they not?


Paposo(Posted 2007) [#8]
Yes. That is!

the change only increment the memory cost in 4 bytes for each list created, reduce the effort in coding and optimize two methods.

Bye,
Paposo


Grey Alien(Posted 2007) [#9]
I assumed count was just reading an Int, didn't realise it was so slow, shame...


N(Posted 2007) [#10]
. (Removed because evidently nobody wanted a linked list with the changes requested.)


GfK(Posted 2007) [#11]
Paposo: I see. The way you wrote your post above, you made it sound like 4ms was a bad thing.
I assumed count was just reading an Int, didn't realise it was so slow, shame...
No, it iterates through the entire list and counts items manually. Unless you have several thousand items in a list, though, it wouldn't usually be noticable or a problem.


Damien Sturdy(Posted 2007) [#12]
Hmmm! This is not the behavior I require for one of our projects...

SO, I will modify the source myself (it's hardly difficult.)


Gabriel(Posted 2007) [#13]
Much like Grey, I had always assumed that it was just returning an internal count field. My fault for not opening up the source and looking, clearly, but since Paposo has demonstrated that using an internal counter does not reduce TList performance, I have to agree that this should be added.


H&K(Posted 2007) [#14]
Its a tossup between a small increase in time every TList expresstion, vrs A Big increase in time for a few exprestions.

I personaly always favour the faster the commabds you are likly to use in a main loop, the better.

In this case, its piddley easy peasy to extend TList and add a MyCount (or if Count isnt Final, just override that), and so I dont want a change to the core object for everyone.

but since Paposo has demonstrated that using an internal counter does not reduce TList performance
I have seen no demonstration, I have seen one were I:+1 is shown to be arbitary, but this does not include any fetching time, for what would be a field of a type, and not simply a local variable that is very probaly held as a register in the example given


ziggy(Posted 2007) [#15]
Extend the TList to add it this feature. It's easy. Then use the extended version where necessary, and the regular version when not.

Just add the objCount field, and increase it and decrease it on every Add or Remove command.

Doing so, there's no need to change the count() routine.


Gabriel(Posted 2007) [#16]
In this case, its piddley easy peasy to extend TList and add a MyCount (or if Count isnt Final, just override that)

Not if you have a project with god-knows-how-many references to tList that you have to manually change over to MytList it's not.

I have seen no demonstration, I have seen one were I:+1 is shown to be arbitary, but this does not include any fetching time, for what would be a field of a type, and not simply a local variable that is very probaly held as a register in the example given

Fair enough. If no one writes a better demonstration, I'll try to find time to do it myself later.


ziggy(Posted 2007) [#17]
@Gabriel:
Type TList extends brl.pub.tlist
...
End Type


This works, and you don't have to change any reference as in-code defined types have priority


Gabriel(Posted 2007) [#18]
That works even in a module?


ziggy(Posted 2007) [#19]
Of course!
The only thing you need is, if the code where your module is imported needs to instantiate your extended Tlist, it has to be declared like
MyObject:gabrielmodserver.gabrielmodule.tlist = new (gabrielmodserver.gabrielmodule.tlist)

don't forget the ( ) in the new sentence, as the New operator has priority over the "." operator (weird, but it works this way).

If this TList is internally used by your module, you soudn't worry about that. If not, as your TList extends the existing one, a given instance should be late-binded with the original TList object without any problem, but I have not tested it.


Paposo(Posted 2007) [#20]
Hello.

H&K , Try it :-)
THIS ARE DEMOSTRATION


If the changes are made in TList is eliminated the overhead produced by inheritance. The modification in TList original is little fast than example TListTest

Bye,
Paposo


H&K(Posted 2007) [#21]
Ok, I accept that Ive now seen an example that prove both of our assertions.
Yours that the addition of CTA:+-1 is an arbitary increase, and mine that its piddley easy peasy to extend TList and add a MyCount (or if Count isnt Final, just override that)

I would say, that even if Mark adds this to TList, I would probably then just override it back to the way it was.

(Edit, Im assuming that your code did what you said, I simply ran it, and havent analised it)


Grey Alien(Posted 2007) [#22]
If you extend TList, won't there be an overhead everytime you call the existing functions because you'll have to do this

Method AddLast()
  Super.AddLast()
  Counter:+1
End Method


I mean overhead from having to wrap the super method in another method. Or does Super.AddLast() compile in a special optimised way?


Oddball(Posted 2007) [#23]
To avoid cyclic referencing a TLink has no knowledge of which TList it is associated with. This means that if you use the TLink method Remove() to remove an object from a TList your count will not be updated and therefore will yield incorrect results. In other words your new count() function simply doesn't work.


Derron(Posted 2007) [#24]
What about

MyList.link = Null

no method is used - so no count-variable is refreshed.

If this would be the normal way bmax acts, the count-variable would differ from actually stored items in the list.


bye
MB


Gabriel(Posted 2007) [#25]
Oddball's right.

MichaelB: And neither should a count variable be refreshed with your example. Destroying a link does not and should not remove the item from the list.


Paposo(Posted 2007) [#26]
Yes, Oddball. You are true :-(

If use TList for certains operations, is more readable and clean use tlist for all operations. Is very easy add a method removelink() in TList

The use of tlink is a good reason for not modify the behavior of tlist. Some guys use tlink and is not good a change in your code.

Sorry. I retire the proposal of change tlist. I create another tlist for me :-(

Bye,
Paposo


ziggy(Posted 2007) [#27]
edit: wrong post. please dete it
Anyway, having an optimized count method of an extended TList, in some situations can improve the collection performance. Obviously there can be some compatibility issues. I will use an extended version of TList for a module I'm writting, as it is using a lot of count() calls for large TLists. These Tlists are being used internally by the module, and there's no speciffic TLink manipulation, so in this situation, I think it can be a good improvement of performance.
As soon as I have tested it properly, I can provide some results, if anyone here is interested.


Blueapples(Posted 2007) [#28]
Linked lists (I mean the formal data structure, not just the way it is implemented in BlitzMax) do not keep track of item counts. If you need that number for anything, either keep track of it yourself or use an array. If you're doing any kind of loop using a count on a TList then your algorithm is written incorrectly.

A significant problem with implementing the suggested change is that TLink objects do not and should not know what list they belong to. The only fields of TLink are:

	Field _value:Object
	Field _succ:TLink,_pred:TLink


How can it's Remove() method (which is what RemoveLink() calls) know what count value to reduce? Yeah you can keep a reference to it but then you have a circular reference and the GC won't clean up the object very well...

It's just not a good idea, it isn't useful. List algorithms don't use counts.


ziggy(Posted 2007) [#29]
@Blueaples: You're completely right, but but there are other kind of collections (not available on BlitzMax) that do need this count (Stack collections as instance). Now stack collection algorithms can be done using TList, but they're very slow. It would be great to have LIFO and LILO collections.

Just to make a simple keyframing timeline routine I have had to write my own collection class becouse using TList was simply too slow, and TMap was not letting me have duplicate keys (more than one keyframe in the same timecode in my particular code).


Blueapples(Posted 2007) [#30]
@ziggy: Point taken, but you're exactly right in your example with stacks: they are inefficiently implemented as linked lists. Usually a stack has so much thrash that it's much better to calculate how much space you will need and use an array to store it. It can look like a list maybe, but it really should be an array under the implementation covers. Look at the most efficient stack ever, the one in a CPU: it is certainly not a list, much more like an array.

I just really like the finer points of CS topics, so I like to chime in on some of these discussions. It's kind of a shame more people aren't exposed to the "right" way of doing things... it's a joy.


SculptureOfSoul(Posted 2007) [#31]
Ziggy, if you need duplicate keys, and speeds as fast or faster than TMap, you might consider my THashtable class: http://www.blitzbasic.com/codearcs/codearcs.php?code=1907