Object Composition and Aggregation

BlitzMax Forums/BlitzMax Programming/Object Composition and Aggregation

zoqfotpik(Posted 2014) [#1]
Getting my mind around run-time object composition. It's been a little bit of a bear but I think I understand it.

In example code by AlexO posted in this thread, what he's doing is giving a parent object a list (really in this case a TMap, for speed) of components, then calling Update on all members of that list, with Self as the parameter to Update(), to pass the parent to the component. Then, within the component, values within the parent are twiddled.

Can people with experience explain how this is useful in composition of game objects? That is to say, what big wins are available from that sort of implementation?

How is it useful in general program flow and game state management?


Yasha(Posted 2014) [#2]
There's been a lot written about this... the search term you're looking for is "entity component system". Some introductory articles here and a decent series here.

The big win is in terms of abstract architecture. You get away from the mess caused by inheritance, and move closer towards the kind of flexible, omnidirectional relationship graph your game logic probably actually embodies.

The downside is that performance may suffer for small games, and is quite likely to suffer for large games thanks to huge amounts of wasted memory and fragmentation - to get actual performance improvements, you need a lot of low-level runtime support code and quite possible a component-aware language/compiler (which Max is not).

The tradeoff therefore is that you get a much better high-level abstract design for your overall game architecture, at the cost of a very messy low level of implementation code that might become easy to get bogged down in. Avoid thinking about code (like TMaps vs. TLists for implementing entities) - this system is for very high-level organisation of your project, and to see the advantages you need to be working mostly or entirely outside the entities themselves, using high-level generic procedures to build your program logic. In practice you probably wouldn't implement a full entity system, simply so that the low level stays manageable and still uses some native objects (maybe a half-system over the top of a small set of classic-OOP components).

It has some similarities to "generic method"-based OOP as exemplified in CLOS, where the methods exist outside the classes.


zoqfotpik(Posted 2014) [#3]
I don't like the sound of performance problems, if I didn't mind that I would have stayed with Python, or if I didn't like shooting myself in the head, 6502 on the Apple II.

How bad IS the performance trouble?

What if you implemented this with function pointers?


Yasha(Posted 2014) [#4]
It depends entirely on your program, but at a guess I wouldn't expect it to drag your performance down quite as far as using Python (and since Python is fast enough for many realistic uses anyway...).

It's probably the kind of "performance problem" that's an issue for people who write MMO engines in C++ - it's much less likely to bother someone writing a smaller game or something that doesn't do so much data crunching.

Bigger, fewer components will go much of the way to addressing such a problem if it arises, anyway. You don't need to use it for everything; the main problems are likely to come about if you go overkill (or... conceptually pure) and start using components even for things as small as object positions.


Derron(Posted 2014) [#5]
Yasha: the main problems are likely to come about if you go overkill (or... conceptually pure) and start using components even for things as small as object positions.


Do you think that this subtracts so much performancewise?

I use my Vec2D-classes for positions - or Rectangleclasses (using vec2d for dimension and position) for areas in my entities.

Of course it adds another "layer" but allows for easier things like return values (the screen rect does not get to have created on each request), allows for shorter code (pos.swap(otherpos), pos.copy(otherpos), rect.growRelative(1.5) ...).

I know that each "benefit" comes with some costs - but I do not know how much it adds compared to eg. "setters/getters" (I assume they do not get "optimized away" during compilation).


@entities/components
I even store strings in a specific Class (TLocalizationString) because I need to change the language "on keypress". The alternative to a dynamic wrapping class is then to have a function changing all available strings - or to send out an event so that all objects handle it on their own. Of course the wrapper is more "convenient" to use - but increases overall memory usage (eg. for people not changing the language at all).
But if you follow such a theory, you wouldn't even bother using an "AssetManager".

Benefit of components might even be things like "team colors" (all the entities of one team "share" the same color object - change for one in the team, and all others recolor automatically. Alternatively use a Color-Collection and store the id of the color in each entity -- seems there is in many cases a diversity of options available).


So back to my original question: does the component based approach add more than "memory usage" - especially when using "getters/setters" in both cases? Don't know how expensive it is to bring the "components object" to the front to operate on it (stack?).


bye
Ron


zoqfotpik(Posted 2014) [#6]
It seems to me that he's talking about runtime composition (stack of types) rather than compile-time (types within types) because boxed variables for things like rgbcolor are common and seemingly necessary. And seemingly cause very little performance overhead. I can see how it might become a cache thrashing issue though to have a list of objects, each of which is getting passed a pointer to parent and then acting on the parent. My guess is that runtime creation of these object trees might be slow if there were more than a few nodes.

The sorts of apps I am writing are typically well within performance tolerances, and are frankly at the very low end (even a raspberry pi laughs at early 80s arcade games. Even emulated). My main question is what sorts of productivity gains and low hanging fruit emerge from runtime composition (rather than compile time which I already use.).

It's sort of a conceptual headache but I suppose I'm using it already with a Renderables list and an Updatables list, sort of. It's also looking like there are lots of ways of doing it, like having a list of function pointers, which I'm thinking would be much less processor intensive. I guess I just don't like the idea of passing a parent pointer to a child component and having the child act on the parent, it seems to break encapsulation, but maybe I am wrong about that.

And certainly old fashioned. And old. This kind of paradigm shift seems to get harder every time.

Back to performance of components, I've been doing things with inheritance hierarchies at a 2x2 pixel block level for 640x480 windows and I didn't notice any slowdown from that, but I suppose I could time that with both flat arrays ( r[],g[],b[]) and runtime aggregation, to see if there's any substantial difference.


Derron(Posted 2014) [#7]
@colors
that is why I asked if the overhead of calling a function is more expensive than the overhead in memory.

rgba could be stored in an int - so using "GetA/GetR/..." and then returning the desired value using bitshifting is an option.

This would replace an "object" with a simple number. In BlitzMax this also means, you cannot check for "is not set" - as "colorobject = null" works, but "colorint = null" just is also true for "black" (int = 0).

So having "components" might have benefits in more cases than one might think of in the first moment.


@runtime components
Instead of passing parents to children my objects have a "parent" property of the basetype they extend from too.
Then all "linked" data is asking the parent which asks its parent etc.

So GetX() means "x + parent.GetX()".

Dunno if the overhead of storing the "reference" of a parent (or nothing) is bigger than having that value as parameter in a function. Maybe Yasha can shed some light on this (field reference:object VS function(reference:object).


bye
Ron


zoqfotpik(Posted 2014) [#8]
Let's figure out an empirical test for this stuff.

My feeling is that there are probably huge benefits from runtime composition. An example would be if you wanted to have a unit editor be part of the game, or have huge numbers of procedurally generated enemies, weapons etc. (one of my projects is like that). Just come up with a bunch of properties that are either roughly balanced with each other or have some sort of cost associated with their power, and then as the skill level of the game increases, throw more of those properties on the list. There have to be other things you can do with it as well.


zoqfotpik(Posted 2014) [#9]
Well, I went ahead and tested it using AlexO's code.

These were my results:

Adding 1000000 Component Objects
Millisecs elapsed:407
Doing 1000000 updates
Millisecs elapsed:106

That's on a quad core i5. On my dual core Mac Mini 2011 I got:

Adding 1000000 Component Objects
Millisecs elapsed since last call:1019
Doing 1000000 updates
Millisecs elapsed since last call:279

I'm not going to bother testing compile-time composite objects or raw arrays. I imagine they are faster in that order but a million updates in a tenth of a second is far more than sufficient.

That leads me back to the other part of my question: what can we do with this sort of thing? One obvious idea is a learning AI. Another: I understand that people use this for game state management, networking and other things. How does that work? Do they just snap in a network controller object instead of an AI or console input?

Derron:
that is why I asked if the overhead of calling a function is more expensive than the overhead in memory.

This is the face I make when I'm concerned about memory footprint of a single-developer indie game in a 32 gig machine:

:|

Obviously it would probably be much more of an issue on mobiles but there I will just use manager objects containing raw arrays. And even now it's still good to have some concern about that kind of thing because cache etc. But when Yasha said "performance hit" I was really expecting it to be something much more substantial.

This is the testing code:
Rem
bmax composition example
written by: AlexO (Testing code by ZFP)
End Rem

Global time:Int

Global playerlist:TList = New TList
Global i:Int

' lets make an object that has physics, AND can be controlled by our player.
Print "Adding 1000000 Component Objects"
timer()
For i = 1 To 1000000
Local myPlayer:TBaseObject = TBaseObject.Create("Robot", 40, 40)
'...now to give our new empty object physics and controls we simply add the functionality to them
myPlayer.AddComponent(New TPhysicsComponent)
myPlayer.AddComponent(New TControllerComponent)
myPlayer.Init()
playerlist.addlast(myplayer)
Next
timer()

Print "Doing 1000000 updates"
timer()
For myplayer = EachIn playerlist
myPlayer.Update()
Next
timer()



' now lets make a bullet with laser flight
Local myBullet:TBaseObject = TBaseObject.Create("Laser", 30, 30)
myBullet.AddComponent(New TLaserComponent)
myBullet.AddComponent(New TCollisionComponent) 
myBullet.Init()

myBullet.Update()
' and so on...


Rem
this is our 'container' class.
it provides very basic functionality that we 
determine every 'object' should have.
in this example I've only determined
that each 'object type' should have a name and position
end rem
Type TBaseObject
	Field _name:String
	Field _components:TMap = CreateMap()
	Field _x:Float
	Field _y:Float
	
	' factory method to help create a base object.
	Function Create:TBaseObject(name:String, x:Float, y:Float)
		Local obj:TBaseObject = New TBaseObject
		obj._name = name
		obj._x = x
		obj._y = y
		Return obj
	End Function
	
	' some basic setters/getters
	Method SetPosition(x:Float, y:Float)
		_x = x
		_y = y
	End Method
	
	Method GetX:Float()
		Return _x
	End Method
	
	Method GetY:Float()
		Return _y
	End Method
	
	Method GetName:String()
		Return _name
	End Method
	
	' to provide some coherency
	' we'll add an init() function
	' that we call once all the components
	' for an object have been added
	' and wish to initialize themselves
	Method Init()
		For Local comp:TComponent = EachIn _components.Values()
			comp._OnAdd(Self)
		Next
	End Method
	
	' our update function becomes
	' simplified for this object
	' since all the functionality
	' is tucked away in it's components.
	Method Update()
		For Local comp:TComponent = EachIn _components.Values()
			comp.Update()
		Next
	End Method
	
	' this allows use to search for a particular component
	' and functionality based on name. Since it is 
	' using a tmap, retrevial is fast compared to a linked list.
	' the only downside is 'update' order is not easily determined. a trade-off
	' but could be solved with a more complex solution.
	Method GetComponent:TComponent(componentName:String)
		If _components.Contains(componentName) Then
			Return TComponent(_components.ValueForKey(componentName))
		End If
		Return Null
	End Method
	
	' this adds a new type of component
	' to this object. In this implementation
	'you cannot add, for instance, 2 physics components, as it doesn't make much sense
	Method AddComponent(component:TComponent)
		If Not _components.Contains(component.GetType()) Then
			_components.Insert(component.GetType(), component)
		End If
	End Method
End Type


Rem
our base component class.
each component class _must_ have a way of identifying what type 
of component it is. For this example we will just use a 'type' string. 
This isn't the most 'strongly' typed way to do it, but for simplicity
it will do.
end rem
Type TComponent Abstract
	Field _owner:TBaseObject
	Field _type:String
	
	Method New ()
		_type = "TComponent"
	End Method
	
	Method GetOwner:TBaseObject()
		Return _owner
	End Method
	
	Method GetType:String()
		Return _type
	End Method
	
	Method SetType(typeName:String)
		_type = typeName
	End Method
	
	
	Method OnAdd(owner:TBaseObject) Abstract
	Method OnRemove() Abstract
	Method Update() Abstract
	
	Rem
	this is consider an 'internal' method
	used only by our base component class to help facilitate the aggregation
	end rem
	Method _OnAdd(owner:TBaseObject)
		_owner = owner
		OnAdd(owner)
	End Method
	
	Rem
	another internal method to help facilitate the component structure
	end rem
	Method _OnRemove()
		_owner = Null
		OnRemove()
	End Method
End Type


'now we will define some custom components to give our container object
' some usability.


' a basic physics component could encapsulate the physics calculations
' and formulas needed to do simulations.
Type TPhysicsComponent Extends TComponent
	Field _velx:Float
	Field _vely:Float
	
	Method New ()
		SetType("TPhysicsComponent")
	End Method
	
	Method OnAdd(owner:TBaseObject)
		' sometimes a component may want to query for other components
		' in the owner to gain access to other data if there are dependencies.
	End Method
	
	Method OnRemove()
		
	End Method
	
	' a trivial example for 'physics' functionality.
	Method Update()
		Local owner:TBaseObject = Self.GetOwner()
		Local x:Float = _velx + 3
		Local y:Float = _vely + 3
		owner.SetPosition(x, y)
		'Print "physics update!"
	End Method
End Type

'one interesting separation to think about is between physics calculations, and collision calculations.
' both are needed for physics simulations. but say you did not want to run physics on an object but wanted to see when it hit something to
' do some other non-physics related action? adding this component allows you to collide this object against others who have collision components installed.
Type TCollisionComponent Extends TComponent

	Method New ()
		SetType("TCollisionComponent")
		
	End Method
	
	Method OnAdd(owner:TBaseObject)
		
	End Method
	
	Method OnRemove()
		
	End Method
	
	Method Update()
		'Print "collision update!"
	End Method
End Type

' now another powerful use is if you wish to encapsulate different AI/input behaviors, you can do so by making separate components.

' a basic controler behavior that could attached.
Type TControllerComponent Extends TComponent
	
	Field _physics:TPhysicsComponent
	
	Method New ()
		SetType("TRobotAIComponent")
	End Method
	
	Method OnAdd(owner:TBaseObject)
		'lets say for example this controller type needs to know the object's velocity to function. lets query for it:
		_physics = TPhysicsComponent(owner.GetComponent("TPhysicsComponent"))
		' now the above is not ideal if you wish to decouple your components. for more information on a solution that'll 
		' solve that look into Data ports:
		' http://www.gamasutra.com/view/feature/1779/implementing_dataports.php
		' these can be easily adapted to the component architecture where the 'baseObject' object is it's own
		' individual data port manager, and components and register/read/write to it's owner's data ports.
	End Method
	
	Method OnRemove()
		
	End Method
	
	Method Update()
		'Print "controller update!"
	End Method
End Type


'lets say your game character can pick up items and store them. this component adds that functionality, by working together
' with the 'collision' component to see if we collided with some type of object. then we can handle the object internally.
Type TInventoryComponent Extends TComponent

	Method New ()
		SetType("TInventoryComponent")
	End Method
	
	Method OnAdd(owner:TBaseObject)
		
	End Method
	
	Method OnRemove()
		
	End Method
	
	Method Update()
		
	End Method
End Type


'what if you have different bullets and wish to have different flight behaviors for each type of bullet?
Type TLaserComponent Extends TComponent
	Method New ()
		SetType("TLaserComponent")
	End Method
	
	Method OnAdd(owner:TBaseObject)
		
	End Method
	
	Method OnRemove()
		
	End Method
	
	Method Update()
		'Print "laser!"
	End Method
End Type

Function timer()
	If time > 0 
		Print "Millisecs elapsed since last call:" + ( MilliSecs()-time)
		time = 0
	Else time = MilliSecs()
	EndIf
End Function



Yasha(Posted 2014) [#10]
Yeah I meant runtime composition of things like vectors... simply owning a vector object in the static way is perfectly normal and the cost (in all ways) is negligible.

And it's good to see the performance isn't a problem (hey, I did say I was just guessing). That's the modern world for you; these days you could write a game in a slow-interpreted language and it'd still be blazingly fast. On desktop you just don't need to worry.

The real thing to focus on is taking advantage of the dynamism.


Just after a cursory glance though... doesn't this implementation have a memory leak? If a BaseObject has a map of components and each component has a hard link to its owner, you've got a circular reference. Stock BlitzMax's GC can't detect these, as it's based on refcounting. (Interestingly, this is no problem for Brucey's implementation.)


Derron(Posted 2014) [#11]
@circular reference

I think you should just have a "Delete()" method in the base object - which calls Delete for all components (which then remove their "parents").
But one should not rely on the GC to call it - so you have to manually call it.

Another option is a Collection containing every baseobject. Then instead of storing the parent itself you would have the GUID/hash of a parent - and fetch that from the globally available collection (collection.get(guid)). This adds then of course overhead when using it often.

Another way is to pass around the "parent" in each update/render call. So the parent informs the children on each action, that he is the boss :p.


@code:
	Method GetComponent:TComponent(componentName:String)
		If _components.Contains(componentName) Then
			Return TComponent(_components.ValueForKey(componentName))
		End If
		Return Null
	End Method
	
	Method AddComponent(component:TComponent)
		If Not _components.Contains(component.GetType()) Then
			_components.Insert(component.GetType(), component)
		End If
	End Method


There is no need to have the "contains" check (performance!!)
Getter: if it does not exist, it just returns null
Setter: the tmap just contains 1 object per key, so it overrides the already set value (it is just important if the insert is "expensive".


shortened code:
	Method GetComponent:TComponent(componentName:String)
		Return TComponent(_components.ValueForKey(componentName))
	End Method
	
	Method AddComponent(component:TComponent)
		_components.Insert(component.GetType(), component)
	End Method


If you want multiple components of 1 type you need to introduce arrays for each type (tmap-> key|componentArray[]).

Generally you might want to have a look at Artemax (to not reinvent the wheel :P):
Community/posts.php?topic=97192


@networking/ai
I connected them via "events" - my game creates events (or the specific functions emit events). My network handler listens to everything of interest and handles it then. A generic "network component" ... dunno how this should work for diversive objects ... some contain fields to share, some contain lists not to share if longer than X etc. . So you end up writing multiple components extending the "networkbaseComponent" just to handle specific objects.
Think at the end the workload for us coders is the same: if we use approach A or B does not matter :p.


@Yasha
With overhead I meant things like this: BlitzMax does not "optimize out" a single line Getter (Method GetX(); return x; EndMethod). But I do not know how much has to get done to pass something via a parameter. So I do not know if accessing a field-referenced object's method is less expensive than a parameter passed object:

Type blubb
...
End Type

Type bla
  field parent:blubb

  Method Run:int(paramParent:blubb)
    paramParent.do()
  End Method

  Method Run2:int()
    parent.do()
  End Method
End Type



Of course the param one avoids the potential circular reference - but does not allow for "picking" an object and traverse back along the ancestors - eg you got hit by a bullet and want to know which enemy shot you (which then needs the collection I spoke about at the beginning of my post).


bye
Ron


Yasha(Posted 2014) [#12]
So I do not know if accessing a field-referenced object's method is less expensive than a parameter passed object


Technically the field should be faster because one less item gets pushed to the stack (self goes on the stack both times, but the field can then be read from it directly to a register).

In practice, on modern hardware (which does a whole lot of runtime optimisation), I doubt there's any difference at all; if there is it'll be on the order of nanoseconds.