Please make GC optional

Community Forums/Monkey2 Talk/Please make GC optional

JaviCervera(Posted 2015) [#1]
While garbage collection can help in some situations such as detecting and breaking circular references, it would be great to be able to disable it entirely and rely on other stuff like smart pointers, which add less overhead (being able to mark references as 'weak' would help with breaking circular refs). I am still not sure that GC is the best way to go in game programming, especially when trying to write low-level stuff with it, like a 3D engine, where every cycle you can save counts.


Danilo(Posted 2015) [#2]
+1

Very useful for Frameworks and complete Library Systems.


Michael Flad(Posted 2015) [#3]
Yeah this, as well as a way to put instances/classes into a given memory block as memory performance nowadays is almost always one of the most limiting factors.

There's a pretty CppCon talk from Mike Acton (Insomniac Games) https://www.youtube.com/watch?v=rX0ItVEVjHc

In addition Unity is a pretty decent sample of this issue too because fighting the GC seems to be one of the really prominent tasks, developers have to focus on and paradoxically it's much more complicated with languages like C#.

Another decent read http://sebastiansylvan.com/2015/04/13/why-most-high-level-languages-are-slow/


ziggy(Posted 2015) [#4]
Please, don't make the GC optional. Providing a good GC is a much better option. Unity one is horrible slow in some situations.


JaviCervera(Posted 2015) [#5]
What's the problem in providing the option to disable it, Ziggy?


ziggy(Posted 2015) [#6]
Usually, mixing both GC model and not GC model in the same code causes lots of mem fragmentation, and forces usage of the third-party written apis to be coded now in a managed fashion, now not, depending on the favorite flavor of the module creator. IMHO this adds complexity to the whole language ecosystem for no good reason. Modern GC are fast enough (sometimes faster than manual memory management).


JaviCervera(Posted 2015) [#7]
It wouldn't actually make a difference to support both: when disabling GC, the 'weak' keyword could be used to mark weak references, and all pointers could be exported as std::shared_ptr and std::weak_ptr (assuming that the translator uses STL; if not, whatever smart pointer implementation it would use). When GC is enabled, the translator would ignore the 'weak' keyword and export proper code to make everything garbage collected. There would never be a situation where GC and non-GC code is mixed.


Danilo(Posted 2015) [#8]
If GC is disabled, it's a choice. A choice by the user/king (for some reason)...


ziggy(Posted 2015) [#9]
It's not a choice if any module relies on proper elimination of circular references. You can have both modes at the same time, as in D, and add complexity to language usage because users will have to deal with 2 memory management models, or just allow one at a time, and expect all module and third party developers to design all their modules without circular references, just in case anyone wants to work without a proper GC.


Nobuyuki(Posted 2015) [#10]
I like my circular references being handled automatically, thank you very much! +1 to keeping GC 'mandatory'. Maybe there can be a clearly "unsupported" way to deal with this, like the current #CPP_GC_MODE=0 but with some limited control. Needs to be global as heck though so it's impossible to mix and match models in the same process. I think the majority of users will prefer GC being ON, maybe with a small amount of added control such as the ability to force the GC to flush (or to delay it, where possible -- I don't think it is on Android). Better support for async background processes would pretty much whisk away most remaining GC anxiety to the land of minor concerns. (Parsing heavy XML/JSON metadata or doing procedural asset generation in another thread would be useful)

Sincerely, a coding pleb (nobuyuki)


JaviCervera(Posted 2015) [#11]
It's not a choice if any module relies on proper elimination of circular references.
The only problem with that would be that you would be forced to work with GC on when using that framework if you don't want memory leaks.


Shinkiro1(Posted 2015) [#12]
Giving options can't hurt right? No, it does.
There is a cost when giving users options: increased complexity, maintenance (time, money), model is not as simple anymore, probably resulting in longer debug times.

Therefore I think that MX2 should pick 1 model and only 1. Otherwise it will just become a mess.


Skn3(Posted 2015) [#13]
I would agree that that to support two memory management models is a headache! But I can definitely see some benefit of adding some form of "destructor" for objects at least. Currently we just cut all ties to an object and pray. If we had some kind of Destroy() (pseudo destructor) mechanism built into all objects I think that could be a good stop gap.

Class Base<T>
	Global pool := New Pool<T>
	
	Method New()
		If pool.IsEmpty()
			Return New T
		Else
			Return pool.Pop()
		EndIf
	End
	
	Method Destroy() 'pseudo destructor
		pool.Push(Self)
	End
End

Class Item Extends Base<Item>
End

Function Main:Int()
	Local i:= New Item
	Destroy i
End


The gc could immediately call Destroy on the object when the internal ref count reaches 0. That way we don't have anything getting freed in an unexpected manor. In teh example above we could change Function Main to:

Function Main:Int()
	Local i:= New Item
	i = Null
End


and the Destroy() would get called for the object.


JoshKlint(Posted 2015) [#14]
My view is that garbage collection is an outdated fad. Performance never goes out of style.


JaviCervera(Posted 2015) [#15]
Yeah, I agree. Apple did it right with its ARC system.


GW_(Posted 2015) [#16]
I'm voting no for what it's worth. It would be a support nightmare for Mark and the other maintainers.
Unity is bad example. It's gc is like 8 years old and not designed well in the first place.
Mark has already mentioned something about being able to disable the GC for certain things.
performance is always my first priority and I've never encountered the kind of GC issues with Bmax/Monkey that would make me want to turn it off.


ziggy(Posted 2015) [#17]
The idea that any GC implementation is worse than manual handling of memory allocation and deallocation can be discussed. Manually handling of memory can lead to worse performance, specifically when there are lots of deallocations and allocations, and memory starts to get fragmented.
When a GC'ed application uses less ram than the available one, collection could never be trigged, and in this situation the whole application is way faster than when it uses weak pointers, which require ref counting, and mem being free and eventually compacted.

Then, there are several scenarios where a good GC can work wonders in comparison to manually human written unmanaged code, or ref counting based garbage collection. Ref counting (and weak pointers) are a bad idea because when references are met to be zero, memory will be deallocated, even when there's no need to deallocate it just yet. That means all objects being destroyed produce a small performance degradation, and also, all references being set/unset to a given object produce computational costs which are linear. Also, ref counting needs to be atomic on multithread scenarios which is a very bad situation regarding performance.

That said, the problem with stop-the-world mark&sweep collectors is that they may introduce pauses when they're performing collection (as in D, or Dalvik) and this is very disgusting when used on games, as it introduces pauses. While the whole performance may be better than with manual collection or refcounting, the performance is much more constant when using refcounting or manual collection. But there are very nice GC implementations that minimize a lot these pauses, to the point that they're unnoticeable. (realtime garbage collectors, some generational ones, etc).

@Snk3: BlitzMax had destructor methods, so maybe we could have them too on Monkey2? I think they where left out of Monkey1 because the Dalvik GC did not support them, but if we're not going the Java route, maybe Mark will add them They can be handy sometimes


Michael Flad(Posted 2015) [#18]
I'd love to learn about those scenarios where a GC will do wonders compared with manually written code (by a reasonably experienced developer).


Nobuyuki(Posted 2015) [#19]
I think ziggy mentioned memory fragmentation. Do reasonably experienced developers love faffing about with that sorta stuff?


ziggy(Posted 2015) [#20]
Take any multithread application where you have to increase/decrease the ref counting of weak pointers by using atomic operations. A simple variables reference swap means 3 blocking substractions, 3 blocking additions and 6 blocking comparisons. Compare this to a null garbage collector if memory usage is low, or to a realtime garbage collector. (I would recommend you the book The Metronome, A simpler approach to garbage collection in real-time systems. )

This site is curious: https://gist.github.com/spion/3049314 LuaJIT seems to perform faster than C and C++ in those benchmarks.

Then, I remember some time ago there was a long debate between two microsoft coders, both where trying to get a chinese dictionary sorting algorithm as fast as possible. One coder was using C++ while the other was using C#. Finally C# was marginally faster, but not in a significative way.


Danilo(Posted 2015) [#21]
ziggy wrote:
@Snk3: BlitzMax had destructor methods, so maybe we could have them too on Monkey2? I think they where left out of Monkey1 because the Dalvik GC did not support them, but if we're not going the Java route, maybe Mark will add them They can be handy sometimes

Destructors and use of Smart Pointers (and Pool) would be perfectly fine, wouldn't it?

- Destructor_(computer_programming)

- Boost Library Documentation

-- Smart Pointers
-- Smart Pointer Programming Techniques

-- Boost.Pool
-- Boost.Thread
-- Boost Timer library
-- Boost.Asio
-- checked_delete

- A new approach to memory management that solves the issues with shared_ptrs (contains more links -> see Conclusion)


Michael Flad(Posted 2015) [#22]
If you have performance sensible code and you have to do threaded reference counting you're just not what I call a reasonable experienced developer.

No issue with having GC for those as well as for areas where the improvement in productivity outweights the costs as long as there's a way to layout data structures in a sensible way, related to the given problem, for those who know what they're doing/need.


ziggy(Posted 2015) [#23]
Destructors and use of Smart Pointers (and Pool) would be perfectly fine, wouldn't it?
Perl was implemented like this and has been proven to not be a very good situation.
The most problematic areas with this approach:
1.- Slow ref counting when data is shared among threads (atomic operations means all threads are paused for at least a cycle)
2.- when a reference reaches 0, the object is released and memory freed. Ok, but that's expensive. What if deallocation is not yet required because the machine has lots of free RAM?Allocating and deallocating RAM is expensive, doing it all the time is an absurd performance loss.
3.- In addition to perform worse and add unrequired complexity to multithread handling, it makes circular referenced objects an automatic memory leak.

As oposite, working with a good GC does not have any of this issues. However, when a GC not designed for realtime applications is used, memory allocation and deallocation is concentrated in bug chunks. This improves performance in a large complex process, but introduces performance variations. That is, a complex process will end earlier (better performance), which is good, but the process will have some tiny pauses while it is being performed (worse stability). This, on a regular server or desktop app, has no importance. but when we're working on a game, we need a real time garbage collector.

Fortunately, there are realtime garbage collectors that solve this issue completely and very efficiently. The trade off is that they're much more complex to implement, and they're are not default on Android (Dalvik AFAIK) nor in .Net or JavaVM. That's why GC in general is considered to introduce performance degradation on games, because, even when they improve performance in the whole process, they can introduce visual "pauses" on a realtime process. IMHO using a proper GC is the answer.


marksibly(Posted 2015) [#24]
I think monkey1 ended up with a reasonable solution to GC, and that was pretty much 'don't use new if you don't want GC', as GC is triggered by the new operator.

As a result of this, the graphics and audio modules ended up being developed in such a way that they don't use New in 'realtime' bits of code, so they never cause GC.

You can't write a module that magically works with or without GC - either the module manages it's memory issues (via GC or manually) or it doesn't. You can write 2 modules, one with manual GC, one without, but if the manual version handles memory management issues on its own efficiently, why write the other?

You can suspend GC, but in realtime situations that will only makes things worse as it'll cause 'lumps' in incremental GC. The only time you should probably suspend GC is when you're messing around with extern libs and pointers.

So I think the main thing is to add features that make it easier to do manual memory management and avoid the New operator.

> A new approach to memory management that solves the issues with shared_ptrs (contains more links -> see Conclusion)

Interesting article! However, I don't see how it'd be possible to do without it ending up being quite heavyweight - wont each object need a list of 'assigned to' objects so it can find it's way back to the root? Gonna take a closer look at this...

I have been thinking about ref counting in general though, one idea being to pinch b3d's system. B3d used refcounting, but also allowed for manual 'safe' delete via a hidden 'this' pointer in objects, eg:

Class RefCounted
...header...
Field this:Object=Self 'real object to actually use
Field refs:int 'ref count
...fields...
End

Whenever you used blah.F in b3d, it actually expanded to blah.this.F. 'This' initially just pointed back to the object so was effectively sort of a NOP. But deleting an object would set 'this' to null, so once an object was deleted it magically appeared to be 'null' to every variable. Behind the scenes, an object would still consume memory until refs went to 0. It worked pretty well in b3d and I think the general idea could be useful in monkey2.

There is still a 'write barrier' involved for assigning a refcounted object to a var, but I think the double dereference of 'this' could be removed in release mode via only using 'this' to when comparing objects, eg:

Class Thing RefCounted
   Method Delete:Void()   'only for refcounted as yet...
   End
   Method Update:Void()
   End
End

Local t:=New RefCounted Thing   'probably an idea not to reuse plain 'New'
DoSomeStuff( t )
if t<>Null   'not deleted?
   t.Update()
Endif


In debug mode, the 't.Update()' would expand to 't.this.Update()', providing a runtime check that an object hasn't been deleted. But in release mode, it can just be 't.Update()', as the code has (theoretically) already validated the object. If not, they would have got a null object error anyway...

Of course, as with my 'Value' and 'Const' idea, making 'RefCounted' a compile time thing is a bit of a drawback. But it does dramatically reduce the crap involved in using objects in code as you don't have qualify vars everywhere with Const, Ptr, RefCounted etc, and it allows for some nice optimisations.

Anyway, just thinking out loud, but one thing's for sure - monkey2 will have GC based on Monkey1's.


marksibly(Posted 2015) [#25]
And regarding destructors (aka dtors 'coz destructors is a hassle to write) there a 3 main issues here:

* No guarantee dtors will be called in time, eg: if you're relying on dtors to release videomemory, but are avoiding 'New' for speed, GC never gets called and nothing is released! That's an extreme case, but the point is that dtors will be called 'at some time in the future', and relying on dtors to be called often enough to be used as a memory management tool is probably not a good idea.

* No guarantee of when dtors will be called. Sure, dtors will (probably) be called inside 'New', but lots of code uses New and it could be in a sensitive section of code, eg: when some global data structure is being modified. I consider this to be different from normal callbacks where an API can make some sort of guarantees about when a callback will be invoked - but 'any time new is called' is kind of dangerous.

* The 'zombie problem', consider...

Class C
   Method Delete:Void()
     Print "Deleting..."
     g=Self
   End
End

Global g:C

Function Main()
   New C
   GCCollect
   If g Print "It's ALIVE!"
End


The issue here is that dtors can theoretically 'resurrect' themselves (or other objects) by assigning them to fields/globals. In the above code, if Main tries to use 'g' the app will probably crash, as 'g' has been deleted/deallocated.

I'm yet to find a clean solution to this, so as things stand normal GC objects wont have dtors.


GW_(Posted 2015) [#26]
but one thing's for sure - monkey2 will have GC based on Monkey1's

I'm glad to hear this. Thanks for letting us know your thoughts on it.


Danilo(Posted 2015) [#27]
marksibly wrote:
I'm yet to find a clean solution to this, so as things stand normal GC objects wont have dtors.

Maybe an alternative to automatic dtors could be to optionally register a clean-up function/method
within New. So the object manager / garbage collector knows what objects want to get informed
when they are dying.
Class Window
    Method New()
        GCregister(this, addressOf free())
    End
End

I guess I'll just need to wait what you come up with. I still think importing and interfacing with (external) libraries
is one of the most important aspects, if you want to make MX2 a bit more generalized like BlitzMax.
The better and powerful the system is, the more useful add-on modules and lib imports will be made... ;)

Maybe Mike will write a new book about advanced MX2 topics...?


JaviCervera(Posted 2015) [#28]
Very enlightening posts, Mark. Thank you for your replies.

Actually, I already took the "avoid New" approach with a 3D rendering library I have written for Monkey (which is already finished, I will publish it as open source as soon as I write some documentation). The 3D math library relies on some cached objects to perform calculations, and no new vectors, quaternions, or matrices are created between frames.