Beware the GC

Archives Forums/BlitzMax Bug Reports/Beware the GC

Otus(Posted 2008) [#1]
You might expect this to print a million hello worlds:
SuperStrict
Framework BRL.StandardIO
Import BRL.LinkedList

Local l:TList = New TList
l.AddLast "Hello "
l.AddLast "World"

Local link:TLink = l.FirstLink(), s:String

For Local i:Int = 1 To 1000000
	s = String(link.Value())
	link = link.NextLink()
	s:+ String(link.Value()) 
	link = link.PrevLink()
	Print s
Next


It does. If you compile in debug mode.

In release mode the garbage collector thinks the list is ready to be collected, when it sees there is no code to operate on the variable - it gets collected before l goes out of scope.

Took me a long while to find one of these in a larger project...


Grey Alien(Posted 2008) [#2]
Interesting, thanks for the heads up. What happens if the list is made Global?


Otus(Posted 2008) [#3]
It seems the list is not collected if a Global references it.


Blueapples(Posted 2008) [#4]
You have to be kidding me...


TaskMaster(Posted 2008) [#5]
Is it the TList (l) getting collected or the TLink (link)?


ziggy(Posted 2008) [#6]
Move this to bug reports?


Azathoth(Posted 2008) [#7]
Is it the TList (l) getting collected or the TLink (link)?

It looks like that the List is getting collected which calls TList.Delete which is setting the fields in the link to null.


Otus(Posted 2008) [#8]
It looks like that the List is getting collected which calls TList.Delete which is setting the fields in TLink to null.


Exactly.

Move this to bug reports?


I was unsure as to whether this would be called a bug or not. What is the way things should work?


Gabriel(Posted 2008) [#9]
I wonder if this isn't due to Mark changing the way local variables are handled by the GC. I remember him telling me a while back that the GC no longer reference counted local variables in the same way that it used to. I mean it used to be the case that if all you had left was a local handle, the internal reference count would show one, but now it shows zero. Perhaps that's because of this apparent urge to collect something mid-scope is a sign of that.

First thing I tried was putting all the code in a function and the same thing happens, so it's not only happening with a local variable in the global program scope.


Grey Alien(Posted 2008) [#10]
When was this GC change made? After V1.28?


Gabriel(Posted 2008) [#11]
I don't think so. I haven't upgraded past 1.28.


Floyd(Posted 2008) [#12]
I still have 1.26, for testing. It has the same problem.


Gabriel(Posted 2008) [#13]
Oh, well then it might not be the change I referred to then, because I don't think it was that long ago either.


ziggy(Posted 2008) [#14]
I'm not sure this is a bug. Nowhere is stated that the GC waits for the entire scope to perform the collection (or I haven't seen it).

It seems to me that the list is being collected just before entering the FOR loop, and doing link = link.nextlink() is doing the rest, as the link is set to the second item of the chain, so the first item is disposed (the list no longer exists) and that causes the hole chain to be also disposed.

Usually debug mode has a much less agresive GC, so I supose the list object remains alive along all its scope. If that's not the case, and this is really a bug, please let me know it!


Azathoth(Posted 2008) [#15]
It seems to me that the list is being collected just before entering the FOR loop, and doing link = link.nextlink() is doing the rest, as the link is set to the second item of the chain, so the first item is disposed (the list no longer exists) and that causes the hole chain to be also disposed.


The TList's Delete method is explicitly setting the link's fields to null, so nextlink is going to give a null object.


Otus(Posted 2008) [#16]
I'm not sure this is a bug. Nowhere is stated that the GC waits for the entire scope to perform the collection (or I haven't seen it).


No, but with it being based on reference counting, one would assume an object will not be collected as long as it is referenced.

It seems to me that the list is being collected just before entering the FOR loop, and doing link = link.nextlink() is doing the rest, as the link is set to the second item of the chain, so the first item is disposed (the list no longer exists) and that causes the hole chain to be also disposed.


Actually, the list is collected only after several rounds of string concatenation, when the GC wants to clear some memory. Also, the reason is not that only one link is referenced at a time, but that list .Clears itself on Delete (removing references between links).

Usually debug mode has a much less agresive GC, so I supose the list object remains alive along all its scope. If that's not the case, and this is really a bug, please let me know it!


The variable cannot lose its value in debug mode, because it must be available for debugging. That's the reason the GC cannot "optimize" in the same way as release.


Czar Flavius(Posted 2008) [#17]
SuperStrict
Framework BRL.StandardIO
Import BRL.LinkedList

Local l:TList = New TList
l.AddLast "Hello "
l.AddLast "World"

Local link:TLink = l.FirstLink(), s:String

For Local i:Int = 1 To 1000000
	s = String(link.Value())
	link = link.NextLink()
	s:+ String(link.Value()) 
	link = link.PrevLink()
	Print s
	Local a:TList = l
	l = null
	l = a
Next


This seems to work.


TomToad(Posted 2008) [#18]
To me, this would be a bug. l should not be GCed until there is no longer a reference to it, which would be at program exit in this example.


Grey Alien(Posted 2008) [#19]
I agree, it's a bug because that local TList variable never goes out of scope (until program exit).


Beaker(Posted 2008) [#20]
It's a bug to me as well, all local variables should be scope based IMO. [moving to bug reports]


Otus(Posted 2008) [#21]
This seems to work.


A better fix is to simply Clear the list after it is not longer needed:

SuperStrict
Framework BRL.StandardIO
Import BRL.LinkedList

Local l:TList = New TList
l.AddLast "Hello "
l.AddLast "World"

Local link:TLink = l.FirstLink(), s:String

For Local i:Int = 1 To 1000000
	s = String(link.Value())
	link = link.NextLink()
	s:+ String(link.Value()) 
	link = link.PrevLink()
	Print s
Next

l.Clear()



Vlad(Posted 2008) [#22]
I'm don't agree to call this as bug. Just give me to explain.
Better is declare list as global or write correct code.
Global list:
Global l:TList = New TList

Or right code to work with lists:
For Local i:Int = 1 To 1000000
	s = String(l.first())
	s :+  String(l.last())
	Print s
Next

or
For Local i:Int = 1 To 1000000
	Local s:String
	For Local ls:String = EachIn l
		s :+ ls
	Next
	Print s
Next

or write code more 'clear' (use local variables at local spaces and not in global space):
For Local i:Int = 1 To 1000000
	Local  link:TLink = l.FirstLink(), s:String
	s = String(link.Value())
	link = link.NextLink()
	s:+ String(link.Value()) 
	link = link.PrevLink()
	Print s
Next

TLink is for internal use of TList. But not for 'only' ;) because access to TLink give more flexible programming.
If u declare list as local, this means list was deleted from memory after last use. After this code: "Local link:TLink = l.FirstLink()", if garbage collector calls after each command.


TomToad(Posted 2008) [#23]
Vlad, I disagree. This is a bug with the optimizer in the Blitz compiler. An object shouldn't be GCed after the last use, but instead after the object loses scope. In the case of Local objects in the main program, that'll be at program exit.

The above example demonstrates why. The GC cannot anticipate all the different ways that an object will be referenced from a program. What about Var and Ptr? Or passing the object to a .dll?

It is also possible for custom types to be accessed similarly to the TLink example above. Create a SpriteGroup type and within a list of Sprite types. You decide you want to manipulate the sprites directly instead of the entire group at once. The SpriteGroup as well as all the sprites get GCed because SpriteGroup is no longer being referenced in your program.

Take a look at this example. Debug mode prints all 50s, release prints all 0s
SuperStrict


Type TAlpha
	Field Array:Int[]
	
	Method New()
		Array = New Int[100]
		For Local i:Int = 0 To 99
			Array[i] = 50
		Next
	End Method
	
	Method Delete()
		For Local i:Int = 0 To 99
			Array[i] = 0
		Next
	End Method
End Type

Local Alpha:TAlpha = New TAlpha

Local Array:Int[] = Alpha.Array
Local s:String

For Local i:Int = 0 To 10000
	s = ""
	For Local j:Int = 0 To 99
		s :+ Array[j]+" "
	Next
	Print s
Next


I think it is very reasonable for a programmer to expect that an object will be valid and components can be accessed throughout it's entire scope, not just until the main object is no longer used.


Azathoth(Posted 2008) [#24]
I'm don't agree to call this as bug. Just give me to explain.
Then you don't understand that the GC is meant to collect when there are no references to the object, 'l' is still referencing the Tlist when it gets collected.


Vlad(Posted 2008) [#25]
2 TomToad
To protect list from GC, if u don't want use it as regular, is declare it as global var. See my first sample. Or use trick below.

2 Azathoth
Then you don't understand errors in code by Otus. I showed it already. Or u don't understand what 'l' don't use in code after "Local link:TLink = l.FirstLink()"? And any references to list not exist after this?
Well how about this code, when 'l' REALLY still reference until end of program:
SuperStrict
Framework BRL.StandardIO
Import BRL.LinkedList

Local l:TList = New TList
l.AddLast "Hello "
l.AddLast "World"

Local link:TLink = l.FirstLink(), s:String

For Local i:Int = 1 To 1000000
	s = String(link.Value())
	link = link.NextLink()
	s:+ String(link.Value()) 
	link = link.PrevLink()
	Print s
Next
Print l.count();

Just write right and clean code and all be OK ;P
Use Global for global variables and don't declare global variables as local.


Azathoth(Posted 2008) [#26]
Then you don't understand errors in code by Otus.
Sure I do, I've already pointed out whats wrong.
I showed it already.
You showed you think its perfectly safe for the GC to collect live objects.

Or u don't understand what 'l' don't use in code after "Local link:TLink = l.FirstLink()"? And any references to list not exist after this?
'l' is still referencing an instance of TList, so until it either goes out of scope or 'l' gets assigned to a different instance the current instance is still alive.


Vlad(Posted 2008) [#27]
2 Azathoth
'l' is still referencing an instance of TList, so until it either goes out of scope or 'l' gets assigned to a different instance the current instance is still alive.
The 'l' variable not exist after 'Local link:TLink = l.FirstLink()'. So it removes and reference removes too. If no variable with reference then no reference :)
You showed you think its perfectly safe for the GC to collect live objects.
Yup. I don't detailed explain this but i think samples is enough. Sample from my second post explain it.
Also this problem a bit specific. The TLink class not have a 'parent' references to TList. And if Otus mean that using TLink have to protect parent TList, that this is wrong for Blitzmax. But this is just specific of this class, and not a bug.


Azathoth(Posted 2008) [#28]
The 'l' variable not exist after 'Local link:TLink = l.FirstLink()'. So it removes and reference removes too. If no variable with reference then no reference :)
Sure it does, show me where it goes out of scope.

Edit: The only thing even to suggest its removed is this bug; 'l' is still in scope, it still exists, its just not directly used any longer.


Vlad(Posted 2008) [#29]
But where scope for local variable, if it declares in global scope? This is specificity of language to decide where is that.

But I agree that I was wrong when say this so peremptory.
I test this code:
SuperStrict
Framework BRL.StandardIO
Import BRL.LinkedList

Function fn()
	Local l:TList = New TList
	l.AddLast "Hello "
	l.AddLast "World"
	
	Local link:TLink = l.FirstLink(), s:String
	
	For Local i:Int = 1 To 1000000
		s = String(link.Value())
		link = link.NextLink()
		s:+ String(link.Value()) 
		link = link.PrevLink()
		Print s
	Next
	'l.clear()
End Function

fn()
In this sample a scope of variables is clearly defined. But error occurs. This is wrong. So I can still say that this is a specificity of language. But this is very different from the generally accepted rules of programming.
We still can make code work with tricks as use list at end of function or use list as return parameter. But this is weird.

So I partially agree that this is a bug.
When we declare global variable as local this can be language specific. Sure more logical to assume if local variable was declared in global scope that it have to be global. Things like 'holywar'.
But when we clearly define variable scope this is bug.


Otus(Posted 2008) [#30]
But where scope for local variable, if it declares in global scope? This is specificity of language to decide where is that.


As you show, the same thing happens if you move everything to a function. The list object will still get collected, even though the scope of the local variable - the function - has not ended. There is no logical reason for not having the same functionality in the "main function", which is all code outside functions.

Also this problem a bit specific. The TLink class not have a 'parent' references to TList. And if Otus mean that using TLink have to protect parent TList, that this is wrong for Blitzmax. But this is just specific of this class, and not a bug.


The TLink nodes cannot reference the TList object or lists would never be collected due to circular references. Also, I did not use TLink to protect the list - I referenced the list in a local variable to protect it.


Dreamora(Posted 2008) [#31]
It was mentioned that local in global scope behave different to clearly specified scopes. In global scopes there is no define lifespan for them, as the local reference count is handled differently than the global reference count for performance reasons (at least my guess thats the reason. that it is done differently was stated by mark and is already mentioned above).

One could see this behavior as a bug. up to a given point I would actually agree.
But on the other hand, you are not meant to have locals on the global level.
Either its global or it has nothing to do there but has to be part of a local scope. Reason is pretty simple: Locals are tried to be kept as near to the cpu as possible. Globals remain in RAM / L2 cache most likely.

If BM would try to use that optimizations of above without GCing your design errors (which they are in the current context of local and global), your CPU performance would ugly degrade!! (especially on AMDs with their inexistant caches)
What would be needed in this case is that the compiler makes them "global without global access", this means that they are internally handled like globals, not like locals, but bmk and bcc making them unaccessable for protected subscopes like functions

I've never had such problems thought, my apps are object oriented ...