K, how the hell do I debug this?

BlitzMax Forums/BlitzMax Programming/K, how the hell do I debug this?

GfK(Posted 2014) [#1]
A problem has just cropped up in my new game, which results in an "EXCEPTION_ACCESS_VIOLATION" in release mode. The game runs perfectly in debug mode.

I can reproduce the problem 10/10 times, so I know vaguely the area I should be looking.

But is there any way I can speed up the debugging process here and find out exactly what's going on?


GfK(Posted 2014) [#2]
Interesting... is there some issue with filestreams being left open that can cause this?


Derron(Posted 2014) [#3]
Shouldnt EAV's be problems arising when accessing null objects?

So I guess you add checks for null in the area (and deeper in the case of module types).


If sometimes happens in release mode and not in debug, this does not mean that, the debug code is "differing" from the release one: I also experienced binaries running flawless on my linux distro and crashing on windows...but at the end it was a null variable (dunno why it did not segfault on linux). Maybe some kind of "crash protection" from the system - or the GC works differently or whatever.


bye
Ron


GfK(Posted 2014) [#4]
Shouldnt EAV's be problems arising when accessing null objects?
Normally I would say so, but not in this case, it seems.

I wrote some code to spit out code markers to an external file, so I could see where execution was stopping. If I do that in a certain place in the code, the problem magically goes away, which leads me to believe there may be some connection here to file pointers. The very act of opening a file stream, seeking to the end, adding something to the file, then closing the stream, is inadvertently fixing the problem, rather than helping to find what the problem actually is.

I'm using quite a few LoadBanks in the bit of code where I thought the problem was, but everything seems to check out OK, no Null stuff appearing.


dan_upright(Posted 2014) [#5]
You could try checking the module that handles filestreams for ?Debug stuff, maybe it's not doing something it should be in release mode?


GfK(Posted 2014) [#6]
Already tried that- there's nothing.

I've managed to track down the exact line where this is falling over:

		For Local s:String = EachIn tempMap.Values()
			filename = binPath + "graphics/characters/large/" + s.ToInt() + ".png"
			Local t:TBank = LoadBank(filename)
			If t <> Null
				result.graphic.Insert(filename, t)
			EndIf
			filename = binPath + "graphics/characters/large/eyes" + s.ToInt() + ".png"
			t = LoadBank(filename)
			If t <> Null
				result.graphic.Insert(filename, t)
			EndIf
		Next                    '<HERE
Completely stumped.


Brucey(Posted 2014) [#7]
Could be something in your map that's breaking the Eachin code. Specifically, since you are casting to String, it will be trying to "downcast" the contents of your map to only pick out the Strings from it (I assume you only have strings in there, but there may be some bad data somehow).

You could try making another loop above this one, which is basically :
For Local s:String = EachIn tempMap.Values()
Next

If that bit breaks, then you know it's a data problem in your map.

If not... then it's something else ;-)


GfK(Posted 2014) [#8]
Just tried that, no problems there - completes the loop every time.

I even tried changing the TMap to a TList, which resulted in exactly the same problem. So I've at least ruled out TMaps as somehow being the cause.

Can't get my mind off this file handle thing. I have a vague memory of there being some issue before with file handles being left open. Anyone else?


Brucey(Posted 2014) [#9]
LoadBank() opens and closes its own streams. So there's no problem there.

What does result.graphic.Insert() do, exactly?


GfK(Posted 2014) [#10]
Hmm... I think I've fixed it.

[edit] Never mind, no I haven't. :/


GfK(Posted 2014) [#11]
What does result.graphic.Insert() do, exactly?
Sorry, "graphic" is a TMap.


H&K(Posted 2014) [#12]
Something in

"result.graphic.Insert(filename, t)"

is altering tempMap.Values()

(But then it probably wouldn't run in debug either)


H&K(Posted 2014) [#13]
Make a closed each in loop (re Brucy), that pre loops and stores the "s" values, then real loop runs and compares "s" values just before next, and see if they have differed


dan_upright(Posted 2014) [#14]
Can't get my mind off this file handle thing. I have a vague memory of there being some issue before with file handles being left open. Anyone else?
That rings a bell with me too, though I'm buggered if I can remember specifics.


GfK(Posted 2014) [#15]
Just put this in:
		Local Count:Int
		For Local o:Object = EachIn tempMap.Values()
			Count:+1
		Next
		Print "counted " + Count + " objects"

It says there are two objects, which is correct. Both objects are strings.

Why would this EVER fail on "Next"? At the end of my rope with this. :/


GfK(Posted 2014) [#16]
Just tried this, to get around the need for a For/Next loop:
		For Local t:tDialogLine = EachIn dialog.lines
			If t.characterID > 0
				temp = temp[..temp.Length + 1]
				temp[temp.Length - 1] = String(t.characterID)
			EndIf
		Next

		If temp.Length > 0
			Local tPtr:Int
			Repeat
			filename = binPath + "graphics/characters/large/" + temp[tPtr] + ".png"
			Local t:TBank = LoadBank(filename)
			If t <> Null
				result.graphic.Insert(filename, t)
			EndIf
			filename = binPath + "graphics/characters/large/eyes" + temp[tPtr] + ".png"
			t = LoadBank(filename)
			If t <> Null
				result.graphic.Insert(filename, t)
			EndIf
			tPtr:+1
			Until tPtr >= temp.Length
		EndIf
It still crashes, but nowhere near as much. Probably 1/15 now, instead of 100% of the time. It makes no sense.


Floyd(Posted 2014) [#17]
I wrote some code to spit out code markers to an external file, so I could see where execution was stopping. If I do that in a certain place in the code, the problem magically goes away,

That used to happen way back in the olden days of C programming. The typical scenario was that you had some flawed code, such as an out-of-bounds array reference. This could crash, or not, depending on what memory the incorrect access happened to hit.

Changing totally unrelated code before the bad code, as a debugging measure, could make a crash disappear. The bad code was still bad, but because it had moved in memory the wrong memory access happened to be non-fatal.


Jur(Posted 2014) [#18]
Why would this EVER fail on "Next"?

It is possible the error occurred earlier but the program crashed at that point. I have encountered such behavior when accessing non-allocated memory with pointers.


Henri(Posted 2014) [#19]
Maybe the result object is being collected by GC ?

-Henri


dan_upright(Posted 2014) [#20]
Maybe the result object is being collected by GC ?
Obviously it shouldn't be but that's a good point, have you tried turning GC off for that section of code Dave?


degac(Posted 2014) [#21]
if you have excluded Tmap, just try to comment out the LoadBank commands to see if there's something about it.
Maybe there's something about time and reallocating the same bank/reference that causes some problems. Move out from the main loop (same in Repeat--Until) the Local t:Tbank to see what happens.


GfK(Posted 2014) [#22]
I've got rid of all of the LoadBank stuff and rewritten a lump of code, and still have the same problem but it seems to happen quite rarely now.

I'm still using For/Next with a TMap enumerator and it's tripping up on "Next" when the loop is complete. I know this because:

					Self.characters.Insert(String(c.id), c)
					Print "char done"
				Next
				Print "all done"
				tempMap = Null

(tempMap is the TMap).

I get "char done" but NOT "all done". Just about to rip out the TMap enumerator and try something else in its place. Is it at all possible there's a bug in the TMap enumerator somewhere?


dan_upright(Posted 2014) [#23]
Is it at all possible there's a bug in the TMap enumerator somewhere?
I hope not, I'm using that. I'll have a look but I don't promise I'll be able to figure it out. Did you try disabling the garbage collector?


H&K(Posted 2014) [#24]
is dialog.lines terminating with anything specific? (is this crashing it?}

As a test add some terminating chr, and when encounted "Exit" from loop


GfK(Posted 2014) [#25]
@dan - no, I didn't yet. But I did replace the TMap with a TList, and it appears to have fixed the problem.

So, TList = good puppy, TMap = bad puppy, it seems.

More testing to do.


Derron(Posted 2014) [#26]
If you tested it with "TList", it wont be a "TMap"-problem.

I assume you somewhere modify a object you are iterating over.

you use "result.graphics.insert" - have you ensured to have a valid "result" and a valid "result.graphics" ?



Why would this EVER fail on "Next"? At the end of my rope with this. :/


As soon as you access the "values()" of a null map, you will run into this. I had that "errors at a different position" some times too (this is what also is connected to that "errors on win32 but not on linux"-symptom).

Especially "print xyz"-debug-style was not working 100% accurate in this cases.

bye
Ron


Kryzon(Posted 2014) [#27]
Is this happening in the main thread or in a child thread?


GfK(Posted 2014) [#28]
No threading involved.


GfK(Posted 2014) [#29]
I hope not, I'm using that.
I'm using TMap enumerators elsewhere, too, without problems. Which only adds to the confusion.

Whether it's a mix of TMaps, large blocks of data (OGG files) and long TMap keys (file paths), I don't know. It was all very odd.


therevills(Posted 2014) [#30]
Just wondering why you are using LoadBank in the first place?


GfK(Posted 2014) [#31]
Because I have a bunch of OGG files which I wanted to load quickly into RAM, then LoadSound individually as needed from there (they're played one after the other).

So basically I've just switched from that, to loading them straight from HD. They're really small files (most under 50k) so no noticeable performance difference. I was probably just over-thinking it and creating problems that don't exist.


Derron(Posted 2014) [#32]
Did that also solve the problem of crashes?

Maybe the Thread regarding OGG playback (the threads are then created in the C files, not BMX ones) crashes and then your app crashes too?
Do you use maxmod2? It has its own "PlayMusic"-functionality with threads and so on.

Instead of loading all files into RAM you can have a queue containing the to play-songs. All of them get loaded as soon as they get added to the queue. After playing you can send them to a cache or free them, this depends on the dynamic of your playlist (eg. you allow a user to add a directory of oggs as a playlist).
If you ensured the files are just some kilobytes, this is surely "overhead" you want to avoid.


GfK(Posted 2014) [#33]
Did that also solve the problem of crashes?
Yep. Not a single crash all day, whereas yesterday it didn't go 60 seconds without falling over.

I do use MaxMod2, but disabling that was among the first things I tried as it seems to be responsible for most of my Blitzmax woes.


Derron(Posted 2014) [#34]
Maybe some odd bug in ogg-decoding (or corrupt ogg files)?

Maybe you try a "LoadSound" from pre-decoded oggs (so convert them to "wav").

Another thing is: to use 1 file with different filenames (so they do not share contexts) and check if that still happens.



bye
Ron


GfK(Posted 2014) [#35]
Maybe some odd bug in ogg-decoding (or corrupt ogg files)?
Maybe, but in my original post the problem was with images, not audio files.


Derron(Posted 2014) [#36]
Thread is way to long to remember all details. Sorry.

So decoding an image from memory was bugged ... maybe there is something corrupting your memory - or your RAM is broken. Did you check on other computers or was the bug reported from other users?


bye
Ron


GfK(Posted 2014) [#37]
Tested on four PCs, failed on all four. 1 x Win8, 2 x Win7, 1 x XP.


Kryzon(Posted 2014) [#38]
In order to try and get more information on the problem, remember you can encapsulate the code inside that For...Next loop with a Try...Catch statement, to see if the crash behaves differently.

For (...)
	Try
		(...)
	Catch e:String
		Notify( "EXCEPTION:~n" + e )
	End Try
Next



Brucey(Posted 2014) [#39]
Can you catch an EXCEPTION_ACCESS_VIOLATION ?


Derron(Posted 2014) [#40]
Ok, so RAM wont be the source of the problem (hmpf...would be the easiest one to solve),

So just to see if I got that right:
- you have a loop loading resources into banks - to get rid of hdd accesses.
- they are not getting decoded in that loop

if you replace that "Loading into bank" with "straight LoadSound/LoadImage..." (within that loop!), that loop does not crash?


TBank.Load() and LoadPixmapPNG do the same things to streams, so if that would be a source of the bug, it would be something within streams (memory and file). I doubt that again (such things are so heavily used that the bug would have been known for years now).


I assume you are not able to shrink that reproducable piece of code to a postable size?

Hmmppf.


bye
Ron


LittleDave(Posted 2014) [#41]
your Images might be corrupt,its happened to me before.