(for the pros) Random errors, help, ideas?

Blitz3D Forums/Blitz3D Programming/(for the pros) Random errors, help, ideas?

Vertigo(Posted 2007) [#1]
Hey everyone. I have quite a little issue going on here. I have some code that is not playing nice on all machines. It uses quite a few banks. And before any calls to render world or any graphics commands(besides loading models in general), it will mav. It always mav's on anything relating to the bank commands. Saying BBbank does not exist. The strange thing, is that I can print, and debuglog the handle of the said bank.

Now, if I restart my computer it will present the same exact error, but at a totally different place. There around about 5 or so resize bank commands in a row. If I restart my computer it will always fail at one of the 6, commands until I restart my machine again, then it will always fail on a different one of the 6. The thing that gets me... This only errors out if the machine im trying to run this on has onboard integrated video. If it has an actual accelerator card then it runs just fine, no errors at all.

Now, if I lower the total number of banks on the machines that have onboard/integrated video and compile the code to an exe, then it runs fine. Yet STILL gives the bbBank does not exist error from within blitz... debug on our not. Does anyone have any idea what the heck might be causing this?


Rob Farley(Posted 2007) [#2]
1. Use paragraphs
2. Without seeing your code it's damn near impossible to say.


big10p(Posted 2007) [#3]
What he ^^ said, and...

The strange thing, is that I can print, and debuglog the handle of the said bank.
Just because you can display the bank handle doesn't mean the bank still exists. Handles don't get zero'd when freed.


Vertigo(Posted 2007) [#4]
Ok ok, So on machines that have integrated video cards this will fail...

If you have a real 3d card agp/pci/pcix whatever, the code runs fine...

So big10p why does the bank get zero'd when you use an integrated video card and not when you have a real video card?

And why does it "sometimes" let you compile it to an exe and run just fine, but still gives a bank error running in blitz debug on or off?

This isnt anything specific to the code, if that were the case then it would fail on ALL machines this has been tested on, and not just 4 out of the 8. Right?

Its not the code so much thats the issue, rather blitz, in why its deciding to act different and crash at random places on machines that do not have a real video card.

Below is a link to a video I made demonstrating the issue. This is on a Pentium Core2Duo 2.0ghz with 1gb ram, intel onboard video card 945g.

http://205.234.98.59/~vertigo/web_cms/pub/wtf.avi

Note you may need the codec from www.camstudio.org, and use windows media player.

See how it fails here in blitz, yet when I compile it... it runs just fine.

On 4 other machines with various ati and nvidia cards it works fine, no errors in blitz. Same code, same versions of blitz all running blitz off of a thumb drive with the same decls, all windows xp.

Im just curious if anyone has ever experienced anything like this, or has done any HEAVY work with banks.


Gabriel(Posted 2007) [#5]
This isnt anything specific to the code, if that were the case then it would fail on ALL machines this has been tested on, and not just 4 out of the 8.

Equally, if it wasn't anything specific to the code, then everyone would be reporting that their programs mysteriously fail on bank commands when run with debug enabled on a machine with integrated video. They're not, so either a huge number of people are very remiss or it *is* something specific to your code, even if it's not something obviously specific to your code or even related to the parts of your code which you would logically expect to be involved.

For instance, reading and/or writing out of bounds to a bank may crash and it may not. Just because you shouldn't do it, doesn't mean it will necessarily pop up and nicely report to you that you've done it. It may only do that under certain, improbable circumstances.

Another example, any Windows API stuff, kernel32, all that, can do very unpredictable things if you pass it a bad pointer or something like that. It won't necessarily crash on the command you would expect.

Furthermore, any libraries you use which may spawn off a new thread can cause unpredictable errors. You may find it crashing on those lines because they just happen to be the next lines you execute after doing something which will shortly trigger an error in another thread. The timing won't be precise because threading is inherently unpredictable.

Heaven knows if any of those examples could be of any use, but these are the kind of things which could happen, and if it's any of them, it will still come down to your code. All I can suggest is to give it a bit of Sherlock Holmes. Eliminate things. Block comment sections of code. Write to the debuglog at certain points, try calling timewasting functions which do nothing particular, to see if you can get the program to crash on different lines. Without code, all I can really suggest is trial and error.


big10p(Posted 2007) [#6]
So big10p why does the bank get zero'd when you use an integrated video card and not when you have a real video card?
It doesn't. If the bank handle is zero and you haven't explicitly set it to zero after freeing the bank, then the bank was never created in the first place, for whatever reason.

How big are these banks and how many are you allocating?


Vertigo(Posted 2007) [#7]
Ok fair enough, yes i realize the code has "something" to do with it... what I mean is I have managed to narrow this down to a hardware issue of sorts... Something blitz is doing differently between machines that have an integrated videocard(such as failing at parts of the code that make no sense), and those that do not have an integrated card(works perfect).

I mean at this point it could be a million things. Without being able to physically see blitz's memory heap that it has allocated on the machines that fail, and create a stop on the machines that work and try to compare data or something. Regardless the only piece of the puzzle that is consistent is the fact that ALL machines with video cards execute the code flawlessly. And that makes absolutely zero sense to ol' boy Scott here.


Vertigo(Posted 2007) [#8]
Big10p, as Gabriel mentioned the fact that it fails at random parts could very well be due to something with the thread pool. It may have some sort of delay before it crashes, and the actual cause of the crash isnt known. But what I find more important is how the hell(without using any 3dgraphics comands besides building a mesh from verticies) does a video card have anything to do with this when graphics commands are really being used yet, I mean.. its crashing during the initialization not during the loop... no flips, no calls to d3d draw or anything.

*** scratch that idea ***... If I put a for next loop right before ANY type of bank operation on that handle it will execute fine.. and bam, beit 1 second later or 15 seconds later it will crash on any type of bank command.

So to anyone super savvy with blitz... why do bank commands function differently between integrated and non-integrated systems... arent blitz banks making calls to the heap blitz has allocated on SYSTEM memory anyways, not on the vram? Plus in total when the code is on a machine that has a card, its only using about 8mb of the vram. Any card under the sun should handle that fine.

*head hits desk contemplating suicide*


Gabriel(Posted 2007) [#9]
Banks are never allocated in VRam. However, if a bank handle became corrupted ( we'll come to why shortly ) then you might address a bank, Blitz looks up the bank handle ( which now points somewhere completely different ) and kaboom!

Why / how could a bank handle become corrupted? Passing an invalid pointer to a DLL or a Win32 function. Passing incorrect parameters to the abovementioned.

Frankly, I wouldn't focus on the symptoms, because they're seemingly not helpful. Focus on the cause. And the only way you can find a cause without symptoms is.. trial and error. Cut as much out as possible. Get it bare bones. As soon as you cut something out which stops the bug happening, put it back in and play with it.


Floyd(Posted 2007) [#10]
Why / how could a bank handle become corrupted?

Another favorite mistake is using a floating point variable to store a handle.

This is particularly baffling because it will sometimes work. It depends on whether the address value can be exactly represented as a float.


Vertigo(Posted 2007) [#11]
OK cats, to add something even weirder to the mix. I have a Dell D620 with Intel945gm express graphics card. Core2duo 2.0... blah blah.. alright with a clean install of xp using the same blitz folder as the rest it fails under xp... however, out of sheer curiosity I installed the copy of windows vista ultimate that came with the machine. Clean install, same folder of blitz.. and guess what... NOW it runs without failing. There is absolutely zero consistency with this.

I guess im sorry for bothering you guys. There is a lot more going on here, and im frustrated beyond belief. I realize there is not sufficient data to present to you guys for any real type of diagnosis. I do appreciate the help you have given me though. So thanks. I give up for now.


Danny(Posted 2007) [#12]
Hi All - I'm the one responsible for the Code Vertigo is talking about....

I JUST FOUND OUT THAT BLITZ WILL ACCEPT A NEGATIVE OFFSET WHEN POKEING TO A DATABANK!!!

A mistake/bug set my 'offset' variable to a negative number, but instead of crashing immediately it gives a MAV at a much later stage... AND WHEN YOU COMPILE it does NOT GIVE A MAV AT ALL!
Meaning I was probably poking in some 'random' bit of memory quite possibly my own compiled code!!...
Funny thing is that 2 of my machines here I guess I was 'lucky' enough that the random poking did NOT crash my machines, but Vertigo's systems went belly up...

Here's proof, just copy/paste and run this:

	;note: DEBUG ON or OFF: MAV at End
	;note: COMPILED: NO MAV WHATSOEVER!!! I’m POKING ILLEGALLY IN ‘some’ memory!!!
 
	bb = CreateBank(40)
	PokeInt bb, -4, 666		; ILLEGAL OFFSET! -> Should MAV!!!

	Print "But I'm here!"		; SHOULD NEVER BE HERE !!
	WaitKey()
	End			; NOW you get the MAV !!


How you like them apples?!

:Danny


big10p(Posted 2007) [#13]
Ah.

Well, presumably you're calculating the offset in code with something like 'offset = A - B'. As such, the compiler can't tell this is erroneous as the values of A and B aren't known until runtime.

Also, when compiled with debug off, no checks for out of bounds errors like this are carried out.


Danny(Posted 2007) [#14]
Touché!


Floyd(Posted 2007) [#15]
With Debug On the negative offset really should be caught, just like an offset which is too large.

Has this just started with some recent update? It's hard to believe it went unnoticed all these years.


Vertigo(Posted 2007) [#16]
Nope, I tried this thing with every version of blitz runtime and linker from version .88 to .99 :) *phew* I can breathe now. Blitz is usually nice to me and doesnt pull illogical crap like this haha.


Gabriel(Posted 2007) [#17]
I agree with Floyd, it really should be caught in debug mode. Perhaps it's not because there are no pointers in Blitz3D and this presents a way for people to hack in memory access if they need it, but I can't help thinking that should be done via a DLL or something if it's truly required.

After all, there's a reason a lot of people choose Blitz languages over C/C++ and one of those is not having unpredictable pointer crashes.

Anyway, glad you found it.


SLotman(Posted 2007) [#18]
Well, if your offset was a "signed byte" then -1 would actually be passed as 254. So Blitz may be using "signed integers" or "signed longs" for that, and -4 would be some positive value after compiling.

This *could* (I'm guessing here) explain the problem on integrated chipsets, the onboard video probably uses the top memory area as VRAM, and then this would actually be writting/reading from there...


Gabriel(Posted 2007) [#19]
But surely -4 as an offset would be out of bounds in a bank of size 40 whether it were an unsigned long, an unsigned int or even an unsigned byte?


Vertigo(Posted 2007) [#20]
No I dont believe -4 would start to wrap around... as gabriel said, its only a size of 40.. anything out of that allocated space should present an error right then and there in blitz. However, slotman you are slightly correct. It would appear that integrated video cards that use shared memory store their data in the relative space that blitz will start storing these banks. Atleast on xp pro with intel cards onboard. Blitz when it would "work" with these out of bounds data points would simply be poking memory that wasnt used by system processes or hardware. On integrated sets, im sure those programs reclaimed(especially the shared ram as vram for graphics) the allocated space... making handles in blitz now point to data that it itself can no longer make heads or tails out of. This was a two part problem, slight coding mistake, and blitz not reporting errors like it should.


SLotman(Posted 2007) [#21]
Even -1 would wrap around... but not on bank area, it wraps around on RAM area. Example, if you have 64Mb RAM, a -1 would point to 65534Kb, regardless of what your bank size is.

So I suppose integrated video with shared mem uses the last RAM banks as VRAM; and when you write to -1, -4 or -(anything), you will be writing on that memory (or reading from it), trashing it's contents.

On non-integrated memory, you just happen to write at some non-used memory area... maybe if some other application would be using it, then it would crash as well...

But I do agree Blitz should report it as "invalid offset", or something like it (unless you are using a variable, so there's no way the compiler/IDE could find out about it)