blitz max great language but slower than expected?

BlitzMax Forums/BlitzMax Programming/blitz max great language but slower than expected?

Nate the Great

(Posted 2009) [#1]

hi,

I am in no way saying blitz max is a bad product, it is an awesome language and very fast at most things, but I just thought it would be as fast as C++ am I mistaken? It is much faster than b3d but the speed just doesnt compare to C++. it could just be that I am a noob to Bmax though and I dont know how to optimize things properly. Also is it possible to program in C++ in blitz max? or is that only with B3d SDK?

plash

(Posted 2009) [#2]

BlitzMax is actually just as fast, or faster in some cases vs C++.

I wouldn't start throwing out things like that unless you know what your talking about. There have been several tests in the past.
Even if BlitzMax is slower than C++ in some cases, it usually isn't far behind at all.

Nate the Great

(Posted 2009) [#3]

ok I guess its just because im new to bmax and not very good at optimizing things yet.

marksibly

(Posted 2009) [#4]

Hi,

BlitzMax isn't quite as fast as C++ - but it often comes close!

If you're having problems with speed, post some specific examples and I'm sure people would be able to suggest ways to improve them.

Nate the Great

(Posted 2009) [#5]

ok thanks mark!

I just did a basic test and it seems bmax is very close to C++. It must be some inefficient coding by me.

thanks

slenkar

(Posted 2009) [#6]

maybe its lists?

A lot of people say they are slow

Can you do some comparisons between blitz and C lists?

Nate the Great

(Posted 2009) [#7]

lists! if lists are slow then im in trouble because I have at least 10 of them. that could be what is slowing me down so much. would it be faster to use one global list?

ImaginaryHuman

(Posted 2009) [#8]

Choosing a better algorithm can sometimes help more than anything, too.

Nate the Great

(Posted 2009) [#9]

I did some testing and found that using 2 lists for 2 types runs 50% faster than using one list for both types all done in blitz max.

here were the programs I used

this uses 2 lists for 2 types

SuperStrict

SeedRnd(MilliSecs())

Global alist:TList = New TList
Global blist:TList = New TList

Type a
	Field x:Int
	Field y:Int
	
	Function Create:a(x:Int,y:Int)
	
		Local aa:a = New a
		aa.x = x
		aa.y = y		
		alist.addlast(aa:a)
	End Function
End Type

Type b
	Field x:Int
	Field y:Int
	
	Function Create:b(x:Int,y:Int)
	
		Local bb:b = New b
		bb.x = x
		bb.y = y		
		blist.addlast(bb:b)
	End Function
End Type

Print "Creating types..."

For Local i:Int = 1 To 10000000
	a.Create(Rnd(1000),Rnd(1000))
	b.Create(Rnd(1000),Rnd(1000))
Next

'Print "Press a key when ready to test!"

Delay 500

Print "Starting..."

Local timer:Int = MilliSecs()

For Local aa:a = EachIn alist
	Local tmpx:Int = aa.x		'this does some random math stuff just to take up time
	aa.x = aa.y
	aa.y = tmpx
Next

For Local bb:b = EachIn blist
	Local tmpx2:Int = bb.x		'this does some random math stuff just to take up time
	bb.x = bb.y
	bb.y = tmpx2
Next

timer = MilliSecs() - timer

Print timer

WaitKey()

this uses one list for 2 types

SuperStrict

SeedRnd(MilliSecs())

Global ablist:TList = New TList

Type a
	Field x:Int
	Field y:Int
	
	Function Create:a(x:Int,y:Int)
	
		Local aa:a = New a
		aa.x = x
		aa.y = y		
		ablist.addlast(aa:a)
	End Function
End Type

Type b
	Field x:Int
	Field y:Int
	
	Function Create:b(x:Int,y:Int)
	
		Local bb:b = New b
		bb.x = x
		bb.y = y		
		ablist.addlast(bb:b)
	End Function
End Type

Print "Creating types..."

For Local i:Int = 1 To 10000000
	a.Create(Rnd(1000),Rnd(1000))
	b.Create(Rnd(1000),Rnd(1000))
Next

'Print "Press a key when ready to test!"

Delay 500

Print "Starting..."

Local timer:Int = MilliSecs()

For Local aa:a = EachIn ablist
	Local tmpx:Int = aa.x		'this does some random math stuff just to take up time
	aa.x = aa.y
	aa.y = tmpx
Next

For Local bb:b = EachIn ablist
	Local tmpx2:Int = bb.x		'this does some random math stuff just to take up time
	bb.x = bb.y
	bb.y = tmpx2
Next

timer = MilliSecs() - timer

Print timer

WaitKey()
End

slenkar

(Posted 2009) [#10]

I suppose it would run 50% faster because you are iterating through everything twice with the single list.

if you compare lists to arrays or lists to maps you may find a large difference

Nate the Great

(Posted 2009) [#11]

by a large difference do you mean arrays are faster or slower than types and lists?

Matty

(Posted 2009) [#12]

These are unlikely to be bottlenecks in your code, even though I've never programmed in blitzmax and therefore haven't used 'lists' 'arrays' in blitzmax these sorts of things are usually not going to make a lot of difference speedwise.

As ImaginaryHuman says above - a different algorithm may make the world of difference.

Nate the Great

(Posted 2009) [#13]

@ matty - yeah Im working on optimization right now so im heavily researching new more efficient algorithms

xlsior

(Posted 2009) [#14]

by a large difference do you mean arrays are faster or slower than types and lists?

Arrays are faster, but types and lists can be easier to deal with.

MGE	(Posted 2009) [#15]

wait...you're new to blitmaz and you're ALREADY optimizing? Cmon... the speed of the language is fine out of the box, never optimize unless you need to. That's the joy of a language like Blitzmax. It allows you to get your idea up and running fairly quickly and then if needed...you go back and optimize. I've NEVER had to optimize anything other than some rendering code. So...enjoy learning the language, code things anyway you want, get something up and running and after it's almost done, if it turns out it's lagging somewhere...THEN post for suggestions on optimizing key areas of your code. (collision, ai, etc..) :)

Nate the Great

(Posted 2009) [#16]

@MGE its a physics engine, I have to optimize! Without optimization it slow to a crawl.

ImaginaryHuman

(Posted 2009) [#17]

I guess it depends on your background as to how relevant you think it is to optimize your code. Like MGE said you can still create efficient programs without having to think about optimization at all. BlitzMax is pretty quick, and as Mark said it's almost as fast as C++ and sometimes faster. It's not like some slow script language or interpreted language.

For me personally, my background started with BASIC and Pascal but also went into assembly language on the Amiga's 7Mhz 68000 CPU. In those days memory access was slower, instructions took more CPU cycles to execute (e.g. a divide operation took like 40 or more clock cycles), you had less CPU/Data cache, and you had to get a job done more efficiently. From that perspective, coding `sloppily/lazily` in higher level languages without much of a care about how your clock cycles are being spent was seen as inefficient. The faster CPU's became the less people have to worry about how fast something would be so they stopped paying attenting to how efficient their code was, as is the case today.

There are ways to optimize most code which will help with speed, but speed is really a subjective experience based on *perceived speed* by the user. If you are doing billions of raytracing calculations, obviously it will take a while to render a scene, and someone would see that delay as `slow`, but considering the amount of calculations it's doing maybe it's really super-efficient. Slowness and delay, in the eyes of most users, means `its taking some noticeable time`.

It doesn't matter to them how efficient your code is so long as they are not kept waiting too much. It also doesn't matter to them how efficient your algorithm is or how much faster your algorithm is than it used to be - unless it reflects directly on their perception of how fast it is. User's are oblivious to being sloppy on faster CPU's. If you have more power at your hands and you aren't programming efficiently, the sloppiness will absorb the power. That's partly why we have such bloated and relatively slow operating systems now, and partly why Linux is often faster than both Mac and Windows (in my experience anyway.

That said, I still do focus on starting out with an efficient algorithm and optimizing it somewhat based on what I think/know will be bottlenecks or ways of coding that would slow it down. I like things to be efficient even if I have a tonne of CPU time at my disposal, that's just me.

One thing you can look at, besides better algorithms which can make a HUGE difference to speed (e.g. using grids for collision detection rather than brute force comparisons), is pipelining. CPU's use pipelining because their `pipeline` is a series of sevearal smaller processing units. To execute an instruction can involve fetching the instruction, fetching operands from memory, performing a calculation on it, writing something out to memory, etc. As soon as stage 1 of an instruction is done and the cpu starts to move onto stage 2, stage 1 is now free and the next instruction can be started. The CPU can be executing several instructions at once.

This is especially noticeable where you are accessing the contents of memory. Each time you read something from memory, and especially if it is not in the cache, the next stage of your program has to wait until the data is finished being read before it can proceed with the next memory-accessing instruction. Meanwhile if there are other instructions it can start working on which *do not depend on the results of the previous memory read*, those instructions can be pipelined and you start to get extra performance `for free`. So if you can rearrange the sequence of instructions in your sourcecode, without breaking your algorithm's logic, so that memory reads and writes don't all occur at the same time (space them out inbetween non-memory-based instructions), and so that you are doing some work which doesn't depend on the data you're waiting to be read, then you can get a speed gain. This DOES apply to Blitz code as well as any other language. In a test I ran, it did noticeable increase performance. As a general rule it probably helps speed if you simply try to make a habit of not referring to something you just read from memory immediately after the memory read. Try to do something else with some local variables before you try to use the new data - give it time to be read.

The trick then is to know which instructions will be trying to read/write memory. Using local variables a lot helps because this puts them in cpu registers where possible. So if you read something from an array (which will be in main memory) and then immediately follow it by some operation on a local variable which doesn't require memory access, and then following that do something with the array element you read, you might get the local variable operation to be performed almost for free. Give it a try.

Also you should use as many Local's as you can get your hands on. Avoid Globals as much as possible, and especially avoid the sloppy non-strict Blitz code where you don't define global or local because those just default to main memory. Locals will always make your code faster. Use them as much as possible, as a general rule. Especially where you are doing intensive computation, try to use as many locals as possible, and if you know a variable is stored in a Field in a custom type, I recommend loading that variable into a new Local first before using it. Also I recommend that if you reference the same memory content/value/variable more than once in a section of code, and that value isn't getting written out to memory in the meantime, you should load it into a Local and refer to the local rather than keep reading from memory.

Although the CPU will have a limited number of concurrent registers that it can use at one time, which if you were savvy you could account for and not define too many locals at once which would possibly cause registers to have to be updated to/from memory, you can probably make good gains by just using as many locals as you like/can.

Also I recommend using arrays rather than linked lists where possible. Also if you have to use linked lists, you might even gain some speed by creating your own linked list. Instead of having a separate `link` type containing objects, just add a Previous and Next field to your custom type. I also recommend trying something which I came up with - circular linked lists. Rather than the list having a fixed start and end, simply point the last element in the list back at the first. It makes the list coding easier and more efficient, plus you can call any element the start or end since there really is no start or end. Adding and removing elements requires less code because you don't have to account for starts and ends.

There are a couple of ways you can use arrays while still having some flexibility over adding/removing objects randomly - sort of a cross between the benefits of arrays and linked lists.

If you don't care what order the elements of an array are in, ie not sorted, then you can keep two variables: space and count. Space defines the size of the array which you predefine at a fixed length, and Count defines how many elements are stored. Each time you store an element at Array[Count] you do Count:+1. Each time you remove an element (randomly) you copy the element from Array[Count-1] to array[DeletionIndex] and do Count:-1. You don't have to delete Array[Count], just let it be garbage.

This automatically fills in the gap created from deleting the array element. When your array does not have any gaps in the sequence of objects you no longer have to visit every element in the array trying to find objects. You just go from 0 to count-1. I think of it is as a self-defragmenting array. If you need to make more space, you can do it in a big chunk, like add 1000 extra elements using slices - this way you don't have to do memory allocations/copies very often, because those are slow.

If you are concerned about keeping array indexes the same and not moving objects around - e.g. if other objects are pointing to specific array elements at given indexes and you don't want to do two-way references, you can add a second array of `available indexes`, like a heap, where when you want to store a new element you get the index of the next free element from the top of the heap and reduce its count by 1. You then store your object at that index within your normal array of objects. When you remove an object you don't defragment the array but you store the free index on top of the heap again and increase its size by 1. This way you keep a running heap of `last in first out` available indexes and your allocated indexes can stay put. The only drawback is you then have to iterate through the entire object array to find all allocated objects, unless you have some optimized reference array somewhere.

Also something I am currently using in my bigger project for dealing with lots of objects is a combination of arrays and linked lists. The nice thing with lists is you can add and remove elements randomly really easily, but the downside of that is every element has to have memory allocation performed and can lead to fragmentation and scattering. Not good for the CPU cache. What I do is have a linked list of arrays. They are most likely self defragmenting arrays of a given size. When I run out of room in the array, I either jump to the next list element and start using its array, or I insert a new link in the list with a new array. This way you can spend most of your time working on consecutive memory accesses with arrays, which is best for the cache, and yet you still have some flexibility to expand the space as needed. Yes it does mean that you might have allocated array space which is going unused, but totally unused arrays can have their list links deleted. Overall I use a circular custom linked list where each instance of a link contains an array and a count. This works well for me.

Arrays are also faster than linked lists because a) you don't have the overhead of indirectly jumping to the list element, and b) you can randomly access straight to an object in the array rather than having to step through the list. Arrays are definitely faster so you should use them as much as possible.

You could also try partially or fully unrolling some loops. Each time the CPU has to evaluate the expression in a FOR loop, for example, that's extra overhead. If you can do your loop in steps and then have several iterations pasted after each other within the loop, you have less loop overhead and it will be faster. This is especially true when the contents of your loop code is fairly short, e.g a for loop which adds 100 to a variable 1 million times - change the loop to run only 10000 times and replicate the loop code 100 times, it will be faster. You can't go too big with unrolling though, the loop has to fit within the CPU cache.

Vilu	(Posted 2009) [#18]

That's an excellent idea creating custom circular linked lists. Thanks, Imaginary, I can definitely see some benefits in it. :)

JoshK

(Posted 2009) [#19]

Avoid dynamic allocations.

Warpy

(Posted 2009) [#20]

good lord that's a lot of writing

Grey Alien

(Posted 2009) [#21]

It's good enough for me. Does a game really need ultra speed anyway? It's doubtful a few extra % in speed will make a game more "fun".

Nate the Great

(Posted 2009) [#22]

I just posted because I am having bottlenecks in speed... I think its fixed now though. one more question: what are bitwise operations and how do I use them to be more efficient than just normal math? ie + - / *

@ Imaginary Human - thanks! thats quite a bit of writing and its very useful!

ImaginaryHuman

(Posted 2009) [#23]

Bitwise operations are operations which you perform at the level of binary rather than using base-10 numbers. Usually it's where you do things like set or clear or toggle one or several bits, and you operate on each bit separately. You can also shift the bits left or right and have them wrap around to the other side.

As you know, base-10 numbers have columns which go:

1000's, 100's, 10's, 1's

So for example the number 3456 comprises 3 thousands, 4 hundreds, 5 tens and 6 ones. In binary the column values are different, they go up in powers of 2, and the value in each column can only be 0 or 1 rather than the range 0 to 9. eg:

1024, 512, 256, 128, 64, 32, 16, 8, 4, 2, 1

So a number like 1816 would be 11100011000. Base 2 (binary) means the number in each column goes from 0 to 1 (2 states), rather than 0 to 7 (octal), or 0 to 9 (decimal), or 0 to 15 (Hexadecimal).

Okay so then if you want to modify those `bits` somehow, you would use the `and` `or` `not` and `xor `operations, eg:

a & b 'a AND b
a | b 'a OR b
~a 'Not a
a ~ b 'a XOR b

AND: Bits which are 1 in A, AND 1 in B, will be set to 1, otherwise it's cleared to 0

OR: If a bit is set in A, OR the bit is set in B, then the bit is set to 1, otherwise it's cleared to 0

NOT: If a bit in A is a 1 it is set to a 0, otherwise it is set to a 1 (the bits are inversed)

XOR: If a bit is set in A but is NOT set in B, or the bit is not set in A but is set in B, the bit is set to 1 otherwise it is cleared to 0. In other words, one of the two bits can be a 1 but not both - the set bit has to exist `exclusively` in A OR B but not both.

You usually use AND to say which bits you want to keep and which ones will be set to 0, like a mask. For example, if you have an RGBA pixel value and you want to isolate the red component, it would be MyRGBA:& $FF000000

You usually use OR to `set bits to a `1` state regardless of what it was before. You can use that to set `bit flags`, e.g. you might have 32 flags corresponding to which of 32 sprites are visible, ie Visibility:| %10000000000000000000000000000000 would set the 32nd sprite to be hidden.

You usually use NOT to inverse bits, ie to toggle them to whatever state they are currently not in, like switching a switch on and off. If you had a sprite enemy which gets dazed when you jump on its head but if when you jump on its head a second time it wakes up again, you could use NOT to toggle a bit between `dazed` and `awake`, 1 or 0.

You usually use XOR for strange things, like if you draw a hollow rectangle on the screen on top of other graphics, using XOR logic operation on the pixel values, it changes the pixels in such a way that when you XOR over the top of the same pixels they revert back to their original colors. XOR is reversible. Some people use it for basic encryption.

None of these are particularly useful for doing faster math. You can use AND to mask off some lower bits which would have the effect of rounding a number up to the next power of 2. I suppose XOR'ing the lowest bit would toggle it between odd and even. There's not much else I can think of.

For math you should look at Shl and Shr - Shift Left and Shift Right. If you Shift a value left by 1 bit it multiplies it by 2. The more bits you shift it the more powers of 2 it multiplies by. e.g. %1100 Shl 2 becomes %110000 which is 4 times bigger. So if you are multiplying or dividing a number by a power of 2 (ie 2, 4, 8, 16, 32 etc), using a shift is WAY faster than doing multiplication - but it only works on integers.

There are many other more advanced things you can do with binary and bits, see the bit-twiddling hacks page:

http://graphics.stanford.edu/~seander/bithacks.html

+ and - are such simple binary operations you aren't going to speed them up. Multiply is sped up by Shl, Divide is sped up by Shr, but that's about it.

Sometimes, algorithmically speaking, there are things you would do in an algorithm where you might have to do lots of IF's and Else's etc, but for which there are bit manipulations you can do which produce the same result without branches, thus is faster - see the bit twiddling page above. What that means is you do some kind of `processing` of the bits based on simple operations like those above, which change the state of the bits in such a way as to produce the result you want, without having to make any program-flow decisions.

Bitmasks also save a lot of memory space if you want to store 32 bit conditions in one integer rather than 32 integers (4 bytes versus 128) which can be accessed faster from memory.

SLotman

(Posted 2009) [#24]

~a 'Not a
a ~ b 'a XOR b

I don't get it. Why using "~" for not, when there was the so common used !a on C/C++?

Even worst, using the same symbol for XOR.... that has to confuse people, even (maybe even more) if they have a lot of background from other languages!

This seems to be a general problem with almost all programming languages... why create symbols for bitwise operations, instead of plain using keywords for them, and thus, making the code way more readable, and easier to understand?

I doubt that "!(a >> 2)%2" is easier to read than NOT (a SHL 2) MOD 2" :P

marksibly

(Posted 2009) [#25]

> why create symbols for bitwise operations, instead of plain using
> keywords for them

Well, the thing is what to call them.

'Not' and ~ actually perform different operations.

Ditto, 'And' and &, and 'Or' and | are not the same.

I've seen languages that use 'Band' and 'Bor' ('B' for 'bitwise'), but I don't really like 'made up' words in languages.

In the end, I decided to simply make them C-like. I decided to reuse '~' for 'bitwise not' because it was sort of symmetrical with '-' (eg: negate;minus and invert;eor), and saved a token.

Perhaps Band, Bor would have been better? Or even 'BitwiseAnd', 'BitwiseOr' and 'BitwiseNot'?

Still not sure what I prefer out of all the above to be honest!

TeraBit

(Posted 2009) [#26]

Personally I'm not that keen on symbols for these things either. I do take your point about made up words though, as these can be just as difficult to scan / remember. Compound words are a likely compromise:

BitAnd
BitNot
BitOr
BitXor

etc.

(Prepending 'BitWise' was a 'bit' long (pun intended) and 'Band' could be misinterpreted as something else)

dmaz	(Posted 2009) [#27]

Lists have been getting an undeserved bum wrap on these boards even though iterating though arrays are not inherently faster than iterating a list. Actually, *given everything the same*, lists should be very slightly faster.

random access is the best thing thing you get from an array and the should be the core reason to use one over a list. now, not all list collections are created equal but a singly linked list will perform the same as an array for iteration while insertions and expansion of a list can be orders of magnitude better than the array.

here's is a test...
http://www.blitzbasic.com/Community/posts.php?topic=76706#858425
remember to always compile in release mode.

*what I mean by everything the same... if you are testing performance, the objects should be preallocated in a block.

dmaz	(Posted 2009) [#28]

personally, I like the Max's syntax very much.... that includes the symbols but if I had to choose between BitwiseAnd, BitAnd and BAnd, I'd go with BitAnd.

Nate the Great

(Posted 2009) [#29]

Imaginary Human - ooo thats a neat website and definitely worth a look. thanks

ImaginaryHuman

(Posted 2009) [#30]

I generally feel that arrays will still always be faster to use than a linked list. Each time you add a linked list element you have to allocate memory, whereas with a pre-sized array the memory allocation only occurs when you want to add a significant chunk of space.

I will grant you that iterating through the array/list will possibly be similar speed. That's because a) arrays require you to read the base pointer of the array from memory first and to then read the contents, while b) lists require you to read the pointer to the next link from memory first and to then get the object. So each has an indirect pointer read making iteration potentially the same speed. One issue though is the cache - sequential array data is better for the cache than randomly scattered fragmented memory of a linked list.

Also, btw, using pointers with allocated memory (not an array or bank) can be a little faster even than arrays and lists because you no longer have an indirect jump to get to the base of the array memory. You can just keep the base pointer in a local variable and index it.

As to the syntax, I vote that it stay as it is - &|~ makes simple sense and I like them being similar in length to +-*/ - typing `BitAnd` seems a bit silly.

Slotman you asked why `~` is used as the symbol for NOT as well as for XOR. Well, you may not know this but if you XOR a variable with another variable which has all it's bits set to 1, e.g. $B4 ~ $FF, it performs a NOT operation. Due to every bit in $FF being set, every bit in $B4 is toggled - giving the same result as an inverse. So doing A ~ $FF is the same as doing `NOT A`, so really NOT (as ~) is just an abbreviated way of writing it. So in a way it does make sense to use the ~ for both xor'ing and not'ing.

If you specify both variables, e.g. A ~ B, those two specific values are XOR'd. If you only specify the second variable, e.g. ~B, then A is assumed to be $FFFFFFFF. Makes sense to me.

Oddball

(Posted 2009) [#31]

I switched PhysLite from using lists to managed arrays and saw a large speed increase. It's a little bit more work at the back end, but when it means you can throw around a few extra hundred objects it's more than worth it.

Here's an example of what I mean by managed arrays.

Type TMyType
	'Array of TMyType
	Global array:TMyType[100]
	Global aNext:Int=0
	
	'Stack of free indices
	Global stack:Int[100]
	Global sNext:Int=0
	
	'Objects index within the array
	Field aIndex:Int=-1
	
	Method New()
		'If there is a free index on the stack use that
		If sNext>0
			sNext:-1
			aIndex=stack[sNext]
		'Else add to the end of the array
		Else
			aIndex=aNext
			aNext:+1
			'If the array is full then enlarge it
			If aNext=array.length
				array=array[..aNext*2]
				stack=stack[..aNext*2]
			EndIf
		EndIf
	End Method
	
	Method Free()
		'If it's already been freed return
		If aIndex=-1 Then Return
		'Add index to the stack
		stack[sNext]=aIndex
		sNext:+1
		'Remove from the array
		array[aIndex]=Null
		aIndex=-1
		'If the array is completely empty then reset
		If aNext=sNext
			aNext=0
			sNext=0
		EndIf
	End Method

End Type

Then you can iterate through them quickly by doing something like this

For Local index:Int=0 Until TMyType.aNext
	If TMyType.array[index]
		'Do stuff
	EndIf
Next

Obviously there is very little speed gained by doing this when dealing with small collections, but once you're getting into the thousands and adding in nested loops that's when you'll see the advantage.

dmaz	(Posted 2009) [#32]

@Oddball,
sorry no offense, but a list would be more efficient, your slow down must have come from your implementation of a list. if I read this code correctly, this example creates the objects at the first available spot in the array but on freeing the object it marks the spot as free. this means that you can actually have a situation where you could be iterating through a whole mess of empty spots before you get to a valid one... that's something that would never happen in a list.

you would better off using a singly linked list... in addition to being more efficient, the code would be much cleaner.

I want to point out that when I say list, I'm not referring to blitz's TList. TList is a generic cyclic double linked list that stores any type or even a combination of types. it has a lot of features but having all those features is a trade off for a little speed.

[edit]still for most things... blitz's tlist is fine... I think we are all talking about very large amounts of objects

Nate the Great

(Posted 2009) [#33]

hmmm I see your point oddball. Ill play with that idea as well
Physlite is actually what inspired me to convert my b3d physics engines into blitz max.

@dmaz - I think oddball has a point considering his physics engine is twice as fast as mine... I get 200 boxes before it slows down in the physlite demos as opposed to 100 boxes with my engine.

ImaginaryHuman

(Posted 2009) [#34]

dmaz if you use a self defragmenting array then you still only need to iterate through the objects which are defined and not the whole array.

Nate the Great

(Posted 2009) [#35]

would it work just as fast as an array if I have each verlet have a pointer to the next one? that seems to be what your example is doing oddball

markcw

(Posted 2009) [#36]

I don't get it. Why using "~" for not, when there was the so common used !a on C/C++?

Because ! is used for doubles.

If you had ! then you'd have confusion with doubles assignment..

!a 'Not a
a ~ b 'a XOR b
Local mydoub!=1.0

So the current form is less confusing, to me anyway.

dmaz	(Posted 2009) [#37]

dmaz if you use a self defragmenting array then you still only need to iterate through the objects which are defined and not the whole array.

sure but Oddball's example isn't defragmenting... that example uses the stack index just to store the free locations. now if you were to defragment then yes but is defragmenting free? no, especially on release of the object... right? if you are going to go through all that trouble and code, why not just use a list especially since there is no speed gain...?

would it work just as fast as an array if I have each verlet have a pointer to the next one?

that's a list.

that seems to be what your example is doing oddball

why do you think that?

dmaz	(Posted 2009) [#38]

ok, I have to apologize... after a lot of testing it seems AMD athlons perform differently with BMax with array iteration coming out well ahead.

see this
http://www.blitzbasic.com/Community/posts.php?topic=83750#945222

Nate the Great

(Posted 2009) [#39]

so say I have a maximum of 800 objects in a list (kind of overshooting) but I need to do collision checking between each object so thats 640,000 collision checks so would the following be a good system to use for arrays?

if I have an array like in oddball's example limited to 800

keep track of number of objects

if an object is added, add it to the end of the array

if an object is deleted shift every object after it down one wich would take less than a millisec with 800 or less

xlsior

(Posted 2009) [#40]

640,000 collision checks are going to be murder on your PC regardless of what system you use...

Nate the Great

(Posted 2009) [#41]

160,000 didnt murder it so I would assume 640,0... oh nevermind yeah I guess it would :p unless of course you use the special NTG collision checking system (which unfortunately doesnt exist yet)

ImaginaryHuman

(Posted 2009) [#42]

Interesting dmaz.

Nate the Great - a self defragmenting array would be more efficient if you don't mind the order in which your objects are stored in the array. Add new objects to the end. When you remove one (randomly) copy the last object over the top of the one you're removing. Then you don't have any gaps and you don't have shift everything above that object.

As to collision detection, you MUST use an acceleration structure like bounding boxes or grids to radically reduce the number of collisions you need to test. You do NOT want to test every single object against every other object with brute force, it'll grind to a halt. Instead, use at least a grid, where you only have to test the objects which are located locally within a grid cell plus perhaps a few surrounding cells. That majorly cuts down the number of combinations. Or in other words, you need a better algorithm.

Nate the Great

(Posted 2009) [#43]

@Imaginary Human - exactly what I am doing at the moment :) you read my mind... or at least my screen :p sorry for the weird jokes.. its late

@ your second paragraph
yeah I already have done a lot of those and it works beautifully

Nate the Great

(Posted 2009) [#44]

so a quick question.. is it better to use a for next loop for cycling through the array or a for eachin?

TaskMaster

(Posted 2009) [#45]

800 object should not take 640,000 checks. You are doing things wrong.

Object 1 checks against 2-800
Object 2 checks against 3-800
Object 3 checks against 4-800
etc...

2 was already checked against 1...
3 was already checked against 1 and 2...
etc...

Nate the Great

(Posted 2009) [#46]

@ taskmaster. true but it becomes unstable that way. It is much more stable with 640000

JoshK

(Posted 2009) [#47]

And you think C++ will be fast at performing an n*n process?

dmaz	(Posted 2009) [#48]

so a quick question.. is it better to use a for next loop for cycling through the array or a for eachin?

you will not want to use the eachin... it will iterate through the whole array looking for valid objects. so you will need to keep the index position of your last object. (unless you create a custom enumerator)

@Imaginary Human

Nate the Great - a self defragmenting array would be more efficient if you don't mind the order in which your objects are stored in the array. Add new objects to the end. When you remove one (randomly) copy the last object over the top of the one you're removing. Then you don't have any gaps and you don't have shift everything above that object.

if you do that you may have issues because you may remove one from a position that has already been processed. so in effect, you could miss processing any objects that you pulled from the end. I guess you could either queue up your removals or better yet, immediately process the one you pulled from the end?

ImaginaryHuman

(Posted 2009) [#49]

That's the idea dmaz, when you need to remove an object, you immediately copy the object from the end of the array over the top of the one you're trying to remove, and decrement the count by 1. There is no danger to it. You still proceed through the array in order until you get to the end.
You can even be removing objects at the same time as reading the array to draw objects, so long as you don't go back on yourself and remove something from an index prior to where you already got to.

I would recommend not using EachIn except for use with lists, and only where the list doesn't have any gaps - ie a link which has a Null object.

dmaz	(Posted 2009) [#50]

but with nates, he's checking each against everything currently... so it could result in removing not the current item but a previous item. so if you are processing item 9 and you want to remove item 2 then you can't fill that with the last one..?

Nate the Great

(Posted 2009) [#51]

hey oddball your code above works! and doubled the speed of my engine. thanks

so it could result in removing not the current item but a previous item. so if you are processing item 9 and you want to remove item 2 then you can't fill that with the last one..?

no. it works, I tried it. I wont be processing verlets and removing them at the same time