Interesting speed test

BlitzMax Forums/BlitzMax Programming/Interesting speed test

Jake L.(Posted 2007) [#1]
I've tested two different versions of LinesCollide. The method is pretty the same (solving line-equatations). Version 1 (LinesCollide) calculates everything and then finally determine if the lines collide.

Version 2 (LinesCollide2) tries to calculate only the neccessary to leave the function as fast as possible before everything is calculated. Less calculation, more if-then-else.

I've called both versions with 500k random lines (same seed for both), and LinesCollide (=calc everything, only one if) was always faster (750ms against 790ms for LinesCollide2). That's with debug off.

I personally thought that version 2 would be faster, at least with debug off.

If you want to try by yourself:




ImaginaryHuman(Posted 2007) [#2]
Interesting. I guess sometimes in order to implement a special case of something requires extra overhead. Sometimes it takes longer to ask yourself `if` and to take a branch, than to just do something straight away. Your second line program is obviously quite a bit larger.


H&K(Posted 2007) [#3]
Deleted. Full code posted later


Derron(Posted 2007) [#4]
Don't blame the methods...
Most of the time is spend on randomizing the numbers...

So the code posted above isn't really usable to compare effectivity.


bye
MB


Perturbatio(Posted 2007) [#5]


Method 1 is always faster for me, H&K's occasionally is slower, but never faster than method 2


H&K(Posted 2007) [#6]
Don't blame the methods...
Most of the time is spend on randomizing the numbers...
You are of course correct


Method 1 is always faster for me, H&K's occasionally is slower, but never faster than method 2
Well on mine, mine is twice as fast as method 2, but never faster than method 1
The Rnds were the major bottle neck. Test the one Ive just posted. It runs until "Space Down"


Perturbatio(Posted 2007) [#7]
Method 1: 93866 colisions found in 62 ms
Method 2: 93866 colisions found in 87 ms
Method 3: 93866 colisions found in 96 ms
Method 1: 93773 colisions found in 60 ms
Method 2: 93773 colisions found in 87 ms
Method 3: 93773 colisions found in 96 ms
Method 1: 93680 colisions found in 60 ms
Method 2: 93680 colisions found in 87 ms
Method 3: 93680 colisions found in 96 ms
Method 1: 93500 colisions found in 63 ms
Method 2: 93500 colisions found in 87 ms
Method 3: 93500 colisions found in 96 ms
Method 1: 93566 colisions found in 61 ms
Method 2: 93566 colisions found in 87 ms
Method 3: 93566 colisions found in 96 ms
Method 1: 93830 colisions found in 60 ms
Method 2: 93830 colisions found in 87 ms
Method 3: 93830 colisions found in 96 ms
Method 1: 93749 colisions found in 62 ms
Method 2: 93749 colisions found in 87 ms
Method 3: 93749 colisions found in 96 ms
Method 1: 94207 colisions found in 62 ms
Method 2: 94207 colisions found in 87 ms
Method 3: 94207 colisions found in 96 ms



H&K(Posted 2007) [#8]
Method 1: 116225 colisions found in 221 ms
Method 2: 116225 colisions found in 455 ms
Method 3: 116225 colisions found in 264 ms

Method 1: 115406 colisions found in 238 ms
Method 2: 115406 colisions found in 470 ms
Method 3: 115406 colisions found in 270 ms

Method 1: 115792 colisions found in 220 ms
Method 2: 115792 colisions found in 453 ms
Method 3: 115792 colisions found in 263 ms

Method 1: 115383 colisions found in 219 ms
Method 2: 115383 colisions found in 466 ms
Method 3: 115383 colisions found in 271 ms

Method 1: 115599 colisions found in 220 ms
Method 2: 115599 colisions found in 453 ms
Method 3: 115599 colisions found in 265 ms

Method 1: 115614 colisions found in 221 ms
Method 2: 115614 colisions found in 461 ms
Method 3: 115614 colisions found in 277 ms

Method 1: 115886 colisions found in 219 ms
Method 2: 115886 colisions found in 448 ms
Method 3: 115886 colisions found in 272 ms

Method 1: 115507 colisions found in 225 ms
Method 2: 115507 colisions found in 467 ms
Method 3: 115507 colisions found in 278 ms
HUmmmmmm. Can we have a few more

Acer Aspire 3000 winXP Home SP2


grable(Posted 2007) [#9]
Method 1: 115783 colisions found in 46 ms
Method 2: 115783 colisions found in 79 ms
Method 3: 115783 colisions found in 74 ms

Method 1: 115762 colisions found in 48 ms
Method 2: 115762 colisions found in 70 ms
Method 3: 115762 colisions found in 85 ms

Method 1: 115269 colisions found in 47 ms
Method 2: 115269 colisions found in 67 ms
Method 3: 115269 colisions found in 76 ms

Method 1: 115647 colisions found in 48 ms
Method 2: 115647 colisions found in 69 ms
Method 3: 115647 colisions found in 78 ms

Method 1: 115558 colisions found in 47 ms
Method 2: 115558 colisions found in 68 ms
Method 3: 115558 colisions found in 76 ms

Method 1: 115874 colisions found in 48 ms
Method 2: 115874 colisions found in 68 ms
Method 3: 115874 colisions found in 75 ms

Method 1: 115545 colisions found in 48 ms
Method 2: 115545 colisions found in 69 ms
Method 3: 115545 colisions found in 75 ms

Method 1: 115689 colisions found in 47 ms
Method 2: 115689 colisions found in 70 ms
Method 3: 115689 colisions found in 79 ms

Athlon64 3400+ Windows XP SP2 (32bit)


SpaceAce(Posted 2007) [#10]
Method 1: 115153 colisions found in 47 ms
Method 2: 115153 colisions found in 66 ms
Method 3: 115153 colisions found in 74 ms
Method 1: 115393 colisions found in 47 ms
Method 2: 115393 colisions found in 66 ms
Method 3: 115393 colisions found in 74 ms
Method 1: 115747 colisions found in 46 ms
Method 2: 115747 colisions found in 67 ms
Method 3: 115747 colisions found in 74 ms
Method 1: 116007 colisions found in 47 ms
Method 2: 116007 colisions found in 67 ms
Method 3: 116007 colisions found in 74 ms
Method 1: 115949 colisions found in 51 ms
Method 2: 115949 colisions found in 67 ms
Method 3: 115949 colisions found in 74 ms
Method 1: 116075 colisions found in 46 ms
Method 2: 116075 colisions found in 68 ms
Method 3: 116075 colisions found in 74 ms
Method 1: 115516 colisions found in 46 ms
Method 2: 115516 colisions found in 69 ms
Method 3: 115516 colisions found in 74 ms
Method 1: 115772 colisions found in 46 ms
Method 2: 115772 colisions found in 68 ms
Method 3: 115772 colisions found in 74 ms
Method 1: 115824 colisions found in 46 ms
Method 2: 115824 colisions found in 67 ms
Method 3: 115824 colisions found in 74 ms
Method 1: 115375 colisions found in 47 ms
Method 2: 115375 colisions found in 66 ms
Method 3: 115375 colisions found in 75 ms

AMD Athlon 64 X2 4600+, 2 gigs of RAM, Windows XP Home Edition with Service Pack 2

SpaceAce


Torrente(Posted 2007) [#11]
Method 1: 115909 colisions found in 55 ms
Method 2: 115909 colisions found in 80 ms
Method 3: 115909 colisions found in 88 ms
Method 1: 115636 colisions found in 56 ms
Method 2: 115636 colisions found in 79 ms
Method 3: 115636 colisions found in 87 ms
Method 1: 115657 colisions found in 55 ms
Method 2: 115657 colisions found in 79 ms
Method 3: 115657 colisions found in 88 ms
Method 1: 115406 colisions found in 56 ms
Method 2: 115406 colisions found in 79 ms
Method 3: 115406 colisions found in 87 ms
Method 1: 115461 colisions found in 55 ms
Method 2: 115461 colisions found in 79 ms
Method 3: 115461 colisions found in 88 ms
Method 1: 115402 colisions found in 55 ms
Method 2: 115402 colisions found in 79 ms
Method 3: 115402 colisions found in 87 ms
Method 1: 115741 colisions found in 56 ms
Method 2: 115741 colisions found in 79 ms
Method 3: 115741 colisions found in 88 ms
Method 1: 116169 colisions found in 56 ms
Method 2: 116169 colisions found in 79 ms
Method 3: 116169 colisions found in 88 ms
Method 1: 115926 colisions found in 55 ms
Method 2: 115926 colisions found in 78 ms
Method 3: 115926 colisions found in 87 ms
Method 1: 115758 colisions found in 57 ms
Method 2: 115758 colisions found in 78 ms
Method 3: 115758 colisions found in 87 ms
Method 1: 116293 colisions found in 55 ms
Method 2: 116293 colisions found in 78 ms
Method 3: 116293 colisions found in 88 ms


AMD Athlon 64 X2 3800+, 1G RAM, Windows XP Service Pack 2

Oh, btw, you spelled collisions wrong ;)


Jake L.(Posted 2007) [#12]
Sure, Rnd takes the most time, but that doesn't matter as it should need the same time for both methods, so the result is compareable again.

Jake

PS: Torrente, you're right, here's the missing letter: l ;)


LarsG(Posted 2007) [#13]
Method 1: 116066 colisions found in 52 ms
Method 2: 116066 colisions found in 79 ms
Method 3: 116066 colisions found in 109 ms

Method 1: 115853 colisions found in 52 ms
Method 2: 115853 colisions found in 81 ms
Method 3: 115853 colisions found in 112 ms

Method 1: 115801 colisions found in 51 ms
Method 2: 115801 colisions found in 79 ms
Method 3: 115801 colisions found in 111 ms

Method 1: 116252 colisions found in 53 ms
Method 2: 116252 colisions found in 80 ms
Method 3: 116252 colisions found in 110 ms

Method 1: 115178 colisions found in 51 ms
Method 2: 115178 colisions found in 80 ms
Method 3: 115178 colisions found in 109 ms

Method 1: 115494 colisions found in 53 ms
Method 2: 115494 colisions found in 81 ms
Method 3: 115494 colisions found in 111 ms

Method 1: 115880 colisions found in 52 ms
Method 2: 115880 colisions found in 79 ms
Method 3: 115880 colisions found in 110 ms

Method 1: 116172 colisions found in 52 ms
Method 2: 116172 colisions found in 80 ms
Method 3: 116172 colisions found in 109 ms

Method 1: 115674 colisions found in 51 ms
Method 2: 115674 colisions found in 80 ms
Method 3: 115674 colisions found in 111 ms

Method 1: 115286 colisions found in 51 ms
Method 2: 115286 colisions found in 80 ms
Method 3: 115286 colisions found in 110 ms


with a fair amount of apps running on the compy..


H&K(Posted 2007) [#14]
Sure, Rnd takes the most time, but that doesn't matter as it should need the same time for both methods, so the result is compareable again.
Well, I was going to post an answer like that, but then I realised that it does make a differance. For wereas the results are comparible, the % differance is greater.
So I agree with you that the fastest will still be the fastest, but it will be by a greater amount. eg You had Method1 as about 6% faster. With the edited program is it not now closer to 80% faster? (LarsG Has it as 120%Faster)


Jake L.(Posted 2007) [#15]
Who cares about how much faster one method is about another? No offense, but unless I (or anyone) finds a method faster than method 1 (which seems to be the fastest at the moment), this is the method to use, right?

Well, maybe it could be interesting to have a general guide which basic operations (+-*/, if-then-else, select-case, etc..) takes more time than others. With such a guide one could code faster functions more easy. On the other side I don't profile most highlevel-functions. Linechecking and other essential stuff called a hundred times each frame are very special cases and the only ones worth for optimization imho.

In fact I'm excited how fast this little function is. I thought about splitting concave polygons into convex ones for faster collision checking. Now I see no need for this step. Should be fast enough for my project when doing AABB-check prior. More time for other things...

Jake

PS: What about starting a competition for "usual tasks"-functions like different collisions, basic physics-calculations and such? I'm sure we find a lot of common functions everyone of us use, so having a FAQ/Howto with the fastest way to do it in BM would be nice, wouldn't it?


H&K(Posted 2007) [#16]
Who cares about how much faster one method is about another?
Well you. As proven by
No offense, but unless I (or anyone) finds a method faster than method 1 (which seems to be the fastest at the moment), this is the method to use, right?


What was interesting is that Method 2 is only 50% of the speed of Method three in DEBUG, but about 110% of the speed in release. Somthing which I didnt think would happen.


Jake L.(Posted 2007) [#17]
That's indeed interesting, not to say a little bit weird.

P.S: I meant the %-difference with "how much faster than another"...


zzz(Posted 2007) [#18]
Some results from me.



Thats not enough code in the linescollide function(s) to justify branching for optimization at all, and the use of abs/max/min slows it down even more.

Having said that, im not sure if you can manage without the latter.


zzz(Posted 2007) [#19]
Shaved of a few milliseconds.



And the code.



I noticed that between blitzmax 118 and 124, method 2 and 3 were one or two milliseconds faster on 124, but method 4 was 10! milliseconds faster on 118.


H&K(Posted 2007) [#20]
Ok bare in mind that Im rubbish at C++, but I had a go anyway, just Method1

Note 1: As you may be able to tell, its using DarkGDK, and The free VC8 express. And possibly someone who know what they are doing could improve it.
Note 2:I got it going with a Low Array size, and then it stopped working with the larger size. Ok so it was just nessesary to increase the Stack size, but Ive never had to do anything like that with BMax
Note 3:I dont think that I compiled it as Managed C++, and am almost certain that I ran it as core C++

DeBug
116109     211
115805     223
115598     198
115891     199
115433     199
115914     204
Max, 271,274,276,270,269

Release
115598     161
115878     156
116235     159
116088     157
115766     157
115657     158
Max, 70,65,65,65,64

I dont think Ive made the code any differently in the C version, but if anyone want to point out a bottleneck Ive introduced please do.