Interesting speed test
BlitzMax Forums/BlitzMax Programming/Interesting speed test
| ||
I've tested two different versions of LinesCollide. The method is pretty the same (solving line-equatations). Version 1 (LinesCollide) calculates everything and then finally determine if the lines collide. Version 2 (LinesCollide2) tries to calculate only the neccessary to leave the function as fast as possible before everything is calculated. Less calculation, more if-then-else. I've called both versions with 500k random lines (same seed for both), and LinesCollide (=calc everything, only one if) was always faster (750ms against 790ms for LinesCollide2). That's with debug off. I personally thought that version 2 would be faster, at least with debug off. If you want to try by yourself: |
| ||
Interesting. I guess sometimes in order to implement a special case of something requires extra overhead. Sometimes it takes longer to ask yourself `if` and to take a branch, than to just do something straight away. Your second line program is obviously quite a bit larger. |
| ||
Deleted. Full code posted later |
| ||
Don't blame the methods... Most of the time is spend on randomizing the numbers... So the code posted above isn't really usable to compare effectivity. bye MB |
| ||
Method 1 is always faster for me, H&K's occasionally is slower, but never faster than method 2 |
| ||
Don't blame the methods... You are of course correctMost of the time is spend on randomizing the numbers... Method 1 is always faster for me, H&K's occasionally is slower, but never faster than method 2 Well on mine, mine is twice as fast as method 2, but never faster than method 1The Rnds were the major bottle neck. Test the one Ive just posted. It runs until "Space Down" |
| ||
Method 1: 93866 colisions found in 62 ms Method 2: 93866 colisions found in 87 ms Method 3: 93866 colisions found in 96 ms Method 1: 93773 colisions found in 60 ms Method 2: 93773 colisions found in 87 ms Method 3: 93773 colisions found in 96 ms Method 1: 93680 colisions found in 60 ms Method 2: 93680 colisions found in 87 ms Method 3: 93680 colisions found in 96 ms Method 1: 93500 colisions found in 63 ms Method 2: 93500 colisions found in 87 ms Method 3: 93500 colisions found in 96 ms Method 1: 93566 colisions found in 61 ms Method 2: 93566 colisions found in 87 ms Method 3: 93566 colisions found in 96 ms Method 1: 93830 colisions found in 60 ms Method 2: 93830 colisions found in 87 ms Method 3: 93830 colisions found in 96 ms Method 1: 93749 colisions found in 62 ms Method 2: 93749 colisions found in 87 ms Method 3: 93749 colisions found in 96 ms Method 1: 94207 colisions found in 62 ms Method 2: 94207 colisions found in 87 ms Method 3: 94207 colisions found in 96 ms |
| ||
Method 1: 116225 colisions found in 221 ms Method 2: 116225 colisions found in 455 ms Method 3: 116225 colisions found in 264 ms Method 1: 115406 colisions found in 238 ms Method 2: 115406 colisions found in 470 ms Method 3: 115406 colisions found in 270 ms Method 1: 115792 colisions found in 220 ms Method 2: 115792 colisions found in 453 ms Method 3: 115792 colisions found in 263 ms Method 1: 115383 colisions found in 219 ms Method 2: 115383 colisions found in 466 ms Method 3: 115383 colisions found in 271 ms Method 1: 115599 colisions found in 220 ms Method 2: 115599 colisions found in 453 ms Method 3: 115599 colisions found in 265 ms Method 1: 115614 colisions found in 221 ms Method 2: 115614 colisions found in 461 ms Method 3: 115614 colisions found in 277 ms Method 1: 115886 colisions found in 219 ms Method 2: 115886 colisions found in 448 ms Method 3: 115886 colisions found in 272 ms Method 1: 115507 colisions found in 225 ms Method 2: 115507 colisions found in 467 ms Method 3: 115507 colisions found in 278 msHUmmmmmm. Can we have a few more Acer Aspire 3000 winXP Home SP2 |
| ||
Method 1: 115783 colisions found in 46 ms Method 2: 115783 colisions found in 79 ms Method 3: 115783 colisions found in 74 ms Method 1: 115762 colisions found in 48 ms Method 2: 115762 colisions found in 70 ms Method 3: 115762 colisions found in 85 ms Method 1: 115269 colisions found in 47 ms Method 2: 115269 colisions found in 67 ms Method 3: 115269 colisions found in 76 ms Method 1: 115647 colisions found in 48 ms Method 2: 115647 colisions found in 69 ms Method 3: 115647 colisions found in 78 ms Method 1: 115558 colisions found in 47 ms Method 2: 115558 colisions found in 68 ms Method 3: 115558 colisions found in 76 ms Method 1: 115874 colisions found in 48 ms Method 2: 115874 colisions found in 68 ms Method 3: 115874 colisions found in 75 ms Method 1: 115545 colisions found in 48 ms Method 2: 115545 colisions found in 69 ms Method 3: 115545 colisions found in 75 ms Method 1: 115689 colisions found in 47 ms Method 2: 115689 colisions found in 70 ms Method 3: 115689 colisions found in 79 ms Athlon64 3400+ Windows XP SP2 (32bit) |
| ||
Method 1: 115153 colisions found in 47 ms Method 2: 115153 colisions found in 66 ms Method 3: 115153 colisions found in 74 ms Method 1: 115393 colisions found in 47 ms Method 2: 115393 colisions found in 66 ms Method 3: 115393 colisions found in 74 ms Method 1: 115747 colisions found in 46 ms Method 2: 115747 colisions found in 67 ms Method 3: 115747 colisions found in 74 ms Method 1: 116007 colisions found in 47 ms Method 2: 116007 colisions found in 67 ms Method 3: 116007 colisions found in 74 ms Method 1: 115949 colisions found in 51 ms Method 2: 115949 colisions found in 67 ms Method 3: 115949 colisions found in 74 ms Method 1: 116075 colisions found in 46 ms Method 2: 116075 colisions found in 68 ms Method 3: 116075 colisions found in 74 ms Method 1: 115516 colisions found in 46 ms Method 2: 115516 colisions found in 69 ms Method 3: 115516 colisions found in 74 ms Method 1: 115772 colisions found in 46 ms Method 2: 115772 colisions found in 68 ms Method 3: 115772 colisions found in 74 ms Method 1: 115824 colisions found in 46 ms Method 2: 115824 colisions found in 67 ms Method 3: 115824 colisions found in 74 ms Method 1: 115375 colisions found in 47 ms Method 2: 115375 colisions found in 66 ms Method 3: 115375 colisions found in 75 ms AMD Athlon 64 X2 4600+, 2 gigs of RAM, Windows XP Home Edition with Service Pack 2 SpaceAce |
| ||
Method 1: 115909 colisions found in 55 ms Method 2: 115909 colisions found in 80 ms Method 3: 115909 colisions found in 88 ms Method 1: 115636 colisions found in 56 ms Method 2: 115636 colisions found in 79 ms Method 3: 115636 colisions found in 87 ms Method 1: 115657 colisions found in 55 ms Method 2: 115657 colisions found in 79 ms Method 3: 115657 colisions found in 88 ms Method 1: 115406 colisions found in 56 ms Method 2: 115406 colisions found in 79 ms Method 3: 115406 colisions found in 87 ms Method 1: 115461 colisions found in 55 ms Method 2: 115461 colisions found in 79 ms Method 3: 115461 colisions found in 88 ms Method 1: 115402 colisions found in 55 ms Method 2: 115402 colisions found in 79 ms Method 3: 115402 colisions found in 87 ms Method 1: 115741 colisions found in 56 ms Method 2: 115741 colisions found in 79 ms Method 3: 115741 colisions found in 88 ms Method 1: 116169 colisions found in 56 ms Method 2: 116169 colisions found in 79 ms Method 3: 116169 colisions found in 88 ms Method 1: 115926 colisions found in 55 ms Method 2: 115926 colisions found in 78 ms Method 3: 115926 colisions found in 87 ms Method 1: 115758 colisions found in 57 ms Method 2: 115758 colisions found in 78 ms Method 3: 115758 colisions found in 87 ms Method 1: 116293 colisions found in 55 ms Method 2: 116293 colisions found in 78 ms Method 3: 116293 colisions found in 88 ms AMD Athlon 64 X2 3800+, 1G RAM, Windows XP Service Pack 2 Oh, btw, you spelled collisions wrong ;) |
| ||
Sure, Rnd takes the most time, but that doesn't matter as it should need the same time for both methods, so the result is compareable again. Jake PS: Torrente, you're right, here's the missing letter: l ;) |
| ||
Method 1: 116066 colisions found in 52 ms Method 2: 116066 colisions found in 79 ms Method 3: 116066 colisions found in 109 ms Method 1: 115853 colisions found in 52 ms Method 2: 115853 colisions found in 81 ms Method 3: 115853 colisions found in 112 ms Method 1: 115801 colisions found in 51 ms Method 2: 115801 colisions found in 79 ms Method 3: 115801 colisions found in 111 ms Method 1: 116252 colisions found in 53 ms Method 2: 116252 colisions found in 80 ms Method 3: 116252 colisions found in 110 ms Method 1: 115178 colisions found in 51 ms Method 2: 115178 colisions found in 80 ms Method 3: 115178 colisions found in 109 ms Method 1: 115494 colisions found in 53 ms Method 2: 115494 colisions found in 81 ms Method 3: 115494 colisions found in 111 ms Method 1: 115880 colisions found in 52 ms Method 2: 115880 colisions found in 79 ms Method 3: 115880 colisions found in 110 ms Method 1: 116172 colisions found in 52 ms Method 2: 116172 colisions found in 80 ms Method 3: 116172 colisions found in 109 ms Method 1: 115674 colisions found in 51 ms Method 2: 115674 colisions found in 80 ms Method 3: 115674 colisions found in 111 ms Method 1: 115286 colisions found in 51 ms Method 2: 115286 colisions found in 80 ms Method 3: 115286 colisions found in 110 ms with a fair amount of apps running on the compy.. |
| ||
Sure, Rnd takes the most time, but that doesn't matter as it should need the same time for both methods, so the result is compareable again. Well, I was going to post an answer like that, but then I realised that it does make a differance. For wereas the results are comparible, the % differance is greater.So I agree with you that the fastest will still be the fastest, but it will be by a greater amount. eg You had Method1 as about 6% faster. With the edited program is it not now closer to 80% faster? (LarsG Has it as 120%Faster) |
| ||
Who cares about how much faster one method is about another? No offense, but unless I (or anyone) finds a method faster than method 1 (which seems to be the fastest at the moment), this is the method to use, right? Well, maybe it could be interesting to have a general guide which basic operations (+-*/, if-then-else, select-case, etc..) takes more time than others. With such a guide one could code faster functions more easy. On the other side I don't profile most highlevel-functions. Linechecking and other essential stuff called a hundred times each frame are very special cases and the only ones worth for optimization imho. In fact I'm excited how fast this little function is. I thought about splitting concave polygons into convex ones for faster collision checking. Now I see no need for this step. Should be fast enough for my project when doing AABB-check prior. More time for other things... Jake PS: What about starting a competition for "usual tasks"-functions like different collisions, basic physics-calculations and such? I'm sure we find a lot of common functions everyone of us use, so having a FAQ/Howto with the fastest way to do it in BM would be nice, wouldn't it? |
| ||
Who cares about how much faster one method is about another? Well you. As proven by No offense, but unless I (or anyone) finds a method faster than method 1 (which seems to be the fastest at the moment), this is the method to use, right? What was interesting is that Method 2 is only 50% of the speed of Method three in DEBUG, but about 110% of the speed in release. Somthing which I didnt think would happen. |
| ||
That's indeed interesting, not to say a little bit weird. P.S: I meant the %-difference with "how much faster than another"... |
| ||
Some results from me. Thats not enough code in the linescollide function(s) to justify branching for optimization at all, and the use of abs/max/min slows it down even more. Having said that, im not sure if you can manage without the latter. |
| ||
Shaved of a few milliseconds. And the code. I noticed that between blitzmax 118 and 124, method 2 and 3 were one or two milliseconds faster on 124, but method 4 was 10! milliseconds faster on 118. |
| ||
Ok bare in mind that Im rubbish at C++, but I had a go anyway, just Method1 Note 1: As you may be able to tell, its using DarkGDK, and The free VC8 express. And possibly someone who know what they are doing could improve it. Note 2:I got it going with a Low Array size, and then it stopped working with the larger size. Ok so it was just nessesary to increase the Stack size, but Ive never had to do anything like that with BMax Note 3:I dont think that I compiled it as Managed C++, and am almost certain that I ran it as core C++ DeBug 116109 211 115805 223 115598 198 115891 199 115433 199 115914 204Max, 271,274,276,270,269 Release 115598 161 115878 156 116235 159 116088 157 115766 157 115657 158Max, 70,65,65,65,64 I dont think Ive made the code any differently in the C version, but if anyone want to point out a bottleneck Ive introduced please do. |