quick poll - precalc sin & cos or perform every frame ?

BlitzPlus Forums/BlitzPlus Programming/quick poll - precalc sin & cos or perform every frame ?

Dax Trajero(Posted 2004) [#1]
Hi, I'm new to all this programming business, and have a set of 4 sprites, each of which can move in any angle (eg. 0 deg, 11.25deg, 22.50deg, etc...)

Regardless of which method* I choose, I use

dx = cos(a)
dy = sin(a)

*My question is, is it best to precalc these values and put them in a data structure (array,etc...) so its less cpu itensive

or

Every frame, do I just do the COS & SIN calcs for each of the 4 sprites, and to hell with the cpu intensity ?

Thanks in advance,
-Dax


WillKoh(Posted 2004) [#2]
If they can move in a somewhat limited number of angles one could use an array with precalc.. i wouldn't have thought about using precalc for this until you brought it up.. would it be much faster? One could just make a prog to include each time that stores the cos and sin of say, every integer angle from 0-359... Unless blitz already uses a built-in table for that??


Rob Farley(Posted 2004) [#3]
I'd finish your game and see if you need the extra speed afterwards.

It's easy enough to create a couple of arrays afterwards.

Personally I always just use sin() and cos() I've not noticed any massive speed hit and I don't think it's really worth worrying about until you're getting to optimisation right at the end of your project.

It's a bit like worrying about floats and ints... I don't... I just use them as necessary.

PCs are so quick now I don't think it's worth spending too much time optimising unless you can get an extra 10 or 20%. To get an extra 3 cycles in 200 hardly seems worth the effort.


Beaker(Posted 2004) [#4]
Only time I've seen it being worth it is for demo's that don't really do much else apart from Sin/Cos and floating point number crunching.


Kevin_(Posted 2004) [#5]
Anything that saves you having to calculate in real time has got to be a good thing. Therefore I would recommend using the lookup tables.


Warren(Posted 2004) [#6]
Profile twice, optimize once.

I seriously doubt sin/cos will be your primary bottleneck.


Kevin_(Posted 2004) [#7]
Epicboy....

Please explain....


Warren(Posted 2004) [#8]
Was I not clear?

Profile your code. Look at the bottlenecks before optimizing something like sin/cos calls. Unless you're doing a high intensity graphics demo, sin/cos are almost certainly not going to be a significant chunk of your CPU usage.

Concentrate on optimizing the code that matters.


sswift(Posted 2004) [#9]
I conqer.


Kevin_(Posted 2004) [#10]
Epicboy...

You are wrong. Sin and Cos are bottlenecks that is why people use lookup tables. If you designed your program correctly based on an industry standard model then you would have identified potential bottlenecks even before your did any programming at all.

Good design is about the model on paper NOT the program. Designing the program is one of the last phases in the system development life cycle.

Why do people use Bubble Sort? Because they havent designed their models correctly thats why. They think that because the code needed to perform this algoirthm is short, it must be better. WRONG! If they had designed it correctly they would use an algirithm with an Order of Magnitude of '0n' instead of 'On Squared'.

It amazes me that people assume they can get away with sloppy code because cpu's are so fast today it does not make a significent difference to the speed of their programs.

How wrong they are.


Rottbott(Posted 2004) [#11]
Prof, you are right in theory, but it doesn't apply to this case. Using lookup tables for this isn't really faster than simply calling the function these days.


Warren(Posted 2004) [#12]
Prof

No, Sin/Cos -used- to be bottlenecks and that is why people -used- to have to use look up tables. These days, it's not a big deal unless you are calling them thousands of times every frame.

This is the same sort of thing that goes along with unrolling loops and fixed point math. Used to be useful optimizations. Nowadays? Not so much.


Kevin_(Posted 2004) [#13]
If that is the way that you do it then fine. It is not the way I do it because I want my programs to run as fast as possible and having done benchmarks on both methods, the lookup tables method wins every time.


Kevin_(Posted 2004) [#14]
Do you want proof? Instead of just making a statement Epicboy why dont you prove it? I have. Run the code below that shows I am right and you are wrong (as usual).

The results on my Cel(2GHz) were as follows....

On the fly took 31 milliseconds
Pre-calced tables took 1 millisecond

But it doesn't really matter how much proof I provide does it Epicboy? You are always right about these things (like Flameduck is also).



[Code]
Graphics 640,480,32,2

;Lookup Table
Dim Sin2(360)
Dim Cos2(360)
For a=0 To 359
Sin2(a)=Sin(a)
Cos2(a)=Cos(a)
Next

Iterations=100 ; change

; Calc on the fly
Time1=MilliSecs()
For t=1 To Iterations
For a=0 To 359
dx = Cos(a)
dy = Sin(a)
Next
Next
Time2=MilliSecs()
TimeTaken1=Time2-Time1


; Now Calc using tables
Time1=MilliSecs()
For t=1 To Iterations
For a=0 To 359
dx = Cos2(a)
dy = Sin2(a)
Next
Next
Time2=MilliSecs()
TimeTaken2=Time2-Time1


Text 10,10,"REPORT"
Text 10,30,"Calc On the fly took "+Str(TimeTaken1)+" millisecs"
Text 10,45,"Look up tables took "+Str(TimeTaken2)+" millisecs"

Text 10,70,"Conclusion... Look up tables are much faster"
Text 10,85,"and Epicboy talks out of his arse (as usual)."

Flip
WaitKey()
End

[/Code]


DJWoodgate(Posted 2004) [#15]

Every frame, do I just do the COS & SIN calcs for each of the 4 sprites, and to hell with the cpu intensity ?



Yes, your CPU can take the strain.


Warren(Posted 2004) [#16]
As I said, if you are calling them hundreds or thousands of times each frame, optimize it. If you're writing a normal application, and not a pathological test case, it's generally unnecessary.


WolRon(Posted 2004) [#17]
Do you want proof? Instead of just making a statement Epicboy why dont you prove it? I have. Run the code below that shows I am right and you are wrong (as usual).

The results on my Cel(2GHz) were as follows....

On the fly took 31 milliseconds
Pre-calced tables took 1 millisecond
Actually, EpicBoy is closer to the truth. There was another thread on this recently (it's in here somewhere) and people found out that the faster processors could actually compute the math faster than they could look up the tables (because they had to slow down to RAM access speed). Nowadays, it actually does depend on the system you are running it on. As far as SIN/COS are concerned, it's not realy worth the effort worrying about anymore.

If you want a real test, you should run it on MANY systems and compare the results. But as EpicBoy said, it hardly matters in most cases.


Warren(Posted 2004) [#18]
Prof, if it makes you happy, go ahead and use the look up tables. Enjoy. The fact is, for most applications, the difference is negligible and there are far, FAR larger fish to fry when it comes to speeding them up.


Kevin_(Posted 2004) [#19]
Prove it!


Dax Trajero(Posted 2004) [#20]
Please guys, calm down - I don't want to start an argument.

Taken all that has been written on this thread into contention, perhaps it would be pertinent to ask me what my target hardware is ?

People talk about running the game on 1GHZ systems - but ideally I'd also like people with slower cpu's to be able to run the game, eg. Pentium II class systems. I'm not going for state-of-the-art graphics here - I'm doing a retro remake.

Taking that into account I'm sure PROF's argument becomes that more pertinent when running the game on a Pentium II 350MHz ?

For now and for quickness, I'll use EPICBOY's approach, then when the game is done. I'll go back an try PROF's approach and see what the gains are.

Thanks for your input guys

-Dax


Rob Farley(Posted 2004) [#21]
Prof,

I find optimisations come from looking at your code once you've finshed, and you can shave cycles off loops, or indeed remove loops altogther.

For example. You might not need to run AI code for every enemy every loop. You might be able to get away with running 1 ai code per game loop and running through your enemies therefore removing an entire loop from you main loop.

You might find that you're not hiding entities out of view or processing collisions that you will never even consider hitting. All of these things will probably be of greater impact that a lookup of sine and cosine.

That's just me though.


Kevin_(Posted 2004) [#22]
Dax....

You have answered your own question! Lookup tables will give a better performance on ANY system where as calculating on the fly may induce slowdown on older systems.

Why design two systems when one will do the job?

Rob....

I agree with what you (and Epicboy) say about other factors but the fact remains that mappings are so much faster than calculations. The overhead is bigger because of the arrays, but it is worth it for the extra speed you gain. Run the code in my previous post. Change the ITERATIONS value to 1000 and view the results. You are looking at a speed increase of 300 TIMES FASTER using lookup tables!

Mappings are FAST! Calculations are Slooowwwwwwwww!


WolRon(Posted 2004) [#23]
After comparing apples to apples (your code was doing an extra (unfair) conversion of floating point to integer):
Graphics 640,480,32,2

;Lookup Table
Dim Sin2#(360)
Dim Cos2#(360)
For a=0 To 359
	Sin2#(a)=Sin(a)
	Cos2#(a)=Cos(a)
Next

Iterations=71500   ; change

; Calc on the fly
Time1=MilliSecs()
For t=1 To Iterations
    For a=0 To 359
        dx# = Cos(a) 
        dy# = Sin(a)
    Next
Next
Time2=MilliSecs()
TimeTaken1=Time2-Time1


; Now Calc using tables
Time1=MilliSecs()
For t=1 To Iterations
    For a=0 To 359
        dx# = Cos2#(a) 
        dy# = Sin2#(a)
    Next
Next
Time2=MilliSecs()
TimeTaken2=Time2-Time1


Text 10,10,"REPORT"
Text 10,30,"Calc On the fly took "+Str(TimeTaken1)+" millisecs"
Text 10,45,"Look up tables took "+Str(TimeTaken2)+" millisecs"

Flip
WaitKey()
End
this is the results I came up with for
B+ :6693 - 95
B3D:5531 - 122 (approx. 4307 - 95)

Looks like B3D is faster at doing the math than B+ is (35% faster) and slower at looking up the tables.

My machine is a P4-2.53GHz.
I imagine with such small tables (360 elements), it fits entirely in the L2 cache so the processor doesn't have to access the system RAM. With a larger table this may not be true.

I got these results with substantially larger tables which suggest what I was talking about.
B+ :3102 - 103
B3D:2565 - 99
Graphics 640,480,32,2

;Lookup Table
Dim Sin2#(36000000)
Dim Cos2#(36000000)
For a=0 To 35999999
	Sin2#(a)=Sin(a)
	Cos2#(a)=Cos(a)
Next

Iterations=1   ; change

; Calc on the fly
Time1=MilliSecs()
For t=1 To Iterations
    For a=0 To 35999999
        dx# = Cos(a) 
        dy# = Sin(a)
    Next
Next
Time2=MilliSecs()
TimeTaken1=Time2-Time1


; Now Calc using tables
Time1=MilliSecs()
For t=1 To Iterations
    For a=0 To 35999999
        dx# = Cos2#(a) 
        dy# = Sin2#(a)
    Next
Next
Time2=MilliSecs()
TimeTaken2=Time2-Time1


Text 10,10,"REPORT"
Text 10,30,"Calc On the fly took "+Str(TimeTaken1)+" millisecs"
Text 10,45,"Look up tables took "+Str(TimeTaken2)+" millisecs"

Flip
WaitKey()
End



Kevin_(Posted 2004) [#24]
Wolrun....

Your first example on my machine....

On the fly = 30471 millisecs
Look up tables = 157 millisecs

The look up tables are still roughly 200 times faster using floating point numbers.

Your second example was rather interesting because it crashed Blitz Plus! However, removing three digits from the right allowed the program to run and....

On The Fly = 51 millisecs
Lookup Tables = 0 millisecs

Conclusion.....

According to my tests using INT's are roughly 300 times faster and FLOATS are roughly 200 times faster. Which ever way you look at it, Lookup tables are much faster.


RexRhino(Posted 2004) [#25]
According to my tests using INT's are roughly 300 times faster and FLOATS are roughly 200 times faster. Which ever way you look at it, Lookup tables are much faster.

Your test program doesn't conclusivly prove anything. It is seriously flawed. Here is why...

Your program is small and uses very little memory per loop, and therefore your look-up table is stored in the L1 high speed memory cache. However, depending on the size of your program and how much memory it uses in each loop, your lookup table could be outside the L1 high speed memory cache (or even outside the L2 cache). Because your test program is small enough to fit in L1 cache, the values you are getting are not accurate (unless your game is going to be very small).

To get a more accurate set of values, you need to operate on an amount of memory greater than your L1 and L2 caches in each loop. Try creating a one million value array, and incrementing each slot every loop along with using your Sin and Cos array. Or, try doing a lot of graphics manipulation in each loop.

On my older machine, when using a lot of memory so that the array exists outside of the cache, on the fly and lookup tables are almost exactly the same. And on a faster machine, I would venture to guess on the fly is quicker.


Warren(Posted 2004) [#26]
Prove it!

I just released a new game on my site that calcs sin/cos every frame for the various effects. Does it seem slow to you?

I dunno, I've developed several games now (both on my own and for Epic Games) and look up tables are just NOT used.

I work with the Unreal engine every day and you know what? No look up tables for sin/cos.

What more can I say?


Warren(Posted 2004) [#27]
Dax
For now and for quickness, I'll use EPICBOY's approach, then when the game is done. I'll go back an try PROF's approach and see what the gains are.

That's the sensible thing to do, yes. Mind you, I'm not AGAINST look up tables if they make a difference for you. I just don't think they will.


Kevin_(Posted 2004) [#28]
I just released a new game on my site that calcs sin/cos every frame for the various effects. Does it seem slow to you?


That is a rediculous thing to say because I cannot compare it with anything. Compile another version using lookup tables and compare the results.


Warren(Posted 2004) [#29]
My point stands ... does it seem slow? No? Then that optimization, in the case of my game, is irrelevant and unnecessary. And that's what I've been trying to get through to you this entire time.


Kevin_(Posted 2004) [#30]
It may be fast on your system but what about other peoples computers? I do not believe that you have just said that.

For someone who works regularly on the Unreal Engine (your comments) I am amazed that someone like yourself does not understand the principles and importance of comparisons.

I'm afraid your point doesn't stand because yet again I have provided evidence proving that lookup tables are much faster than calculating on the fly.

I have yet to see some of your code that proves that I am wrong. I wonder why?


RexRhino(Posted 2004) [#31]
I'm afraid your point doesn't stand because yet again I have provided evidence proving that lookup tables are much faster than calculating on the fly.

No you haven't!!! Your test program doesn't use enough memory to accurate test it the way it would be used in a game. Try loading and manipulating a bunch of graphics so that everything can't sit in your L1 cache, and you will get orders of magnitude slower results.


Andy(Posted 2004) [#32]
Prof,

REX has a very important point here, and just to slap you around a bit for dragging people into this thread that haven't even posted, a point that Flameduck has raised many times.

Just because someone decides to argue their point, doesn't mean that they don't know what they are talking about - in fact this community has a large number of very knowledgable people.

What used to be essential optimizations on a 486, are a trivial waste of time on a P3/P4... Knowing where to optimize is as important as the optimizations themselves.

Andy


Warren(Posted 2004) [#33]
I have yet to see some of your code that proves that I am wrong. I wonder why?

Because, and this is just a hunch, I don't believe that you would accept anything that I showed you.

You seem to be stuck in this "optimization is always essential" mindset, which is usually the mark of an inexperienced programmer. Not trying to insult you, just stating the facts.

A few people have pointed out the caching problems inherent to skewing your example but you've done nothing to prove them wrong. Why not?

Profile twice, optimize once.


Kevin_(Posted 2004) [#34]
Because, and this is just a hunch, I don't believe that you would accept anything that I showed you.


Rubbish! Being able to prove something is the only way to get credability. Show me some code that proves that calculating sin/cos is faster than using lookup tables and I'll take back what I said. I already ran Wolron's example and the lookup tables are still loads faster.

Just because someone decides to argue their point, doesn't mean that they don't know what they are talking about - in fact this community has a large number of very knowledgable people.


But how can someone argue their point without producing evidence? This is one of the biggest problems in this forum. Some people are making comments and even giving the wrong advice to others without providing a single shread of evidence to back up their argument.

Anyway, I have provided evidence to the original starter of this thread that lookup tables are faster. I am happy in the knowledge that this person has been given the correct advice by me and not some 'Now it all amateur' who gives the impression he knows his stuff without providing evidence.


Andy(Posted 2004) [#35]
You know, your condescending attitude won't score you any points... The original poster seemed very level-headed and I am sure that he can see through both 'selfimportant bravado' as well as 'know it all amateurs'.

Andy


Warren(Posted 2004) [#36]
Rubbish! Being able to prove something is the only way to get credability. Show me some code that proves that calculating sin/cos is faster than using lookup tables and I'll take back what I said. I already ran Wolron's example and the lookup tables are still loads faster.

As I feared, you've totally missed my point.

I'm out.


Sledge(Posted 2004) [#37]

I already ran Wolron's example and the lookup tables are still loads faster.



Isn't the point that the difference between realtime calculations and still-loads-faster lookup tables is negligible in terms of the application's overall performance these days? I mean, "slow" is still pretty damn fast!


Tracer(Posted 2004) [#38]
Sin and Cos shouldn't put much strain on the CPU itself i'd think, on the FPU perhaps.. the Celeron seems to not like that as much as a real processor like the P4.

Tracer


Kevin_(Posted 2004) [#39]
Epicboy...

I don't believe that you would accept anything that I showed you


A strange answer which is probably due to the fact that can't provide any proof.


Warren(Posted 2004) [#40]
www.respawngames.com


Kevin_(Posted 2004) [#41]
That is not proof that calculating sin & cos is just as fast as using lookup tables.

How about posting a little code so that everyone can see for themselves? (like I did at the top of this thread).


Kevin_(Posted 2004) [#42]
I'm still waiting :-)


Warren(Posted 2004) [#43]
That is not proof that calculating sin & cos is just as fast as using lookup tables.

I KNOW! Like I said, you've missed my point and no matter how many times I state it, it's just not sinking in.

I'm tired of repeating myself. Grow up a little, get some real world experience, come back and we'll talk.


WolRon(Posted 2004) [#44]
I agree with EpicBoy. Prof is being childish and obviously misses the point.


Al Mackey(Posted 2004) [#45]
Prof, the point I think these people are trying to make has nothing to do with what's faster. The point they're trying to make, in a nutshell, is this:

Sin/Cos functions will take up such a small percentage of your overall CPU load in your average complete game or application, that wasting memory and load time on a lookup table is unnessicary. They can't post example code because that would mean posting the code for an entire game or application.

So, yes, lookup tables are technically faster, but in a real-world application it would be the difference between spending, say, 3.55 milliseconds per frame on calculations as opposed to 3.58. And as CPUs get faster, this margin is becoming less and less.

But go ahead and use lookup tables if you want to. Makes no difference to me.


Lazze(Posted 2004) [#46]
CPU's get faster - we make slower code - we need faster CPU's to run or slower code - now we have even faster CPU's so we are satisfied with even slower code - now we need faster CPU's again....get my point???


Al Mackey(Posted 2004) [#47]
Are you saying that our code is the same level it was ten years ago? It isn't inefficient code that's driving the processor market faster, it's larger and more powerful applications.


Warren(Posted 2004) [#48]
CPU's get faster - we make slower code - we need faster CPU's to run or slower code - now we have even faster CPU's so we are satisfied with even slower code - now we need faster CPU's again....get my point???

The thing is, it's not all about performance. I don't know why young programmers are so wound up about clock cycles and such. If your app runs acceptably on your target platform, that's enough. Faster processors allow you to use higher and higher levels of abstraction and layers of libraries (like Blitz products).

Code from the old days was highly optimized, yes. It was also hard to read and almost impossible to maintain.

Bottom line ... hacking in assembler sucks. Nobody wants to do it anymore.


Lazze(Posted 2004) [#49]
My point is not that we should code it in assembler (would probably make it even slower, given the complexity of todays systems), nor is it that we should make code that is impossible to maintain or to cut features from our applications. My point is; why make programs that only runs acceptable on high-end systems by only partially optimizing, when we could make them accessable to weaker computers by using all our possible means of optimizing. Of course; if you're only coding for your own system it makes no difference, but if you want to sell it, you naarow the market if it only runs on fast computers....but thats just my oppinion.


Warren(Posted 2004) [#50]
Lazze

Assembler was going to an extreme to make my point. My original point still stands, and I'll repeat one final time:

Optimize when necessary. Compacting/optimizing/obfuscating code just because you think it's "cool" or you do it "just because" is to be a nightmare programmer to work with. I don't want that person on my team. Ever.

And note : selected optimizing is NOT being lazy. It's working smart.


Lazze(Posted 2004) [#51]
Optimize when necessary.


I agree - believe it or not :o) If it runs acceptable on your target computers - no need to work more on that.

And note: If you want to limit your programs to high-end systems, well thats your choice.


Kevin_(Posted 2004) [#52]
Lazze...

The problem with Epicboy is that he is always right. Even when you provide code as proof he still wont admit he's wrong. He has still yet to provide one bit of code that backs up his argument that the difference is not worth bothering with. Oh, sorry, correction.... It works fast on his computer so it must be OK (LOL!)

And to the others that think I'm being childish, I work in industry on a regular basis as well as having my own business. Anyone in a similar position will tell you that nothing gets past quality control unless it can be proved!

Proof is everything, saying "I dont think it will make a difference" shows inexperience and a lack of understanding of the basic principles of the system development life cycle.

But it doesn't really matter what I say. Epicboy is always right about these things no matter how much proof you provide. Its a real a pitty that he can't provide us with an example of his argument. Never mind.


Warren(Posted 2004) [#53]
Are you even reading my posts? Seriously.


Kevin_(Posted 2004) [#54]
Still no code to prove your point? Provide a bit of code that proves your are right otherwise no one will take you seriously.

I don't believe that you would accept anything that I showed you.


Of course I would. Just show me an example. The truth is, is that you can't can you?

Either admit you are wrong or provide some proof. How many times have I got to ask?


Warren(Posted 2004) [#55]
Yeah, you're right. I admit it. There is never a case where using lookup tables would be slower or unnecessary than straight calcs. They are useful in every situation, every time, without exception.

Look up tables are always faster, always better and always necessary.


Kevin_(Posted 2004) [#56]
I don't care. I'm arguing that it DOESN'T MATTER.


Where is the proof that IT DOESN'T MATTER? I have proved that IT DOES MATTER near the top of this thread. Ints were 300 times faster and floats were 200 times faster using lookup tables.

So where is your evidense that proves it doesn't matter?


Warren(Posted 2004) [#57]
Look, I already said that you're right. Look up tables are always necessary and will speed up any application in a noticeable way. They are essential to any application written for any reason on any platform or system spec.

You win.


WolRon(Posted 2004) [#58]
I bow down to you oh great and mighty Prof.
You are the supreme being.


Kevin_(Posted 2004) [#59]
Thats not what its about at all. Its all about proving your argument and I did. Epicboy didn't.

I am not an almighty supreme being at all and I occasionaly get things wrong like most people. On the occasions when I do get things wrong I often back down from my argument because someone has provided proof to suggest otherwise - Something which Epicboy just cannot do regardless of how much evidense is provided.

In this case I won the argument simply because I provided proof where as Epicboy didnt. It is as simple as that.


Stoop Solo(Posted 2004) [#60]
Good god, I can't believe this childish bickering is still continuing.

My own contribution to the original question: yes, in my last project I did use pre-calc tables. However, only for part of it. It was for a routine that performed alot more than just a SIN calculation, and sometimes did it many times per loop. I precalculated the whole formula and stuffed it into a bank. It had an impact, I'd estimate, of about 6% or so, largest case. When I tried substituting the simpler stuff with SIN tables, I couldn't even tell if it was making any difference. If it was, it was so insignificant compared to what else was going on, I simply couldn't tell. I could create a few test programs, but they wouldn't necessarily accurately reflect a real-world scenario.

So yeah, I suppose if I was doing a humungous number of SIN calcs, especially as part of a larger equation, I would most likely precalculate the entire equation. But to be honest, it will depend upon the program and how heavy the usage is as to whether or not tables are worth bothering with.


Warren(Posted 2004) [#61]
I could create a few test programs, but they wouldn't necessarily accurately reflect a real-world scenario.

Your logic and facts have no place here, hippie!


podperson(Posted 2004) [#62]
Epicboy's original point was correct, even if it could have been put more diplomatically, and could be stated thus:

1) Write your program as simply and cleanly as you can.

2) When you're finished, if performance is an issue, profile your code to find the bottlenecks.

3) Optimize where it will do the most good.

A program that is finished is infinitely preferable to a faster program that isn't finished.

As for the example at hand:

It's almost certain that using a lookup table for trig functions will be faster than not doing so -- but the odds of this being significant with today's hardware is fairly small, and it may cause visual artifacts (non-smooth rotations, jerky control behavior).

More important, there's the question of OPPORTUNITY COST. You could spend the time you've spent optimizing something that doesn't NEED to be optimised optimizing something that DOES. And by optimising too soon you're also making code more difficult to maintain (optimised code almost always is nastier than unoptimised, the exception being when you find a more elegant way to do something) and debug (you know than sin() works; does your optimised version work?).


Warren(Posted 2004) [#63]
Or to use the classic quote, "Premature optimization is the root of all evil".


Kevin_(Posted 2004) [#64]
Premature optimization is the root of all evil


No, that is incorrect.


podperson(Posted 2004) [#65]
"We should forget about small efficiencies, about 97% of the time. Premature optimization is the root of all evil."

Donald Knuth, "Literate Programming"

I guess Epicboy could have put the period inside the quotation marks.

(If you don't know who Donald Knuth is, I suggest you look him up before posting.)


Craig Watson(Posted 2004) [#66]
Small test programs like that will not produce realistic results. That Sin/Cos test is not a game engine, and as such is not a valid response to the original question, which related to the use of those functions within an actual game. Think of how many benchmarking tools actually use code from real programs or game engines to provide a more realistic result.

To accurately test whether look up tables would be faster than the processor's own functions, you would have to take all cache levels out of the equation, or ensure the caches were adequately flooded with data as would be representative of a real world game engine.

In a real world engine it is unlikely that the lookup table would be stored long term in the cache, and thus on a processor from Pentium level up, using lookup tables would most likely be SLOWER than using the internal functions, which are performed in few cycles. To load the lookup data from system RAM would actually take several processor cycles (processors being massively faster than RAM) and as such is far less efficient in reality for most cases.

Thus the need to profile your code to find out where the true bottlenecks are.

If however your target market is the 386SX, or your main loop and program code consists largely of Sin and Cos calls, by all means use lookup tables. For 99% of people, the performance difference will be negligible, or possibly worse.


Simon S(Posted 2004) [#67]
Well, not keen to add fire to this argument, but I can say that my game won't run full speed without look up tables.

Let's see, there's a heat haze effect, needing two passes, one with about 350 cos operations and about 250 sin operation per frame.

It runs very comfortably on my Athlon 1.33ghz using a look up, but crawls at about 40fps if I do the math on the fly.

I've not had a chance to test it on a more powerful CPU, but I find it difficult to believe new processors find it faster to calculate sin than to access a memory field. Still when I finally upgrade I may be proved wrong. We will see.


But unless I'm only using a handlful of math, the look up tables always improve things for me.


bobbo(Posted 2004) [#68]
I agree that this kind of tests are not reliable.

For example:

Graphics 640,480,32,2


; Calc on the fly
Time1=MilliSecs()
For a=0 To 10000000
	cont=a
	cont=cont*2
	cont=cont*4
Next
Time2=MilliSecs()
TimeTaken1=Time2-Time1

; Now Calc using Binary Shift
Time1=MilliSecs()
For a=0 To 10000000
	cont=a
	cont=cont Shl 1
	cont=cont Shl 2
Next
Time2=MilliSecs()
TimeTaken2=Time2-Time1





Text 10,10,"REPORT"
Text 10,30,"Calc using * operator took "+Str(TimeTaken1)+" millisecs"
Text 10,45,"Calc using Binary Shift Left took "+Str(TimeTaken2)+" millisecs"

Flip
WaitKey()
End


and this
Graphics 640,480,32,2


; Now Calc using Binary Shift
Time1=MilliSecs()
For a=0 To 10000000
	cont=a
	cont=cont Shl 1
	cont=cont Shl 2
Next
Time2=MilliSecs()
TimeTaken2=Time2-Time1

; Calc on the fly
Time1=MilliSecs()
For a=0 To 10000000
	cont=a
	cont=cont*2
	cont=cont*4
Next
Time2=MilliSecs()
TimeTaken1=Time2-Time1



Text 10,10,"REPORT"
Text 10,30,"Calc using * operator took "+Str(TimeTaken1)+" millisecs"
Text 10,45,"Calc using Binary Shift Left took "+Str(TimeTaken2)+" millisecs"

Flip
WaitKey()
End


I wanted to test the fastest way to multiply by 2 and 4 (* operator or binary shift), but the result changes if you just switch the order of the test...


Kevin_(Posted 2004) [#69]
Simon...

Try telling that to Epicboy. Those who say that it doesnt matter in a real world example just don't know what they are talking about OR they are just arguing for the sake of arguing without producing any evidense.


Warren(Posted 2004) [#70]
Prof

Your ability to selectively read posts and only acknowledge the ones that support your argument is truly admirable.


Stoop Solo(Posted 2004) [#71]
"Those who say that it doesnt matter in a real world example just don't know what they are talking about OR they are just arguing for the sake of arguing without producing any evidense."


...or might simply find that utilising such a table just isn't worth the effort in their particular piece of software.

Use them if it's going to make a worthwhile difference to your particular piece of software. Seems reasonable enough.

As for the poor soul who opened this pandora's box of a thread, I can't see look-up tables making a vastly huge impact. It will depend on what you are doing with these four sprites. And backgrounds. And whatever else. Yes, it may be that using look-up tables gives you a gajillionty-hundredty frames per second, rather than only a billionty-twillionty. Of course, the end-user's monitor may only display 100 of them.

So there isn't a yes-or-no answer, really. All you can do is suck it 'n' see.


Kevin_(Posted 2004) [#72]
Epicboy...

Your ability to selectively read posts and only acknowledge the ones that support your argument is truly admirable.


This is truely amazing. 71 posts and you still cannot give us a bit of code that supports your argument. This suggests to me that you are just arguing for the sake of it. No evidense, no credability. Not suprising really considering some of your other posts.


Mental Image(Posted 2004) [#73]
Moderators.......save me, I'm sinking..... :-)


Warren(Posted 2004) [#74]
Hey Prof, is that ambrosia from heaven or urine spattering on your teeth right now? Can you tell the difference?


Craig Watson(Posted 2004) [#75]
Well, not keen to add fire to this argument, but I can say that my game won't run full speed without look up tables.

Let's see, there's a heat haze effect, needing two passes, one with about 350 cos operations and about 250 sin operation per frame.
If however your target market is the 386SX, OR YOUR MAIN LOOP AND PROGRAM CODE CONSISTS LARGELY OF SIN AND COS CALLS, BY ALL MEANS USE LOOKUP TABLES. For 99% of people, the performance difference will be negligible, or possibly worse.

I think 350 cos and 250 sin operations would fit exactly for that situation.

Most people however are not doing anywhere near that many calls in a loop.

Code should be profiled on a situation by situation basis and there is no uniform, one shot method of making fast code.


Craig Watson(Posted 2004) [#76]
I've hacked up some of the sample code from the standard Blitz3D install to demonstrate this. You may have to locate the media yourself or save the file in the "samples/birdie/fire effect/fire effect" folder to get it to work. I would've created the media on the fly but I can't really be bothered putting any real effort into this argument.

I've created a situation that you would think would actually favour Lookup tables (that being a huge amount of consecutive calls to Sin and Cos) however on my machine, an AMD Athlon 2600+ (the Barton version with the larger cache), using the processor's own functions is almost twice as fast as doing a lookup.

You can test this yourself by changing the uselookup constant.

The amount of milliseconds running the game loop takes is displayed in the top left corner.

I get between 77 and 80 for the processor's function and 139-144 for using lookups.

;Fire & Particles scene
;David Bird
;dave@...
Graphics3D 640,480

;Lookup Table
Dim Sin2#(36000000)
Dim Cos2#(36000000)
For a=0 To 35999999
Sin2#(a)=Sin(a)
Cos2#(a)=Cos(a)
Next
;Change this to switch test
Const UseLookup=0

HidePointer
SetBuffer BackBuffer()

lit=CreateLight()

TurnEntity lit,30,0,0

piv=CreatePivot()
cam=CreateCamera(piv)
CameraRange cam,.1,1000
PositionEntity cam,0,8,-25
TurnEntity cam,10,0,0

; Pre load sprites ** Hint Copy Entity is faster and more efficient **
Global flame01=LoadSprite("smk01.bmp"):HideEntity flame01
Global Part=LoadSprite("particle.bmp"):HideEntity Part

mirror=CreateMirror()	;Stick a mirror in there for fun
cube=CreateCube():PositionEntity cube,0,2,0
;
tfire.fire=Add_Fire(0,0,0)
rad#=10	;Radius of path
While Not KeyDown(1)
	starmill = MilliSecs()
	TurnEntity cube,0,1,0.2
	;Simple movement of fire.
	PositionEntity tfire\piv,xp#,1,zp#
	xp=rad*Cos(MilliSecs()/30)
	zp=rad*Sin(MilliSecs()/30)
	
	If uselookup=0 Then
		For t = 1 To 300000
		x1#=Sin#(Rnd(0,35999999))
		x2#=Cos#(Rnd(0,35999999))
		Next
	Else
		For t = 1 To 300000
		x1#=Sin2#(Rnd(0,35999999))
		x2#=Cos2#(Rnd(0,35999999))
		Next	
	End If
	
	If KeyDown(203) TurnEntity piv,0,1,0
	If KeyDown(205) TurnEntity piv,0,-1,0
	If KeyDown(200) TurnEntity cam,1,0,0
	If KeyDown(208) TurnEntity cam,-1,0,0
	If KeyDown(44) MoveEntity cam,0,0,-.2
	If KeyDown(30) MoveEntity cam,0,0,.2
	
	;Press Space to change the colour of particles
	If KeyDown(57) Then
		add_particle(EntityX(tfire\piv),EntityY(tfire\piv),EntityZ(tfire\piv),Rnd(255),Rnd(255),Rnd(255))
	Else
		add_particle(EntityX(tfire\piv),EntityY(tfire\piv),EntityZ(tfire\piv))
	End If
	
	;Update all fires and particles
	Update_Fires()
	Update_Particles()

	UpdateWorld 
	RenderWorld
	Text 0,0,MilliSecs()-starmill
	Flip False
Wend

;Clean Up everything
Erase_Particles()
Erase_flames()
Erase_Fires()
FreeEntity Part
FreeEntity Flame01
FreeEntity lit
FreeEntity cam
FreeEntity piv
EndGraphics
End

;Each Flame of the fire
Type flame
	Field ent
	Field ang#
	Field size#
	Field alph#
	Field dis#
	Field dx#, dy#, dz#
End Type

;The fire itself
Type fire
	Field piv
	; Direction
	Field dx#, dy#, dz#
End Type

;Hot ashes
Type particle
	Field ent
	Field alpha#
	Field dx#,dy#,dz#
	Field pop
End Type

;Add a flame to the fire
Function Add_flame(x#,y#,z#,size#=1,dis#=.016,dx#=0,dy#=0.3,dz#=0)
	a.flame=New flame
	a\ent=CopyEntity(flame01)
	PositionEntity a\ent,x,y,z
	a\alph=1
	a\size=size
	a\dis=dis
	a\ang=Rnd(360)
	ScaleSprite a\ent,a\size,a\size
	EntityColor a\ent,Rnd(150,255),Rnd(0,100),0
	a\dx=dx
	a\dy=dy
	a\dz=dz
End Function

;Update flames
Function Update_flames()
	For a.flame=Each flame
		If a\alph>0.01 Then
			a\alph=a\alph-a\dis
			EntityAlpha a\ent,a\alph
			RotateSprite a\ent,a\ang
			a\ang=a\ang+.2
			MoveEntity a\ent,a\dx,a\dy,a\dz
		Else
			FreeEntity a\ent
			Delete a
		End If
	Next
End Function

;Erase all flames
Function Erase_flames()
	For a.flame=Each flame
		If a\ent<>0 Then FreeEntity a\ent
	Next
	Delete Each flame
End Function

;Update all fires
Function Update_Fires()
	For a.fire=Each fire
		Add_Flame(EntityX(a\piv),EntityY(a\piv),EntityZ(a\piv),Rnd(1,4),.04,a\dx,a\dy,a\dz)
	Next
	Update_Flames()
End Function

;Erase all fires
Function Erase_Fires()
	For a.fire=Each fire
		If a\piv<>0 Then FreeEntity a\piv
	Next
	Delete Each fire
End Function

;Add a fire to the scene
Function Add_Fire.fire(x#,y#,z#,dx#=0,dy#=.23,dz#=0)
	a.fire=New fire
	a\piv=CreatePivot()
	PositionEntity a\piv,x,y,z
	a\dx=dx:a\dy=dy:a\dz=dz
	Return a
End Function

;Add a particle to the scene
Function Add_Particle(x#,y#,z#,r=255,g=255,b=255)
	a.particle=New particle
	a\ent=CopyEntity(Part)
	PositionEntity a\ent,x,y,z
	a\dx=Rnd(-.1,.1)
	a\dy=Rnd(0.1,.7)
	a\dz=Rnd(-.1,.1)
	ScaleSprite a\ent,Rnd(.1,.2),Rnd(.1,.2)
	a\alpha=1
	a\pop=False
	EntityColor a\ent,r,g,b
End Function

;Update all particles
Function Update_Particles()
	For a.particle = Each particle
		MoveEntity a\ent,a\dx,a\dy,a\dz
		If EntityY(a\ent)<.3 Then 
			a\dy=-a\dy
			a\dy=a\dy*.62
			a\pop=True
		End If
		a\dy=a\dy-.02

		If a\pop Then
			a\alpha=a\alpha-.02
			EntityAlpha a\ent,a\alpha
			If a\alpha<0.05 Then
				FreeEntity a\ent
				Delete a
			End If
		End If
	Next
End Function

;Erase all particles
Function Erase_Particles()
	For a.particle = Each particle
		If a\ent<>0 Then FreeEntity a\ent
	Next
	Delete Each particle
End Function



Sledge(Posted 2004) [#77]

Those who say that it doesnt matter in a real world example just don't know what they are talking about



Nope, on the contrary the answer seems obvious: A look-up table is trivial to add so it is straightforward to let the user decide which method your program should employ. That way owners of 386SX's have total freedom of choice as to whether their machines choke on the maths or just the rendering.


Kevin_(Posted 2004) [#78]
Epicboy....

Hey Prof, is that ambrosia from heaven or urine spattering on your teeth right now? Can you tell the difference?


Wow! I really hit a nerve there didn't I?

Graig....

I cannot test this because I dont have Blitz3D. Can it be modified to use 2d only functions?


Craig Watson(Posted 2004) [#79]
Qrof, it's easy enough to take out the bits you need.

Add
;Lookup Table
Dim Sin2#(36000000)
Dim Cos2#(36000000)
For a=0 To 35999999
Sin2#(a)=Sin(a)
Cos2#(a)=Cos(a)
Next
;Change this to switch test
Const UseLookup=0

To the beginning of a program and put:
	starmill = MilliSecs()	
	If uselookup=0 Then
		For t = 1 To 300000
		x1#=Sin#(Rnd(0,35999999))
		x2#=Cos#(Rnd(0,35999999))
		Next
	Else
		For t = 1 To 300000
		x1#=Sin2#(Rnd(0,35999999))
		x2#=Cos2#(Rnd(0,35999999))
		Next	
	End If
	result=MilliSecs() - starmill
	Text 0,0,result

Within the main loop.

I tried it in Tracer's asteroid.bb demo under BlitzPlus and the results were far closer, although the CPU again beat the lookup tables on my machine. The difference was pretty negligable though. BlitzPlus in general appears to be slower at maths than Blitz3D.

Those large arrays ran fine on my copy of BlitzPlus by the way.

We're performing huge numbers of calculations here, and it's unlikely any real game is going to do that much, my intention here is only to prove that a real game is likely to perform far different to a program that calculates Sin and Cos only.

Thus why your time is best spent trying to find the real slow points in your code, and not worrying over 5ms of speed over a 300000 iteration loop. In smaller numbers of iterations, the millisecs difference actually reads as 0 on my machine, for both.

Running those code snippets alone on my machine puts CPU in front on Blitz3D at about twice as fast, and under BlitzPlus puts lookups in front by about 10ms. Blitz3D was twice as fast as BlitzPlus overall.


Craig Watson(Posted 2004) [#80]
Additionally, just had a play about (trying to figure out why my results were so very different to the other tests), and the speed of the Lookup results appear to be attributable to the sequential method of accessing them in those tests above.

If you provide the Lookup arrays with more realistic out of order random data, they actually perform far more in line with the CPU results.

Since it's highly unlikely any game is actually going to access lookup arrays sequentially, it appears to be quite a bad idea to use them at all under Blitz3D, and in my tests they have a negligable or sometimes negative affect under BlitzPlus too.


Kevin_(Posted 2004) [#81]
Interesting Graig. I ran your code and my swap file kicked in (only 256 MB Ram). Not surprising considering the large numbers involved here.

Three points to your code....

1. Sin & Cos only need an angle of 0 to 359 degrees so it is rather unfair trying to get the Sin & Cos of an angle like 35999999. Havent tried using radians so this maybe a better test.

2. We only need to Dim the arrays of 360 elements because that is all people use. An angle of 361 degrees is the same as 1 degree.

3. The important thing is the iterations part in your For...Loops because this is what takes the time to process the results.

I've modified your code below which now uses 1,000,000 iterations. Run it once then change your USELOOKUP variable to a 1. The results on my 2GHz Celeron were.....

Where USELOOKUP=0 : Time = 1179 Millisecs

Where USELOOKUP=1 : Time = 167 Millisecs

One interesting point to note is that using lookup tables on my computer was only approx 9 times faster. This may have something to do with the RND() function that you have included.


[Code]
Graphics 640,480,32,2

;Lookup Table
N=1000000

Dim Sin2#(360)
Dim Cos2#(360)

For a=0 To 359
Sin2#(a)=Sin(a)
Cos2#(a)=Cos(a)
Next
;Change this to switch test
Const UseLookup=0

starmill = MilliSecs()
If uselookup=0 Then
For t = 1 To N
x1#=Sin#(Rnd(0,359))
x2#=Cos#(Rnd(0,359))
Next
Else
For t = 1 To N
x1#=Sin2#(Rnd(0,359))
x2#=Cos2#(Rnd(0,359))
Next
End If

result=MilliSecs() - starmill
Text 0,0,result

Flip
WaitKey()
End

[/Code]


Kevin_(Posted 2004) [#82]
Craig... Check the code out below. It is the same as above only now we draw 1000 rectangles after our Sin & Cos loops. The results on my machine were....

Where USELOOKUP=0 : Time 1117

Where USELOOKUP=1 : Time 207

This has got be conclusive proof that pre-calcing the Sin & Cos of an angle is still lots faster even when using graphical operations. This is a fair test because software 3D engines usually pre-calc sin & cos and then store the results in another array. This new array is then used to plot each vertex (after perspective correction etc). So this could be considered a real world situation.

[Code]

Graphics 640,480,32,2

N=1000000

Dim Sin2#(360)
Dim Cos2#(360)

For a=0 To 359
Sin2#(a)=Sin(a)
Cos2#(a)=Cos(a)
Next
;Change this to switch test
Const UseLookup=1

starmill = MilliSecs()
If uselookup=0 Then
For t = 1 To N
x1#=Sin#(Rnd(0,359))
x2#=Cos#(Rnd(0,359))
Next
; Now Draw 1000 rectangles
For t=1 To 1000
Rect 50,50,100,100
Next
Else
For t = 1 To N
x1#=Sin2#(Rnd(0,359))
x2#=Cos2#(Rnd(0,359))
Next
; Now Draw 1000 rectangles
For t=1 To 1000
Rect 50,50,100,100
Next
End If

result=MilliSecs() - starmill
Text 0,0,result

Flip
WaitKey()
End

[/Code]


Craig Watson(Posted 2004) [#83]
Sin and Cos can take decimals as well, which means you'll need far more than 360 items, unless you don't care about accuracy. The initial post used decimal angles, so presumably more accuracy is required for that poster.

The large array was intended to make the test depend upon memory speed also. Unfortunately as pointed out before, 360 elements is small enough to fit in most caches. Running your test on my Athlon 1700 with all cache disabled yielded near identical results under Blitz3D (the lack of cache also made the machine slow as hell.)

Your observation of it being only 9 times faster is due to the fact the array is no longer being read sequentially, as I noted above.

The massive performance differences between Blitz3D and BlitzPlus aren't encouraging. I wouldn't mind knowing what's behind that.

For a potentially larger lookup table, under more realistic conditions (within in a game) and for Blitz3D, there appears to be little reason to use a lookup table. Under BlitzPlus, the real performance is similar within a realistic game environment.

If you are using only 360 elements under BlitzPlus, and calling a _lot_ of trig functions, I'd recommend using a lookup array at this stage. Over more elements, the array overhead makes it less worthwhile though.

For those using tens of sin or cos calls with a loop, the results here at least would indicate using a lookup array is a bit of a waste of time, changing your code to do only 1000 iterations should be proof enough of that.

Your best bet, as myself and others have repeatedly pointed out, is to profile your code to see which bits are performing poorly, and act accordingly.


Craig Watson(Posted 2004) [#84]
Your above post is massively flawed.

You are batching the operations. That's not at all what happens in a game engine. Interleaving the commands is way more realistic, because rarely would people perform a million trig functions, only to THEN draw something.

And on Blitz3D and BlitzPlus the results become exactly the same. Pointing out the fact that the weakest link of your test is actually in the drawing of the squares, and not the trig functions.

That's exactly what I've been trying to tell you, that in an actual game, with realistic function calls, the net effect of lookups is going to be worse or negligible compared with just using the functions built in to Blitz. Just interspersing the array lookups with even one function call totally ruins any advantage you would have got from hogging the L1 cache with just the array.

Graphics 640,480,32,2
N=100000
Dim Sin2#(360)
Dim Cos2#(360)
For a=0 To 359
    Sin2#(a)=Sin(a)
    Cos2#(a)=Cos(a)
Next
;Change this to switch test
Const UseLookup=0
starmill = MilliSecs()
If uselookup=0 Then
For t = 1 To N
x1#=Sin#(Rnd(0,359))
x2#=Cos#(Rnd(0,359))
Rect 50,50,100,100
Next
Else
For t = 1 To N
x1#=Sin2#(Rnd(0,359))
x2#=Cos2#(Rnd(0,359))
Rect 50,50,100,100
Next
End If
result=MilliSecs() - starmill
Text 0,0,result
Flip
WaitKey()
End



Sledge(Posted 2004) [#85]

Sin and Cos can take decimals as well, which means you'll need far more than 360 items, unless you don't care about accuracy. The initial post used decimal angles, so presumably more accuracy is required for that poster.



Doesn't usually matter... 360 is more distinct degrees than you'll ever need for most stuff and the really important factor is that you can have decimals returned so using parametric equations (and the like) to plot movement remains pretty convincing. If you REALLY want to optimise, you work in ints on a large scale then factor them down for rendering anyway... something else that isn't really worth the bother on a modern PC.


2. We only need to Dim the arrays of 360 elements because that is all people use.



No, it's not. Employing two tables is overkill; instead dim a single array of 450 elements, write the sin of each element to it and you have sin(x) as table(x) AND cos(x) as table(x+90). Much smaller table, same functionality.


Craig Watson(Posted 2004) [#86]
I suppose we can also say that Pi is 22/7. It depends what people need. I did acknowledge as much anyway.

Your second point isn't lost on me however I think lookup tables are generally useless anyway. I wouldn't bother with them unless I was programming an ARM-7 or 6502 or something.


Sledge(Posted 2004) [#87]

It depends what people need. I did acknowledge as much anyway.



For sure -- I wasn't really saying it for your benefit, more the original poster's.


Warren(Posted 2004) [#88]
Wow! I really hit a nerve there didn't I?

Enjoying the taste of aspargus then? Great.


Kevin_(Posted 2004) [#89]
Interleaving the commands is way more realistic, because rarely would people perform a million trig functions, only to THEN draw something.


That might be the case for a novice programmer who doesn't know about aptimization but for professionals there is no way they would do this.....

calc, draw, calc, draw, calc, draw, flip.

It is more efficient to do

calc, calc, calc, draw, draw, draw, flip.

I cannot quote results for Blitz3D athough I am quite surprised to see that you get similar time-test results. I dont know why this is happening.

Also...


Your observation of it being only 9 times faster is due to the fact the array is no longer being read sequentially, as I noted above.



Nothing is being read sequentially. The whole idea of lookup tables is the ability to directly MAP to an element in an array. So I do not buy this at all. I suspect the reason for this slowdown is the RND() function.


Wiebo(Posted 2004) [#90]
I was under the impression that this thread contained a quick poll =] wheee!


Kevin_(Posted 2004) [#91]
Graig...

I ran your code and I got similar results also...

LOOKUP=0: Time 6964 Millisecs
LOOKUP=1: Time 5230 Millisecs

OK, very close results BUT! The code is badly designed as I pointed out above. Lets compare results....

Senario 1: Calc, Draw, Calc, Draw, Calc, Draw, Flip.
Using Lookups: 5230 Millisecs

Scenario 2: Calc, Calc, Calc, Draw, Draw, Draw, Flip.
Using Lookups: 4072 Millisecs

It appears that the difference is getting less and less the more calculations and more drawing is used. But is this a real world scenario? 200,000 trig functions and 100,000 drawing operations. I would say that the trig functions are an overkill here but I feel that the drawing operations are realistic. There is still over a one second difference on my machine.


Craig Watson(Posted 2004) [#92]
I think you'll find that the more commands you implement aside from those trig functions, the less advantage you will have using lookups. They benefit greatly from being able to use the processor cache and it seems to work best when you have one trig command after the other.

It's not particularly real world, but to do a real world test is going to take a lot of time and effort, and the differences will be too insignificant to measure.

It seems clear enough that under BlitzPlus, when doing a lot of trig operations sequentially, there can be some advantage in using lookup tables, but otherwise your optimisation effort is best spent elsewhere I think, and particularly if you're using Blitz3D, where there is a small or even negative difference.

I still wouldn't mind knowing why there is such a big difference in maths performance between the two though. Blitz3D would come out 5 times faster for maths operations than BlitzPlus in my tests.

Anyway, I've said enough.


Kevin_(Posted 2004) [#93]
Dare I say it? Me too.


Warren(Posted 2004) [#94]
Definitely urine.


Kevin_(Posted 2004) [#95]
You just couldn't resist it could you EpicBOY?


WolRon(Posted 2004) [#96]
Perhaps the time differences is because of the way the two are designed.
Blitz3D is a CPU hog, taking up as many cycles as it can grab.
BlitzPlus is a law-abiding CPU sharer, and lets the operating system get a chance at the cycles once in a while.


Michael Reitzenstein(Posted 2004) [#97]
That might be the case for a novice programmer who doesn't know about aptimization but for professionals there is no way they would do this.....

Right, and so EpicBoy, being someone who lives off of the income he gets as a programmer, isn't a 'professional'? Professional programmers are invariably people who get results (which means they justify their paycheck), not people who sit around all day doing pointless optimizations to save a few, mostly worthless, CPU cycles. Honestly, you have no idea what you are talking about.


Kevin_(Posted 2004) [#98]
Yea sure.


Michael Reitzenstein(Posted 2004) [#99]
Glad we agree, then.


Tracer(Posted 2004) [#100]
Wait, isn't EpicBoy a level designer who hobbies with Blitz?

Tracer


Warren(Posted 2004) [#101]
Yes, I'm a level designer, but I also write code.

I write and maintain the level editor, UnrealEd, here at Epic. I also wrote the cut scene camera system we use, Matinee.

Beyond that, I've released 2 games on my own site, Respawn Games, along with a screensaver.

But I'm clearly not up to Profs standards, so disregard me.


Craig Watson(Posted 2004) [#102]
BlitzPlus is a law-abiding CPU sharer, and lets the operating system get a chance at the cycles once in a while
Not necessarily.

Try doing an infinite loop and see how much CPU blitzcc takes up.

BlitzPlus behaves nice when you program it explicitly to do so, but you can make it just as much of a hog as Blitz3D if you avoid all the event driven stuff.


Kevin_(Posted 2004) [#103]
But I'm clearly not up to Profs standards, so disregard me.


Thats not what its about. Its about proving your argument and I produced many time tests where as you didn't produce any.

I am sure you could give plenty of advice on level design because thats what you do. I wouldnt give advice to anyone who wanted to do level design in 3D environments because I am not profficient at it. Therfore I stay clear. Something which you should have done in this Sin / Cos debate.


Warren(Posted 2004) [#104]
You're right. I just can't hold a candle to a programmer of your stature.

I mean, the fact that Epic Games employs me as a programmer tells you right off the bat that I have no idea what I'm talking about. Hell, "Learn C++ in 24 hours" is sitting on my desk and I still haven't opened it! Gah!

And the fact that I've released several shareware products on my own is the clearest indication of all that when it comes to programming, I am clearly a n00b. A stupid n00b, at that.

When's the short bus get here? I need a ride home.


Michael Reitzenstein(Posted 2004) [#105]
Never argue with an idiot. They will bring you down to their level and beat you with experience.


Warren(Posted 2004) [#106]
"Arguing on the internet is like competing in the special olympics. Even if you win, you're still retarded."


Kevin_(Posted 2004) [#107]
Micheal..

Never argue with an idiot


That is why I am not arguing with you :-)

Epicboy...

Even if you win, you're still retarded


But it is better than being a loser :-)


Michael Reitzenstein(Posted 2004) [#108]
I am distinctly reminded of an eight year old brat replying "I know you are, you said you are, now what am I?!".


Rob(Posted 2004) [#109]
Hi Prof

remember that you cannot use lookup tables where any kind of numerical accuracy is needed - for example small angle changes. It's only approximate and thats pretty bad sometimes. I can't see myself using cos and sin much per loop though.

what real world fps differences do you get by using look up tables? not a lot I suppose. I would only add look up tables if that was the slowest part.


Hotcakes(Posted 2004) [#110]
Michael, 8 year old? I still say things like that =]


Michael Reitzenstein(Posted 2004) [#111]
Maturity of an eight year old, then.

:)


Hotcakes(Posted 2004) [#112]
That's better!!!

.

..

...

Heyyyyyy!!!


Kevin_(Posted 2004) [#113]
I am distinctly reminded of an eight year old brat replying "I know you are, you said you are, now what am I?!".


I am distinctly reminded of an adult with the mental age of 8 who just cannot admit when he is wrong. Fear not, you will grow out of it.


skidracer(Posted 2004) [#114]
Instead of locking this thread I'm just going to ban anybody that continues to take this discussion personally.