Regarding performance
Blitz3D Forums/Blitz3D Programming/Regarding performance
| ||
Having a background in other languages where its a required form I've always declared my variables in this kind of style in B3D:function test() local a a = 24234324 return a end function (which for me makes it easier to keep track of whats going on and I find more readable) however recently I found out that due to the underlying C++ and the way it works this is resulting in upto 15% more time spent in a function compared to declaring and assigning a value to a variable on the same line i.e. function test() local a = 24234324 return a end function If your also using a similiar style its worth checking functions that are used a lot as theres potentially a fair bit of performance difference. |
| ||
It makes sense that more instructions will take more time. However the 15% more time has been calculated with how many iterations ? For a few iterations it may be negligible... |
| ||
Depends on the function - some of mine that were being called many many times per frame were upto 15% slower, others its not measurable. Some of the languages I've worked with in the past have compiler optimisations that mean it doesn't matter which style you use as the compiled code is the same but typically with C++ and it seems by extension B3D its left alone. |
| ||
This is not true! Both Functions are translated into the exact same assembler code: tmp.bb Function test() Local a a = 24234324 Return a End Function test() tmp.asm BlitzCC V11.6 (C)opyright 2000-2003 Blitz Research Ltd Compiling "tmp.bb" Parsing... Generating... Translating... .align 16 __MAIN push ebx push esi push edi push ebp mov ebp,esp sub esp,4 mov eax,__DATA mov [esp],eax call __bbRestore sub esp,4 mov eax,__LIBS mov [esp],eax call __bbLoadLibs call _2_begin jmp _2_leave _2_begin call _ftest ret _2_leave mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 .align 16 _ftest push ebx push esi push edi push ebp mov ebp,esp sub esp,4 mov [ebp-4],0 mov [ebp-4],24234324 mov eax,[ebp-4] jmp _3_leave mov eax,0 jmp _3_leave _3_leave mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 .align 4 __LIBS .db "",0 .align 4 __DATA .dd 0 Assembling... tmp2.bb Function test() Local a = 24234324 Return a End Function test() tmp2.asm BlitzCC V11.6 (C)opyright 2000-2003 Blitz Research Ltd Compiling "tmp2.bb" Parsing... Generating... Translating... .align 16 __MAIN push ebx push esi push edi push ebp mov ebp,esp sub esp,4 mov eax,__DATA mov [esp],eax call __bbRestore sub esp,4 mov eax,__LIBS mov [esp],eax call __bbLoadLibs call _2_begin jmp _2_leave _2_begin call _ftest ret _2_leave mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 .align 16 _ftest push ebx push esi push edi push ebp mov ebp,esp sub esp,4 mov [ebp-4],0 mov [ebp-4],24234324 mov eax,[ebp-4] jmp _3_leave mov eax,0 jmp _3_leave _3_leave mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 .align 4 __LIBS .db "",0 .align 4 __DATA .dd 0 Assembling... Please compare - I did not find any difference! And if you don't find any difference either there cannot be the slightest performance difference. |
| ||
Ooops good thing you posted that - was compiling with debugging enabled - which does produce upto 15% performance difference between the 2 different ways of doing it - disabling debugging and they both produce the exact same thing. |
| ||
tmp.asm (debug mode)BlitzCC V11.6 (C)opyright 2000-2003 Blitz Research Ltd Compiling "tmp.bb" Parsing... Generating... Translating... .align 16 __MAIN push ebx push esi push edi push ebp mov ebp,esp sub esp,4 sub esp,4 mov eax,__DATA mov [esp],eax call __bbRestore sub esp,4 mov eax,__LIBS mov [esp],eax call __bbLoadLibs call _2_begin jmp _2_leave _2_begin sub esp,12 lea eax,[ebp] mov [esp],eax mov [esp+4],3014400 mov [esp+8],_4 call __bbDebugEnter sub esp,8 mov [esp],393216 mov [esp+4],_1 call __bbDebugStmt call _ftest ret _2_leave mov [ebp-4],eax mov eax,ebx call __bbDebugLeave mov ebx,eax mov eax,[ebp-4] mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 .align 16 _ftest push ebx push esi push edi push ebp mov ebp,esp sub esp,8 mov [ebp-4],0 sub esp,12 lea eax,[ebp] mov [esp],eax mov [esp+4],2383992 mov [esp+8],_5 call __bbDebugEnter sub esp,8 mov [esp],65537 mov [esp+4],_1 call __bbDebugStmt sub esp,8 mov [esp],131073 mov [esp+4],_1 call __bbDebugStmt mov [ebp-4],24234324 sub esp,8 mov [esp],196609 mov [esp+4],_1 call __bbDebugStmt mov eax,[ebp-4] jmp _3_leave sub esp,8 mov [esp],262144 mov [esp+4],_1 call __bbDebugStmt mov eax,0 jmp _3_leave _3_leave mov [ebp-8],eax mov eax,ebx call __bbDebugLeave mov ebx,eax mov eax,[ebp-8] mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 _1 .db "tmp.bb",0 _4 .db "<main program>",0 _5 .db "test",0 .align 4 __LIBS .db "",0 .align 4 __DATA .dd 0 Assembling... tmp2.asm (debug mode) BlitzCC V11.6 (C)opyright 2000-2003 Blitz Research Ltd Compiling "tmp2.bb" Parsing... Generating... Translating... .align 16 __MAIN push ebx push esi push edi push ebp mov ebp,esp sub esp,4 sub esp,4 mov eax,__DATA mov [esp],eax call __bbRestore sub esp,4 mov eax,__LIBS mov [esp],eax call __bbLoadLibs call _2_begin jmp _2_leave _2_begin sub esp,12 lea eax,[ebp] mov [esp],eax mov [esp+4],6293016 mov [esp+8],_4 call __bbDebugEnter sub esp,8 mov [esp],327680 mov [esp+4],_1 call __bbDebugStmt call _ftest ret _2_leave mov [ebp-4],eax mov eax,ebx call __bbDebugLeave mov ebx,eax mov eax,[ebp-4] mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 .align 16 _ftest push ebx push esi push edi push ebp mov ebp,esp sub esp,8 mov [ebp-4],0 sub esp,12 lea eax,[ebp] mov [esp],eax mov [esp+4],6293136 mov [esp+8],_5 call __bbDebugEnter sub esp,8 mov [esp],65537 mov [esp+4],_1 call __bbDebugStmt mov [ebp-4],24234324 sub esp,8 mov [esp],131073 mov [esp+4],_1 call __bbDebugStmt mov eax,[ebp-4] jmp _3_leave sub esp,8 mov [esp],196608 mov [esp+4],_1 call __bbDebugStmt mov eax,0 jmp _3_leave _3_leave mov [ebp-8],eax mov eax,ebx call __bbDebugLeave mov ebx,eax mov eax,[ebp-8] mov esp,ebp pop ebp pop edi pop esi pop ebx ret word 0 _1 .db "tmp2.bb",0 _4 .db "<main program>",0 _5 .db "test",0 .align 4 __LIBS .db "",0 .align 4 __DATA .dd 0 Assembling... It is interesting to see how Mark's compiler handles basic code and how code gets optimised. You can see above why it is different in debug mode. The Debugger must *remember* where basic code is wrong. The additional line of basic code therefore produces extra code. |
| ||
Optimised? I think the answer to that is "it doesn't"! My experience so far has largely been that you shouldn't worry about this too much because there are so many factors that affect performance you just can't easily control with Blitz3D. e.g. you can reverse the order of two lines of code with no data dependencies on each other, and see a 20% speedup because the instruction alignments are suddenly better (don't ask for an example, long since lost it). I should also point out that if you really care about assembly, you can use the TCC library (old wrapper) to include "inline" assembly (or C) routines in your application. Doesn't do SSE/AVX though. |
| ||
Don't overestimate this little sentense I added. We are talking here about a time about 13 - 14 years ago. I just wanted to point out, that Mark was very good in what he did in those days. And that this was the reason why we had such a good tool to make games. |
| ||
@Yasha I found a situation like that years ago - can't remember the exact details now but having certain commands in a certain order and alternating the use of /2 and/or *0.5 made a massive difference to speed and not in the order you'd expect afaik just due to instruction alignment. |
| ||
Do these sorts of issues really cause a problem wont the major source of slowdowns be somewhere else ... ususlly terribly inefficient algorithms with large amounts of data or rendering. |
| ||
How did you generate the assembly output from the blitz compiler? |