Inline assembly problems
BlitzMax Forums/BlitzMax Programming/Inline assembly problems
| ||
I've been running tests with assembly stuff in Max. I can make it work with usual integer stuff, but when I try to use FPU instructions, I keep hitting a wall. Here's what works: test1.c int add(int x, int y) { asm ( "addl %1, %0" : "=r"(x) : "m"(y), "0"(x) ); return x; } test1.bmx Import "test1.c" Extern Function add:Int(x:Int,y:Int) End Extern Print add(10,20) Here's what doesn't work: test2.c float sinx( float degree ) { float result, two_right_angles = 180.0f ; __asm__ __volatile__ ( "fld %1;" "fld %2;" "fldpi;" "fmul;" "fdiv;" "fsin;" "fstp %0;" : "=g" (result) : "g"(two_right_angles), "g" (degree) ) ; return result ; } test2.bmx Import "test2.c" Extern Function sinx:Float(degree:Float) End Extern Print sinx(10) Compiler keeps giving me "Error: suffix or operands invalid for `fstp'". I've been searching something on this for hours, but I'm still at square one. I have a hunch it's trying to tell me something about converting the result to float, but I wouldn't even know what to do about it. |
| ||
Probably has to do with calling conventions - a float is returned on the stack (in memory, not float stack). This works:float sinx( float degree ) { float result, two_right_angles = 180.0f ; __asm__ __volatile__ ( "fld %1;" "fld %2;" "fldpi;" "fmul;" "fdiv;" "fsin;" "fstp %0;" : "=m" (result) : "g"(two_right_angles), "g" (degree) ) ; return result ; } |
| ||
Thank you very much, this worked! I did some tests with distance calculation between two 2D points and it appears it doesn't need asm or even C optimization. Speed results came so close to each other, that it doesn't matter. Actually the version in C came out as fastest. :( The GCC optimizer really knows what it is doing... Or I'm doing something stupid in asm. And max version was only 0.5% slower then C version. Maybe this will come handy some day. |
| ||
You are aware that blitzmax produces a asm file from your .bmx source code. You can look at those files (.s in the .bmx folder) and see just what the blitzmax compiler is doing. If your looking for optimizations, I'd suggest seeing what the blitzmax compiler is doing with your external code calls. Also, this will allow you the attempt to restructure your blitzmax source and possibly come up with a faster, more effecient routine, considering you'll have a better understanding with what is going on with the blitzmax compiler, behind the scene. |
| ||
Yes, I did take a look at the generated .s files when I was searching for hints where I was going wrong. Blitzmax compiler was doing unnecessary fxch instructions, but it seems the performance hit is very marginal. Exchanging values in the stack is probably very fast operation for the fpu. I'll see if I can boost the performance of some more complex math problems, like the quadratic formula. |