Inline assembly problems

BlitzMax Forums/BlitzMax Programming/Inline assembly problems

Jasu(Posted 2010) [#1]
I've been running tests with assembly stuff in Max. I can make it work with usual integer stuff, but when I try to use FPU instructions, I keep hitting a wall. Here's what works:

test1.c
int add(int x, int y) {
   asm
      (
      "addl %1, %0"
      : "=r"(x)
      : "m"(y), "0"(x)
      );
   return x;
}

test1.bmx
Import "test1.c"
Extern
   Function add:Int(x:Int,y:Int)
End Extern

Print add(10,20)


Here's what doesn't work:

test2.c
float sinx( float degree ) {
    float result, two_right_angles = 180.0f ;

    __asm__ __volatile__ ( "fld %1;"
                            "fld %2;"
                            "fldpi;"
                            "fmul;"
                            "fdiv;"
                            "fsin;"
                            "fstp %0;" : "=g" (result) : 
				"g"(two_right_angles), "g" (degree)
    ) ;
    return result ;
}

test2.bmx

Import "test2.c"
Extern
   Function sinx:Float(degree:Float)
End Extern

Print sinx(10)


Compiler keeps giving me "Error: suffix or operands invalid for `fstp'". I've been searching something on this for hours, but I'm still at square one. I have a hunch it's trying to tell me something about converting the result to float, but I wouldn't even know what to do about it.


Otus(Posted 2010) [#2]
Probably has to do with calling conventions - a float is returned on the stack (in memory, not float stack). This works:
float sinx( float degree ) {
    float result, two_right_angles = 180.0f ;

    __asm__ __volatile__ ( "fld %1;"
                            "fld %2;"
                            "fldpi;"
                            "fmul;"
                            "fdiv;"
                            "fsin;"
                            "fstp %0;" : "=m" (result) : 
				"g"(two_right_angles), "g" (degree)
    ) ;
    return result ;
}



Jasu(Posted 2010) [#3]
Thank you very much, this worked!

I did some tests with distance calculation between two 2D points and it appears it doesn't need asm or even C optimization. Speed results came so close to each other, that it doesn't matter. Actually the version in C came out as fastest. :( The GCC optimizer really knows what it is doing... Or I'm doing something stupid in asm. And max version was only 0.5% slower then C version.

Maybe this will come handy some day.


Shortwind(Posted 2010) [#4]
You are aware that blitzmax produces a asm file from your .bmx source code. You can look at those files (.s in the .bmx folder) and see just what the blitzmax compiler is doing.

If your looking for optimizations, I'd suggest seeing what the blitzmax compiler is doing with your external code calls. Also, this will allow you the attempt to restructure your blitzmax source and possibly come up with a faster, more effecient routine, considering you'll have a better understanding with what is going on with the blitzmax compiler, behind the scene.


Jasu(Posted 2010) [#5]
Yes, I did take a look at the generated .s files when I was searching for hints where I was going wrong. Blitzmax compiler was doing unnecessary fxch instructions, but it seems the performance hit is very marginal. Exchanging values in the stack is probably very fast operation for the fpu. I'll see if I can boost the performance of some more complex math problems, like the quadratic formula.