NG extern struct example...

BlitzMax Forums/Brucey's Modules/NG extern struct example...

col(Posted 2016) [#1]
Hey,

NG is getting and better. I love it!!
I'm especially loving the constructor parameters, and method/function overloads - just awesome.

But ;-)
Do you have an example use case of using extern structs please? my brain is melting and I can't think how to piece things together.

Cheers!!!

Dave.


Derron(Posted 2016) [#2]
Isn't it just similar to:


Extern
?bmxng
  Struct
?not bmxng
  Type
?

    field bla:int
    field blubb:int

?bmxng
  End Struct
?not bmxng
  End Type
?
End Extern


Example usage (without vanilla-compatibility, as other things surely were adjusted too):
https://github.com/bmx-ng/pub.mod/blob/master/lua.mod/lua.bmx

Edit: this uses my suggested variant:
https://github.com/maxmods/bah.mod/blob/master/fmod.mod/common.bmx



bye
Ron


col(Posted 2016) [#3]
Thanks Derron,

Feeling silly now...
I don't know why but I was thinking of stack based structs both sides of the api lol. AND the clue is in the name - "extern" LOL.


Brucey(Posted 2016) [#4]
I was thinking of making "extern" optional for this, as it's not exclusively an external struct. You can use them anywhere...

SuperStrict

Framework brl.standardio

Local v1:SVec2
v1.x = 100
v1.y = 150

Local v2:SVec2
v2.x = 10
v2.y = 80

Local v:SVec2 = sub(v1, v2)

Print v.x + ", " + v.y


Extern
Struct SVec2
	Field x:Float
	Field y:Float
End Struct
End Extern

Function sub:SVec2(v1:SVec2, v2:SVec2)
	Local v:SVec2
	v.x = v1.x - v2.x
	v.y = v1.y - v2.y
	Return v
End Function


Note too, that the generated name is "SVec2", rather than being mangled. Perhaps if we were to accept non-extern for this, we might want to mangle the name... or we could leave them the same.


col(Posted 2016) [#5]
@Brucey
I nearly missed your reply here.

Holy crap! I didn't realise you could do that.

For code clarity I'd remove the need for Extern - definitely. The way 'Max uses Extern implies that the variable is created externally but this is direct in the language itself.

This could have a huge impact on what I'm doing here - especially if/when we have 128 datatypes and structs are also 16 byte aligned ;-)


Derron(Posted 2016) [#6]
Col, but as Brucey wrote: if you remove "extern", the "SVec2" name becomes something unpredictable for "C" (read: it might change in later incarnations of BCC with a maybe "bugfixed"-mangler).

So when interfering with BMax code from C you might run into problems somewhen.


Just wanted to emphasize the difference so it does not get missed ;-)


bye
Ron


col(Posted 2016) [#7]
Ahh ok, I see, still early days.
I'll have a play here and see how things work out.

Thanks for the heads up :O)


col(Posted 2016) [#8]
Brucey,
Have you tried your example in a debug build? ;-)


Brucey(Posted 2016) [#9]
Have you tried your example in a debug build?

Apparently not! :-p


Brucey(Posted 2016) [#10]
Should be working now...


I actually can't remember why I am not name-mangling the Structs. I'm sure I had a good reason at the time - at least, I must have believed that to be the case.

So it may as well just mangle the names regardless. Unless someone has a reason for otherwise doing so?

@alignment
I'd suggest only 16-byte aligning structs if they are at least 16-bytes in size. Otherwise one is potentially wasting space. Again, if you have a good reason to 16-byte align for smaller structs?

@128-bit numbers.
Are there 128-bit ints in 32-bit architectures? Or should I assume we would only enable this for 64-bit (x64) targets?
On those not supporting it the type would not be defined, and should fall over at compile time - eg. unknown type Int128.
Talking of names... what types would you like, exactly?

@more Structs
I should also note, if I haven't before, that you can't really put Object refs in a Struct - they will not be visible to the GC, since the struct is on the stack. Well, that is as far as I understand in - in which case, YMMV.


col(Posted 2016) [#11]
@alignment
Yes I agree.

@128bit numbers
I'm sure the later 32bit x86 intels have them, I'm not sure on the other cpus that NG compiles to though :/ edit: personally I'd do as you suggest and have them only for x64 builds.

I've been thinking about this all night and I'm not sure how this will work out. My thoughts are totally on using a 128bit datatype to pack floats and ints. Getting/extracting and keeping data into them is where my thoughts are... For eg in c++ you have the sse intrinsic functions for loading data to and from your variables into an xmm register, and there are more functions for the actual data manipulation/bit swapping etc. We don't have those in 'Max so I'm not sure how the syntax for that would work out in order to keep the benifit of using vectors in the first place. Maybe just allow the datatype to be defined on the stack and drop out to c/c++ to do the sse work? I don't know - I'm just throwing thoughts around. What's your thoughts?

@structs
I did initially try an array in one but it wouldn't compile and also thought it was because of garbage handling. There will be limits for sure. Maybe in the future something will may be developed there.

edit2:
@name mangling
If you introduce name mangling for structs what syntax would we use if we want to use them in both external code and 'max code?


Brucey(Posted 2016) [#12]
@arrays
Yes, I am intending to support "arrays", and was thinking along the lines of :
Struct WithArray
  Field x:Float
  Field y:Float
  Field arr:Byte[100]
End Struct

Where you would indicate the size of the array in the Struct, just as you might in a normal C struct...

For undetermined sizes, I'd probably go with a size field and a byte ptr to a block of memory.
(now, you'd probably have to manage that memory yourself though)

@intrinsics
Wouldn't it "simply" (heh) be a case of exposing the intrinsic functions in BlitzMax?...
Extern
    Function _mm_setr_epi32:Int128(e3:Int, e2:Int, e1:Int, e0:Int) = "_mm_setr_epi32"
End Extern

Local num:Int128 = _mm_setr_epi32(10, 20, 30, 40)

This assumes that "Int128" is in fact mapped to __m128i. Which it could be, if that made life easier?

Then one could have a Pub.Intrinsics module or some such with all the functions declared there...


col(Posted 2016) [#13]
@arrays
I think the arrays would work out ok if you just allow arrays of 'non object' types to keep it simple, then as you say maybe use a pointer for other stuff that the coder can manage themselves.

@intrinsics
Of course lol - it would be a case of exposing the functions! Talk about me over-thinking things *rolls eyes*

@mapping
In that case then Int128 and Float128 could map to directly to __m128i and __m128 respectively - otherwise it may be a fruitless tree. Otherwise each time they are used the assembly would create a 'load' and 'store' - we want to keep away from those as far as possible if we can ;-)

@Pub.Intrinsics
Sounds good! Do you want me to write up the functions to help out?

edit-
ps To really leverage the advantage of using xmm registers we may need another calling convention for functions - the __vectorcall convention which prefers to pass xmm registers by value as opposed to passing them by reference. It may be a bit much at this time though? :p


Brucey(Posted 2016) [#14]
Here's a little test example...
SuperStrict

Framework brl.standardio

Local a:Int128 = _mm_set_epi32(10, 20, 30, 40)
Local b:Int128

b = a + a

Local c:Int Ptr = Varptr b

Print c[0]
Print c[1]
Print c[2]
Print c[3]


Extern
	Function _mm_set_epi32:Int128(e3:Int, e2:Int, e1:Int, e0:Int) = "_mm_set_epi32"
End Extern

produces the following on OS X...
80
60
40
20


Not very scientific... but it appears to work so far.

There's lots of stuff you are not allowed do with them - like assign values (Int constant) - which I haven't implemented yet, so it will error at c-compilation time, rather than when bcc is processing it.


col(Posted 2016) [#15]
Excellent stuff! Looks good!

Not very scientific... but it appears to work so far.

When it comes down to brass tacks, they're nothing special at all, just like normal registers they are a bit pattern, except they can be used with the dedicated sse instructions to parallelise basic math and bit shifting. If you can write code to keep variable data in the xmm registers and write code to manipulate those registers efficiently then the speed up is usually over 300% compared to regular math. This can make a very significant difference when working on many vectors and matrices which are incredibly math heavy. They can also be used for shaders parameters too - as I say they are just a bit pattern :-)

If you want I can take a look at the assembly to make sure that gcc is doing it's job properly in regards to the addition operator that you have there? It may be that to produce better assembly you need to use the _mm_add_epi32 intrinsic which should produce a single 'padd' assembly instruction - 4 adds for for 1 instruction. Feel free to email me or maybe you'd create a separate branch? Or, knowing you, you have almost finished it already :D


Brucey(Posted 2016) [#16]
you have almost finished it already

uh, yeah... you probably want to get the latest brl/pub/bcc and have a play.
I should add a Double128 (__m128d) at some point too, just to complete the set.

It may be that to produce better assembly you need to use the _mm_add_epi32 intrinsic

Indeed. It was just an example :-)
I was surprised straight addition worked at all, to be fair.


col(Posted 2016) [#17]
Haha, why am I not surprised ;-)

For sure.. I'll update as soon as my broadband connection comes back to life.

[quote] I was surprised straight addition worked at all, to be fair [\quote]
GCC should be intelligent enough to spit out the right instructions already. Id be surprised if it wasn't, but you never know.

Awesome job!!


col(Posted 2016) [#18]
First impressions are that it's working absolutely great :-)
It was a little strange viewing the assembly initially as its so aggressively optimised but I have some functions that produce very good output.

Any bugs I come across I'll post on the repo.
Awesome awesome job! Thankyou!

I'll get to work making up a module...

edit: just for the record, yes the add operator did what it was supposed to do too :^)


Brucey(Posted 2016) [#19]
Added Double128 (__m128d) and added a load of SSE2 Double128 functions to pub.intrinsics.

Note that any functions with "const *" parameters will need to be declared in intrinsics.x too (see contents for example).
I also didn't implement any __m64 functions, since I haven't added any 64-bit-sized intrinsic types.

I can do the SSE and remaining SSE2 functions, if you like? Unless you've done most of them already?

:o)


col(Posted 2016) [#20]
Hah,

I have the complete set of sse functions all done - sse/sse2/sse3/ssse3/sse41/sse42 - complete with a little 'make' utility to automate the process, just tidying things up now :D

There are some conflicts that I've mentioned in the repo :p
I asked about using a .x file there - I'll add that in.

Edit...
In the .x file is there a specific syntax that needs to be adhered to? I notice that you mix some 'Max types with c types ( in other .x files )

Also, the '!' at the end of each line, whats that for? some kind of 'end of function' or eol marker?

Cheers!


col(Posted 2016) [#21]
Brucey,
Is your email in your blitz profile still valid?


Derron(Posted 2016) [#22]
Yes it is.

At least he tends to respond on mails at this address ;-)


bye
Ron


Brucey(Posted 2016) [#23]
Is your email in your blitz profile still valid?

Yes it is :-)


col(Posted 2016) [#24]
Wow,
Well that didn't take long eh.

Blitzmax NG now supports 99% of all sse/sse2/sse3/ssse3/sse4.1/sse4.2 instructions with new Float64, Float128, Double128, Int128 datatypes on x64 builds.
There are hundreds of new specific functions to use so be sure to bookmark Intels reference page for guidance.

:O)