Code archives/Miscellaneous/FAST bank to string

This code has been declared by its author to be Public Domain code.

Download source code

FAST bank to string by Yasha2012
It's not difficult to copy data between bank and string in pure Blitz Basic. Just loop over the characters or bytes and either poke ascii values, or append the new Chr() to the build-string.

Unfortunately, it is really, really slow, due to the automatic string memory management behind the scenes having to do a lot of copying, reallocation, etc.

However, with not one but two dirty tricks, we can use the system runtime libraries to do the hard work for us, resulting in a pair of copy functions something ridiculous like fifty times faster than doing it by hand. This could come in very, very handy when looping over all of the characters in a string!

Here's a speed comparison:



You need the userlib .decls found in the archive box below, but the trick itself only uses system DLLs, so no extra files to package with your program.

Important note: because of incompatibilities between the way B3D and C handle strings (the userlibs assume C is your language), API_BankToString_ will not work correctly on any banks containing a 0 byte, which is the string terminator for C strings. It also won't work correctly - possibly causing crashes - if your bank is not null-terminated, so I strongly recommend you make use of the wrapper functions instead of using the raw API functions, which will perform the necessary safety-check.

API_StringToBank_ will still work as expected for B3D strings containing Chr(0), however, and this is likely to be the more useful of the two.
; The decls:

; Quickly copy data from string to bank and bank to string using system functions
; Both of these functions are more usefully wrapped, to hide the dirty details

.lib "Kernel32.dll"

API_StringToBank_(Dest*, Src$, Sz%) : "RtlMoveMemory"

.lib "msvcrt.dll"

; Dest and Src should ideally be the same bank (dirty hack)
API_BankToString_$(Dest*, Src*, Sz%) : "memmove"

; The wrappers:
;
;Function StringToBank_FAST(s$)
;	Local b = CreateBank(Len(s))
;	API_StringToBank_ b, s, BankSize(b)
;	Return b
;End Function
;
;Function BankToString_FAST$(b)    ;The extra code around the API function here is important!
;	Local hasExtra = False, sz = BankSize(b)
;	If sz = 0
;		ResizeBank b, 1 : PokeByte b, 0, 0 : hasExtra = True
;	ElseIf PeekByte(b, sz - 1)
;		ResizeBank(b), sz + 1 : PokeByte b, sz, 0 : hasExtra = True
;	EndIf
;	Local res$ = API_BankToString_(b, b, 0)    ; 0 should technically be fine here
;	If hasExtra Then ResizeBank(b), sz
;	Return res
;End Function
;

Comments

*2012
couldnt you memcpy or doesnt that work?


Yasha2012
memcpy works just as well on my machine, but it also provided no noticeable speedup, so I figured it's better to put the one that doesn't rely on Undefined Behaviour in the public archive.

It's actually just using memmove/memcpy as an identity function because it happens to also return the dest buffer - in fact you could probably I just changed it to pass a 0 for "bytes to copy" and it would be fastest - the work of actually creating the new string is done entirely by B3D's automated glue code for userlib functions that return strings. That's why it's a dirty, sneaky trick: tell B3D that the returned value is a char *, and it will create a string from whatever the function returns (the untouched contents of the bank).

TBH when you're talking about an order-of-magnitude speedup from the "slow copy", the difference between memmove and memcpy is a seriously minor detail, anyway.


virtlands2013
"FAST bank to string by Yasha"

This is awesome code. I just tried it and received the following results on my PC:

Fast code timing = 0.411 sec
Slow code timing = 25.500 sec

The difference is that the Fast functions are about 62 times faster than the slow B3D code.

This combines well with Yasha's other topic of using pointers on strings:

String banks/data buffers by Yasha
http://www.blitzbasic.com/codearcs/codearcs.php?code=3017#comments

;;-------------------------------------
Function BankToString_FAST$(b)
Return API_BankToString_(b, b, 0) ; 0 should technically be fine here
End Function

In that function, I'm assuming that a B3D string of appropriate length is automatically created. :amazing:

In case those banks have zeroes in them, then we could perhaps temporarily replace the zeroes with a replacement number, not previously occurring.
And then the rough part would be to turn the stand-ins back into zeroes (in the destination string.)


virtlands2013
Oh my! I just found some performance errors with the Fast functions.

I tested the Fast functions with strings that vary in length from 0 to 1000.

As seen from the screenshots, the BankToString_Fast()
function creates some inaccuracies sometimes:

This first screenshots shows about 7 errors.
[ String lengths are shown from 0 to 32; Errors happen at bank lengths 0,16,17,18,19,20,32. ]


Testing code for string lengths 0 to 64


Here's a more streamlined program for testing string lengths 0 to 1000.


The errors happen kind of randomly ..


Yasha2013
Well that's interesting. Bears investigation. Let me get back to ya on that...


EDIT: well, I have an answer for you.

The problem is with API_BankToString_.

Banks are not null-terminated, the data just goes right up to the end. However, B3D's .decls interface glue code (which is the part providing all the work here) expects to be given a null-terminated string - this is why it will return a shorter-than-expected string when the bank contains a 0 byte.

The converse is also in effect: if the end of the bank isn't a 0 byte, the glue code will keep copying string data until it finds one, whether that's still within the bounds of the bank or not! That's why the errors consist of extra data beyond the expected string, but everything put into it originally is still returned.

The obvious fix - to simply trim the result string down to the expected length - is not safe, because if there are no 0 bytes in between the end of the string and unallocated memory, the program will crash. The safe thing is to test the end of the bank to see if it's a 0, and if not, append one (then remove it afterwards because it wasn't part of the user's original data).

This is a C-programming rookie mistake and I'm embarrassed to have left it in there (in my defence, that's because BankToString already has the showstopper bug that it can't copy all strings anyway, so I didn't think anyone would use it).

I will make the necessary change to the example code and add a note explaining that users should copy that whole function.


virtlands2013
Good luck on that.

Local teststr$, S$, B,L

L = Len(teststr)
B = stringtoBank_Fast( teststr )
S = banktoString_Fast(B)
S = Left$( S, L )
;; <----

Simply forcing the resultant string S to be of
identical size as the original teststr may be a temporary way of doing it.


Yasha2013
It is done.

(As I said in comment #5, using Left$ will get the right string data out, but it won't prevent the possibility of a crash.)


Code Archives Forum