String terminators

Blitz3D Forums/Blitz3D Beginners Area/String terminators

_PJ_(Posted 2010) [#1]
What 'control code' is used to recoognise a string terminator?

I had assumed it was Chr(0), but somehow, this is bypassed with the following example:

d$=CurrentDir()
wf=WriteFile(d$+"file.txt")
WriteString wf,"Hello "+Chr(0)+"World"

CloseFile wf

rf=ReadFile(d$+"file.txt")
s$=ReadString(rf)

CloseFile rf
Print s$



Yasha(Posted 2010) [#2]
I'm pretty sure Blitz doesn't use terminators, and just records the string's length internally somewhere in the string's (inaccessible) data structure. Consider the way strings are written as data to files or streams.


Zethrax(Posted 2010) [#3]
There's no string termination character for Blitz3D's internal strings. Instead, they have a 32 bit length header which defines the length of the string.

If you need a terminator for a string in a text file, then use an end-of-line character sequence (a Carriage Return character followed by a LineFeed character - Chr(13)+Chr(10)), and use the WriteLine and ReadLine functions to write and read the data.

For a data file, you'll need to create your own terminator characters, plus the code to split up the string using those characters. Or just write out multiple strings.


Dreamora(Posted 2010) [#4]
optimally you go the same file for files as blitz goes: write a short / int with the length of the string then write the string.

reason is that this allows you to just use a readbytes to read the whole string in 1 go. while a string terminator requires you to read every character one by one to find it basically.
Thats a magnitude++ slower (and one of the main reason why tcp in blitz has such a bad reputation, cause people did such stupid things in production code)


_PJ_(Posted 2010) [#5]
Thanks all, that really helps.
Suprised to hear about the inaccessible value, but i agree it's a lot better to store the length rather than process each byte until reaching a terminator.
Presumably this limits the string lengths to 2^32 chars (really way more than should ever be needed ;) ) So maybe a Short byte of 16 bits would be a little improvement.

The main reason for asking was in relation to ways of making reading/writing a little more efficient or a sort of compression routine for filesizes, so it's very helpful, Thanks!


Floyd(Posted 2010) [#6]
Today's history lesson...

It's traditional for Basics to handle strings like this. It means that every character code can be a string element. C took the attitude that strings were text, so a terminator code could be chosen.

The fact that strings could hold all byte values was used back in the 8-bit days for embedding machine language into interpreted (non-compiled) Basic code. The bytes of the machine language subroutine were stored in the string. The code was executed by branching to the address of the start of the string.


_PJ_(Posted 2010) [#7]
That's right, I remember now the Z80 assembly code had 256 commands in its set, which of course, could also be represented with the ASCII char set or rather, any 8-bit byte.
terminators are messy oin my opinion, and in having a "terminator code", we're losing out on a value which might otherwise be used... maybe... :P`


stanrol(Posted 2010) [#8]
Pascal strings.


_PJ_(Posted 2010) [#9]
Pascal strings. 
What?


stanrol(Posted 2010) [#10]
@Malice Turbo Pascal used those string typel.