String terminators
Blitz3D Forums/Blitz3D Beginners Area/String terminators
| ||
What 'control code' is used to recoognise a string terminator? I had assumed it was Chr(0), but somehow, this is bypassed with the following example: d$=CurrentDir() wf=WriteFile(d$+"file.txt") WriteString wf,"Hello "+Chr(0)+"World" CloseFile wf rf=ReadFile(d$+"file.txt") s$=ReadString(rf) CloseFile rf Print s$ |
| ||
I'm pretty sure Blitz doesn't use terminators, and just records the string's length internally somewhere in the string's (inaccessible) data structure. Consider the way strings are written as data to files or streams. |
| ||
There's no string termination character for Blitz3D's internal strings. Instead, they have a 32 bit length header which defines the length of the string. If you need a terminator for a string in a text file, then use an end-of-line character sequence (a Carriage Return character followed by a LineFeed character - Chr(13)+Chr(10)), and use the WriteLine and ReadLine functions to write and read the data. For a data file, you'll need to create your own terminator characters, plus the code to split up the string using those characters. Or just write out multiple strings. |
| ||
optimally you go the same file for files as blitz goes: write a short / int with the length of the string then write the string. reason is that this allows you to just use a readbytes to read the whole string in 1 go. while a string terminator requires you to read every character one by one to find it basically. Thats a magnitude++ slower (and one of the main reason why tcp in blitz has such a bad reputation, cause people did such stupid things in production code) |
| ||
Thanks all, that really helps. Suprised to hear about the inaccessible value, but i agree it's a lot better to store the length rather than process each byte until reaching a terminator. Presumably this limits the string lengths to 2^32 chars (really way more than should ever be needed ;) ) So maybe a Short byte of 16 bits would be a little improvement. The main reason for asking was in relation to ways of making reading/writing a little more efficient or a sort of compression routine for filesizes, so it's very helpful, Thanks! |
| ||
Today's history lesson... It's traditional for Basics to handle strings like this. It means that every character code can be a string element. C took the attitude that strings were text, so a terminator code could be chosen. The fact that strings could hold all byte values was used back in the 8-bit days for embedding machine language into interpreted (non-compiled) Basic code. The bytes of the machine language subroutine were stored in the string. The code was executed by branching to the address of the start of the string. |
| ||
That's right, I remember now the Z80 assembly code had 256 commands in its set, which of course, could also be represented with the ASCII char set or rather, any 8-bit byte. terminators are messy oin my opinion, and in having a "terminator code", we're losing out on a value which might otherwise be used... maybe... :P` |
| ||
Pascal strings. |
| ||
Pascal strings.What? |
| ||
@Malice Turbo Pascal used those string typel. |