max length of blitz string?

BlitzPlus Forums/BlitzPlus Programming/max length of blitz string?

ashmantle(Posted 2003) [#1]
The title say it all.. how many characters can I have in
a blitz string?


cyberseth(Posted 2003) [#2]
I'd say unlimited, as long as there's sufficient memory to hold it. But you can always check with this little example:

Repeat
    If KeyHit(1) then exit
    i=i+1
    strtest$=strtest$+"X"
Until Len(strtest)<>i

Print "Length reached: "+i
WaitKey()
End



DarkEagle(Posted 2003) [#3]
heheh... how long is a piece of string? :P

the max length is twice as long as it is from one end to the middle. comprende? :P


podperson(Posted 2003) [#4]
It's limited by RAM (until we get 64-bit addressing, then it will be limited to 4GB).


sswift(Posted 2003) [#5]
A string in Blitz uses a longint at the start of the string, which is 4 bytes, to define the length of it.

So a string can be up to 4,294,967,295 characters long.

So no, there's effectively no limit.

When you save a string into a file, keep in mind that those 4 length bytes are saved with it at the start of the string. So the number of bytes it takes up is 4 + Len(String$)


podperson(Posted 2003) [#6]
Since Windows can only manage an address space 2GB of RAM, it's actually limited by RAM. Hence my earlier remark :)


Anthony Flack(Posted 2003) [#7]
What? Only TWO GB?!? That's only about 4 million pages of text per string! That is unacceptable. I'm tired of this dead language.


ashmantle(Posted 2003) [#8]
Anthony: I wholeheartly agree.. I was going to make the best next-gen MMORPG, utilizing next-gen string manipulation, but now all my dreams are ruined.. thanks ALOT blitz!

:) actually.. thanks to all you people above for answering this question!


podperson(Posted 2003) [#9]
You might like to check out my string utilities in Code Archives | Misc (allows you to treat strings as associative arrays).


SoggyP(Posted 2003) [#10]
Hi Folks,

6 characters. Probably.

Later,

Jes


Tricky(Posted 2003) [#11]
Most programming languages have a max of 255 characters for a string... I don't know if that counts for Blitz, but I simple live by that rule in whatever language I use...


ashmantle(Posted 2003) [#12]
Tricrokra: thats what I thought too, but I am glad I was wrong :)


Tricky(Posted 2003) [#13]
Longer strings are possible, I suppose... Doesn't really matter to me, since I always had a max of 255 in my past 20 years of coding I'm so used to it, that I never need more...


cyberseth(Posted 2003) [#14]
Strings longer than 255 are necessary for a Windows app like Blitz+, since you might be handling text files and textarea gadgets, containing a LOT of textx. But since Blitz+ strings are pretty much limitless, there's no problem! :)


Tricky(Posted 2003) [#15]
I've analyzed the file structure of files I save with a Blitz application... I uses a 32-bit number to indicate how long a string is.... Is this because Blitz does not make any diffrence between BYTE, WORD, DWORD, INTEGER or LONG, or is it that it supports strings longer than 255 characters? In the last case, by knowlegde indicates that Blitz should generally be able to handle strings up to 2 gigabyte (considering your RAM is big enough), and when an unsigned LONG is used it could theoreticly take up 4 gigabyte...

But this theory is basically only based one data I could analyze by the file created with a file like this
Local BT = WriteFile("Test")
WriteString BT,"Hello there"
CloseFile BT

So I've no clue if my theory is actually the right one...


ashmantle(Posted 2003) [#16]
As long as I don't have to worry about it exceding 255 chars, Im fine.. I am going to create a "compiler" for my
rpg scripts.. compiles a page of instructions into one string..


Koriolis(Posted 2003) [#17]
Like you say Tricrokra the way it's stored on disk is not really related to the way it's handled in memory, but here yes (at least it's what has been said here and there many times), it's true that strings use a 4 byte integer to store the length, which means strings of up to 4 BG (or at least 2 with signed integers)!
I've seen Mark is going to use ASCCIZ strings (NULL terminated) in BlitzMax, probably to ease communication with DLL and other Librarys that us this C-style strings.
And why do you say "Most programming languages have a max of 255 characters"??? It's completly wrong for most of the popular languages except Pascal. The standard until now is more C-style strings (which are not very efficient - getting the length of a string requires scanning every char until the end - but at least you have no size limitation).


Tricky(Posted 2003) [#18]
I've coded in many DOS languages, and the Windows ones are not really in my full knowlegde is my appologies if I stated anything wrong about "most languages"...

Null-terminated strings, you say? I hope he keeps it seperated from the system already in place... I mean, it could cause incompatibilities with older files :)


Koriolis(Posted 2003) [#19]
When I say Null-terminated strings, I mean in memory. When writing/reading to/from files, you still can write/read first the length on 4 bytes followed bu the characters. That's probably what Mark will do for compatibility (plus it can be faster to read this way: you get the size, then you can load all the following characters in one shot rather than character by character).


Tricky(Posted 2003) [#20]
Okay... That would be good....


Oldefoxx(Posted 2003) [#21]
Initially, early BASIC interpreters only had a single byte at the beginning of a string to signify how long it was. That meant that it could have a length of zero bytes (0), up to a maximum length of 255 characters (255 decimal, FF in hexidecimal).

Since GWBASIC and BASICA, the number of bytes for the size of a string was extended to two or four. If two bytes, the maximum length of a string could be just under 32,767. That is because the two bytes were treated as a signed integer, and since you could not have a negative number of butes, only the positive values were allowed. In addition, the string might contain an identifier or a memory segment reference, so a full 32,767 bytes was not always permitted.

Some Languages supported an unsigned integer format to express the length, so that they could have string lengths up to 65,535 bytes.

As more memory was made available, it became possible to think in terms of even larger string structures, and some languages now support four bytes for a long representation of a string length. That would be 256^4-1, or 4,294,967,295 bytes in length. Again though, you have to consider whether the language treats this as a signed or unsigned value, as an signed value would allow just half the size of an un signed one. You also have to consider if the language has made the move to support Unicode, since each character in Unicode is two bytes long, not just one.

Keep in mind that working with strings, which have variable lengths, is not as efficient as working with fixed length structures, and it often pays to render string data into other forms to speed processing and improve performance. The longer the strings are, the more inefficient the results are likely to be. One principal reason is that if the processor can limit itself to integer indexing, it can work far more efficiently than if if has to somehow cope with offsets that require base address modifications or other techniques to extend its index range. This imposes a lot of overhead on the underlying processes that can seriously effect performance.

But sticking to a 255 maximum byte size is unnecessary and does not significantly impact on performance. That has not been a factor since the early days of microprocessors when we only had the old 6502, Z80, and 6800 processors which only had s single byte of index addressing and a few kilobytes of total memory space.