string comparisons are fast!

BlitzMax Forums/BlitzMax Programming/string comparisons are fast!

slenkar(Posted 2011) [#1]
I made a little test app to see if string comparison could made faster:

Print Chr(48)

Const num_compares=100000

Local string_array$[num_compares+1]
For x=1 To num_compares
Local this_string$=""
For z=1 To 20
Local r=Rand(1,2)
If r=1
this_string=this_string+Chr(Rand(65,85))
EndIf
Next
string_array[x]=this_String
Next

Local rs$="HERYTR"

Local time1=MilliSecs()

Local trys



For x=1 To num_compares

If string_array(x)=rs
trys=trys+1
EndIf
Next
Print MilliSecs()-time1+" normal compare"

time1=MilliSecs()
For x=1 To num_compares
If compare_strings(string_array(x),rs)
trys=trys+1
EndIf
Next

Print MilliSecs()-time1+" my compare"


Function compare_strings(s1$,s2$)
If s1.length<>s2.length
Return False
EndIf
If s1=s2
Return True
EndIf

EndFunction


it basically checks the length of the 2 strings before doing a normal comparison,
turns out its faster to just do comparisons!


EDIT-
I just did it in Release mode and 'mycompare' is faster then (!)

Last edited 2011


Jesse(Posted 2011) [#2]
that is not how compare works.

Last edited 2011


Czar Flavius(Posted 2011) [#3]
You are adding the overhead of an extra function call. Try putting the check logic straight into the for loop to see if it's faster.


slenkar(Posted 2011) [#4]
interesting, it is faster when you add it straight in,


Czar Flavius(Posted 2011) [#5]
Unless you are comparing very long strings, I wouldn't worry too much about this. Strings usually come from users, and users are limited by how fast they can enter input, so comparision speed isn't very important. If you are using strings to store your in-game data, it will be slow anyway.. still interesting exercise.


slenkar(Posted 2011) [#6]
yeah if it can compare 100,000 strings in about 5 millisecs it doesnt need any optimizing :)


Czar Flavius(Posted 2011) [#7]
If you put it in Strict mode it's twice as slow on my computer. That's concerning.
Edit: that was in debug mode. In release mode, they both execute about the same speed.

Strict
Print Chr(48)

Const num_compares=100000

Local string_array$[num_compares+1]
For Local x=1 To num_compares
	Local this_string$=""
	For Local z=1 To 20
		Local r=Rand(1,2)
		If r=1
			this_string=this_string+Chr(Rand(65,85))
		EndIf
		Next
	string_array[x]=this_string
Next

Local rs$="HERYTR"

Local time1=MilliSecs()

Local trys



For Local y=1 To num_compares	
	If string_array[y]=rs
		trys=trys+1
	EndIf
Next

Print MilliSecs()-time1+" normal compare"

time1=MilliSecs()
For Local a=1 To num_compares
	If compare_strings(string_array[a],rs)
		trys=trys+1
	EndIf
Next

Print MilliSecs()-time1+" my compare"


Function compare_strings(s1$,s2$)
	If s1.length<>s2.length
		Return False
	EndIf
	If s1=s2
		Return True
	EndIf
EndFunction


Last edited 2011


Perturbatio(Posted 2011) [#8]
Strict mode doesn't provide a speed increase, superstrict does, but potentially only a marginal one in this case.

I put together a test of my own:


It would appear that internally, the equals operator uses compare which means = is the same speed as compare().

Testing length appears to be beneficial in some cases and a hindrance in others.

(i.e. the more likely it is that the strings are different, then length tests are a good idea).

I modified the compare_strings function to not use the compare method of the string class at all

with the above code, my output is:
Variable results tests (25% equal)
----------------------------------
Using string.compare(): 849
Using = operator: 852
Using inline length check and = operator: 358
Using inline length check string.compare(): 352
Using compare_strings: 389

fixed string (always equal)
----------------------------------
Using string.compare(): 1078
Using = operator: 1076
Using inline length check and = operator: 2414
Using inline length check string.compare(): 2414
Using compare_strings: 898

fixed string (never equal)
----------------------------------
Using string.compare(): 291
Using = operator: 290
Using inline length check and = operator: 2418
Using inline length check string.compare(): 2417
Using compare_strings: 184

*EDIT*
Oops, fixed about a billion bugs

Last edited 2011

Last edited 2011

Last edited 2011

Last edited 2011


Czar Flavius(Posted 2011) [#9]
Your code goes out of bounds on an array, but works in release mode. Here are my results.

Variable results tests (25% equal)

----------------------------------

Using string.compare(): 1138

Using = operator: 1148

Using inline length check and = operator: 439

Using inline length check string.compare(): 434

Using compare_strings: 380



fixed string (always equal)

----------------------------------

Using string.compare(): 1350

Using = operator: 1350

Using inline length check and = operator: 3099

Using inline length check string.compare(): 3101

Using compare_strings: 909



fixed string (never equal)

----------------------------------

Using string.compare(): 382

Using = operator: 377

Using inline length check and = operator: 3109

Using inline length check string.compare(): 3110

Using compare_strings: 239


The times are pretty much the same on all the strict and non-strict modes on my computer.

Last edited 2011


Perturbatio(Posted 2011) [#10]
I inadvertantly added an error (in an edit), which I've fixed

Superstrict and strict may compile to the same code in this instance, however Mark specifically stated (a long time ago) that there are less checks necessary for code produced with superstrict vs strict and so there is a speed increase

Last edited 2011


Czar Flavius(Posted 2011) [#11]
The Blitz modules themselves are only made in Strict. I don't understand what extra checks are needed? The only possible slowdown is that all non-returning functions implicitly return an Int, so there is an unneeded Return 0 but that's nothing.

Last edited 2011


Perturbatio(Posted 2011) [#12]
It would appear that there might be some benefit in finding a way to make the equals operator for strings not use the compare method (so that you don't have to use the compare_strings function).

*EDIT*
I don't understand what extra checks are needed? The only possible slowdown is that all non-returning functions implicitly return an Int, so there is an unneeded Return 0 but that's nothing.


I'm not arguing that superstrict is faster than strict (in this instance), just stating that you probably wouldn't notice a speed increase between strict and non-strict since strict just adds a helper to alert you to undefined variables

Last edited 2011


Czar Flavius(Posted 2011) [#13]
You have it the wrong way round. Either kind of strict provides a speed boost compared to non-strict, because the program won't be suprised by undefined variables popping up and it can optimize based on scope. The only differences between strict and superstrict is, as a shortcut, if you don't specify the type of a variable it is assumed to be an int and a function without a return value is assumed to return an int. With the exception that function without a return value will return an unused 0, the speed of the code should otherwise be the same.

Last edited 2011


Perturbatio(Posted 2011) [#14]
Possibly I have, I can't recall where I picked that up from unfortunately.

*EDIT*

although I'm not alone in my assumption of this:
http://www.blitzmax.com/Community/posts.php?topic=80515#905608

incidentally, a similar thread appears to have already covered this string stuff:
http://blitzbasic.com/Community/posts.php?topic=92179

Last edited 2011

Last edited 2011