Comparing Arrays....?
BlitzMax Forums/BlitzMax Programming/Comparing Arrays....?
| ||
Hi. There is so much about BMax that I'm sure I don't know yet. Is there a fast way to check if the contents of one array is identical to another? I do basically this: For X=0 to 99 For Y=0 to 99 If Array1[x,y] <> Array2[x,y] then return false Next Next Return True But thinking about it, and knowing there is array notation I don't know about, is there a more CPU efficient way? It would be nice news if there was because in this little project I've started, such a comparison is the main time-taker in the main loop. Cheers. :D |
| ||
Sure, that this loop causes your time problems? Your sample needs <0.2msec time! |
| ||
An array of 100 x 100 is not going to take long to test. You probably don't need to worry about it. |
| ||
Your test is failing if you want to make sure that array1 and array2 contain totally the same. you need to prepend: 'not equal if one array is not existing - or lengths differ if (Array1=null) <> (Array2=null) or Array1.length <> Array2.length then return False As you use a multi-dim-array you might need to check for dimensions too... So how to speed up things? Keyword here is "cache". It only works if you have a "governor" taking care of the content in Array1 and Array2 - so every modification of the array - needs to be done by a manager/governor. Principle is: you have a variable stating if array1 is modified. Call it "array1Modified:int = False". When adding, removing to ... or manipulating the array in other ways, set "array1Modified = True" So this allows you to know right before your loops if the array is the same as "last time". Ok, so how to use it? Assume we got this manager and now have "array1Modified" and "array2Modified". if array1Modified or array2Modified 'do your loops endif ... 'once everything is done, set your helper var to "handled" array1Modified = False array2Modified = False bye Ron |
| ||
Here's an example, simulating calling it at 60fps over 1 minute :SuperStrict Framework BRL.Standardio Extern ?bmxng Function memcmp:Int(p1:Byte Ptr, p2:Byte Ptr, size:Int)="int memcmp(const void *,const void *,size_t)!" ?Not bmxng Function memcmp:Int(p1:Byte Ptr, p2:Byte Ptr, size:Int) ? End Extern Const iterations:Int = 16 * 60 * 60 Const x:Int = 100 Const y:Int = 100 Local array1:Int[x,y] Local array2:Int[x,y] Local res:Int Local start:Int = MilliSecs() For Local i:Int = 0 Until iterations res = loop1test(array1, array2, x, y) Next time(start, "loop1 same", res) start = MilliSecs() For Local i:Int = 0 Until iterations res = loop2test(array1, array2, x, y) Next time(start, "loop2 same", res) start = MilliSecs() For Local i:Int = 0 Until iterations res = cmptest(array1, array2, x * y * 4) Next time(start, "cmp same", res) array2[x - 1, y - 1] = 1 start = MilliSecs() For Local i:Int = 0 Until iterations res = loop1test(array1, array2, x, y) Next time(start, "loop1 diff", res) start = MilliSecs() For Local i:Int = 0 Until iterations res = loop2test(array1, array2, x, y) Next time(start, "loop2 diff", res) start = MilliSecs() For Local i:Int = 0 Until iterations res = cmptest(array1, array2, x * y * 4) Next time(start, "cmp diff", res) Function loop1test:Int(array1:Int[,], array2:Int[,], xLen:Int, yLen:Int) For Local x:Int = 0 Until xLen For Local y:Int = 0 Until yLen If array1[x, y] <> array2[x, y] Then Return False End If Next Next Return True End Function Function loop2test:Int(array1:Int[,], array2:Int[,], xLen:Int, yLen:Int) Local a1:Int Ptr = array1 Local a2:Int Ptr = array2 For Local i:Int = 0 Until xLen * yLen If a1[i] <> a2[i] Then Return False End If Next Return True End Function Function cmptest:Int(p1:Byte Ptr, p2:Byte Ptr, size:Int) Return memcmp(p1, p2, size) = 0 End Function Function time(start:Int, text:String, result:Int) Local diff:Int = MilliSecs() - start Print text + " (" + result + ") : " + diff End Function There are 3 different comparisons, an x,y compare, a single loop compare, and one with memcmp. Some results from an i7 on Windows 10: Legacy BlitzMax 1.50 : loop1 same (1) : 528 loop2 same (1) : 306 cmp same (1) : 240 loop1 diff (0) : 526 loop2 diff (0) : 306 cmp diff (0) : 238 BlitzMax NG loop1 same (1) : 245 loop2 same (1) : 214 cmp same (1) : 0 loop1 diff (0) : 240 loop2 diff (0) : 213 cmp diff (0) : 0 528 is half a second over the course of 1 minute. A single loop with BlitzMax 1.50 is slightly faster. And memcmp() takes about half the time of your example. Interestingly for NG, with this data, the compiler/runtime does some sorcery and the repeated calls to memcmp() with the same data results in it deciding it doesn't need to retest the compare since the last time... giving a total time of 0 milliseconds. For the other loop compares, they take almost the same time overall, and even finish faster than 1.50's use of memcmp(). Take (silly) test results with a pinch of salt ;-) |
| ||
Just hopping in to gain some knowledge about internals: I am not sure why "memcpm" is really doing what it should. I understand "memcpm" is returning whether two memory blocks contain the same data. So "0" means: both contain the same. But what I am not sure about (not able to test now) is why you can add 100 objects to two _different_ arrays. So while the pointers to the 100 objects might be the same, the properties of the array will differ. So arr1 has a memory address of A and the arr2 a memory address of B. At A2 we find "length"-property of arr1 and at B2 the length for arr2. So why are both memory blocks identical in your tests? Does the Pointer to the array lead directly to the connected data and skips "properties" ? Another thing: If the content of the array are "strings" ... would memcpm still work? A arr1[x,y] = arr2[x,y] would still be valid even with these two strings being stored at different memory locations. bye Ron |
| ||
Does the Pointer to the array lead directly to the connected data and skips "properties" ? Yes. If the content of the array are "strings" ... would memcpm still work? An array of Strings, is an array of pointers to memory locations where String data is stored. memcmp is only useful for primitive data types or raw data. |
| ||
A pity but seems to behave as I expected. Seems "cache" is the fastest one - if entries do not change every execution (ohh how unexpected ;-)). bye Ron |
| ||
Derron: It already has that overall HasBeenUpdated=1 thing. :D Brucey: Interesting. Thanks for the tim check code. I implemented it and it actually takes a lot longer because (and I should have said) it's actually checking several arrays with the same dimensions... For X=1 to 10000 For Y=1 to 10000 If GridVar1a[x,y] <> GridVar1a[x,y] then Return false If GridVar1b[x,y] <> GridVar1b[x,y] then Return false If GridVar1c[x,y] <> GridVar1c[x,y] then Return false If GridVar1d[x,y] <> GridVar1d[x,y] then Return false If GridVar1e[x,y] <> GridVar1e[x,y] then Return false Next Next Return true The 1-100 in my first example was just arbitrary for the illustration of the problem. And I'm not too bothered about framerate because the program already gets 13000 fps on my mediocre PC with vsync off. Actually I tried that atlas image thing you suggested on my last forum post which multiplied the framerate by like 5. The thing is though thats great for the client but this current array comparison thing is intended for server code and it will be doing this per user, so squeezing out extra FPS could be a life-saver. Thinking about it.... the variables being compared are bytes. Would it be faster if they were integers? If so it might be worth multiplying the per-user grid memory by 4 and just having integers... |
| ||
and I should have said Yes, it will help if the code you provide is fairly accurate ;-) For example : GridVar1a[x,y] <> GridVar1a[x,y] is comparing the same thing... so even this code doesn't really help :-p |
| ||
Gawd, I make mistakes like that while actually coding as well. Cheers again Brucey. |
| ||
If you have fields of 100.000.000 elements, I would suggest to keep track of the changes when happen instead of checking all 100.000.000 elements later. instead of... ... If Bomb=TRUE GridA[x,y]=15 Endif .... For X=1 to 10000 For Y=1 to 10000 If GridA[x,y] <> GridB[x,y] then Return false Next Next i would do something like... ... Type Pos Global List:TList Field X%,Y% End Type .... If Bomb=TRUE ChangeGrid(x,y,15) Endif .... For local loc:Pos = EachIn Pos.List If GridA(loc.X,loc.Y)<>GridB(loc.X,loc.Y) then Return false Next Pos.List.Clear ..... Function ChangeGrid(x%,y%,value%) GridA[x,y]=value Local loc:Pos=New Pos Pos.X=x Pos.Y=Y Pos.List.AddLast Pos End function To check you can process the list. With perhaps 1000 entries it is much faster than checking all 100.000.000 elements |
| ||
Many ways to do this. One way is to use a TMap to track differences. Every time you change a value in the array, check the corresponding cell in the other array. If they do not match, you can enter a key into the map. if they do match, then remove the key from the map. If the map is empty, then the arrays are identical. |
| ||
Yeah good stuff. And always wondered what the hell a TMap is... |