Copying array 2x slower than CopyBank?
BlitzMax Forums/BlitzMax Programming/Copying array 2x slower than CopyBank?
| ||
Have I done something wrong here, or is copying an array just really slow?SuperStrict Local T0%, T1%, T2%, T3% Local Bank1:TBank = CreateBank(40000*4) Local Bank2:TBank = CreateBank(40000*4) Local Pixels1:Float[40000] Local Pixels2:Float[40000] Local Ptr1:Byte Ptr, Ptr2:Byte Ptr Local Loop%, Loops% = 10000 Ptr1 = MemAlloc(40000*4) Ptr2 = MemAlloc(40000*4) T0 = MilliSecs() For Loop = 1 To Loops CopyBank(Bank1, 0, Bank2, 0, 40000*4) Next T1 = MilliSecs() For Loop = 1 To Loops Pixels1 = Pixels2[..] Next T2 = MilliSecs() For Loop = 1 To Loops MemCopy(Ptr1, Ptr2, 40000*4) Next T3 = MilliSecs() Print "CopyBank = " + (T1-T0) Print "Copy Array = " + (T2-T1) Print "MemCopy = " + (T3-T2) |
| ||
yes. Pixels1=Pixels2[..] is slow. but,For Local i:Int = 0 Until Pixels2.length Pixels1[i] = Pixels2[i] Next is fast. |
| ||
A slice is a new array.Pixels1 = Pixels2[..] This allocates memory for a new array, copies Pixels2[] into it, points Pixels1 at the new memory. The old memory used by Pixels1 is now unused and can be garbage collected. I'm surprised it runs as fast as it does. Here it is with garbage retained. Local Pixels1:Float[100000] Local Pixels2:Float[100000] GCSuspend For n = 1 To 20 Pixels1 = Pixels2[..] Print GCMemAlloced() Next |
| ||
Zeke: Afraid not. It is faster, but it's still twice as slow as copying a bank: SuperStrict Local T0%, T1%, T2%, T3% Local Bank1:TBank = CreateBank(40000*4) Local Bank2:TBank = CreateBank(40000*4) Local Pixels1:Float[40000] Local Pixels2:Float[40000] Local Ptr1:Byte Ptr, Ptr2:Byte Ptr Local Loop%, Loops% = 10000 Ptr1 = MemAlloc(40000*4) Ptr2 = MemAlloc(40000*4) T0 = MilliSecs() For Loop = 1 To Loops CopyBank(Bank1, 0, Bank2, 0, 40000*4) Next T1 = MilliSecs() Local Loop2% For Loop = 1 To Loops For Loop2 = 0 Until 40000 Pixels1[Loop2] = Pixels2[Loop2] Next Next T2 = MilliSecs() For Loop = 1 To Loops MemCopy(Ptr1, Ptr2, 40000*4) Next T3 = MilliSecs() Print "CopyBank = " + (T1-T0) Print "Copy Array = " + (T2-T1) Print "MemCopy = " + (T3-T2) It is faster than doing a slice (which was actually 2.5x slower, not 2x like I originally said) but it's still terribly slow. This isn't too surprising actually, come to think of it, because we're coping 32bit floats here, whereas Memcopy which the other two examples use, is likely copying 64bits at a time or something. Or perhaps it uses an unrolled loop. I imagine using an unrolled loop on the array might speed it up a bit. But it's really not worth all the effort to use an array instead of memalloc. [edit] Yep. Unrolling the loop 4x results in array copying going almost as fast as memcopy. |
| ||
Most modern CPUs provide instructions to quickly copy large amounts of memory at a time, and implementations of standard library's memcpy are almost certainly going to take advantage of them instead of using a loop. Remember, though, that arrays are objects and are therefore implicitly convertible to and from byte pointers. You might find For Loop = 1 To Loops MemCopy(Pixels1, Pixels2, 40000*4) Next T4 = MilliSecs() more legible than the alternatives. It should be tied in speed with MemCopy'ing on the byte pointers, and you'll still be able to access the arrays with Pixels1[index], etc., which is almost certainly more legible than messing with peeks or dereferencing byte pointers as float pointers. |