Unusual array behavior apparently due to MemMove

BlitzMax Forums/BlitzMax Programming/Unusual array behavior apparently due to MemMove

Pineapple(Posted 2015) [#1]
I wrote a method which is meant to shift elements in an object array (called "buffer") to the left or right by some distance using MemMove. OBJECT_POINTER_SIZE is a constant assigned the value 4, since an object pointer is 4 bytes long.

    method shift( index%, count%, distance% )
        local copysource@ ptr = byte ptr(buffer) + (index * OBJECT_POINTER_SIZE)
        local copydest@ ptr = copysource + (distance * OBJECT_POINTER_SIZE)
        local copysize% = count * OBJECT_POINTER_SIZE
        memmove( copydest, copysource, copysize )
    end method


Sometimes, using this method results in apparently patternless (but consistent between executions of identical code) corruption of the array. For example, I stringified the array twice in a row with no operations between. The result was correct (identical strings) if I removed one method call significantly earlier in the program that would perform a shift, otherwise the console looked like this. Note the alteration of the third member. By commenting out various other lines I found that the member overwritten would change, but it would always become occupied by the object at index 0.

list: 1,2,A,B,C,E,F,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,a,b,c,d
list: 1,2,1,B,C,E,F,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,a,b,c,d


Why is this happening, and how can I fix it? I'm using 1.50 on OSX Mavericks.


Yasha(Posted 2015) [#2]
This is just a wild guess (without so much as opening the IDE), but -

If you copy or overwrite object pointers using raw memory operations, BlitzMax will not be updating the reference counts for each object as it normally does. If the reference counts become out of sync, you might encounter serious problems.

In this case, say you have an array of size 10 populated with objects. You shift it right by one. The refcount for object [0] is now too low by one, because it's been copied unknown to the GC into [1]; the refcount for the previous inhabitant of [9] is also too high by one and the object will probably leak.

Now my guess is that if something happens to clear the array, object [0] will be freed too early because it has two references in the array but only one was counted. The object's memory will then be reused for a new object (Max pools small instances). However, there's still one "live" reference to the old object floating around somewhere in the program - because it was off-by-one - and that reference now "magically" points to some object completely unrelated to what it originally aimed at because the memory has been reused ahead of schedule.

A related guess is that this is why an operation much earlier in the program is affecting this array via spooky-action-at-a-distance; objects from that array have presumably gone through some reference inc/dec cycles and had the chance to suffer for their incorrect refcount.


Brucey(Posted 2015) [#3]
I (personally) wouldn't be moving BBArray contents around internally from BlitzMax. If you really want to do hacky stuff like this, you are safer doing it C where you have complete control.

Note that copying objects like this will not auto-magically increase the refs of the copied objects. (depending on which GC you are using - there are two available, and I can't, off the top of my head, remember which is used for threaded and non-threaded).

Presumably you are also ensuring index + distance + count < array.length ?

<edit>
Like what Yasha said ;-)
</edit>


Pineapple(Posted 2015) [#4]
There shouldn't be any issues as a result of array length. And what's the next-best way to handle this after rewriting everything in C? Have I any better options than iterating and moving each element individually?


Brucey(Posted 2015) [#5]
Have I any better options than iterating and moving each element individually?

That would be the safest way :-)

Objects are not like Ints and friends. You can happily memmove blocks of ints and things around without any worries. However, due to the garbage-collector, you need to be careful to handle Objects correctly. Doing things with their pointers is not considered "correct" unless you know what you are doing.


Yasha(Posted 2015) [#6]
If you're only concerned about syntactic brevity and not performance, you should be able to work out something equally short using slices (and the fact that arrays can use the + operator). This will be slower though as it involves allocating new array objects.

If you really need performance, I guess you could work something out to correct the refcounts before the shift? But you really don't want to be playing with that stuff. It's hidden so that it can change in future (e.g. your code would break if you moved over to Brucey's compiler).

If you can change the types of object, you might also have some success replacing the array of object types with a memory buffer (probably not an array, extern type arrays are horribly broken in 1.50) of extern types. This would be totally untracked by Max, with nothing to corrupt (but you do then need to manage the memory yourself somehow). This requires some C-level programming knowledge too, but at least it's a stable and official interface.


Pineapple(Posted 2015) [#7]
Thanks much for the explanations. I'm trying to write substitutes for BlitzMax's native data structures and so performance is a priority, which is why I was hoping I could offload it onto simpler memory operations. I'm also wanting to have it written all in BlitzMax itself. So I'm settling for making friends with a for loop. (The aforementioned code was involved in an arraylist's insertion and removal methods.) Someday I need to find a different language to prefer working in.


Brucey(Posted 2015) [#8]
Someday I need to find a different language to prefer working in.

Is there anything in particular you are struggling to accomplish with BlitzMax?


Pineapple(Posted 2015) [#9]
I love BlitzMax's native libraries and its community (especially you - I can always count on you to help with my more confusing issues, you rock), I appreciate the ease of project management and compiling. No need for makefiles, forward declarations, or worrying over importing the same thing multiple times makes me happy. And since I've been developing with it for so long, making anything interesting in another language entails rewriting half of my personal BlitzMax libraries and convenience functions and such.

But its syntax could be denser, it could offer method inlining, multiple inheritance, operator overloading, templates, macros, more fleshed out preprocessor directives, actual constructors, cleaner generators, a less opaque GC, it could use normal operators. Having = instead of == and <> instead of != for example drives me up the wall. Not having a ternary operator, too.