waitevent locks up application?

BlitzMax Forums/BlitzMax Programming/waitevent locks up application?

Rozek(Posted 2010) [#1]
Hello!

For several days, I tried to isolate a casual lock-up of my multi-threaded application - and finally found that "waitEvent" did no longer return although there is a timer which should fire every 1/30 second. Surprisingly, after a few seconds, the non-returning "waitEvent" fully loads both(!) cores of my CPU (although the operating system is still repsonsive)

I have not yet managed to build a simpler application which reproduces the problem. As usual in an MT environment, my application may run fine for long - and then suddenly stop.

Has anybody out there had a similar experience? It seems to happen under both Windows and Mac OS X and, thus, might be a BMX-specific problem.

Thanks in advance for any hints!


Rozek(Posted 2010) [#2]
By looking into the source of "waitEvent" I realized, that it mainly consists of a loop which internally invokes "waitSystem".

I tried to limit the number of iterations and/or use "pollSystem" instead of "waitSystem", but still got stuck in it - thus, the problem might be within "waitSystem" or "pollSystem".

[edit]Hmmm, "waitSystem" and "pollSystem" look very similar, namely
  If _busy Return
  _busy=True
  Driver.Wait
  _busy=False

I don't know which routines may call "waitSystem" or "pollSystem" (I should have moved all GUI-related stuff into the main thread), but could the "_busy = true/false" bracketing introduce a "race condition"?

[edit]Just as a note: I already modified the code to be more thread-safe:
' If _busy Return
' _busy=True
  if (not compareAndSwap(_busy,false,true)) then return ' return if already _busy
  Driver.Wait
  _busy=False

but this did not have the desired effect (i.e. the application still got stuck loading the whole CPU)

[edit]By looking even deeper into the source (under MacOSX) I found that there are even more routines which use a similar approach, e.g. "bbSystemIntr" ("Intr" sounds like "interrupt"?)
void bbSystemIntr(){
  if( !appWaiting ) return;
  appWaiting=0;
  [NSApp postEvent:anullEvent atStart:NO];
}

Is this approach "thread-safe"?


Rozek(Posted 2010) [#3]
Hmmm,

just for the records: "postEvent" and "waitEvent" etc. use a circular buffer with "queue_get" and "queue_put" as indices. Shouldn't these variables also bit-anded with "QUEUEMASK" when incremented? Or what will happen in case of a range overflow (well, 2 billion events may be quite a lot, though, until something bad may happen)


Rozek(Posted 2010) [#4]
Well,

my Application seems to lock itself up within "waitEvent" - unfortunately, however, I don't know how to deal with ".m" files and, thus, can't help myself any further...

In order to proceed (and progress is now getting urgent for me - after dealing with MT issues for quite a while!), it might already be helpful to know where bbSystemIntr, bbSystemWait and bbSystemPoll get called - I *might* be able to provide workarounds then...

Did anybody have similar experiences with events in MT applications?


ima747(Posted 2010) [#5]
Don't know how exactly you're working with the events, but events are not thread safe. I didn't have any problems posting and emitting from child threads though it shouldn't be safe... but defenitely can't poll or wait without some nasty issues. I currently use a secondary event que. I have a function called QueEvent which you pass an event in a child thread. You then lock the que TList mutex and shove the new event at the end, then unlocks. The main thread meanwhile locks the que Tlist and posts (or emits whatever is better for you) all the events in the que, then wipes it and unlocks. This requires your event loop to be constantly running to check the que for new events. For me I don't need any events from a child THAT frequently so I can call waitsystem() to idle the main thread when there's nothing major going on and it's not a big deal, your flow may vary.


Rozek(Posted 2010) [#6]
Logan,

in principle, all "postEvent" and "wait/pollEvent" calls should already be made from my main thread only (but it's difficult to say as there might be some BMX commands which access the event queue internally).

Additionally, there are those bbSystemIntr, bbSystemWait and bbSystemPoll calls which don't look very thread-safe but might get called from non-MaxGUI commands (perhaps even from "delay"?)

As a consequence, even with all MaxGUI-related stuff in the main thread, there might be situations, where event-related stuff gets called from within a sub-thread. My problem is that I am heavily switching contexts and, thus, suffer from a high probability to run into race-conditions (on the other hand, this simplifies testing, of course)


Rozek(Posted 2010) [#7]
Hmmm,

I just tried the same application under Windows - and could not yet get it to lock up. Thus, it seems as if it could be a Mac-specific problem.

I'll keep you informed...