Threads - LockMutex freeze up?

BlitzMax Forums/BlitzMax Programming/Threads - LockMutex freeze up?

Retimer(Posted 2010) [#1]
I'm having a few freezes caused by lockmutex. If I understand correctly...if I lock a mutex and it is never unlocked, and another thread comes across that mutex, it will basically freeze up....however this doesn't seem to be the case.

I have a thread for each user, and they tend to access the same functions. Those functions have lockmutex, and ofcourse, unlock mutex right after accessing/dealing with the resource....there's nothing overly complicated (and it's not just happening with one function), but for some reason it still just randomly freezes over time.

I haven't tried trylockmutex because the logic that the threads are calling is pretty important (such as saving).

I'm still using blitzmax 1.34, as I didn't notice any real major changes to threads in the latest update.

Any clues? I'm going crazy with this issue :(


Kurator(Posted 2010) [#2]
Sounds like a deadlock :)

Is it possible that that those functions are called several times from various threads?

If so, look for:

Timestep 1: Thread A locks Ressource in Function D
Timestep 2: Thread B locks Ressource in Function E
Timestep 3: Thread A wants also to access Function E --> blocks and waits....
Timestep 4: Thread B wants also to access Function D --> blocks and waits...
Both A & B are waiting forever....

Be sure to lock ressources always in the same order and to free them in the same order. Lock ressources as short as it is possible


Retimer(Posted 2010) [#3]
Sounds haunting...

Pretty much all of my mutex are short, but i'll go through them all again and see if any of the functions could possibly be doing that scenario.

Really appreciate it Kurator!


GfK(Posted 2010) [#4]
Has anybody else noticed that LockMutex works, while myMutex.lock() does not? I get "EXCEPTION_ACCESS_VIOLATION" if I do it the OO way.

Haven't posted in bug reports yet but much of the OO threading stuff seems to have problems - I get the same error with thread.create() too.


Retimer(Posted 2010) [#5]
Yep...finally narrowed it down to a timer that randomly called a couple functions that caused this. I've kind of gone a safer but more costly route of having each user process the timer functions seperately in their own thread instead of a single seperate one calling multiple mutex.

Thanks again Kurator. This issue was so random and so annoying I don't think I would have figured out deadlocking with trial and error haha.


@gfk
Could that possibly just be a 136 issue? I was getting that from so many modules when I tried updating, so I reverted back to 134 for now.

And...I just tested on 134, i'm not receiving the error using the threading object methods like that. 1.36 seems a far stretch from ready; a bit disapointing.

Also, from what i've seen in the past, using object methods appear a bit quicker than using global functions, so that sucks.


GfK(Posted 2010) [#6]
Just been playing with threading stuff in 1.36 and can't reproduce any of the errors today.

Not too concerned with having to resort to a procedural approach anyway, as long as it works.


Retimer(Posted 2010) [#7]
There are not many examples to go by, but I find that I still need (or prefer) seperate threads calling certain functions, such as npc ai (if npc attacks player, I need to access that players resources for logic and socket).
There are also times where a players logic interferes with other players logic (player 1 heals player 2, so the player1 thread accesses player 2's resources for logic/data sending).

What i'm doing is a bit of overhead, but i'm thinking it's the safest bet, which is passing the 'igniter' (the user whos thread is performing the function) in all the functions, and if the function accesses multiple users resources, it ensures to lock their mutex, while NOT locking the same players mutex (the igniter/player who is processing the functoin) to prevent deadlocking.


example:
PlayerA Thread Function (repeats):
 LockMutex(GetMutex(PlayerA))
 	Player_Heals_All(PlayerA)
 UnlockMutex(GetMutex(PlayerA))
End of PlayerA Thread Function

 Function Player_Heals_All(PlayerA:TPlayer)
	PlayerA.Mana:-10
 	For Local t:TPlayer = EachIn PlayerList
 		If t <> PlayerA Then LockMutex(GetMutex(t))
			HealPlayer(t)
		If t <> PlayerA Then UnlockMutex(GetMutex(t))
	Next
 End Function


Just curious if there's a better way, or if someone has done something similar in a different way?

It's hectic but I don't really see a better, or safer way. I feel like i'm in another dimension with this crap...threads appear to be a bit more complicated than I initially thought, but useful nonetheless.


Otus(Posted 2010) [#8]
It's hectic but I don't really see a better, or safer way. I feel like i'm in another dimension with this crap...threads appear to be a bit more complicated than I initially thought, but useful nonetheless.

Some general things I find helpful when thinking about threads:

If you never lock two mutexes at the same time, you cannot have deadlocks. E.g. in your example, it might be better to have one mutex for all players.

If you do need to lock two mutexes, make sure it always happens in the same order. E.g. use .Compare to always lock the mutex for the "smaller" object first, when locking mutexes for objects of the same type.

The more specific your mutexes, the less likely a thread has to wait on one. E.g. one mutex for players' position, one for health+mana etc.

Keep the mutex locked for as short as possible. Move everything you can to before locking or after unlocking.

In AI and some other cases you don't care about an occasional glitch in the data you *read* (like opponent's position, own health). Then you can often get by without a lock, though the code isn't technically thread-safe.


Retimer(Posted 2010) [#9]
Thanks for the tips!

Although in that example, if there was PlayerB, PlayerC, etc all performing the heals at the same time, wouldn't only one mutex = boom as they'd end up deadlocking when calling healplayer (assuming heal player access that players resources/sends data to the player)?


Dreamora(Posted 2010) [#10]
Question though is how much heavy AI you have if you need more than 1 thread for it actually. Its not like AI is that math heavy that it easily fills more than a core, as it is not evaluated in realtime but every XX / XXX ms per AI entity (at least not if you wanted them to be able to gather knowledge about the environment, which would be too costly if you did the AI in a distinct thread and would thus flood the phys / collision / path system backends with requests)
Just using threads for the sake of "you might need it in some hundred years" is a pretty bad idea as the overhead of threads is no small one and just spawning threads like insane will at the end lead to worse performance than you had before using threads at all.

Above function is a good example for at least two reasons not to use threads:

1. On its own the function takes that little time that it basically takes a fraction of the thread creation / destruction time, so you are losing over 100% - xxxx% performance over not threading it.

2. Due to the lock in the loop you get rid of the overhead by making the situation even worse, as the chance that it just waits "forever to finish the function" is definitely there and that at no gain. A single mutex would not reduce the problem but actually make it just as worse as any access of any ai entity to its data would cause it to lock


Spawning a thread for anything that takes movements to execute is a waste of execution time as well as other resources.
You might in this case potentially consider writting a thread pool based worker system, that has a fixed amount of threads that just run and a job queue. The will check the job queue and process job by job and otherwise idle and you add jobs to that queue to provide them with it.
Jobs could look like network messages.
This threads then would be used to call other microfunctions basing on the job.

That way you will get a much better use of the resources than with aboves naive approach.


Retimer(Posted 2010) [#11]
The use of threads is needed in my case for sockets, but i'll admit that I am getting a bit carried away by using each thread to process the entire players logic, however at this point i'm using it more as a challenge than anything as i'm sure i'll end up needing to understand the complexities of threads in other uses some time anyways.

Also - is that you dreamora that I see a lot over at unity?


Otus(Posted 2010) [#12]
Although in that example, if there was PlayerB, PlayerC, etc all performing the heals at the same time, wouldn't only one mutex = boom as they'd end up deadlocking when calling healplayer (assuming heal player access that players resources/sends data to the player)?

No, the other ones would just wait on the mutex until the first one completed, so any healing would be serial.

But yeah, I agree with Dreamora that if you don't need threads, it is often faster not to use them. If you do need them, try to keep the data that multiple threads need to access to a minimum. And if you are interested in a job queue approach, you might want to have a look at my actor module.


Kurator(Posted 2010) [#13]
I fully agree with Dreamora, Threads are relatively expensive, even Mutexes are very costly, because they are syscalls.

This is also the reason why the "big" languages like C++, C#, Java came up with even more parallel processing abstractions like threadpools, workerthreads, agents, message orientation etc. etc. etc...


Dreamora(Posted 2010) [#14]
The use of threads is needed in my case for sockets, but i'll admit that I am getting a bit carried away by using each thread to process the entire players logic, however at this point i'm using it more as a challenge than anything as i'm sure i'll end up needing to understand the complexities of threads in other uses some time anyways.


I understand the reasoning for the thread to handle all the networking naturally (especially on a server backend).
Assuming that thread processes the incoming messages and creates events for the application to use, you could put the whole ai into a single thread too and handle the communication between them through that event queues (potentially dedicated ones for those two threads to communicate).

Also - is that you dreamora that I see a lot over at unity?

Yes


Retimer(Posted 2010) [#15]
Assuming that thread processes the incoming messages and creates events for the application to use,


Couple questions here..
1. Do you mean using 1 thread for incoming messages from all clients, or multiple threads?

2. The reason for multithreaded sockets is for sending data so other clients are unaffected by a overflowed buffer, delays and so forth, correct? So reading what is already in all the sockets buffers and dealing with it doesn't require a major multi-threaded approach i'm assuming?

Would it make sense to use a single thread for basically all server logic (ai, timers, dealing with incoming data), and multiple other threads for dealing with sending data to the clients?

If so, would it be unwise to use the server logic thread (the main thread) to write to a bank stream when needed, and have the multiple clients threads check when the clients bank has data, and write the data to the socket stream?


By the way I really appreciate all the help here. A lot of unanswered questions...I would have thought blitz users would have got more into the hype of threads.


Dreamora(Posted 2010) [#16]
1. One thread for the whole incoming message handling unless you intend it for a tens of thousands of user backend.
The problem I could forsee is that you would potentially have to use more than one queue for the network events if you use multiple threads (that though requires testing) because the locking could lead to a locking of the various network threads with an actual decrease.
In that case I would then potentially go per-thread queues and have a worker thread that grabs all the events from the queues, pushing them in a full batch (once per x milliseconds) into a single "for the application eventqueue".
As I've never written anything on that scale, I might potentially be missing smarter solutions so please understand that as a basic "lock prevention approach".

2. Thats one. The other is that the network buffers are sizewise restricted, so with many connected users you require 2-4 threads to work through the incoming messages (each thread has a specific "balanced port range" to take care off).


The outgoing amount of data naturally isn't easy to handle too, but unlike the incoming, it can not be killed in the receive buffer on the server.

Depending on the amount of users you expect, your approach can easily work too and it makes sense to keep it as "simple as possible". More threading just adds more work on the design end if you want it to work right, because debugging "not right" is hard to near impossible if the codeflow is the issue.



Threading hype: I guess quite a few got into it and then realized that it is not as trivial as they might have though or they just realized that the work to get it to work nicely enough to have a gain is just too much for their own need.
For example casuals games will never need a second thread.
Its tools that generate "cleanly seperated data" and servers that are most and especially most easily benefiting from it.

Its not a "coincedence" that even big budget games often don't unleash that much threading power aside of using it for AI, physics and networking.
Supreme Commander is one of the few games I know that succeeded in missusing the fullscale that multicores can offer, but to do so it basically had to resort using scripting for a large part of the game logic (ai and all is basically in LUA, fully accessable to any user for modding) so it can run the VMs on distinct cores.
I'm hoping that we will somewhen see a Python - Stackless module for BM, as I think it would be one of the strongest and cleanest way to unleash multicore usage. But you naturally just could write a base in C++ and the rest straight with python - stackless on top of it ...


Retimer(Posted 2010) [#17]
Thanks for clearing that up Dreamora.

While I expect a large player-base over time, I certainly don't expect several thousand simultaneous players, especially not in the next couple months.

Since you say my simple design idea should work, i'm going to give that a whirl, and possibly leave some room for scalability.

If I had more time to play around with this I would, but I jumped the gun before noticing this deadlocking (a few testers are hard to cause the 1/10000 chance, several users at once halves that chances many times) and have begun pushing the app.

Cheers to all the shared knowledge :)

After this project, i'm sure you'll see some "HELP!" posts by me over at unity, hehe.