Load a bunch of stuff in a thread

BlitzMax Forums/BlitzMax Programming/Load a bunch of stuff in a thread

GfK(Posted 2014) [#1]
I've never really got my head around this threading lark, despite reading about it over and over (and making my PC crash a lot), so I wonder if somebody could write some example code for me.

I want to load a bunch of stuff - some images, and some OGG files - it takes about half a second, during which I'd rather not have a pause, so I figured I might be able to do it in a separate thread.

I won't be attempting to use anything that's being loaded, until everything is loaded. I basically want to set the thread off loading stuff, then go "yoohoo! I'm finished!", when it's done what it's got to do, so my main program can start faffing with the things that have just been loaded.

Any help muchly appreciated.


col(Posted 2014) [#2]
Hiya,

If what you say you want to do 'is' really all want to do then you don't need any special technique or locks. Think of a second thread as a standalone function that runs side by side with your main thread. All threads have access to the main variables ( globals ) or you pass in a object that you can use to hold any variables that you want the thread to use. You can use the ThreadRunning function to know if your thread is done or not. As long in your main thread you dont touch the same variables thats being used in the second 'loading thread' then its really straight forward.

A bare bones simple example to help get your head around it, in your game you'd obviously flesh it out to something realistic :-

Graphics 800,600

Local f#
Local LoadingThread:TThread

Repeat

	' fake a loading process by pressing SPACE
	If KeyHit(KEY_SPACE) And Not LoadingThread
		LoadingThread = CreateThread(LoadThread,Null)
	EndIf
	
	' Isloading finished??
	If LoadingThread
		If ThreadRunning(LoadingThread)
			' do some anim while loading
			Cls
			
			DrawText "Loading something in another thread and doing anim in main thread",10,10
			f :+ 1
			DrawLine 400 + 200 * Cos(f),300 + 200*Sin(f),400 - 200 * Cos(f),300 - 200*Sin(f)
		Else
			LoadingThread = Null
		EndIf
	Else
		Cls
		DrawText "Not loading anything so play game or do something else.",10,10
	EndIf
	
	Flip
Until KeyDown(KEY_ESCAPE)

Function LoadThread:Object(Data:Object)
	' Fake loading by putting a msg in the console window 
	For a = 1 To 5
		Delay 1000	' delay 1 second
		Print a
	Next


EndFunction



GfK(Posted 2014) [#3]
Hmm... seems straightforward enough. Before, it's all the talk about mutexes and semaphors and God knows what else that's flummoxed me.

I made this. It's not pretty but it's a start, and it seems to work. Am I doing anything potentially dangerous?

Oh, and the thread Function - can it be a method in a class, instead of a function?

Strict

Graphics 1024, 768

Local thread:TThread

thread = CreateThread(loadStuff, Null)
Global sndMap:TMap = New TMap
Local sound:TSound

Repeat
	Cls
		If ThreadRunning(thread)
			DrawText "Loading", 50, 50
		Else
			DrawText "Stuff is loaded", 50, 50
			If sound = Null
				sound:TSound = TSound(sndMap.ValueForKey("map.ogg"))
				If sound
					PlaySound(sound)
				EndIf
			EndIf
		EndIf
		DrawText MilliSecs(), 50, 100 'to show that stuff is still happening
	Flip
Until KeyHit(KEY_ESCAPE) Or AppTerminate()


Function loadStuff:Object(data:Object)
	Local sound:TSound = LoadSound("map.ogg")
	sndMap.Insert("map.ogg", sound)
End Function



col(Posted 2014) [#4]
Yep, that's ok.

The multithread disasters only arise when one thead is writing to a variable as another thread is reading it. In that scenario you will get inconsistent and at best corrupt results but more than likely a crash due to incomplete and corrupt data. Mutexes place a special lock on cpu threads and using clever internals allow one thread run, this can cause cpu stalls and stuff, which is all internal cpu complexities, but still it matters, especially if you forget to unlock a mutex, and also you can end up with deadlocks :- one thread which uses some data from another locks while waiting for the data to come from another thread, but that other thread is locked out so you end up the first thread deadlocked and waiting. Something else to think about is that once the data is loaded and if you know that it isnt ever going to change, then its perfectly safe for any amount threads to READ the data at the same time. The problems only come about when reading AND writing to or from the same variable from different threads at the same time.

On the contrary there are atomic functions that allow you to read and write to a variable of an atomic type without using a mutex to protect it - in 'Max land an atomic type would be an Int type. Without going on about this bit too much, an Object variable is a 32bit Int value as its passed around by reference so things can be carefully planned out to make 'lock free' code, but you need to be REALLY careful and its not easy - its easier to use a mutex. Multithreading is a big area and what you want to do with it is nice, easy and straight-forward.


Oh, and the thread Function - can it be a method in a class, instead of a function?



It has to be a function yes, you can use a function inside a type but does need to be a function.


Derron(Posted 2014) [#5]
Because all sounds like you want to have a "loading screen" which informs what is done now:

Use Mutexes. Mutexes are useful to block access to variables used by multiple threads.

Before manipulating a variable which both threads access: you use LockMutex(aMutexYouCreatedFirst) and after finishing manipulation you UnLockMutex(aMutexYouCreatedFirst).

That "LockMutex" takes care of not manipulating an object in the same time ... it waits until the other thread "unlocked" the mutex again.

If it is no crucial data - and you do not want to block the flow of a thread, you use "TryLockMutex(...)" - it returns "true" if it was able to get the lock in this loop. This is useful if you eg. want to fetch a queue of something (events, objects to load).

A more specific example for you:
Your loader-object has a list of what to load.
The loader should do its work in a second thread.
If your mainthread now wants to add ressources to load ... it can do a TryLockMutex() - if it was successful, the loader is not locking that mutex in this moment and you can modify eg. loader.list. After adding, you unlock the mutex.
Meanwhile the loader object in its own thread loops and checks if there is something to do. If it successfully loaded something and is now wanting to remove that from the list of "to-load-objects" the loader cannot just "TryLockMutex" - it must assure to remove the entry from the list - that is why "LockMutex" has to get used. That loader-thread is now "halted" until it gets the lock of the mutex. After it got the mutex (next line of code is exectuted now) you can remove safely from the list and unlock the mutex again.

You see: use "TryLockMutex" to avoid halted threads (eg. mainthread - to display things smoothly) and use "LockMutex" if you need to get something happen in this loop.


I know that "prosa text" isn't the same as posting sample code - but maybe it helps to understand things.



bye
Ron


Yasha(Posted 2014) [#6]
an Object variable is a 32bit Int value as its passed around by reference so things can be carefully planned out to make 'lock free' code, but you need to be REALLY careful and its not easy - its easier to use a mutex.


If I may add a little bit to this:

It is not repeat not possible to write safe (and efficient) lock-free code in pure BlitzMax that allows writes from multiple threads. All the efficient/classical lock-free algorithms predate modern CPU design and they rely on in-order instruction execution; the fact that modern CPUs execute instructions in parallel even on a single core effectively wrecks all of the assumptions behind this style of coding. It's no safer than uncontrolled data sharing as a result (the solution is either to fake locks using something extremely inefficient like a stream or a timer, which is pointless as it's worse than the real thing, or to use barrier instructions which you don't have access to from above the assembly level). (edit: only true for archaic definitions of "lock free", )

The standard solution proposed by most languages/systems is to avoid code that writes outside its own thread at all. Don't allow your threads to share any writable data objects and you're guaranteed to be safe: just give them something read-only to work from and have them all report their newly created result data in turn when they're done through a standardised safe returning object (write mutex code once for this object and never think about it again, basically). You can enforce read-only attributes either by convention or by abstracting the interface. In practice it's rare for good code to need to share data between threads that often (if they're sharing, they're waiting, and the operation is then basically serial anyway), and this is increasingly the case as the number of threads rises.

In the case of the original question, for instance, there's no need for any sharing in between the order to start loading, and the report that loading is done containing the created data. No locks, or "lock-free" substitutes: no contact at all.


GfK(Posted 2014) [#7]
What I've decided to do is have an object containing two TMaps. The thread loads stuff into one TMap for images, the other for sounds. When it's done it passes the whole lot to the main thread via WaitThread(). Then all the main thread has to do is pull out the sound/image handles, referenced by filename.

It's not for a loading screen, Derron. And I'm sure all that you said (well, most of it) is correct, but all of that is exactly why I've avoided threading so far. And the documentation for threading is still beyond useless, which has not helped matters. It explains almost nothing.


Derron(Posted 2014) [#8]
I did not get your last paragraph...
[EDIT: Gfk posted in the same second - above refers to Yasha]

Sure there is no need to have data shared between "loading screen" and "loading thread" - but it would allow to display the progress in a "smooth" way (loading can take some more milliseconds than the render-interval).

bye
Ron


Kryzon(Posted 2014) [#9]
When it's done it passes the whole lot to the main thread via WaitThread().

I believe that WaitThread() stalls the thread that calls it so it can wait for the other to finish.
This would defeat the purpose of using threaded loading for not stalling the program.

I always detach the thread - it won't return an Object value.
You can still retrieve data from it by having it emit events.

I won't be attempting to use anything that's being loaded, until everything is loaded.

In case one is after a way to use the data as soon as it is loaded by the thread, one can do it with events as such:




GfK(Posted 2014) [#10]
I believe that WaitThread() stalls the thread that calls it so it can wait for the other to finish.
This would defeat the purpose of using threaded loading for not stalling the program.
Not if ThreadRunning() has just returned False.


Kryzon(Posted 2014) [#11]
I see. You are using ThreadRunning() to know if you can call WaitThread() or not.
If ThreadRunning( myThread ) Then
	
	'...

Else

	data = WaitThread( myThread )

EndIf



Derron(Posted 2014) [#12]
@Kryzon this is the way I do it too (with events) albeit I have a "TData"-object which can contain everything in its tmap

new TData().AddNumber("fileNumber",1).AddString("fileUrl", url).Add("file", file)

and afterwards
eventData:TData = TData(event.data) '(or how this eventmanager handles it)
if not eventData then return False
local fileNumber:int = eventData.GetInt("fileNumber", 0)
local fileUrl:string = eventData.GetString("fileUrl", "unknownString")
...

Is really nifty and easier to read than an array of objects [object("1"), ...].


As only object references (int values) should be transported in the case of "objects" you should be able to even add your images/sounds to an event data which then gets transported to the main thread using an threadsafe event system (the default event system is not threadsafe -- at least I remember to have read such thing).


@ThreadRunning/WaitThread:
So this is a one-time-task instead of something waiting in the background to get new things to do ... Okay, for a Loader this might be useful. Never thought of using it this way.

Nethertheless I currently do not use threading at all as I still have to do this ugly hack as I do not know how to circumvent it:
http://www.blitzmax.com/Community/posts.php?topic=98439
(hat to redirect some of the attention of this thread to mine .... bad hijacknig attempt :p).


bye
Ron


GfK(Posted 2014) [#13]
I got it working with little more than col's original post.

Just one question though - is it totally safe to just use LoadImage in a thread, as long as I'm not attempting to draw anything from there? Or should I load all the graphics as pixmaps/banks and pass that back to the main thread?


xlsior(Posted 2014) [#14]
is it totally safe to just use LoadImage in a thread, as long as I'm not attempting to draw anything from there?


As best as I can remember reading on the forums: Yes.

The *drawing* operations are definitely not thread-safe and should only be done from the main program.


col(Posted 2014) [#15]
Hiya,

All TImage create/load functions use TPixmaps in the background and only get uploaded to the gpu ( via the appropriate driver texture creation functions ) upon the first DrawImage call that use that image, so yep its completely safe to load from a different thread, again as long as no other threads are trying to access it at the same time.

As xlsior says - DrawImage should only ever be called from the main thread.


GfK(Posted 2014) [#16]
OK, guess I'm good to go then. Thanks for the help.


Kryzon(Posted 2014) [#17]
The default event system is not threadsafe -- at least I remember to have read such thing.

I was reflecting on this statement and it seems that the only problem is the array of TEvent objects that is used as the event queue.
If you use PollEvent or WaitEvent to retrieve threaded events in your main thread, you might read the array at the same time it's being written to by one of your child threads - this constitutes the "multiple access" disruption described by col on post #4.

To address this, the "brl.eventqueue.mod" would need to be modified (or receive additions) so that a mutex is used when accessing the array.
I believe these modifications should be enough:



Using the "mt" versions in the place of the regular ones.


Derron(Posted 2014) [#18]
Feel free to make it compileable with/without mt (?threaded ... ?, ?not threaded ...?) and then push your changes to bruceys hub.

I think so too, that only the "changed" parts should be mutex-secured.


Albeit "integers" seem to be threadsafe by default - I do not know if this somehow might change, in this case the "index"-variable should be secured two. Means maybe you better place that "queue_put:+1" or "queue_get:+1" into the lock-portion too.


bye
Ron


GfK(Posted 2014) [#19]
Another related question. Suppose I've got a thread running to load some game stuff, and while the thread is running, the player decides he wants to go back to the main game menu. I need to stop the thread from running.

How do I achieve that? Do I just Null the thread handle?


[edit] Nope, Nulling the handle doesn't stop the thread immediately, which is what I want to do:
Graphics 800,600


Local thread:TThread = CreateThread(mythread,Null)
While Not KeyDown(key_escape)
	If KeyHit(key_space)
		If thread	
			thread = Null
		EndIf	
	EndIf	
	Delay 1
Wend	

Function mythread:Object(data:Object)
	Local s:Int = MilliSecs()
	Repeat
		DebugLog MilliSecs()
	Until MilliSecs() > S + 100000
End Function	



Derron(Posted 2014) [#20]
Isn't a thread stopped by its own?


If your thread function does not contain a "loop forever" it is just a simple task which ends after the function ends.


So if you have this "repeat forever" approach in your thread, you will have to have a flag/variable telling the thread to break out of this loop: "global _endThisThread:int=FALSE" ...


edit for your edit-inserted code:
global _endMyThread:int=FALSE
Function mythread:Object(data:Object)
	Local s:Int = MilliSecs()
	Repeat
		DebugLog MilliSecs()
	Until MilliSecs() > S + 100000 or _endMythread
End Function

'...
_endMyThread = TRUE



bye
Ron


Kryzon(Posted 2014) [#21]
To be extra immediate, since the global might be changed just after the "Until" test fails and the thread continues to load something else, don't load or do anything if the global isn't true - have additional tests for the global in your thread code.

Global _runThread:int=True

Function myThread:Object( data:Object )

	Local s:Int = MilliSecs()

	Repeat

		If Not _runThread Then Exit

		'Only load resources if the thread can run.

		If _runThread Then 'Load images.

		If _runThread Then 'Load sounds.

		'...

		DebugLog MilliSecs()

	Until MilliSecs() > S + 100000 Or Not _runThread

End Function

'...

_runThread = False



Ravl(Posted 2014) [#22]
Question:

In my game.bmx I have this code:

	Function threadLoadMainMenu:Object(Data:Object)
		myMenu.loadGraphics()				
	End Function


myMenu exists when the game starts.

the problem is that when I want to compile I receive:
Compile Error: Identifier 'loadGraphics' not found
Build Error: failed to compile D:/share/HOPAEngine/HOPAEngine/game.bmx



Derron(Posted 2014) [#23]
What is "myMenu" - is it a class name, an instance ?...


bye
Ron


Ravl(Posted 2014) [#24]
myMenu is an isntance of the TMainMenu class..

I must declare it as Global and not as a Field. It is working now..