Multithread this

BlitzMax Forums/BlitzMax Programming/Multithread this

JoshK(Posted 2009) [#1]
Here's a routine that could be multithreaded. It reads through a directory and loads all files into a TMap. How would I go about creating different threads to make the process faster?:
Global AbstractFileMap:TMap=New TMap

Function RegisterAbstractPath(filepath:String)	
	Local name:String,filename:String,d:int
	filepath=RealPath(filepath)
	d=ReadDir(filepath)
	If Not d Return
	If Right(filepath,1)="/" filepath=Left(filepath,Len(filepath)-1)
	If Right(filepath,1)="\" filepath=Left(filepath,Len(filepath)-1)
	Repeat
		filename:String=NextFile(d)
		If filename="" Exit
		Select FileType(filepath+"/"+filename)
			Case 0
				Exit
			Case 1
				name=StripDir(filename).tolower()
				AbstractFileMap.insert(name,RealPath(filepath+"/"+filename).tolower())
			Case 2
				If filename<>"." And filename<>".."
					RegisterAbstractPath(filepath+"/"+filename)
				EndIf
		EndSelect
	Forever
	CloseDir d
EndFunction



Gabriel(Posted 2009) [#2]
Are you sure this is worth multithreading? On the face of it, it seems like your bottleneck here is going to be the hard drive, not the CPU. Like when you open two Windows Explorer searches at once and they take the same amount of time as running them one after the other. At least, they do for me.


JoshK(Posted 2009) [#3]
I don't know, but it would be a good example to learn how it works.


Gabriel(Posted 2009) [#4]
Oh, I see, just a theoretical exercise to get some practice in. Fair enough.

Well as I see it, you have two options. You can't split the entire job into two because you will kill any performance gains checking which files the other thread had done.

So one way would be to run through the entire directory with ReadDir and a loop of NextFile() to give yourself a list before you start. Put that into an array or something. Then slice the array in two/more and give a part of the array to each thread to process further.

Another way to do it would be to have one thread processing all the files with NextFile() and putting the filenames in a list/array. Then another thread could look to see which files have been found with NextFile and do the rest of the processing.

With the second way though, I guess you would need some kind of locking to prevent the processing thread from accessing the list of found files while the file reading thread was writing to it. In the first way, each thread would have its own data to work on and there would be no crossover. Therefore nothing to lock.


JoshK(Posted 2009) [#5]
because you will kill any performance gains checking which files the other thread had done.

Why would you need to do that? You can just assign new directories to a thread and let them work out the sub-hierarchy. The only time the threads have to be locked is when the global TMap is added to.


Gabriel(Posted 2009) [#6]
Oh ok, I don't think I've understood you very well. I assumed this was a function you called once or twice and everything was off of one or two main folder paths. So I thought you want to use multiple threads each time the function was called.

But you're actually suggesting that you might call this dozens of time with different folders and you want each separate call of the function to use its own thread? Have I got that right now?


Kurator(Posted 2009) [#7]
i would not partition the function, just put the function in a own 'background' thread.

this is a clearly i/o bound function, splitting it to more than one thread would simply create more i/o syscalls, causing the hardrive doing some funny things - like trying to read different sectors at the same time - which causes the in longer response time.

partitioning a problem domain will give you a speedgain on calculation intense algorithms, or algortithms that are not i/o (hd, dvd...) bound.


JoshK(Posted 2009) [#8]
This is just an example of a simpler hierarchal problem. The interaction of the threads is minimal, so I want to use this to learn from.


Kurator(Posted 2009) [#9]
So here some short pseudocode:

Let the map being a global, define a mutex for it, to avoid concurrent access to the list

in your function:

in case 1: do your strip dir, set the mutex, write to the map, release the mutex
in case 2: set up a new thread and call your function, after returning destroy it.

this will create a bunch of threads, but will partition your problem. if you want to keep the amount of threads limited, use a sort of threadpool with a static amount of prepared threads.


ImaginaryHuman(Posted 2009) [#10]
I agree there isn't a lot you can do to speed up the file access - x amount of file data has to be transferred into memory and it takes y amount of time to seek and read the disk.

I would at least experiment with trying to read in two directories at once to see if there is any gain - ie if there is any idle time where the disk is not being accessed where another thread could be reading file data.

If there isn't any idle disk time, then it's going to be more a case of doing some other work while you're waiting for the file data to be read. ie as you read each file into a list, process it somehow while another thread is off reading the next file. File access probably doesn't use all the cpu time so you could gain in that regard so long as you have some other non-file work to do. e.g. read the files in one or two threads, another thread monitors when a file is completely loaded (or maybe starts working on it based on how much is loaded so far), and then perhaps uploads it as a texture to video ram for example. The texture uploads can probably be done in parallel to the file spooling.


BlitzSupport(Posted 2009) [#11]
[EDIT: Got both varieties working but can't be bothered editing all this again!]

[EDIT 2: Nope, the threaded recursion is getting a smaller number of files. Not sure why yet... I expected more! Will try and figure out... ]

I did this before realising you wanted to iterate through the same folder list via multiple threads (at least I think so?), but here's an example of using multi-threading to get the directory list in the background, which might help in a different way. The only modification to the function was changing the return type, the parameter type, and adding a local version of filepath:String.

A mutex isn't necessary here, as the thread calls the function recursively as a normal function, rather than spawning a new thread for each recursion (which I think is what you wanted).

Anyway, this test opens a window so you can draw with the mouse while obtaining the file list, then closes it and lists the files.


Global AbstractFileMap:TMap=New TMap

Function RegisterAbstractPathThreaded:Object (data:Object)
	Local filepath:String = String (data)
	Local name:String,filename:String,d:Int
	filepath=RealPath(filepath)
	d=ReadDir(filepath)
	If Not d Return
	If Right(filepath,1)="/" filepath=Left(filepath,Len(filepath)-1)
	If Right(filepath,1)="\" filepath=Left(filepath,Len(filepath)-1)
	Repeat
		filename:String=NextFile(d)
		If filename="" Exit
		Select FileType(filepath+"/"+filename)
			Case 0
				Exit
			Case 1
				name=StripDir(filename).tolower()
				AbstractFileMap.insert(name,RealPath(filepath+"/"+filename).tolower())
			Case 2
				If filename<>"." And filename<>".."
					RegisterAbstractPathThreaded(filepath+"/"+filename) ' Called as a normal function from same thread, ie. dirs:TThread...
				EndIf
		EndSelect
	Forever
	CloseDir d
EndFunction

' Test...

Graphics 640, 480
SetHandle 16, 16

Print ""
Print "Getting directory list..."

Local dirs:TThread = CreateThread (RegisterAbstractPathThreaded, "C:\Windows") ' Change to suit...

Local threaddone = False

Repeat

	If Not ThreadRunning (dirs)
		Exit ' Exit loop and list dirs...
	EndIf

	Cls
	DrawRect MouseX (), MouseY (), 32, 32
	Flip
	
Until KeyHit (KEY_ESCAPE)

EndGraphics

For p$ = EachIn MapKeys (AbstractFileMap)
	Print String (MapValueForKey (AbstractFileMap, p$))
Next

End


To have a new thread for each recursive call, do what Kurator says, though I imagine it will cripple the speed -- you'll also have to discard multiple filename entries, since each call will go through all the same sub-folders, if my logic is working correctly (recursion hurts me).

Define Global mapmutex:TMutex = CreateMutex () at the top and call LockMutex/UnlockMutex mapmutex before and after AbstractFileMap.insert (), then each thread will wait if another has the mutex locked. (Just modify the CreateThread line above for your RegisterAbstractPathThreaded call inside the function.)

Actually, I got curious and just did it, and it's much faster, but it's probably full of duplicates -- I'll leave that for you to deal with as you see fit. Probably quicker to do it afterwards.

If you add a WaitMouse after the Graphics setup, open Task Manager and add Thread Count to your list (see View menu), you can highlight the program in Task Manager and see the thread count.

Global AbstractFileMap:TMap=New TMap
Global MapMutex:TMutex = CreateMutex ()

Function RegisterAbstractPathThreaded:Object (data:Object)
	Local filepath:String = String (data)
	Local name:String,filename:String,d:Int
	filepath=RealPath(filepath)
	d=ReadDir(filepath)
	If Not d Return
	If Right(filepath,1)="/" filepath=Left(filepath,Len(filepath)-1)
	If Right(filepath,1)="\" filepath=Left(filepath,Len(filepath)-1)
	Repeat
		filename:String=NextFile(d)
		If filename="" Exit
		Select FileType(filepath+"/"+filename)
			Case 0
				Exit
			Case 1
				name=StripDir(filename).tolower()
				LockMutex MapMutex
				AbstractFileMap.insert(name,RealPath(filepath+"/"+filename).tolower())
				UnlockMutex MapMutex
			Case 2
				If filename<>"." And filename<>".."
					Local dirs:TThread = CreateThread (RegisterAbstractPathThreaded, filepath+"/"+filename) ' Called as a normal function from same thread, ie. dirs:TThread...
				EndIf
		EndSelect
	Forever
	CloseDir d
EndFunction

' Test...

Graphics 640, 480
SetHandle 16, 16

Print ""
Print "Getting directory list..."

Local dirs:TThread = CreateThread (RegisterAbstractPathThreaded, "C:\Windows") ' Change to suit...

Local threaddone = False

Repeat

	If Not ThreadRunning (dirs)
		Exit ' Exit loop and list dirs...
	EndIf

	Cls
	DrawRect MouseX (), MouseY (), 32, 32
	Flip
	
Until KeyHit (KEY_ESCAPE)

EndGraphics

For p$ = EachIn MapKeys (AbstractFileMap)
	Print String (MapValueForKey (AbstractFileMap, p$))
Next

End



JoshK(Posted 2009) [#12]
In case 2, your program just creates the thread and leaves without waiting for it to complete. Here is a correction:


Threaded results:
Getting directory list...
24446 msecs
36843 files found.
Process complete

Non-threaded results:
Getting directory list...
10554 msecs
36843 files found.
Process complete

So obviously there is some overhead in creating a thread and since each thread in this example is doing so little it makes sense it would be slower.


JoshK(Posted 2009) [#13]
Now I added a check to keep the thread count at the theoretical optimum count. I have a quad core so I chose four threads. Performance is about the same the previous mt attempt:



JoshK(Posted 2009) [#14]
And here I set it up so threads are only created in the top-level directory. So one thread is created for each folder in the Windows directory, and then they are just each left to do all the work for the subfolders:


Here are the results:

Getting directory list...
10688 msecs
36843 files found.


Since we have already determined the limiting step is the hard drive, it seems like just getting the MT version to match the ST version is the goal. So we see here that the most important thing is dividing the task up intelligently.


BlitzSupport(Posted 2009) [#15]
Actually, my If Not ThreadRunning (dirs) loop was just to allow checking while drawing. You can just do this in your modified version (or, for visual clarity, store the CreateThread result as before and WaitThread on that):

WaitThread CreateThread (RegisterAbstractPathThreaded, "C:\Windows")


However, the file count (even taking directories into account) is not correct for my C:\Windows folder, so something's still not quite right. I have 12,194 files and 1,318 folders, but the result of this program is 9752. How does your output compare with right-click -> Properties on the C:\Windows folder?

This gets the correct count for me (except for some reason there's a consistent discrepancy of two folders and one file):




JoshK(Posted 2009) [#16]
The reason it is skipping files is your threads created in the function aren't given a chance to complete.

This has been a good exercise because now I understand mutexes. It will be interesting to see how this works out with hierarchal culling routines.


BlitzSupport(Posted 2009) [#17]
I mean your final version still doesn't count correctly, and that does wait.

I'm not totally convinced that you should have to wait for the threads, though -- they should complete regardless once spawned off, surely? Unless there's a weird recursion/queuing effect I can't quite get a mental grasp on.


JoshK(Posted 2009) [#18]
You have to make sure the threads are done before you count files or end the program.


Kurator(Posted 2009) [#19]
do not test c:/windows - the count of files may vary while running because of other processes doing some i/o there :)

do skip the loop for waiting, simply use:

Local dirs:TThread = CreateThread (RegisterAbstractPathThreaded, filepath+"/"+filename) ' Called as a normal function from same thread, ie. dirs:TThread...
dirs.Wait()



Kurator(Posted 2009) [#20]
The difference in count between Josh and your implementation is - the TMap in Josh Code eleminates files with duplicate Filenames :)


BlitzSupport(Posted 2009) [#21]

You have to make sure the threads are done before you count files or end the program



Yes, I see now -- I wasn't thinking about that final count/program exit.

However, your final version is waiting for each thread to exit as far as I can see, but it still isn't returning the correct number of files for multiple levels of folders containing multiple folders. The count from the non-threaded function in my last codebox above is returning the correct number of files and folders (ie. matching Windows' Properties dialog).

For example, my Downloads folder is reported in Windows Properties as "3,771 Files, 113 Folders". This is what CountFiles, above, returns, while your function (which does wait on its threads) is reporting "3757 files found".

Kurator, yes, good point regarding TMap -- I did eventually realise myself that the TMap would take care of duplicates!


Kurator(Posted 2009) [#22]
I get the same results, i modified Josh' Code a little bit, using AtomicAdd and Thread.Wait(), give it a look:




BlitzSupport(Posted 2009) [#23]
Hmm, didn't know about AtomicAdd. I guess that's similar to doing a LockMutex and modifying the global? My result was...

Getting directory list...
366 msecs
345 filenames found.
349 files found.
30 folders found.


... which shows the discrepancy. I can't actually see what's causing it though.

Thread.Wait () is good -- my ThreadRunning loop was only there to allow the main loop in my original demo to keep going, while checking for the thread to finish. I've been doing this:

WaitThread CreateThread (RegisterAbstractPathThreaded, "D:\Downloads").


Same thing as Thread.Wait () in the end, of course...


Brendane(Posted 2009) [#24]
I would just like to point out that the thread.Wait() within the 'scanning' thread is effectively limiting the number of running 'scanning' threads to just 1.

So it's not actually running multi-threaded as you intended.

Yes you're creating a number of threads, but each one is immediately going bye-byes until its child has finished its job.. and so on.


Brendane(Posted 2009) [#25]
An appropriate model to use here would be to create a pool of worker threads that wait on a semaphore.

Each time a new folder is needed to be parsed you add it to a list and release a semaphore. The first worker thread takes the folder off the list and parses it.

A count of folders being parsed is maintained... a master thread (could be the main thread) waits for this to become 0 and then knows the job is done.

Now all that's left is to clean up by terminating the worker thread pool. I would normally set a global 'terminate' flag here and then just release 1 semaphore per thread. The first thing the worker thread should do after picking up a semaphore is test whether this 'terminate' flag is set, if so, it skips it's code and ends naturally.

The main thread then waits on each worker thread in turn.

The thread body psuedo code would look something like this

While Not terminate
WaitSemaphore(semaphore)
If Not terminate
' take next folder off list and process it
' decrement global folder count now we've finished with it.
EndIf
Wend

There is also an efficient way to put the main thread to sleep until all is done by having another semaphore that the main thread waits on... each time a worker finishes with a folder and decrements the count it also releases one of these semaphores.
When the main thread wakes upon seeing a semaphore it checks the folder count to see if it's 0, if it's not it waits again. At 0 it is time to do the clean up.


Kurator(Posted 2009) [#26]
Yep, you are absolutely right, because of the inner wait there ist never more than one thread working....

Here a revised approach, but take care - the num. of thread ist not limited in this case :)




BlitzSupport(Posted 2009) [#27]
That worked really quickly. On my C:\Windows folder I saw 61 threads in the Task Manager! I imagine that if they were limited it would be faster, but as this thread has shown... who knows? It's also shown how much attention we need to pay when trying to multi-thread programs...


virtlands(Posted 2013) [#28]
This is an interesting 4-year old topic.

Though it is a great experiment in learning threading techniques,
I think you should not apply this to disk 'reads' since simultaneous disk reads (for directory purposes)
will tend to thrash your hard drive, and wear it out.