LoadDir limited to 100 folders?

BlitzMax Forums/BlitzMax Programming/LoadDir limited to 100 folders?

Czar Flavius

(Posted 2010) [#1]

Is LoadDir limited to 100 folders?

Function LoadDir$[]( dir$,skip_dots=True )
	FixPath dir,True
	Local d=ReadDir( dir )
	If Not d Return
	Local i$[100],n
	Repeat
		Local f$=NextFile( d )
		If Not f Exit
		If skip_dots And (f="." Or f="..") Continue
		If n=i.length i=i[..n+100]
		i[n]=f
		n=n+1
	Forever
	CloseDir d
	Return i[..n]
End Function

Why not use a TList?

Edit: I just realised it isn't, but still, why not use a TList?

Last edited 2010

Gabriel

(Posted 2010) [#2]

Probably because it's more performant to allocate a block of 100 objects at a time rather than adding them one by one to a TList. It's similar to how Vectors work in C++.

EDIT: Besides, what could you do with a TList that you can't do with an array?

Last edited 2010

Czar Flavius

(Posted 2010) [#3]

Well if the folder had thousands of files, TLists would be much faster. And when are you ever going to have LoadDir in such a performance critical hundred-times-a-second loop that it would even matter TLists are negligibly slower?

Gabriel

(Posted 2010) [#4]

Aren't you sort of contradicting yourself there? Your sole reason for thinking it should be a TList is performance (and you admit that it would only be in rare situations) but then also admit that it's not something where performance really matters.

Czar Flavius

(Posted 2010) [#5]

As it's a standard library function, it should be applicable to the widest variety of situations. TList is 99% good for a small number of files and 99% good for a large number of files. Array is 100% good for a small number of files and 50% good for a large number of files.

Gabriel

(Posted 2010) [#6]

We can all make up a bunch of numbers out of thin air, but general purpose functions are designed for a general situation. If you have a specific, unusual situation, you write your own.

Czar Flavius

(Posted 2010) [#7]

I don't understand how a performance hit every 100th file makes for a general purpose solution.

Gabriel

(Posted 2010) [#8]

A wise man once said

And when are you ever going to have LoadDir in such a performance critical hundred-times-a-second loop that it would even matter

Czar Flavius

(Posted 2010) [#9]

I was just musing, you came in with serious business "it's more performant to allocate a block of 100 objects at a time" nonsense, I responded. A folder with more than 100 items is not an impossibility.

Gabriel

(Posted 2010) [#10]

I was just musing, you came in with serious business "it's more performant to allocate a block of 100 objects at a time" nonsense, I responded.

You're right. I tried to answer things in terms you would understand. Since you seem (not just in this thread) obsessed with micro-optimizing things which don't need optimizing, I thought an explanation which appealed to your love of optimizing things which are plenty fast enough would satisfy you.

A folder with more than 100 items is not an impossibility.

No, it's not. It's also not faster to do with a TList. As I explained to you earlier on, this kind of allocation, in blocks of 100 (or whatever) at a time is very fast, and it's the way that vectors work in C++. There's a reason that vectors are used over lists, and that reason is that they're faster. There may be some point at which arrays in blocks of 100 becomes more expensive than lists (I got bored when I got up to 999) but when that happens, you just need to increase your block size. The block size should be proportional to the number of allocations you generally find yourself doing. Mark has hardcoded this for a general purpose solution. If you find yourself routinely running through directories with more than a thousand files, you should definitely look at rewriting Mark's function to allocate in blocks of a thousand at a time.

Czar Flavius

(Posted 2010) [#11]

Dude, take a chill pill.

I'd be interested to see which of these other threads my obsession with optimization is revealed, this one perhaps? http://blitzbasic.com/Community/posts.php?topic=91945

The saying goes, your program spends 80% of its time in 20% of the code. Optimize only what you have to or you will be doing it all year.

If anything you seem to be the one with an optimization obsession.

You're right. I tried to answer things in terms you would understand.

That's just mean.

I was just interested in the design thought behind this function, and why an array was choosen instead of a list. No need to jump down my throat.

Last edited 2010

skidracer

(Posted 2010) [#12]

I only take issue with the fact the array doesn't grow exponentially so will come to a grinding halt at around 500,000 files where as changing the [..n+100] to [..n*2] will avoid that performance issue nicely.

TLists on the other hand will create millions of extra objects and be and order of complexity more difficult for the garbage collector to clean up so unless they are being sorted don't really offer any advantage in the performance department.

Czar Flavius

(Posted 2010) [#13]

What about scanning the directory twice, one to count the number of files and the second to create an array of the required size?

skidracer

(Posted 2010) [#14]

Typically you should always avoid asking an OS like Windows to do anything twice, especially on an operating system such as Vista with its supernaturally slow file operations.

Leon Drake

(Posted 2010) [#15]

its because usually remote differential compression is on, you can turn it off by going to programs and features in the control panel and then clicking turn windows features on/off.

you'll notice probably a 200%+ speed increase in vista by doing this.