Major slowdown creating more than 600 particles.

BlitzMax Forums/BlitzMax Beginners Area/Major slowdown creating more than 600 particles.

Amon_old(Posted 2005) [#1]
I'm getting severe slowdown when I create more than 600 individual stars for my games starfield. I have a very fast system and I remember doing the same in B3D and it not have a problem with it. So, I'm guessing that its my create_stars function.

Here are da codeeeez:

Below is the type for the stars :
Global num_stars = 0
Global max_stars = 50

Global invaderlist:TList = CreateList()
Global starlist:TList = CreateList()

Type star

	Field xpos:Int
	Field ypos:Int
	Field speed:Int
	

	Function create_star()
	
		For Local x:Int = 0 To max_stars
		Local s:star = New star
		ListAddLast starlist,(s)
		s.xpos = rnd(0,GR_WIDTH)
		s.ypos = -1
		s.speed = Rand(2,15)
		num_stars:+1
			If num_stars > max_stars
				Exit
			EndIf
		Next
		
	End Function
	
	Function update_stars()
	
		For Local s:star = EachIn starlist
		
			s.ypos:+s.speed
			
		Next
		
	End Function
	
	Function draw_star()
	
		For Local s:star = EachIn starlist
		
			DrawImage stars,s.xpos,s.ypos,0
			
		Next
		
	End Function
	
	Method remove_star()
	
		For Local s:star = EachIn starlist
		If s.ypos > GR_HEIGHT
		ListRemove starlist,(s)
		num_stars:-1
		EndIf
		Next
		
	End Method
	
End Type


Is this efficiant enough? I was thinking that instead of deleting a star when it exits the screen borders to actually reuse it and change its y position to the top of the screen again. Would that be better than creating a new star each time?

to help you guys more in helping me here is all the code.




Ryan Moody(Posted 2005) [#2]
So, I'm guessing that its my create_stars function.


Likewise, all the other functions seem to be fine. Replace with:

Function create_star()
	
        For Local x:Int = 0 To (max_stars - num_stars)
	        
                Local s:star = New star
		ListAddLast starlist,(s)

		s.xpos = rnd(0,GR_WIDTH)
		s.ypos = -1
		s.speed = Rand(2,15)

		num_stars:+1

	Next
		
End Function


deleting a star when it exits the screen borders


Stick to that method.

Ryan


Amon_old(Posted 2005) [#3]
Hmm! I tried it your way Ryan and I get the same slowdown. Theres no difference in speed. If I leave max_stars at 50 I get top speeds but the more I make the slower it gets. I've managed to get ok speeds by setting max_stars to 300 but anything above that kills the show.

The thing is I remember reading somewhere that people have made particle systems with B3D which are capable of handling thousands of individual particles yet BlitzMax cant handle over 300. So, its either BlitzMax or the way I have coded this.


Ryan Moody(Posted 2005) [#4]
Some particle handling suggestions -

1. Reduce the size of your particles
2. Make them move off screen as fast as possible
3. Give particles a life span, so that they can still disappear before going off-screen, so your stars could fade out, for example.
4. Do you really need hundreds of particles?

Gotta go now, night night.

Ryan


GW(Posted 2005) [#5]
Those are good tips, but doing any of those things now would be just burying your head in the sand. Its best to find the source of the slowdown and fix that.


Takuan(Posted 2005) [#6]
I took that type above and tested stars only.
15000 Stars a 24*24*16 and no noticeable slowdown.
6600GT, 2 GHz

Sometime decent OpenGL Drivers arent part of the normal driver package and you have to install them separately.
Without them, things get slow.
Dont know how ATI handle this...


teamonkey(Posted 2005) [#7]
In your main loop, shouldn't
	For Local s:star = EachIn starlist
		s.remove_star()
	Next

Simply be:
        star.remove_star()
?


Arcadenut(Posted 2005) [#8]

Is this efficiant enough? I was thinking that instead of deleting a star when it exits the screen borders to actually reuse it and change its y position to the top of the screen again. Would that be better than creating a new star each time?



Yes. Creating an Object is very expensive compared to reusing an existing one. I have star code that uses DrawRect and I can do 10,000 stars in about 4.9ms. That is A LOT of time (and way too many stars :-).

You figure at 60FPS, you have approximately 16.67ms to do EVERYTHING. Taking up 25+% of the time drawing stars is bad. (1000ms/60FPS = 16.67ms per Frame)

I use 100 stars for my game, which looks just fine and that drawing time is something along the lines of less than 0.5ms which is plenty fast for my needs (right now).

You should create your stars outside of your main loop and don't destory them unless it's absolutely necessary (or it is not noticble to the user).

I would also recommend taking a more OOP approach and create a Star Object with Methods rather then what you have here. If you're not sure how to do that, I can post some code for you later.


ImaginaryHuman(Posted 2005) [#9]
Man, just for a starfield you really don't need all that `Type` stuff, just use an array so that you can be sure the memory used is sequential and it should be a lot faster. You don't need a linked list to generate a starfield.


Dreamora(Posted 2005) [#10]
I think one of the problems is drawimage itself or the way it is working: It always recreates the quad with the texture from scratch instead of using the already existing quad and just reposition it.

I get around 12000-15000 particles with my particle system (first it should become a port of a B3D one, but droped that idea and restarted to use the BM posibilities and 2D only optimations) until it breaks which is "crap" compared with around 30000 with Blitz3D and more stuff going on.

I'm on a radeon 9700 with quite good drivers.


Amon_old(Posted 2005) [#11]
Hi Guys. Thanks for the replies. I've changed the code so that it recycles each particle and resets its position to -1 when it goes beyond the boundaries of the screen. I've noticed a slight increase in speed but still lagging with just 401 particles. I have taken out the create_star() function and moved it outside the loop so it can only ever create 401 particles. I still get lag with this.

Here is the current code for the stars and the code for the whole game:

Global invaderlist:TList = CreateList()
Global starlist:TList = CreateList()

Type star

	Field xpos:Int
	Field ypos:Int
	Field speed:Int
	

	Function create_star()
	
		For Local x:Int = 0 To max_stars
		Local s:star = New star
		ListAddLast starlist,(s)
		s.xpos = rnd(0,GR_WIDTH)
		s.ypos = Rand(-1,-GR_HEIGHT)
		s.speed = Rand(2,15)
		num_stars:+1
		Next
		
	End Function
	
	Function update_stars()
	
		For Local s:star = EachIn starlist
		
			s.ypos:+s.speed
			
		Next
		
	End Function
	
	Function draw_star()
	
		For Local s:star = EachIn starlist
		
			DrawImage stars,s.xpos,s.ypos,0
			
		Next
		
	End Function
	
	Method recycle_star()
	
		For Local s:star = EachIn starlist
		If s.ypos > GR_HEIGHT
		s.ypos = -1
		EndIf
		Next
		
	End Method
	
End Type




Now dreamora mentioned that it may be drawimage which is causing the problem so I'll try a simple pixel plot instead. I'll post the code in just a sec.

:)

[edit]Plot didnt make a single bit of difference. :/


Ryan Moody(Posted 2005) [#12]
I think the only real problem is the constraints of hardware I'm afraid - it's difficult getting a lot of action on the screen to move smoothly, especially in 3D.

Would it be possible to create an animation instead of using particles? Do you really need hundreds of particles?

Ryan


Amon_old(Posted 2005) [#13]
Well i dont really need all those particles; its just that with B3D when I did the same it handled them with style. There may be a possibility of my OpenGL drivers not being up to scratch. I'll look in to that in a bit.

Thanks for the help anyway fellas. :)


Who was John Galt?(Posted 2005) [#14]
Amon -

You are moving in the right direction, but there are still some problems. To whoever recommended making sure u have the most up to date OGL drivers - I agree. Made a massive difference on my old PC.

Avoid creating stuff and deleting it each loop - again as someone else said. If you have say falling raindrops and u delete them each time they hit the bottom of the screen and create a new one at the top, this will probs not kill your speed but it is better to recycle. Originally you were recreating and destroying your whole type list each loop which is BAD - but that problem is solved. In your main loop you have a sub loop over all invaders and do an invader.draw(), but invader.draw() loops over all the invaders inside itself - so if you have 3 invaders, you'll draw each of them 3 times! (I think). Try taking the loop out of the drawinvader function (and any others like this). There are further optimisations you can make - let's prove MAX can kick B3Ds ***.


Amon_old(Posted 2005) [#15]
In your main loop you have a sub loop over all invaders and do an invader.draw(), but invader.draw() loops over all the invaders inside itself - so if you have 3 invaders, you'll draw each of them 3 times! (I think).


Hi Falken. I did some tests and it definately doesn't draw the aliens 3 times. To be absolutely sur I removed the draw method and changed it to a function. This didnt help with the framerate. I guess it has to be my opengl drivers.

Thanks for your help and suggestions. I guess I'll try to reduce the number of particles and assign them each a life span to reduce how long they are on screen.

Thanks again :)


Cajun17(Posted 2005) [#16]
I found your problem. You were calling the recycle method for each star, but there's also a loop for each star in the method. So for 400 stars you were doing 120000 interations! I do also think your aliens are also suffering from a similar situation.

Just put a call to the recycle method inside the loop in the update function and remove it from your main loop.

Here's the changes I made:

[code]
Strict
'image made global within type: slight speed increase
'star list global within type: very lisght speed increase

Const GR_WIDTH = 1024, GR_HEIGHT = 768, DEPTH = 16, HERTZ = 60

Graphics GR_WIDTH,GR_HEIGHT,DEPTH,HERTZ

SetMaskColor 255,0,255


Type star

	Field xpos:Int
	Field ypos:Int
	Field speed:Int
	Global starImg:TImage
	Global starlist:TList
	Global num_stars
	Global max_stars

	Function create_star()
		num_stars = 0
		max_stars = 500
		starlist=CreateList()
		For Local x:Int = 0 To max_stars-1
			Local s:star = New star
			ListAddLast starlist,(s)
			s.xpos = rnd(0,GR_WIDTH)
			s.ypos = Rand(-1,-GR_HEIGHT)
			s.speed = Rand(2,15)
			num_stars:+1
		Next
		
		'create star image
		SetColor 255,0,255
		DrawRect(0,0,3,3)
		SetColor 255,255,255
		Plot 1,0
		Plot 0,1	
		Plot 1,1
		Plot 2,1
		Plot 1,2
		starImg=CreateImage(3,3)
		GrabImage(starImg,0,0)	
		
	End Function
	
	Function update_stars()
	
		For Local s:star = EachIn starlist
		
			s.ypos:+s.speed
			s.recycle_star()
			s.draw_star()
			
		Next
		
	End Function
	
	Method draw_star()
		DrawImage starImg,xpos,ypos,0
	End Method
	
	Method recycle_star()
		If ypos > GR_HEIGHT Then ypos = -1
	End Method
	
End Type

Const TARGET:Int=60
Local frames:Int=0
star.create_star()
Local frameTime:Int
Local totalTime:Int=0
Local tCounter:Int=0
Repeat
	frameTime=MilliSecs()
	Cls
		SetColor 255,255,255
		star.update_stars()
		DrawRect 0,0,160,28
		SetColor 0,0,0
		DrawText TARGET+" Frames:"+totalTime+"ms",0,12
		DrawText "Stars:"+star.num_stars,0,0
		tCounter:+(MilliSecs()-frameTime)
		If frames=TARGET Then
			totalTime=tCounter
			frames=0
			tCounter=0
		Else
			frames:+1
		EndIf
	FlushMem
	Flip
	
Until KeyHit(KEY_ESCAPE)



ImaginaryHuman(Posted 2005) [#17]
It's probably a slow graphics card combined with the overhead of OpenGL, then.


Amon_old(Posted 2005) [#18]
Hi Cajun17.

Below you have xpos and ypos but I dont see them declared anywhere.

	Method draw_star()
		DrawImage starImg,xpos,ypos,0
	End Method
	
	Method recycle_star()
		If ypos > GR_HEIGHT Then ypos = -1
	End Method


You "drawimage starimg,xpos,ypos" but I dont where xpos and ypos came about. If its to store the positions shouldnt it be s.xpos and s.ypos? I'm confused :/


Dreamora(Posted 2005) [#19]
No they shouldn't as they are declared in the type itself (the first 2 field declarations) and in methods you can access them without any need for self.xxx or similar.


Amon_old(Posted 2005) [#20]
Ahh, I see. Thanks for that Dreamora.

Well, I have altered my code accordingly and its running at top speeds. I have also learnt a lot more about OOP programming because of this thread.

Thanks again all :)


rod54(Posted 2005) [#21]
so how many particles does it take now to see a slow down ??


Cajun17(Posted 2005) [#22]
I could get 45k particles every frame at 60 FPS (3400+ AMD 1Gig DDR 9700 ATI 128MB).


xlsior(Posted 2005) [#23]
I can get 42.000 particles/frame at 60FPS (Athlon 2800+, ATI 9600Pro)

Slightly less than that if I move the mouse.


Ryan Moody(Posted 2005) [#24]
Is that all?! Looks like that code could do with some more work Amon!

Ryan


Amon_old(Posted 2005) [#25]
Well I tested it and and 60fps theres no slowdown with over 9000 particles. I pushed it up to 50,000 and got a slight lag for a split second when they were created but then it ran at top speeds. This is just with my code. I'm sure with more optimized code you can easily push 70-80,000 particles.

The problem is I'm never going to need 80,000 particles. :/


Ryan Moody(Posted 2005) [#26]
Oh I was joking Amon!

Ryan


itoleck(Posted 2006) [#27]
If you set the graphics driver to buffered Direct X first you should be able to push way more particles. I used buffered Direct X and could push 225,000 particles.

SetGraphicsDriver BufferedD3D7Max2DDriver()

This is on Dual 3.2Ghz Intel - ATI9700Pro 128MB


sswift(Posted 2006) [#28]
What is this buffered mode? Are there drawbacks? Why isn't it used normally? And is there something equivalent for OpenGL?


Dreamora(Posted 2006) [#29]
No, there is no equivalent for OpenGL