Nehe 19: 2500 Particles (Xcode) vs 150 (BlitzMax)

BlitzMax Forums/OpenGL Module/Nehe 19: 2500 Particles (Xcode) vs 150 (BlitzMax)

salric(Posted 2005) [#1]
Hi All,

I've been going through the excellent NeHe tutorials (thanks very much to Extron for converting them to Blitzmax) - noticing that tutorial #19 seems very slow on my Powerbook G4. To get a decent framerate above 30fps I reduced the maximum number of particles to 150 (the default is 1000). I thought this to be an extremely low count so I went back to the NeHe site & downloaded an XCode (Cocoa) version, it will perform smoothly at over 2500 particles.

You can find the OSX code at the bottom of this page:
http://nehe.gamedev.net/data/lessons/lesson.asp?lesson=19

In case you don't know, the blitzmax code can be found on this thread:
http://www.blitzbasic.com/Community/posts.php?topic=41689

Can anyone shed some light on this matter of differing performance between the Blitzmax & XCode versions?


Sveinung(Posted 2005) [#2]
I have no problems runnig either of them. You will get the same result if you set the swapinterval to zero in the max example, I think.

Sveinung


salric(Posted 2005) [#3]
Thanks for the reply - I tried setting the swapinterval to zero and this had no effect. Just in case anyone else asks I'm not running in debug mode either.


T-Light(Posted 2005) [#4]
No problems on the PC version of Max.
Whether the settings are
MAX_PARTICLES=150
or
MAX_PARTICLES=2500
the frame rates remain at a constant 61fps


LarsG(Posted 2005) [#5]
1000 ran just fine.. so did 2500..
I even got 25000 running at 56 fps.. :p


salric(Posted 2005) [#6]
I've just tested the same code under an iMac G5; the issues with performance are the same under this platform. If anyone could test these examples under the OSX platform I would appreciate it.

Thanks.


bortels(Posted 2005) [#7]
Just tried both on Dual G5 Powermac - same results. Cocoa version very fast, BMax version dog slow (6 to 7 fps). Commenting out the actual for loop that draws the particles brings it up to 60, so we're looking at something in there.

So, I poked around commenting things out - and the thing that's slowing stuff down are the "If (KeyDown..." statements, surprisingly enough - commenting them out brings the framerate back up to 60. Even leaving the single "If KeyDown(KEY_TAB)" brought the fps down to 30. And this is without touching the keys during the run, so it's not the content of the if statements - it's the actual checks.

... time passes...

Ah. They're inside the "For loop=0 to MAX_PARTICLES-1" loop - that explains it. Rather than check those keys once per frame, we're checking once per particle - presumably as a cheap way to adjust all of the particles. Dandy, but slow.

Moving those If statements to outside that loop, and changing them to the form:
			If (KeyDown(KEY_NUM8)) And (particle[loop].yg<1.5) 
				For n = 0 To MAX_PARTICLES-1
					particle[n].yg:+0.01
				Next
			EndIf

Gives us a nice 60 fps.

(Edited for code clarity)


bortels(Posted 2005) [#8]
As a followup now that I think of it - it then makes no sense that the PC version would work without slowdown, as it's doing the same work - unless the implementation of KeyDown is signficantly cheaper on the PC.

And in fact, on my older PC (Intel P4 2.4 Ghz, 3/4 gig of ram, ATI video card) I get 35 fps with the unmodified version. My guess is the other PC testers had better hardware than I do :-)

And - the KeyDown code on the mac appears to be significantly slower than on the PC. I am sad.

UPDATE: I filed a bug report on the slow KeyDown issue in the bugs forum...


salric(Posted 2005) [#9]
Bortels,

Thanks very much for solving this problem, the last thing I expected it to be is KeyDown checking! On the iMac G5 the demo in mention now moves around 20,000 particles at full-frame. I'm very impressed.

Another point of interest is it takes about 1 minute to create 20,000 instances of the particle type in the array. I've tried putting a flushmem in the loop however it makes very little diffference. Interesting?

Thanks again Bortels - you're a champion!!


Robert Cummings(Posted 2005) [#10]
The init stuff is likely to be much faster in the next blitzmax update.


Dreamora(Posted 2005) [#11]
There is a simple reason initialisation takes that long: One of the worst possible implementations for the particle array's init ... (might be nice for C++ but deadly for BM with its memory management where resizing an array means much more of work than just resizing a memory block)

Change the following code:

Function InitGl()
	LoadGlTextures()
	glEnable(GL_TEXTURE_2D)											' Enable Texture Mapping
	glShadeModel(GL_SMOOTH)											' Enable Smooth Shading
	glClearColor(0.0, 0.0, 0.0, 0.0)									' Black Background
	glClearDepth(1.0)													' Depth Buffer Setup
	glDisable(GL_DEPTH_TEST)											' Disable Depth Testing
	glEnable(GL_BLEND)												' Enable Blending
	glBlendFunc(GL_SRC_ALPHA,GL_ONE)									' Type Of Blending To Perform
	glDepthFunc(GL_LEQUAL)												' The Type Of Depth Testing To Do
	glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_NICEST)					' Really Nice Perspective Calculations
	glHint(GL_POINT_SMOOTH_HINT,GL_NICEST)								' Really Nice Point Smoothing
	glBindTexture(GL_TEXTURE_2D,Texname)								' Select Our Texture

	For loop=0 To MAX_PARTICLES-1										' Initials All The Textures
		particle=particle:particles[..loop+1]
		particle[loop]=New particles


to

Function InitGl()
	LoadGlTextures()
	glEnable(GL_TEXTURE_2D)											' Enable Texture Mapping
	glShadeModel(GL_SMOOTH)											' Enable Smooth Shading
	glClearColor(0.0, 0.0, 0.0, 0.0)									' Black Background
	glClearDepth(1.0)													' Depth Buffer Setup
	glDisable(GL_DEPTH_TEST)											' Disable Depth Testing
	glEnable(GL_BLEND)												' Enable Blending
	glBlendFunc(GL_SRC_ALPHA,GL_ONE)									' Type Of Blending To Perform
	glDepthFunc(GL_LEQUAL)												' The Type Of Depth Testing To Do
	glHint(GL_PERSPECTIVE_CORRECTION_HINT, GL_NICEST)					' Really Nice Perspective Calculations
	glHint(GL_POINT_SMOOTH_HINT,GL_NICEST)								' Really Nice Point Smoothing
	glBindTexture(GL_TEXTURE_2D,Texname)								' Select Our Texture
	particle= new particles[MAX_PARTICLES]
	For loop=0 To MAX_PARTICLES-1	



This should speed the whole thing quite a little up


salric(Posted 2005) [#12]
Dreamora, this is really excellent! The improvement in speed is amazing!

Thank you!