Fastest way to draw unique pixels everywhere

BlitzMax Forums/BlitzMax Programming/Fastest way to draw unique pixels everywhere

AnthonyB(Posted 2010) [#1]
Hello there,

I'm looking for the fastest way to draw unique pixels on every pixel of a window. Is there a faster way than:



The reason I wonder is that I am trying my luck out with writing a ray caster, and I it is really fast when using drawLine() and no textures, but when I add textures and use plot instead, I get 2-4 fps when I'm really close to a wall. After alot of tracking down, I found that setColor() and plot() are the biggest thieves.

The code I use in my raycaster is this (I draw vertical slices of the walls, since it's a ray caster, and thus you'll only get the relevant code):



I've also tried using OpenGL, but I get pretty much the same result (is Max2D using OpenGL maybe?). I've tried different drivers, etc. It doesn't seem to get any faster. I am using ALPHABLEND btw (when using anything else I get the same fps, but boxy-looking text). Any help will be appreciated!

Regards,
Anthony


ImaginaryHuman(Posted 2010) [#2]
Plot is the slowest way to draw lots of pixels. Talking here about OpenGL only ... each time you want to plot a pixel it's calling the plot function and passing parameters (overhead), it's starting a new sequence of geometry with glBegin() (overhead), then it's drawing ONE vertex with glVertex2f() or similar, and then returning to your code. All that just to plot one single pixel. Actually it reminds us of how efficient it *used* to be to draw to the backbuffer using the CPU with a single pointer and a byte write like nextpixel[offset]=value. Anyway..

Faster way #1 is to write your own GL code to do ONE glBegin() and lots of glVertex4i and glColor4ub inside of it.

Faster still by a long way #2 is to write your own GL code to use vertex arrays... put all your coordinates in a single array and put all your colors in another array as unsigned bytes, set up the vertex array system to read these two arrays, then call glDrawArrays() to draw the whole lot in one go. It'll probably be *at least* twice as fast as the previous method, which itself is probably twice as fast as Plot.


AnthonyB(Posted 2010) [#3]
Oh. Thanks for that information. I'm pretty new to graphical programming, and I've been trying to research it on my own, but it's not that easy coming from a pure textbased or gui based programming background. Again, thanks alot! :)


ima747(Posted 2010) [#4]
Not sure about performance comparisons but what about creating an image and locking it, altering the pixmap data directly, unlock and draw,flip? I'm sure it's nowhere near as fast as ImaginaryHuman's suggestions but might be easier to transition to from your existing code.. not sure what the performance difference might be though... just a thought.

I miss directly drawing to the back buffer as well *sigh* good times...


Otus(Posted 2010) [#5]
Not sure about performance comparisons but what about creating an image and locking it, altering the pixmap data directly, unlock and draw,flip? I'm sure it's nowhere near as fast as ImaginaryHuman's suggestions but might be easier to transition to from your existing code.. not sure what the performance difference might be though... just a thought.

When I last wrote a raytracer, this was the fastest way by my testing. If you will only display the data once, you might as well create a pixmap and draw that directly without needing to bother with locking.

I guess some advanced GL code like what ImaginaryHuman suggested might be faster...


ima747(Posted 2010) [#6]
I forgot you can drawpixmap so you don't need a TImage as such, not sure about that speed wise though... I think the implication of the FPS dip is that it will be re-drawn as fast as possible, including fliping to the screen. I think a real time textured ray tracer is a lot to ask for but anyone that wants to try has my support, maybe we can get past the alpha depth issues :0)


jhans0n(Posted 2010) [#7]
Is there a DirectX equivalent of ImaginaryHuman's suggestion? What he describes works very well for OpenGL, but I'd like to be able to do it in DirectX too.


AnthonyB(Posted 2010) [#8]
I wrote a test using the OpenGL technique ImaginaryHuman mentioned, but I can't get it to work. This is what I've got so far:



All I get is a black screen. I wrote it in a way so that it will be easy to make the transition from my test program to my raytracer.


ImaginaryHuman(Posted 2010) [#9]
You might need a glViewport() in your initGL() function?

Also if you are drawing plain untextured pixels you can also switch off blending, texturing, and smoothing. - but smoothing I mean use glShadeModel(GL_FLAT). You can also probably set the texture environment to GL_DECAL instad of GL_MODULATE to save on some multiplications.


AnthonyB(Posted 2010) [#10]


This is my code at the moment. Still a black screen. Since I won't get GLDrawText to work, I print the fps afterwards, and since it only measures the time it takes to render etc, and not the time one iteration takes, the fps is accurate. I get 7-9 fps when I run this program.

I must have missed something. I added the viewport to what I think would work, and I added all the optimizations you told me about in your last post (I think I did it the right way anyway), and it's still slow, and draws a blank screen. What have I missed? I know there are programs that can do exactly what this program does, with OpenGL, and get MUCH higher fps, with even slower computers than my own (AMD Phenom II Quadcore 3,2ghz, Radeon x1650 xt 256mb), so I must have missed something!


matibee(Posted 2010) [#11]
I think you're barking up the wrong tree Anthony. I can fill an 800x600 screen with glDrawPixels at 100 fps but doing anything meaningful with those 480,000 pixels (nigh on 2 million bytes) takes a heck of a lot longer.

This takes around 10ms..
glDrawPixels( 800, 600, GL_RGBA, GL_BYTE, m_vertexColors )

It's only got the same ammount of data to upload to the video card as any other method so I would guess at bandwidth being the bottle neck there (assuming a decent driver implementation).

Nevertheless, this alone takes around 60ms..
For Local y:Int = 0 To 599
		Local ty:Int = y * 800
		For Local x:Int = 0 To 799
			Local i:Int = Rand(0, 1000)
			m_vertexColors[ x + ty ] = $FF000000 | (randBuff[i] Shl 16) | (randBuff[i+1] Shl 8) | randBuff[i+2]
		Next
	Next


70ms (14fps) just to fill the screen with random junk :/

It's a quad core cpu so maybe I can fill it with 4 separate threads in 17ms (at best) but even that's only 37fps for random junk. But that's further than I have time to try.

Here's your code modified for glDrawPixels. I'm no opengl expert and I don't mind admitting that I don't understand why filling the buffer with $FFFFFFFF doesn't equate to bright white :/ So take this code with a pinch of salt. For the profiler just install the framework from my sig.




AnthonyB(Posted 2010) [#12]
@Matibee: Thanks for that information. But then I wonder, how could a textured raycaster (like the classic game Wolfenstein 3D) do this with that kind of hardware without killing the FPS? If it takes such a long time to fill the screen with pixels, then how come we get all of these cool games today, and how come Wolfenstein 3D and Doom both worked so well with very slow hardware?


Otus(Posted 2010) [#13]
For Local y:Int = 0 To 599
		Local ty:Int = y * 800
		For Local x:Int = 0 To 799
			Local i:Int = Rand(0, 1000)
			m_vertexColors[ x + ty ] = $FF000000 | (randBuff[i] Shl 16) | (randBuff[i+1] Shl 8) | randBuff[i+2]
		Next
	Next


That's 480 000 calls to Rand, which is a rather slow function (calls another function internally). If it takes 60 ms, I'd say that's rather good. (A few hundred cycles per call.)

If it takes such a long time to fill the screen with pixels, then how come we get all of these cool games today, and how come Wolfenstein 3D and Doom both worked so well with very slow hardware?

While processing power and various bandwidths have increased a lot since those days, latencies have not gone down by nearly as much. I don't know anything about the internals of OpenGL, but I imagine getting the data to the graphics card is what takes most of the time. Computation of the visuals in modern games mostly happen inside the graphics card.


AnthonyB(Posted 2010) [#14]
I think I know what makes it so slow. I should draw quads instead of points, and just change the size of the quads when getting closer to a wall. That way, when getting near a wall, it will only draw at the most, 64x64 quads. That should be MUCH faster than drawing individual pixels for the walls when one pixel of the wall takes up like 32x32 pixels.


matibee(Posted 2010) [#15]
I think I know what makes it so slow. I should draw quads instead of points, and just change the size of the quads when getting closer to a wall. That way, when getting near a wall, it will only draw at the most, 64x64 quads. That should be MUCH faster than drawing individual pixels for the walls when one pixel of the wall takes up like 32x32 pixels.


But you draw vertical stripes in raycasting. Keeping track of pixels plotted will be a lot of overhead, then for the cases where you're not faced right up to a wall (ie most of the time) it'll be worse.

I got this code to run at around 35 to 40 fps on my machine by rendering to a pixmap and drawing that. I just profiled it in the same way at 800x 600 res.. the ray casting (including writing all the necessary pixels onto the pixmap) averaged 10ms. The pixmap drawing averaged 6ms.


ImaginaryHuman(Posted 2010) [#16]
If you are going to be drawing columns only it would be way faster to draw vertical strips from textures in video ram.


AnthonyB(Posted 2010) [#17]
And how do I draw vertical stripes? I have thought about that as well, but I haven't found anything when searching. And I can't figure it out myself.


matibee(Posted 2010) [#18]
Good call IH! Look up DrawSubImageRect()

I just edited the code from the old post and even cranking up the resolution to 1600 x 1200 still struggles to push the ray casting and screen drawing to much over a millisecond combined!

In my defence, drawsubimagerect didn't exist 9 months ago ;)


AnthonyB(Posted 2010) [#19]
@Matibee: Could you maybe share that code? I am way too tired to think right now, so an example of how you would do that would be very much appreciated! :) I know it's not that liked to ask for code around programming forums, but I have been on this for weeks and I'm studying for a math test at the moment, so I'm really really tired, hehe. I hope it's not too much to ask.


AdamRedwoods(Posted 2010) [#20]
Faster, pseudo-random.
You get speed from doing it twice, but not 3 times.




matibee(Posted 2010) [#21]
Anthony the original, pixel plotting code is still here. The OP never asked me to take it down so I assume he doesn't mind it being there. The modified version using DrawSubImageRect (which returns the best part of 2000fps on my machine [without vsync obviously] is here...

Thanks for the wake up call IH :)




ImaginaryHuman(Posted 2010) [#22]
Yah, well I didn't know you were doing columns otherwise I'd have said it sooner, I thought you were doing some fancy raytracing thing.


AnthonyB(Posted 2010) [#23]
Thanks for your replies guys. I got it working at like 2000-3000 fps now! :)


AnthonyB(Posted 2010) [#24]
I have one more question though. How do I turn off the anti-aliasing when using GLMax2D, Graphics() and setGraphicsDriver(GLMax2DDriver())?


AnthonyB(Posted 2010) [#25]
Noone? :O It really looks cheap when the antialiasing is only applied on the y-axis and not the x-axis, so I want to turn it off completely!


matibee(Posted 2010) [#26]
You don't "turn off" antialiasing as such, simply tell blitz not to filter (alias) the image when you load it..

Global wallImage:TImage = LoadImage("wall.tga",0)



AnthonyB(Posted 2010) [#27]
Oh, thanks! :)