B+, Emulators, and ALPHA Sprites!
BlitzPlus Forums/BlitzPlus Programming/B+, Emulators, and ALPHA Sprites!
| ||
Yeah, that's a weird topic I know..lol Anyway, here is my thinking and please correct me if I am wrong. (only in theory now) We all know that B3D (or 3D hardware in general) is much faster at creating transparencies, rotation, scaling, etc than 2D/software. However, is 2D/software fast enough anyway for smaller sprites? Let me back up, the other day I was playing Super Metroid on my laptop via the ZSNES emulator (don't worry, I own the game so my rom is legal). I was amazed at how SMOOTH the scrolling was! Even full screen! Also, in the game, there are sections where there appears to be clouds floating around and anything under the clouds is slightly altered in color. Transparency at its greatest. Then, I got to thinking. ZSNES more than likely just uses Direct Draw. I don't think it uses any 3D. My laptop is only 800Mhz P3 with a crappy 32mb video card (not meant for gaming). Yet, everything was so smooth and nice! I was getting full screen scrolling, great sound, SMOOTH graphics (WITH ALPHA) on a "crappy" laptop using 2D??? So, why can't B+ (or BB) do some of the ALPHA tricks in 2D?? I realize you will never get 60,000 sprites at 500 FPS but come on. Why couldn't I make a game that was 640x480 (same res as the ZSNES) with maybe a layer with transparency? Or maybe a dozen small sprites with transparency? Does this seem reasonable? I know there are some ALPHA commands out there for Blitz but are any of them any good? Has anyone done anything like this in 2D? Bottom line is this. My goal is to have maybe 4 layers scrolling (platform game) with maybe one of those layers a simple "transparent" layer. Or, have 3 layers with about 10 sprites that appear to be transparent all using B+. I think it should be possible. I could maybe even include a "Turn ALPHA OFF" command for really slow computers. What do you programmers think? Thanks cb |
| ||
I wish this was possible too... Alpha transparency in BB is beyond reach for real time use at the moment =( |
| ||
I think the Extended BB .dll allows for alpha transparency, etc. I'll dig out the thread...hang on...Here you go :) |
| ||
It's possible. Especially if you just want to use transparency on 10-12 sprites every now and then. In my demo vectorized2 I have a big 3D cube done in Blitz+ that rotates and is twisted and transparent at the same time. And when you consider how much the calculations and drawing of the cube itself takes, then you would see that if you were doing something like just drawing a screen full of squares you will have amble time to have a good amount of smaller images moving around on the screen transparent without too many problems. Everything must be done on a locked buffer with writepixelfast, but it can be done. Here is a link to that demo: http://www.blitzcoder.com/cgi-bin/showcase/showcase_showentry.pl?id=zawran10302003103604&comments=no Or you can check my demos at zac-interactive.com To get the best result have a rutine that takes the sprite and puts into an array and then as you move the sprite around read from the array so that you only have to do one readpixelfast for each pixel, because the biggest slowdown is with the readpixel, not as much with the writepixel. |
| ||
That's the thing - DON'T use readpixel. Reads from video memory are slow. It's perfectly possible to get nice full-screen alpha stuff in software, but programs that achieve this do so by keeping everything in system memory, and then dumping the entire buffer into screen memory when it's done. (don't worry, I own the game so my rom is legal). Yeah I was so worried. Can't have people admitting to having an illegal copy of something they stopped selling years ago... |
| ||
Yeah, I think you two are correct. Reading pixels is SLOOW. Hmmm...what do you guys think about doing everything in an array or bank? This would be like using system memory to do complex blits and then copying that to the backbuffer?? I may experiment with that. What do you guys think? Listen, here is what I think...back years and years ago, I could take my 7.14Mhz Amiga 500, read from video memory, and plot sprites at 60 fps. SURELY our 500+ MHz machines nowadays can do it....lol cb ********** EDIT **************** Ok, I just coded a short example for array blitting. I am not too impressed. I haven't tried hard but I don't think it could get much faster. Normally, I get 100 FPS (vsync) with a standard cls/flip loop but using this code, I only get 47 fps. Jeesh. I might forget the alpha blitting for a while. Or just stick with other people's code for the handful of sprites that I will need alpha-blitted. -cb ;This is a test of advance blits using system memory ;cbmeeks Const SCR_WIDTH = 640 Const SCR_HEIGHT = 480 Global Timer, FPS_Real, FPS_Temp,FPS ;screen array Dim Screen(SCR_WIDTH,SCR_HEIGHT) ;graphics Graphics SCR_WIDTH,SCR_HEIGHT,16,1 SetBuffer BackBuffer() ;clear arrays Clear(0,128,0) ;main loop Repeat ;copy arrays to backbuffer LockBuffer For y=0 To SCR_HEIGHT-1 For x=0 To SCR_WIDTH-1 WritePixelFast x,y,Screen(x,y) Next Next UnlockBuffer DisplayFPS(0,0) Flip Until KeyDown(1) Function Clear(r,g,b) Local x,y For y=0 To SCR_HEIGHT-1 For x=0 To SCR_WIDTH-1 Screen(x,y)=GetRGB(r,g,b) Next Next End Function Function GetRGB% ( Red% , Green% , Blue% ) ; Combines Red, Green and Blue values into one RGB value Return Red Shl 16 + Green Shl 8 + Blue End Function Function DisplayFPS(x#,y#) Color 255,255,255 If Timer + 1000 <= MilliSecs() Timer = MilliSecs() : FPS_Real = FPS_Temp : FPS_Temp = 0 FPS_Temp = FPS_Temp + 1 : Text x#,y#,"FPS: " + FPS_Real End Function |
| ||
If you are only doing a few sprites here and there, you don't have to wpf the entire screen. Just put the color values of the sprites into arrays and then only read the pixels that are in the actual location you are drawing to. If you are drawing a 32x32 pixel sprite then that is only 1024 rpf plus another 1024 wpf to write it back. If you have some kind of masking color you might even avoid a bunch of those as well with an if-check. A fullscreen in the above is 307,200 rpf+wpf's which ofcause is somewhat on the slow side. But you can have plenty of smaller sprites moving around no problem. You could do something like this: Function drawTransp(x,y,transp) transpn = 1-transp LockBuffer BackBuffer() For yy=0 To 39 For xx=0 To 39 mask = spritemaskdata(xx,yy) If mask <>$ff0000 Then rgbs = spritedata(xx,yy) rgbd = ReadPixelFast(x+xx,y+yy,BackBuffer()) r = (((rgbs Shr 16) And $ff) * transp) + (((rgbd Shr 16) And $ff) * transpn) g = (((rgbs Shr 8) And $ff) * transp) + (((rgbd Shr 8) And $ff) * transpn) b = ((rgbs And $ff) * transp) + ((rgbd And $ff) * transpn) If r > 255 Then r = 255 If g > 255 Then g = 255 If b > 255 Then b = 255 If r < 0 Then r = 0 If g < 0 Then g = 0 If b < 0 Then b = 0 rgbf = r Shl 16 + g Shl 8 + b If x+xx > -1 And x+xx < 640 And y+yy > -1 And y+yy < 480 Then WritePixelFast x+xx,y+yy,rgbf End If Next Next UnlockBuffer BackBuffer() End Function And before you use it put the rgb values of the sprite and mask into the arrays set up for it. |
| ||
If this is B+, try using the new lockedpixel etc commands. I'll try and knock up an exampe to see if it's faster. |
| ||
I just tried to poke the R G B data into three different memory banks, then I peeked the values out of the banks and used writepixelfast to blit them with alpha alteration to the screen. This worked quite good. I achieved around 30'000-35'000 alpha shaded pixels plus the drawing of the background in 640x480x32 per frame at 85Hz. That's around 2.7 million alpha pixels per second running on a 2.6GHz P4 with Radeon 9700 pro. It could probably be tweaked however. Downsides: > Uses a lot of memory > only allows alpha shading of background and not sprites ontop. > takes time to create the memory banks > probably more downsides too =) |
| ||
If this is B+, try using the new lockedpixel etc commands. I saw that command but don't quite understand it. What's it for? cb |
| ||
Does that extended library (or any other for that matter) allow me to have a 2d image and then apply an alpha mask to it (so some parts of the image are more transparent than other sections) - for 2d shadow effects, etc? Cheers, A |
| ||
I don't think it does but I could be wrong. Oh, and I found out what the lockedpixels do. In fact, I converted it to use lockedpixels and it almost doubled in speed! However, still too slow for full screen. But, I think I will write a version for drawing sprites. I really want to use it to create a "cloud" layer over my maps. That is going to be difficult. -cb ;This is a test of advance blits using system memory ;cbmeeks Const SCR_WIDTH = 640 Const SCR_HEIGHT = 480 Global Timer, FPS_Real, FPS_Temp,FPS Global bank Const FORMAT_RGB565=1 Const FORMAT_XRGB1555=2 Const FORMAT_RGB888=3 Const FORMAT_XRGB8888=4 ;graphics Graphics SCR_WIDTH,SCR_HEIGHT,16,1 SetBuffer BackBuffer() ;timer ;main loop Repeat LockBuffer bank = LockedPixels() For y=0 To SCR_HEIGHT-1 offset=y*LockedPitch() Select LockedFormat() Case FORMAT_RGB565 For x=0 To 319 PokeInt bank,offset+x*4,$f800f800 Next Case FORMAT_XRGB1555 For x=0 To 319 PokeInt bank,offset+x*4,$7c007c00 Next Case FORMAT_RGB888 For x=0 To 639 PokeInt bank,offset+x*4,$00ff0000 Next Case FORMAT_XRGB8888 For x=0 To 639 PokeInt bank,offset+x*4,$00ff0000 Next End Select Next UnlockBuffer DisplayFPS(0,0) Flip Until KeyDown(1) Function GetRGB% ( Red% , Green% , Blue% ) ; Combines Red, Green and Blue values into one RGB value Return Red Shl 16 + Green Shl 8 + Blue End Function Function DisplayFPS(x#,y#) Color 255,255,255 If Timer + 1000 <= MilliSecs() Timer = MilliSecs() : FPS_Real = FPS_Temp : FPS_Temp = 0 FPS_Temp = FPS_Temp + 1 : Text x#,y#,"FPS: " + FPS_Real End Function |
| ||
cb, what FPS do you get on your machine? Your first example runs at 10FPS on mine, the second one at 4FPS?? (P4 2.6GHz, Radeon 9700) I suspect it may have something to do with the version of B+? I'm using a quite old one, 1.34, that I received for testing CTCC in. |
| ||
Andyboy - the dlls in that link do indeed allow you to have alpha maps. Download it and take a look at the demos. |
| ||
[reply] cb, what FPS do you get on your machine? Your first example runs at 10FPS on mine, the second one at 4FPS?? (P4 2.6GHz, Radeon 9700) [/reply] What?? If anything the second should be faster... Using VSYNC, I get 47 fps on the first and 95 on the second. WHen I turn VSYNC off, I get over 200 on the second one. When I do a: repeat cls flip false until keydown(1) I get almost 1000 fps! Clearly, plotting the screen pixel by pixel is much slower. Normally, you would never do this anyway. Much faster to copy rows of pixels or lines. cb |
| ||
cbmeeks >> I am extremely interested in where things are going wrong on my machine! I can't get any performance out of B+ and writepixelfast at all. What version are you using?? Edit: Hmm, I switched off debug mode and it went from 4FPS to 550 FPS ?!? I didn't know the debug mode made THAT much difference?? |
| ||
I am using 1.37 the newest. Hey, try this code out. I am curious what FPS you get. Global Timer, FPS_Real, FPS_Temp,FPS Graphics 640,480,16,1 SetBuffer BackBuffer() Repeat Cls DisplayFPS(0,0) Flip False Until KeyDown(1) Function DisplayFPS(x#,y#) Color 255,255,255 If Timer + 1000 <= MilliSecs() Timer = MilliSecs() : FPS_Real = FPS_Temp : FPS_Temp = 0 FPS_Temp = FPS_Temp + 1 : Text x#,y#,"FPS: " + FPS_Real End Function I get about 1000 FPS on my P3 800 laptop. cb |
| ||
3820 FPS in BlitzPlus 5200 FPS in Blitz2D |
| ||
8070 FPS in Blitz2D Athlon 2000+ Which is a 1.67ghz ... although maybe a pointless test :P |
| ||
I can't see why Blitz can't have fast enough methods to atleast do alpha at a good speed, you shouldn't need a 3D Card for that. I know the Main killer is ReadPixelFast, i've done many test with alpha, and the less ReadPixelFasts you do, the faster it runs, No matter if you have double the WritePixelFast, so the wpf is really Fast, but the rpf is really slow :( But as its stated, reading from video memory is slow. If only we could create a Buffer in System Memory, that we do all our drawing commands to just as easliy as we do the backbuffer maybe instead of SetBuffer Backbuffer() it can be, SetBuffer SysMemoryBuffer() but after that, you use it just exactly the same as you would any other buffer, just drawimage's to it, writepixels, what not, and the flip still flips that sysmemorybuffer to the front screen, basically nothing would change, all the changes would be just internal that mark would have to setup, that all draw to system memory, instead of video memory. The only thing is I don't know how fast it would be for the FLIP to take the image in System Memory, and Put it in The FrontBuffer() If its fast, then all would be great! Then you could do really fast readpixels from system memory, do alphas, do all kinda magical pixel stuff now its all in sys memory :D Although if the flip from System Memory to Front Buffer is slow, then its pointless :( |
| ||
Whoah...hang on guys, think this through....the SNES was only capable of displaying something like 64 colors at one time...meaning it's a paletted basied graphics system with each pixel being only 4-bits in length (which the GPU then uses the current palette as a look up table to define the color visable on screen) You arn't seeing true alpha blitting of sprites and such...it's a simple hack like trick of setting up the color palette in such a way as to use logical operators (OR, AND, XOR) when blitting the sprite to the background...say the background pixel is 0010 (or palette index 2) then you use the OR operator with a sprite pixel value of 0001...the result (0010 combined with 0001 useing OR) is 0011 (value of 3) and the color of palette index value 1 (the sprite) is bright red, index 2 (the background) is black...if you had made the color of palette index 3 as a dark red color (3 is the result of the OR operation on both background and sprite) then it would seem that the sprite was alpha blitted onto the background. When you get into "true" color modes (16-bit, 24-bit, 32-bit, etc..) these sorts of palette tricks don't work the same way...however you can do a "virtual" palette type thing by createing an array to hold the palette values and doing all your blitting in software...then instead of transfering this one software blitted pixel at a time, you use the value of the software pixel as an index into the palette array, and write that color value to the screen buffer...this can increase the speed of your software blitting because you don't need to have each pixel be a direct 16-bit or 32-bit color value. |
| ||
Yeah, I do miss OR, AND, XOR, including cool stuff like palette rotations, ect ect, if only blitz could do 8bit, and load palettes, and or,and,xor image commands. that alone would speed up such graphics that don't require 16bit colors. But, as is, would a system like i discribed above work? Would flipping from a system memory buffer, to the front buffer be fast enough? Because if so, doing all graphics in a system memory with pixel commands would be much faster. |
| ||
Depends on a number of factors...color depth and image resolution being the keys (a 800 by 600 32-bit color buffer uses approx 1.83MB...thats per full update of the frame, a whole lot of data to transverse from system to video memory...trying to do that at 60 frames per second requires about 110MB of data to transfer from system to video memory each second...not really possable even on modern hardware sense the system to video memory pathway lacks the bandwidth to do this... However a 320 by 200 16-bit image transfered 60 times a second requires something like 7MB of bandwidth per second which is more reasonable. |
| ||
Erm...what about createimage()? Instead of useing a "true" system memory buffer...use a system memory "image" that you draw to then when finnished use the Blitz copyrect to transfer it to the video buffer? |
| ||
Just read that BlitzGL can also do 2D alpha sprites.... |