CopyRect: Is it really this slow?

Blitz3D Forums/Blitz3D Programming/CopyRect: Is it really this slow?

gburgess(Posted 2004) [#1]
CopyRect 0,0,1024,1024,0,0,ImageBuffer(shadowim),TextureBuffer(shadowtex)

Every frame, my game needs to run this command. shadowim and shadowtex are a little on the big side at 1024x1024 each, but the single copyrect command has a serious hit on frame rate. Taking out that one command speeds up execution massively.

The commands used to generate the image + texture are:
shadowtex=CreateTexture(1024,1024,8)
shadowim=CreateImage(1024,1024)


Not missing something obvious, am I? Any thoughts, anyone?


GfK(Posted 2004) [#2]
Try using flag 256 with CreateTexture. Should speed it up, although the size of the area you're copying is always going to be an issue.


gburgess(Posted 2004) [#3]
Thanks for the response, GfK. I've had mixed results with the 256 flag in the past. It actually slows down games on the GeForce2MX that I do some of my work on.


gburgess(Posted 2004) [#4]
Well that fixed it, anyway. Cheers! So it's normally that slow? I know it was a large area that I was copying, but even then, it's only a little bigger than a full XGA screen.


jfk EO-11110(Posted 2004) [#5]
It's slow because Textures are involved. compare the speed with some image to image or image to backbuffer ops, that's pretty fast, I made the underwater distortion effect in my latest demo using this copyrect command.


Ross C(Posted 2004) [#6]
Hey Glenny-boy, how does this run?

Graphics3D 800,600,16

cube=CreateCube()
PositionEntity cube,-3,0,2

plane=CreatePlane()
EntityColor plane,40,40,200
MoveEntity plane,0,-2,0

planetex=CreateTexture(256,256)
SetBuffer TextureBuffer(planetex)
For loop=0 To 255 Step 2
	For loop1=0 To 255 Step 2
		Color 50,50,200
		Rect loop,loop1,1,1
		Color 20,20,200
		Rect loop+1,loop,1,1
		Color 50,50,200
		Rect loop,loop1+1,1,1
		Color 20,20,200
		Rect loop+1,loop1+1,1,1
	Next
Next

SetBuffer BackBuffer()

ScaleTexture planetex,200,200
EntityTexture plane,planetex

sphere=CreateSphere()


camera=CreateCamera(sphere)
PositionEntity camera,0,3,-5
CameraRange camera,1,100

mirror=CreateCube()
PositionEntity mirror,0,3,10
ScaleEntity mirror,3,1.5,0.2

mirrorframe=CreateCube()
PositionEntity mirrorframe,0,3,10.2
ScaleEntity mirrorframe,3.3,1.7,0.3
RotateEntity mirrorframe,20,180,0
EntityColor mirrorframe,70,70,200

mirrorcam=CreateCamera(mirror)
PositionEntity mirrorcam,0,0,0
HideEntity mirrorcam
CameraRange mirrorcam,1,100
CameraZoom mirrorcam,3

mirrortex=CreateTexture(256,256,256)
EntityTexture mirror,mirrortex
ScaleTexture mirrortex,-1,1

light=CreateLight()

RotateEntity mirror,20,180,0

Color 255,255,255

While Not KeyHit(1)
	
	
	If KeyDown(200) Then MoveEntity sphere,0,0,0.1
	If KeyDown(208) Then MoveEntity sphere,0,0,-0.1
	If KeyDown(203) Then TurnEntity sphere,0,1,0
	If KeyDown(205) Then TurnEntity sphere,0,-1,0

	If MilliSecs()<timer+1000 Then
								frame=frame+1
	Else
								fps=frame
								frame=0
								timer=MilliSecs()
	End If
	Gosub updatemirror
	UpdateWorld
	RenderWorld
	Text 0,0,"fps="+fps
	Flip 0
Wend
End

.updatemirror
	HideEntity camera
	ShowEntity mirrorcam
	RenderWorld

	CopyRect 272,172,255,255,0,0,BackBuffer,TextureBuffer(mirrortex)

	HideEntity mirrorcam
	ShowEntity camera
Return



gburgess(Posted 2004) [#7]
That seems to work perfectly fine, including on the GeForce2MX. I notice you're going from backbuffer to texture, whereas I'm going from imagebuffer to texturebuffer. Is imagebuffer -> textbuffer vastly slower, then? Because there's nothing like the speed hit here that I seem to be getting.


Ross C(Posted 2004) [#8]
My texture is only 256x256. That might be the main reason. Could further speed that up by only copying to the textureevery 2 or 3 frames.

Why are you copying from image to texture, if you don't mind me asking?


gburgess(Posted 2004) [#9]
I tried increasing your texture size to be the same as mine, and it almost no effect on speed.

Basically, for a simple shadows routine, I'm taking the position of every item that casts a shadow, turning it's X and Z values into X and Y values on a 1024x1024 image, and plotting a crappy shadow blob there. Then copying that image to a 1024x1024 texture that's mapped to the landscape. Quicker than drawing each blob direct to the texture. It's a cheesey method, I know. :D


Ross C(Posted 2004) [#10]
Why don't you try drawing them directly onto the texture? It's main reading from V-RAM that's slow :)


gburgess(Posted 2004) [#11]
That seems to be slower... and I don't do any reading from VRAM...?

I'm drawing the shadow blob to and image, then copying the whole image once to a texture. I was under the impression that one write to a texturebuffer was quicker than several smaller ones. Is that right?


Ross C(Posted 2004) [#12]
Yeah, but your writing to V-RAM, then reading from V-RAM then writing to it again. Write to image, copy (read/write) to V-RAM.


gburgess(Posted 2004) [#13]
Sorry if I'm being really dense here: Images aren't stored in VRAM are they? I thought it was just textures? So when I draw the blobs, they're being written to an image in main memory, then copyrect copies it once from memory to VRAM. Otherwise, I'd be doing several writes to VRAM?


Ross C(Posted 2004) [#14]
Images are stored in V-RAM as are textures, meshes etc :) Only really code, varaible, arrays type are stored in main memory :)

When you draw blobs onto an image, your writing to V-RAM.


jfk EO-11110(Posted 2004) [#15]
Images are not stored in VRAM. THey are in conventional Ram, to make this clear. Run a loop that creates Images continously and print AvailVidMem()...


DJWoodgate(Posted 2004) [#16]
Really the docs should delve into this a bit more. I suspect there are differences in the way images are stored from system to system though, so maybe a definitive answer is not possible. I thought the common one was that images are stored in vram if there is space otherwise they are created in system ram. Textures are created in system ram and copied to vram if they are used or modified, unless flag 256 is used to force them into vram from the outset. But as has been noted flag 256 does not work on all systems and can make things slower. In any event I was also under the impression that use of any 2d graphics command will stall the 3d hardware which is yet another performance issue to take into account quite apart from the amount of data that needs to be moved and where it is being moved from or to.


Ross C(Posted 2004) [#17]
Jfk, when i load images into blitz, availvidmem() goes down. My RAM stay at the same level. Is this mis-information from blitz then??


Yan(Posted 2004) [#18]
AFAIK in B2D/B3D images are stored in VRAM (as DJWoodgates post) and in B+ images are 'managed' (by default), much as textures are in B3D.

Perhaps that's the cause of the confusion?


YAN


gburgess(Posted 2004) [#19]
Well, I was going by what I (thought) I read in the Blitz3d help files, where it recommends writing to an image several times and then copyrect-ing that image to a texture once. Maybe I mis-read or it's outta date or something.


jfk EO-11110(Posted 2004) [#20]
It seems DJWoodgate was right, it depends on the machine specs, on some machines Blitz3D stores images in conventional Ram (like on mine, at least in 2D Graphics Mode) and on some other machines it stores them in VRam.