Stuttering on Xbox

Monkey Targets Forums/XNA/Stuttering on Xbox

Raz(Posted 2011) [#1]
On a modest PC with intel graphics, the game I am working on fine.

On the xbox (as release not debug) it stutters, even on a very simple menu screen (4 buttons, 1 title, 1 background image).

Has anyone had similar and managed to fix it?

Thanks
-Chris


luggage(Posted 2011) [#2]
I did a fair bit of XNA stuff without Monkey. The biggest problem was the Garbage Collector kicking in. As soon as your game allocates more than a certain amount it will try and clear up older memory, when it does this it was easy to see a stall. The only way around it was not to allocate any memory during the game.

The obvious problems were things like allocating a new bullet every time the player fires. To avoid this you create a list of all the bullets you think you need, give each one a flag like "isUsed". Then instead of using New to get a new bullet, you look through the list for one that isn't used and use that slot.

Using For-Each would also allocate, as well as messing with Strings and boxing types.

I'm not sure how Monkey would handle this, would take some very careful coding.


Raz(Posted 2011) [#3]
Hi Luggage, thanks for your reply.

I too have done XNA stuff without Monkey and found that I had to use arrays instead of lists because eventually it caused slowdown (so at the start of the app I would create 100 new bullets, 1000 particles and what not in an array and activate them as required.

Irritatingly I am finding slow down even before I get to an area of the game that uses lists and then throughout the game the stuttering is consistent regardless of content (which leads me to believe the issue is above my code).

I'll keep looking to see what I find.


Raz(Posted 2011) [#4]
Here are two videos of the main menu in Ninjah.

1) On PC - http://www.youtube.com/watch?v=jGUBq9rG6-E

2) On Xbox - http://www.youtube.com/watch?v=ym4K0epGfeE

You will notice a brief pause on the Xbox 360 video.

Nothing is getting added or removed from a list at this time so I can't imagine there will be a garbage collection problem.

I am assuming its a XNA thing and not a PC thing, but if anyone has fixed similar I'd be very happy to hear it.


MikeHart(Posted 2011) [#5]
Are you using Drawtext there?

Edit: I like your "FPS" counter.


Raz(Posted 2011) [#6]
Hi Mike,

I am using Angel font to draw text (within the buttons). The title is an image.

I've created a new app that only contains the code from the main menu screen and the stuttering isn't happening.

I think I am going to have to add content bit by bit to see what's going on. I can't see how loaded content would have an impact on performance unless it was being used though, so I am at a loss.


Raz(Posted 2011) [#7]
All the reading I've done that relates to stuttering does suggest that its to do with garbage collection and lists can cause issues with xbox performance.

I store my screens within a stringmap so maybe this is doing something. I've just seen that there is a remote performance monitor, so I am going to try that too.


MikeHart(Posted 2011) [#8]
Keep us informed about what you discover. It will be certainly interesting to hear what the real cause is.

It will take me around 2 months till I hit the XBOX with Monkey code. So your info and discoveries are valuable for me.


Raz(Posted 2011) [#9]
I think it's to do with strings (or bytes allocated to string). From what I can tell from a very early comparison, the more string operations I have, the more often GC is having to run.


Raz(Posted 2011) [#10]
I am pretty certain it's to do with strings. The more times you use a string, the worse it gets.

I just removed all instances of string code from the main menu and the stutter stops. Strings cause long garbage collection, long garbage collection causes stutter.


Raz(Posted 2011) [#11]
Actually from what I can tell, just about everything is causing too much GC to cope with :/


Raz(Posted 2011) [#12]
I've just made a test app that does absolutely nothing but draw and update X/Y coords of particles.

The update rate is set to 60
There is one image loaded (for the particle)
There is an array of particles (500) that gets created at start up. No additional particles are created during runtime. These particles are activated as required.
Directly, my code does not ask for additional memory at this point.
If the A button is down, create a new particle (so max 60 new particles a second)
If the particle reaches the age of 600 (with one update call counting as 1), deactivate the particle. Only active particles are drawn.

If I hold down the A button to create particles, to start with it's fine, however, once this goes over a certain amount of active particles, the GC kicks in and amount of bytes being collected by the GC sky rockets.

As far as I can tell the GC is having to work very hard right up until there are no more particles being drawn. Take away the DrawImage(gfxParticle,X,Y,0) call all together and this GC doesn't kick in.

So, I am thinking that there is a limit somewhere in the way Monkey handles drawing via XNA code.

Are you able to offer any advice Mark?

Thanks
-Chris


luggage(Posted 2011) [#13]
I had a quick look at the XNA code that's spit out for my project. In the DrawImage function at the top it does...

bb_graphics_Frame bbt_f=bbt_image.bb_frames[bbt_frame];


bb_graphics_Frame being a class, I think this will allocate a new class and copy across the values to it. This will then have to be collected. In DrawQuad it also new's vertex definitions. This will also cause garbage.

There may be some boxing\unboxing of types somewhere in there as well.


Raz(Posted 2011) [#14]
Hi Luggage,

using the above as an example, if instead of making a local bb_Graphics_Frame object each function call, there was a static bb_graphics_Frame that got reassigned each time the function was called, would that help?


luggage(Posted 2011) [#15]
I've got to be honest, I'm not sure. It sounds like it might work.I'm not sure, I'd have to dust off my old xna code to have a look what I did while doing something similar.

I'd definitely look in the DrawQuad functions though, those 'news' will definitely be creating garbage.


Raz(Posted 2011) [#16]
Mark

I've been running the CLR Profiler against a very simple app that draws particles allocated on init within an array.

The function that is causing the most GC problems is

public virtual int DrawSurface( gxtkSurface surface,float x,float y )


And as luggage mentioned before, this seems to be because of the new's. In this functions case, new vector2's and vector3's



I don't know too much about what's going on here. But is there any way you can avoid using new vector3/2's?


Raz(Posted 2011) [#17]
I've also just done a test drawing a series of pre-allocated strings using DrawText() and this suffers the same thing




muddy_shoes(Posted 2011) [#18]
I'm not an XNA coder, but Vector2 and Vector3 are value types and arrays of value types aren't a big deal on the GC performance front. See this blog post: http://blogs.msdn.com/b/shawnhar/archive/2007/07/02/twin-paths-to-garbage-collector-nirvana.aspx .

Do the examples you've posted actually display the stuttering framerate problem? If they don't then they're not going to be of much use as investigative tools. If they do then can you post the code?


Raz(Posted 2011) [#19]
Hi muddy, yes they do, not to the extend of the game I am working on but the stutter is still there. When running the remote performance monitor, I can see that lots of garbage is being generated and the GC is sometimes taking up to 240ms to run (each frame at 60fps has a total of 16.666ms to be ready)

I'm assuming there will never be a case when there's no garbage for a game generated through Monkey, but I've seen guides showing no garbage collection happening for an app that draws X amount of particles on screen.

I'm working with half an understanding of things here, but the Xbox is not good at garbage collection.

I'll post the code when I am next home.


Raz(Posted 2011) [#20]
The source for my test...

http://www.chrismingay.co.uk/monkey/XboxDrawText.zip

the GC kicks in every 1mb of allocation since the last collection. On PC it's fine. On Xbox, it stutters.

I ran a CLR Profile for Ninjah as well, and all of the allocations appear to be Vector2's and 3's.


Raz(Posted 2011) [#21]
Running the following code...

Import mojo

Class TestApp Extends App

	Field star:Image

	Method OnCreate()
		SetUpdateRate(60)
		star = LoadImage("particle_star.png",1,Image.DefaultFlags)
		Print "Started"
	End

	Method OnUpdate()
		If JoyDown(JOY_BACK)
			Error ""
		End
	End
	
	Method OnRender()
		Cls(128,128,128)
		SetColor(255,255,255)
		DrawLine(40,40,400,400)
		DrawLine(80,80,300,300)
		DrawRect(10,600,10,10)
		DrawImage(star,200,500,0)
	End

End

Function Main()
	New TestApp
End


gives the following per 60 frame allocations

At 60FPS
Description                       Allocs   Bytes
Nothing                           60       3840
CLS                               60       3840
CLS + Setcolor                    60       3840
CLS + SC + DrawLine               120      6240
CLS + SC + DrawLine x 2           180      8640
CLS + SC + DLx2 + DrawRect        240      12480
CLS + SC + DLx2 + DR + DrawImage  360      19200


So doing a clear, a set color, 2 line draws, a rectangle draw and an image draw will cause the GC to kick in every 54~ seconds. Cls and SetColor do not allocate bytes, but all of the draw commands do with drawimage allocating twice as much.

So if 4 draw commands makes the GC kick in in about a minute, there really isn't too much hope as it stands for drawing a screen of tiles.


slenkar(Posted 2011) [#22]
so how about setting vector2 as a field of image and reusing it each time?


Raz(Posted 2011) [#23]
That might work for images I guess, but I'm not sure how that'd work for rectangles and lines.

Anyway, I'm going to have a look today, not that I have a clue what I am doing really!


Raz(Posted 2011) [#24]
As one final check, I made an app directly in XNA that draws a single image.

The monkey version leaks memory / causes GC.
The XNA version does not.

I consider this a bug that needs fixing.


clevijoki(Posted 2011) [#25]
So I just looked through the mojo.xna.cs and I noticed a few odd things.

The first thing is this function:

	
public virtual int DrawLine( float x1,float y1,float x2,float y2 )
{
	Vector3[] verts={
		new Vector3( x1,y1,0 ),
		new Vector3( x2,y2,0 ) };



So Vector3 is a struct so it should be on the stack, but perhaps it's still allocating the verts array of Vector3's on the heap? That could explain the mystery allocations. Every draw call seems to use this pattern.

Try adding these two functions to the DrawBuffer class
    public void DrawLine(Vector3 v0, Vector3 v1, Color color)
    {
        tverts[0] = new VertexPositionColorTexture(v0, color, texcoords00);
        tverts[1] = new VertexPositionColorTexture(v1, color, texcoords00);
        DrawLine(tverts, null);
    }

    public void DrawLine(Vector3 v0, Vector3 v1, Color color, Matrix matrix)
    {
        tverts[0] = new VertexPositionColorTexture(Vector3.Transform(v0, matrix), color, texcoords00);
        tverts[1] = new VertexPositionColorTexture(Vector3.Transform(v1, matrix), color, texcoords00);
        DrawLine(tverts, null);
    }


And replace the
public virtual int DrawLine( float x1,float y1,float x2,float y2 )
function with this one:
    public virtual int DrawLine(float x1, float y1, float x2, float y2)
    {
        if (tformed)
        {
            drawBuf.DrawLine(new Vector3(x1, y1, 0), new Vector3(x2,y2,0), color, matrix);
        }
        else
        {
            drawBuf.DrawLine(new Vector3(x1, y1, 0), new Vector3(x2,y2,0), color);
        }
        return 0;
    }


Then run your sample again. I don't have xna setup so I can't test it myself.

The second issue is everything has unnecessary virtual calls. Virtual calls are more expensive and cannot be inlined, and unless they are overloaded they should never be virtual. The 360 has simple branch prediction, and virtual calls in C++ represent a branch predict fail, most likely they do in C# too. Virtual should be removed from everything and the classes should be marked sealed.

The third thing is the weird tverts business where it seems to be copying twice as much data as necessary. Unless there is something I am not understanding the DrawBuffer class should be rewritten to remove this entirely and it'd do half as much work.


clevijoki(Posted 2011) [#26]
Yeah I'm 95% sure this is the issue, all those arrays are being allocated on the heap. If you look at the profiler shot in #16, it's not allocating Vector3's it's allocating Vector3[]'s.


Raz(Posted 2011) [#27]
Hi Clev, thanks for looking in to this, it's very much appreciated :)

I'll try your changes when I get home today.

I see now I didn't word things properly, but yes I found it to be Vector2[]'s and Vector3[]'s

I'm gonna go read up on virtual functions so I can get my head round it.


Raz(Posted 2011) [#28]
Clevijoki, I added your fix and then applied the logic you used to the DrawQuad, DrawSurface and DrawSurface2 functions and the Vector2[]/Vector3[] allocations have stopped :)



The use of strings is still causing quite a bit of junk, but I may be able to do without them.

Thanks again for looking in to the Clev!


Raz(Posted 2011) [#29]
I updated DrawSurface and DrawSurface2 to be the following

public virtual int DrawSurface( gxtkSurface surface,float x,float y ){
		const float HALF=0;//.5f;
		float w=surface.Width();
		float h=surface.Height();
		float u0=HALF/w,v0=HALF/h;
		float u1=(w+HALF)/w,v1=(h+HALF)/h;
		
		/*Vector3[] verts={
			new Vector3( x,y,0 ),
			new Vector3( x+w,y,0 ),
			new Vector3( x+w,y+h,0 ),
			new Vector3( x,y+h,0 ) };
		Vector2[] texcoords={
			new Vector2( u0,v0 ),
			new Vector2( u1,v0 ),
			new Vector2( u1,v1 ),
			new Vector2( u0,v1 ) };*/
			
		if( tformed ){
            drawBuf.DrawQuad(color, surface.texture, matrix, x, y, w, h, u0, u1, v0, v1);
		}else{
			drawBuf.DrawQuad( color,surface.texture, x, y, w, h, u0, u1, v0, v1 );
		}

		return 0;
	}

	public virtual int DrawSurface2( gxtkSurface surface,float x,float y,int srcx,int srcy,int srcw,int srch ){
		const float HALF=0;//.5f;
		float w=surface.Width();
		float h=surface.Height();
		float u0=(float)(srcx+HALF)/w;
		float v0=(float)(srcy+HALF)/h;
		float u1=(float)(srcx+srcw+HALF)/w;
		float v1=(float)(srcy+srch+HALF)/h;

		/*Vector3[] verts={
			new Vector3( x,y,0 ),
			new Vector3( x+srcw,y,0 ),
			new Vector3( x+srcw,y+srch,0 ),
			new Vector3( x,y+srch,0 ) };
		Vector2[] texcoords={
			new Vector2( u0,v0 ),
			new Vector2( u1,v0 ),
			new Vector2( u1,v1 ),
			new Vector2( u0,v1 ) };*/

		if( tformed ){
            drawBuf.DrawQuad(color, surface.texture, matrix, x, y, srcw, srch, u0, u1, v0, v1);
		}else{
            drawBuf.DrawQuad(color, surface.texture, x, y, srcw, srch, u0, u1, v0, v1);
		}
		return 0;
	}


and I added the following functions to the DrawBuffer class

public void DrawQuad(Color color, Texture2D texture, float x, float y, float w, float h, float u0, float u1, float v0, float v1)
    {

        tverts[0] = new VertexPositionColorTexture(new Vector3(x,y,0), color, new Vector2(u0,v0));
        tverts[1] = new VertexPositionColorTexture(new Vector3(x + w, y, 0), color, new Vector2(u1, v0));
        tverts[2] = new VertexPositionColorTexture(new Vector3(x + w, y + h, 0), color, new Vector2(u1, v1));
        tverts[3] = new VertexPositionColorTexture(new Vector3(x, y + h, 0), color, new Vector2(u0, v1));
        DrawQuad(tverts, texture);
    }
	
	public void DrawQuad(Color color, Texture2D texture, Matrix matrix, float x, float y, float w, float h, float u0, float u1, float v0, float v1)
    {
        tverts[0] = new VertexPositionColorTexture(Vector3.Transform(new Vector3(x, y, 0), matrix), color, new Vector2(u0, v0));
        tverts[1] = new VertexPositionColorTexture(Vector3.Transform(new Vector3(x + w, y, 0), matrix), color, new Vector2(u1, v0));
        tverts[2] = new VertexPositionColorTexture(Vector3.Transform(new Vector3(x + w, y + h, 0), matrix), color, new Vector2(u1, v1));
        tverts[3] = new VertexPositionColorTexture(Vector3.Transform(new Vector3(x, y + h, 0), matrix), color, new Vector2(u0, v1));
        DrawQuad(tverts, texture);
    }



marksibly(Posted 2011) [#30]
Hi,

Ok, I've rewritten large chunks of the XNA graphics stuff so it *should* be much faster!

The new version (theoretically) performs no allocations (although XNA probably does some behind the scenes) and eliminates all copying.

Please see the new 'experimental' section of the product updates page to download a test version.


dopeyrulz(Posted 2011) [#31]
Mark,

These changes affect Windows Phone as well?

Will try on both my Xbox and WP.


marksibly(Posted 2011) [#32]
Hi,

> These changes affect Windows Phone as well?

Yep.


dopeyrulz(Posted 2011) [#33]
Ok - thanks. Will have a play and report back


Raz(Posted 2011) [#34]
Well done Mark, Angelfont and converting an Int32 to a String creates junk still (which is expected), but if I disable anything code to do with the runtime processing of strings there are practically no heap allocations :)



On the Xbox everything runs smooth once again, so thank you very much for fixing this.


clevijoki(Posted 2011) [#35]
Something that *might* reduce the GC impacts of strings is the C# 'using' keyword:

http://msdn.microsoft.com/en-us/library/yh598w02%28v=vs.80%29.aspx

They may still show up as a heap allocations, but should get immediately disposed of, meaning it won't count towards a gc 'hitch'.


marksibly(Posted 2011) [#36]
Hi,

Ok, just uploaded a new version that should get rid of the SetScissor allocation.


MikeHart(Posted 2011) [#37]
That is what I call great support!


Raz(Posted 2011) [#38]
Fantastic :) thank you for the fixes


dopeyrulz(Posted 2011) [#39]
Haven't had a chance yet to take a look - hopefully today. Kids back to school!