Tile Engine Considerations

BlitzMax Forums/BlitzMax Programming/Tile Engine Considerations

SculptureOfSoul(Posted 2006) [#1]
Hello everyone. First post here, as I just purchased BlitzMax today, 15 days into my demo trial! :)

Anyways, I'm fairly new to real programming (used to use Clickteams Click & Create and Multimedia Fusion a few years back) and I'm not really sure if the approach I'm taking with my tile engine is particularly well crafted or efficient. Any suggestions or alterations that could help maximize the efficiency of design would be greatly appreciated!

I apologize in advance for the length of this - I figure it's just easier to explain the whole approach and how it's integrated than it is to take out bits and pieces that might not make much sense outside of the framework.

My tile engine design as of now...

A tile engine that:

Can theoretically have an unlimited number of layers (right now it's limited to 255 since I'm using a byte to represent it's layer number), and each tile can utilize from 1 to the max number of layers. (i.e. a spot on the map that only features empty grass might only need one layer, while the tile next to it might feature 4 layers...the grass, some rocks, a barrel, and a roof overhang)

Dynamically loads and unloads map regions and tilesets. If desired, I'd like it so that you never need to exit one map and load another.

Basically at all times I've got 9 regions (a 3x3 array) of the world map loaded. So for instance, if you move far enough left, it loads the new data into the rightmost column of regions and then treats the leftmost region as the center region, the center region as the right region, and the right region as the left region. (Sorry if this doesn't make a lot of sense, it's hard to explain concisely).

Basically, the horizontal and vertical centers are adjusted so that any column or any row can be 'considered' the current center - this way no data copying is ever needed - the new regions that need to be loaded are simply loaded into the region row or column that is 'farthest' away from the player.

Streaming Loads: I found out early on trying to fully load a new region, including the tilesets and such, would cause a momentary pause (just a split second, but still noticeable).

I thought of 'buffering' the load command and waiting til the player wasn't moving or went to a menu to load, but this still had the problem of a moment of sluggishness if the player tried to start moving while the load was occurring.

Instead I've decided to stream the load in over the course of a couple seconds. Once the player gets sufficiently close to the edge of a region (distance configureable), I start streaming the necessary tilesets into separate banks, and also start streaming the map file into separate banks. Then, after the load is complete I create the necessary region out of the map bank, and the tiles out of the tileset banks. Actually, this leads me into a question regarding my approach w/ the tiles themselves.

Some preliminary tests, as well as checking out these boards, verified just how much faster it is to deal with images than it is to deal with pixmaps. Well, for convenience I still wanted to utilize tilesets, but I wanted the speed of images.

My current approach involves loading the tileset image as a pixmap, and then using PixMapWindow to create pixmaps for each tile in the tileset. Then I create an image from the tilesized pixmap. I stash all of these individual tiles into an array whose size I calculate from the tileset image.

As mentioned I wanted each spot on the map to have any number of tiles. Also, I didn't want things like alpha level, layer level, etc, to be hardcoded into the tile, so each "map tile" contains not only a "tile" object that stores the actual image, but also stores information about how that tile should be drawn. The current approach would allow, for instance, a particular tile at X:40,Y:20 to be animated and set to 50% transparcency and not an obstacle, while the same tile image could be found elsewhere on the map unanimated (or animated at a different speed, whatever) at 0% transparency, and act as an obstacle.

The downside to this is that a fair amount of extra data is stored in this tile container class. In fact, large maps (1000x1000 or so) can get ridiculously large, and this is another reason why I've gone with dynamic loading.

Well, that's the basic design, although I've left out a lot of the intricacies of how it's all connected.

At 640x480, it only takes approx 335 tiles to draw a single layer on screen (assuming partial tiles are showing). In my early tests I was able to get 5000-6000 tiles while still getting 50+ fps, and was still getting 30 fps or so at 12000+ tiles. 5000 tiles = an average of 15 layers of tiles per each map location - which is obviously a ton. I imagine most spaces on the map I could get away with only 3 layers, possibly going up to 5 or 6 layers in a few spots. That's why I designed the engine so that a variable number of layers can exist per tile location - if I don't need 5 or 6 layers except in a few spots...why should the whole map have those extra layers?

Anyways - my early tests were conducted before all of the framework was set up. The framework introduces a bit more overhead, but at this point I'm not sure how much as I don't have the framework complete enough to do any accurate tests yet.

One thing I'm really unsure about is how to sort my tiles for proper drawing order. As I see it there's two approaches. The approach I'm planning on implementing goes like this:

The game looks at the currently loaded regions, and then creates an 1d array with the dimension being equal to the max number of layers needed in the current regions. So, if one particularly loaded tile spot has 15 layers, the array is 15 elements long.

The array itself is an array of lists. The lists are going to store these maptile objects (which contain the tile image and how it is drawn).

Anyhow, the game goes through each tile location needed for the screen(going row by row instead of column by column). At each tile location, it looks at each tile and adds it to the end of the list in the list array. So the pseudo code looks like

if tile.tilelayer = 2 then Array[2].ListAddlast(Tile).

So, after this is done, each list in the array should contain all the tiles in that layer, and by default they'll be Y sorted (ascending...since the tiles were added row by row, starting at the top of the screen).

Then I iterate through each list in the array and draw the tiles. This will (theoretically...I don't have it all coded yet) draw all of the tiles with the higher layer tiles overlapping lower level tiles, and the tiles will be Y sorted as well (so a player's avater will be drawn "over" a building if he is standing in front of (below) it).

The other approach I was considering was just straight iterating through each spot on the map and drawing all the tiles while adding them to an OpenGL depth buffer, and letting it take care of the sorting for me. I really don't know anything about OpenGL though, except that from what I've read you can't really properly use the depth buffer with partially transparent textures. If that's the case, then that idea is right out.

Anyhow, like I said, I'm really a novice to all this stuff and I think this design is a bit beyond my abilities but it's been fun designing and implementing it. I haven't hit any major roadblocks at this point - I'm just concerned about speed issues.

Has anyone had any experience using a depth buffer for a 2D game?

Actually, this still hasn't really explored all of the framework. Explaining all of the details would result in this excessively long post becoming RIDICULOUSLY long =).

I guess if anyone is really interested or anything you can contact me to further discuss this. As it stands though, If you notice anything particularly foolish about my approach, PLEASE let me know.

Thanks,
SoS

SculptureOfSoul@...


Cajun17(Posted 2006) [#2]
I had a similar problem with my game. In my case it was more troublesome because I wanted to be able to change the layer an object was on so I could get effects like particles orbiting around a sprite. I debated using the depth buffer, but that's a little much for just sorting rectangles.

In any tile based game(that I can think of at the moment) all the layers are just images. Each location can only have one set of data / logic so I separate the two. When I update I think of the tiles as a 1 layer 2d array and when I render the screen I think of the map as just a huge pool of objects that are sorted by some Z value.

If this makes some kind of sense and you want me to go into the details let me know.


ImaginaryHuman(Posted 2006) [#3]
It sounds like a pretty solid approach over all.

One idea comes to mind, in that you are letting each tile store animation and transparency data separately. I presume you do that for ALL tiles for each used layer. If you think about it, there *probably* aren't going to be that many variations of how you use a particular tile image. You might have, say, used it with 20% transparency unanimated, 50% transparency gradually animated, solid with no animation, etc. Why not have your tiles refer to `animation objects` rather than a given image, so that you only have to store the transparency/animation info once for each way that you're using a given image. Then you don't have to store animation and transparency data separately for EVERY tile. It will save memory. Your `animation object` will then describe what image it used, the transparency, and the animation. Then you can have separate image objects in some array or list that the animation objects refer to.

I think you may also get a smoother update for multiple regions if you implement what is basically a tilemap scrolling technique. Ie, as you move around the map, you spool in one strip of tile data that covers the same amount of area that you scrolled. And if it takes you say 8 frames to scroll 32 pixels then also divide up the spooling into 8 stages so it only needs to load in a few tiles per frame. This is how old-school tile engines used to keep the number of new tiles that need to be drawn per frame to a minimum. You would of course be constantly spooling whenever you scroll but the amount would be small enough that it should be smooth. As for loading in new tile image data, you could do that based on proximity to the new region, ie as you scroll closer to it you spool in more images, as you move away you spool in images for the opposite region.

I think also you might rethink your use of images. Having one Image per tile, presuming your tiles are pretty small, you are wasting a heck of a lot of video ram. You need to get a whole tileset into a single or a few large textures and then draw only rectangles from that texture to draw individual tiles. This is much more efficient for rendering speed and memory use - less memory has to be allocated, plus if you don't have to keep switching to a different texture/image for each tile it cuts down a LOT of overhead and should speed up your rendering.

Also here's another idea that you could implement .... if some areas of the tilemap are identical to others, you don't need to store them multiple times. Just have a simple relative `reference` offset for each tile, which instead of taking tile data from that tile, it takes it from the tile at the given X,Y offset (or a single offset value if you prefer). So then say if you have a 10x10 tile area which is identical in a few parts of the game, you can just store it once in its full state with all the animation data and layers, etc, then have a 1-layer 10x10 tile area elsewhere which is more or less empty and just says `use the tile data from that other tile area over there`. What this also means is that if you then edit that original 10x10 `master` tile group, all other areas that tap into that group are automatically updated.

Think about how spreadsheet software works and how it lets you make formulas and have areas automatically update etc.


SculptureOfSoul(Posted 2006) [#4]
Hi Daniel,

What (I think) I'm doing is using one image per tile, but then that single image is getting drawn wherever the tile is on screen. So, for instance, I have "Water Tile 42" or whatever, and if there are 50 of them on screen, they are all referring to that single image.

Regarding storing alpha data and animation data etc...well, I'm not storing it in the tile, per se. See, I've got my base TTile class. ALL this class does is store it's associated image.

Then I have a TTileSet class. This class, when it is first instantiated, loads a tileset into a pixmap, and then creates individual TTile objects for each tile in the pixmap and stores the image in the TTile. The TTile is then stored in an TileArray:TTile. The position in the array that it is stored at is dependent on it's position in the Tileset. I do a normal 'column scan' through the pixmap, and the first tile created is stored at TileArray[1], second tile at [2], and so on. Also, the location in the array is ALSO the tiles "ID" number, which I will explain in a bit. Basically I use the ID as a lookup into the array.

Next I have a TTileMap 'container' class. This particular class holds info about a tile (but does not hold the actual image). It holds a TileID (which is used to get the image from the tile array mentioned above), a layer value (depth value), and flags regarding if it's an obstacle, a stairway (which causes the Y value handling to be different), etc.

As mentioned, I don't actually store a TTile object in this container class though, instead the class uses the TileID to refer to a tile in a Tilearray. This container class also contains a Draw method, which in this particular class calls draw image on the Tile object that it looks up.

I've then extended my TTileMap container class into other classes, such as a TAnimatedTileMap. This class holds extra info about the animation info, alpha info, etc, and overrides the Draw method to utilize the alpha info and such.

As far as I know this approach involves only using one copy of any particular tile in the tileset regardless of how many times that tile appears on screen. The actual tiles are owned by a tileset object...stored in a 1D array. Their TileID represents their location in that array - and that TileID is the what the TMapTile object holds. So in regards to the master map file - each X,Y location holds a series of "MapTiles" which hold TileID's which are utilized as a reference to the actual Tile image. Sorry for basically repeating that a couple of times, but I'm not sure if I'm making it clear what I'm doing.

The only reason I decided to break up my tile map pixmap objects into individual tile sized images is because I didn't see any way to draw only a partial rectangle of an image. It's of course easy to do with pixmaps, but ridiculously slow when we're talking a potential 1200 tiles or more per screen.

Is there an easy way to draw a partial rectangle of a large image within Bmax, or perhaps with direct calls to OpenGL? And if not, is there any difference in the amount of video ram that will be taken up if I have say, 100 32x32 images verses one 320x320 tileset? I was inclined to believe there wasn't, (minus perhaps a negligible amount of overhead) - and since I didn't see any easy way to draw partial images, I decided to break the tileset pixmap into images when I first load the tileset. (afterwards the pixmap is discarded and the individual images, the tiles, are stashed in an array (this allows direct access when the maptiles are referencing them with the TileID)).

The only reason I'm not drawing individual tile rows or columns as they appear on screen is because when it so happens that one of the newly uncovered tiles contains a tile in a tileset not currently loaded, the process of loading that tileset and then drawing the tiles causes a slight hiccup. Also, eventually I plan to give the player a means of rapid transportation, and then the above problem would simply be exacerbated. I could of course 'pre-screen' a rectangular area of the map larger than the screen area and check to see if any tilesets need to be loaded and then do so before the character gets there. But that's not much different than the region approach I have now - and that has the benefit of having the map pre-loaded as well.

The other thing that has me worried is that I will no doubt have certain tile locations that have 6-10 layers, and if the game suddenly had to load and draw say, 35 tile spaces (20 wide and 15 tall in 640x480 when using 32x32 tiles...this might happen if the player was moving diagonally, for instance) each with say 6 tiles...well, again I imagine the game would hiccup a bit.

I really appreciate your feedback and insight though, it's given me a lot to think about. I guess what makes it really difficult is that the framework I'm building is really quite large, and it's the kind of framework that it'll be hard to test 'realworld results' until I have most of it implemented. I'm trying to plan for it to be as optimized as possible while still being as flexible as possible. I want the engine to offer as few constraints as possible, as I plan on using this thing for many other 2D projects down the road.

Thanks again - let me know if any of this made sense and if I'm right or (probably) way off with some of the assumptions I've made.


SculptureOfSoul(Posted 2006) [#5]
Well, I split a 512x512 object into individual 32x32 tiles, and then generated a random map 1000x1000 tiles long. Video Memory Watcher is reporting that I'm only using just over 2 megs of VRAM, which sounds exactly about right since 512x512x4bpp = 1,048,576. I assume the rest is just overhead or reserved by BMax?

That number quadruples to over 8 megs of VRAm when I'm in full screen mode though. Any idea why that is?

I should mention that I'm using the OpenGL drivers. When I switch to DirectX the vram usage is nearly cut in half, but the framerate drops a bit too. My gfx card is a GeForce 6600 Gt.

I'll admit, I'm pretty clueless when it comes to hardware and hardware interaction :(.


Cajun17(Posted 2006) [#6]
Space is allocated for the backbuffer as well. Assuming a 640x480x4bpp resolution that puts you at 1,288k + 1048k = just over 2 megs. Not sure why it jumps in full screen mode though.


SculptureOfSoul(Posted 2006) [#7]
Cajun, I'd be interested to hear you elaborate more on your approach if you've got the time.

I think I follow what you were saying, but at the same time I don't think I can utilize that approach given my engine goals.

For instance, while I could treat a layer as a 2d array the size of the screen, that becomes problematic when one tile space has 8 tiles, but the rest of the tile spaces only need 3.

[edit] - the main issue being the need to iterate across the whole map 8 times when only 1 square has 8 tiles and the rest have 3. Assuming an average of 400 tiles per screen, that's roughly 2000 (5 passes) of wasted 'checks'.

This wouldn't be a problem if I confined all tiles in the game to a specific size and didn't allow freeform movement. If I knew everything would fit nicely in it's current tile location, I could just loop through my map and draw each tile at each tile location without regard to any other tile location and the layers needed there. But, since I need to allow multiple sized tiles that won't work.

Let's say at 0,0 I've got a grass tile and a bird tile. The grass is at a layer depth of 0 (background) and the bird is at say 5. (not a 5th layer I guess, just a value representing it's pseudo Z value)

Now, lets say the player is 2 tiles tall, and is at layer depth 3. Also, he's standing at (0,1). If I just zoom through each tile location and draw all of the tiles in that spot, the bird would disappear behind the player in this case.

I guess I'm not really utilizing "layers" at all, per se, but instead pseudo layers. The real reason I need this functionality is I want the game to seamlessly allow the following, for instance:

A player walks into an abandoned house. Instead of loading a new map, the ceiling tiles instead are drawn at 75% transparency and the inner tiles of the houses first floor are now shown. The player walks to the corner of the room and descends a staircase into the basement. Upon reaching the bottom the ceiling tiles have now disappeared and the first floor tiles (the new "ceiling" ) are now drawn at 75% transparency.

Assuming that the basement has 2 pseudo layers... objects that are drawn over the background tiles and the background tiles themselves and assuming the first floor and the roof of the house was the same - well, that right there is essentially '6 layers'. There might be another pseudo layer above the roof of the house too, that holds clouds or some kind of truly 'foreground' images.

My approach as it stands now is to iterate through each spot on the map and toss each tile into a list that is assigned to a specific 'pseudo layer'.

i.e.
List0 holds all objects with a layer val of 0
List 1 holds all objects with a layer val of 1

Then I can just zoom through the lists sequentially and draw each tile - knowing that they'll be Y sorted (I'm going to go across my map row by row when inserting into the list) and also depth sorted (since the first list represents the lowest layer, the next list the second lowest, etc).


SculptureOfSoul(Posted 2006) [#8]
Duh...the backbuffer. Thanks! :)

I've been up far too long (nightshift worker and it's... 2:30pm here, way past my bed time). I've got to get to sleep.

Looking forward to a reply.


tonyg(Posted 2006) [#9]
Not sure I follow completely but, if a player goes below ground level, I would stop drawing anything outside the house.


Cajun17(Posted 2006) [#10]
First of all why are you using pixmaps? They're much slower than Textures (TImage). Also you don't have to cut them up yourself.

LoadAnimImage("tiles.png",32,32,0,64) will cut a 512x512 file into 64 32x32 pieces.
DrawImage(tileImage,x,y,5) will draw the 6th tile (0 is 1st) at x,y


Anyway, my system is lower level than a tile engine. Think of it only as a software z buffer. I designed it to allow objects to change their depth easily without having to sort a big list of objects every loop.

It works by creating a ZBuffer object (there are a few kinds of buffers) and filling it with objects that extend the ZObject type. There are basic objects for all the shapes and images(ZRect,ZImage,etc.) that you can use or you can create a more complex object yourself.

The ZObjects can be handled in one of three ways. Staticly, Dynamicly, or Streamed.
Static objects aren't sorted before being drawn. They're sorted only when a buffer is in an 'unlocked' state and upon the locking of the buffer. Terrain / tiles make good static objects.
Dynamic objects are sorted before every draw. These are the whole point of the system. By keeping them separate they can be sorted that much faster.
Both dynamic and static objects remain in the buffer until freed. Static images won't be removed until the buffer is unlocked, although they won't be drawn either.
Streamed objects are removed from the buffer once they're drawn.
I can't think of a reason to use all three types at once, but you can.

Buffers extend ZObject so you can further segregate your objects. For example all GUI or HUD objects would always be above all game content so you can create a buffer for your GUI with a very large Z, make it static and place it in the main buffer. Then your GUI objects would only be sorted relative to each other.

That's it in a nutshell. Hopefully I explained it well enough. In practice it's easier to use than it is to explain or understand. The easiest way to use it is with all streamed objects and use only the main buffer.

Example:
'initialize max2d

ZDraw.Initialize(_STREAMED_)
ZDraw.SetBuffer(ZDraw.BackBuffer())

'from here just replace draw commands
ZDraw.Rect(100,100,30,30,z)
ZDraw.Oval(200,100,20,20,z)
'Text, image, plot, line etc...

ZDraw.Flip()

I can get close to 1000 images at 60fps in this manner. By managing the buffers myself using only static and dynamic objects I can get between 4000 and 5000 depending on how many of each type there are.

Hopefully that made more sense. I do plan on releasing this module once I optimize, tweak and document it.


SculptureOfSoul(Posted 2006) [#11]
Regarding LoadAnimImage...

When I first read that I thought - well, duh, that's obvious. So whipped up a test to compare LoadAnimImage versus loading a pixmap and then loading images from it using Pixmap window.

Strangely, there was really not much difference in speed.
In fact, on this old computer at work the pixmap method was faster.

On a 512x512 tileset:

LoadAnimImage took: 122 ms
LoadPixmap took: 82 ms
LoadImage from Pixmapwindow looped 255 times took: 3 ms

So the pixmap method was 37 ms faster. I tested this multiple times and it always turned out to be faster, strangely.

This test stirred up a revelation though. Both 85ms or 122ms are just too long to load a single tileset. There are times when I might need to load 2 or 3 tilesets, and I'd really prefer not to have any frame-rate hiccups while doing so. I thought that I could stream the pixmap into a bank and then load it from the bank instead of straight from disk, but that didn't seem to speed things up at all.

So pre-streaming the tilesets isn't going to get rid of the bottleneck since the bottleneck is when I actually call LoadAnimImage or LoadPixmap.

I don't know how to get around this. I tried loading a single tile sized image (with the idea that if this was considerably faster I could just load a single tile every frame or couple of frames) - but this is only marginally faster than loading the whole tileset. Loading each tile this way would take 25 times longer than loading the whole tileset - so that's a no go.

Loading from banks doesn't seem to be any faster than loading from disk, either. So pre-streaming individual tiles into banks to be loaded doesn't help, just as it didn't w/ full tilesets.

Incbinning speeds things up but I'd really prefer not to go that way.

The only other idea I've got is to try and utilize a Ram stream, although I really don't know if that's possible. I'm going to go play around with that for a bit.

Your software z-buffer approach is interesting. I've got some thoughts and questions that I'll toss around later, after I get some more tests in.


SculptureOfSoul(Posted 2006) [#12]
Tada - LoadAnimImage called on the same image now loaded into a RAMStream took not 122ms (or 200+ as it did in some of my tests), but instead took only 5.

Woot! I love it when things work like you hope they will :).
(Which is definitely not often enough when you're a noob programmer, hehe)


SculptureOfSoul(Posted 2006) [#13]
Okay, silly me.

Turned out that loading the image from the RamStream was fast...but streaming the data from the file into the RamStream was really slow (took about 700 ms). While I could simply space the streaming out over a couple of seconds, I thought - "Well, why don't I just try loading direct from the file stream."

Not sure why but I didn't think I could do that and hadn't tried up to that point. Turns out that loading from the file stream is just as fast as loading from the RamStream, and doesn't incur the overhead cost of moving the data from the file to the Ramstream.

So now, loading direct from the filestream, I'm getting a load time of about 5-20 ms (the exact range I was getting with the ramstream).

I'm not sure if this is well known or not but for those who aren't aware, it appears that you can cut down your load times drastically ( on this comp, comparing best case speeds and worst case speeds, it was about 10-15 times faster) if you call LoadImage, LoadAnimImage, or LoadPixmap directly on a filestream, instead of simply specifying the file path.


SculptureOfSoul(Posted 2006) [#14]
Shoot. Turns out that as soon as I commented out the Ramstream code, loading the image from the filestream took just as long as loading it from the file path.

I'm guessing that it's a cache thing. When I had the Ramstream load the file, the subsequent call to LoadImage( filestream ) was pulling the cached file data?

Well, looks like I'm back to using RamStreams. Strangely, loading the file into the RamStream seems to be taking much less time now.

I can't wait to get home and try this on a computer that isn't 10 years old. =)