Would it be fast to use a large texture to render the entire background of a level?

I’m writing a game engine for an existing older game. The backgrounds for each level are static sprites, split into a large number of small tiles. Instead of reassembling them each frame, I’m thinking of rendering the entire level to one texture once, when the level loads, and then drawing a sub-rectangle out of it wherever the camera currently is.

The largest levels are around 3800x2900 pixels. No level is bigger than 4000 pixels in one dimension, so they’d fit the GPU limitations I believe. I’m not sure though if memory access in a ~48MB buffer will actually be faster than accessing a smaller amount of data but spread across multiple small buffers. I’m thinking of laptops with integrated graphics with only shared VRAM that’s actually just regular RAM and not particularly fast. Also, well, it’s 48MB just for the level background.

I googled around and the only caveats I read about the approach were the size limits, but nothing about performance.

Does it sound like a good idea?

1 Like

In my game I have two backgrounds. One is 3860x2171 and the other is 5760x3240. No issues so far on the platforms I tested on. Doing a desktop game. I have texture atlases that are bigger.

Anyway, I consider it fine. If some day I find a platform I want to support that can’t handle it, I’d find a workaround.

1 large texture is definitely a lot faster than many small textures.

But if your talking 1 large texture vs 100 small textures the performance increase is small. Once you start getting into 100’s and 1000’s of textures it will be very noticable

depends … yes it’s (slightly) faster, but you trade performance for GPU memory (which can be shared memory depending on device) - so if you target devices where this (little) speed performance is noticeable, I would guess, GPU mem may be something you want to be easy on as well.

I would just render it as tiles (I did this almost 10 years ago on older android devices and it worked fine performancewise) - if you find later that you need that bit of performance you can change it later - but spare the time for now.

You basically just spare a bit of vertexshader which is going to be parallelized anyway, you spare nothing on PixelShader (as long as you put the single tiles in an atlas or array NOT as single textures) - you may even be faster there with a smaller atlas because of texture cache - would be interesting to do a test …

So I tried it and checked the performance on my old laptop with integrated graphics.

Using a large texture is much faster.

A bit surprisingly, there is no need to compute the source rectangle to only render the visible portion of the texture. I can just render the entire thing every time with the right transform passed to SpriteBatch.Begin. GPU load and framerate remained the same with or without a source rectangle corresponding to the visible area, and scaled the same way when I zoomed out to reveal more of the map.

sometimes the simplest solution is the best instead of over engineering the solution.

There is quite a lot of overhead if you break up your map into small chunks when there is not so much to draw, so the simplest is to draw one texture if you are not really looking for a lot of resources.

The benefits of doing one large texture is its all in 1 draw call and the gpu will cull the offscreen part of the texture.

When your tiling the screen your doing many draw calls. If you dont cull the tiles off screen your still batching all these draw calls, sending them and the vertice information to the gpu before it decides it doesnt need to draw them.

What I’ve noticed is that the slowness doesn’t scale further the more tiles you draw, it only slows more for each different texture that you draw.

That means the speed between the two algorithms is only a base. Let’s say I have 30 different floor and wall tiles, my poor integrated GPU might go from say 5% usage for the single texture to say 20% usage drawing each tile. BUT I can draw many many more tiles beyond that in one frame and my GPU won’t go much further beyond that 20% base, if at all, if you see what I mean.

1 Like

“When your tiling the screen your doing many draw calls.”

When you use SpriteBatch you don’t do many draw calls as long as they refer the same texture (atlas) - it’s still just 1 draw call. That’s basically the whole point of SpriteBatch. But yes it will send more Quads, but basic culling can be easily done locally - which should be done anyway.

And you don’t do single texture for each tile. You make a texture atlas or texture array.

That said - pre-generating a big background image is no issue, it will work and it will work fast, especially if you need to fill the whole background anyway and that background is of a reasonable size. One would need to measure, but I would assume rendering the same tile with multiple quads to the same texture should technically be faster than rendering one giant texture because of less cache misses on the sampler for each pixel. But I doubt it will be of any relevance in terms of performance :slight_smile: