Why does a shader give worse performance than a simple Color.Lerp();?

The original thread can be found here : https://gamedev.stackexchange.com/q/148260/63179

Any and all help is appreciated.

SpriteSortMode.Immediate makes SpriteBatch do an actual draw call every time you call Draw on it. By default SpriteBatch batches draw calls and only actually draws them when you call SpriteBatch.End. That means it tries to combine draw calls and send them to the GPU at once. If you draw all your tiles in separate draw calls that will be terrible for performance. Also if you Lerp on the CPU it only happens once, whereas if you do it in a pixel shader, it happens for every pixel, but that really won’t hurt performance as much.
EDIT: Also, you’re sampling twice in your pixel shader and SpriteEffect only samples once.

If subsequent draws use the same texture they can be batched by SpriteBatch. In your case that means it might speed up drawing some more if you draw each type of tile in 1 batch by sorting them beforehand (maybe even at build time if your maps aren’t dynamic) or by letting SpriteBatch sort them by specifying SpriteSortMode.Texture in SpriteBatch.Begin.

Somethings really wrong if your only getting 45 fps.

looking at the first few lines there are some mistakes.

In the first few lines of your code, the else if(… only the begining else in not requisite. As well this first else if line will never do anything.

for (int j = 0; j < bord_width; j++)
{
else if (Players[PlayerID].UncoveredMap[tiles[i, j]] == Types.MapVisibility.NotWatched)
{
else if (drawingIndex == 1)
{

Evalutes to the following

if (drawingIndex == 1)
{

I don’t see the reason for tiles[i, j].HeightType being the case for switching textures unless these tiles are really huge, this HeightType appears to be a misnomer were heighttype is really textureID further sending this to the gpu in random non batched order means that potentially every draw call could require the gpu to switch textures back and forth.

Some additional advice in addition to what jjagg said.

These textures should be in a spritesheet and you should only then need to set a texture once. Pass the source rectangle for the corresponding sprite to a more full draw call.

This would…

Remove the need for immediate mode to batch at all and alleviating pressure on the gpu.
Also in the above way you would not need the switch case at all.

It would look like the below.

  for (int i = 0; i < bord_height; i++)
  {
     for (int j = 0; j < bord_width; j++)
     {
        if (drawingIndex == 1)
        {
        // im guessing this was intended
        // if(Players[PlayerID].UncoveredMap[tiles[i, j]] == Types.MapVisibility.NotWatched || drawingIndex == 1){
           spriteBatch.Draw
           (
              tileSheet,
              tiles[i,j].Position,
              tiles[i,j].HeightType.TileSheetSourceRectangle,
              Color.Lerp(Color.White, Color.Black, 0.5f), 
              ect...
           );
       }
    }
 }

I would rename HeightType to TileInfo

You should be passing your effect to begin in the second example make a spriteEffect from the pipeline and copy your shader to that and use it.

I would imagine in the second example you could also just move that pixel shader code to the vertice shader were by it’s nature the gpu is essentially lerping, then pass the color to the shader and right back out of it.

Hope i’m not too late to add this little piece of info…

  1. I’m getting 45 fps/15 fps at 300 X 300 tile map (this is 2X the size of biggest civ 6 map), the map can’t be pre organized as it is generated randomly with noise.

  2. The name HeightType is exactly what you think, the value of height represented by name (to make it easier and more understandable) and yes there are 4 types in total (HeatType,MoistureType,HeightType and BiomeType and yes i’m trying to make this game as deep as possible).

  3. The “else if” you are seeing is not wrong because first i’m drawing the tiles that are Visible and later the ones that are Not Watched.

  4. I have also thought about putting the effects in Begin() but I was aiming to render all tiles in the same spritebatch (the ones with fog over them and the ones without)but if I put the effects in begin that is not possible (if I dont put if statements in the shader, but I have read that this is the worst thing you can do in a shader).

Thank you both for taking a moment of your time to help a man in need, but unfortunately I still don’t have a permanent solution (if we discard the Color.Lerp() method which works, but I was hoping to gain performance with shaders).

Have you considered making your own quad vertice buffer mesh and assigning texture coordinates on generation to the tiles located in a single spritesheet ? This would also allow for multi-textureing many effects and should result in a substantial speed increase.

128000 textures is not much for instanced vertice indice buffer. Its a lot worse for spritebatch even batching with deffered in this case requires considerable cpu overhead.

If you would like i can give simple custom user index quad mesh generation code this also allows you to actually make a height map from a image for real 3d heights. The main thing here i know that is bottlenecking and even worse on slower computers is that you need these images in a tilesheet that gets rid of the big problem for deffered and immediate… texture fetching.

Remember that underneath spritebatch is 3d quad drawing, 2d is just obfusicated 3d.

If im honest i have never heard of “quad vertice buffer mesh”, will give it a read when I wake up. Sure you can give me an example, and this is strictly 2D, yes the actual world is generated in 4D and could be wrapped around a sphere, but I will not be rendering any 3d objects,

Edit: Correct me if i’m wrong bun in 3d models those triangles that compose the actual model, are they the “quad vertex buffer mesh” you are talking about?

Sorry im really trying to combine a bunch of ideas into a sentence what i mean is the following.

You have 360x360 tiles each is a rectangle with 4 vertices drawn contiguously aligned side to side.

You can create (programmatically) all these points as they would be drawn exactly as spritebatch would but instead figure out how they would be placed as real coordinates side to side. You know that you have 360x360 tiles which means you have 1296000 tiles total each has 4 vertices and all those vertices but the edge ones are not shared 360x4 = 1440 vertices. So even if you did this as a straight vertex buffer its at most 5184000 vertices as quad draws. A vertice buffer with X360+1* Y360+1 vertices should handle all those tiles provided you did right.
You then define a indice buffer by quads not triangles.

such that
quad (0,0) as quad (x, y)
equates to a set of vertice quads by.
quadindex = x + y * width // width is 360 in your case this is known as the stride.
vertice’s for this quad are then defined by…
top left = x + y * width
top right = (x+1) + y * width;
bottom left = x + (y +1) *width;
bottom right= (x+1) + (y +1) *width;

this defines the quad indice’s to the vertexs for the specified tile (x,y);
this is a rectangle.
knowing this allows you to set texture coordinates to those vertices.
instance the data and use DrawUserIndexPrimitives to draw a continuous mesh of rectangles were each of the many vertices texture coordinates alignes to a spritesheet at the positions you specify.
Such a mesh can be built on load cpu side very quickly and is perfect for procedural height generation, nurb meshes, and many things in either 2d or 3d or for snaping off 2d images from a 3d generated image.

obviously this isn’t optimized for you its just a old test class but this is 100x100 quads not really instanced but this is what im basically talking about as a mesh of quads.
http://i936.photobucket.com/albums/ad207/xlightwavex/programing%20and%20concepts/100x100quads_zpscxojtfst.gif

1 Like

Thx. Will try this in the morning…

Edit: Interesting, I tried using TexturePacker to make Atlases and to my surprise if the effects are off and only the base tile textures are to be rendered I get a 10 fps boost if I use individual textures instead of the atlas, also the GPU load is 5% less with an atlas