hlsl shader optimization

So, write to rendertarget -> apply shader & overwrite rendertarget with new data -> draw rendertarget?

Is this efficient? I’m curious about the performance impact of all those spritebatch.Begin calls.

I’m not sure. You’ve got a really good setup to test though :smiley:

It should only be three total, so that shouldn’t be too bad.

First = Draw base assets to target.
Second = Post processing pass on target.
Third = Draw target to screen scaled up.

1 Like

Thanks for the fast reply! I’ll test the performance when I can :stuck_out_tongue:.
Yeah, three for the text, but two more for UI (no effects, no transformation) and world (camera transformation) rendering.

Edit: I can draw the two-pass rendertarget together with the UI.

Hmm, weird. I still need to use outlineWidth = 4 to get the desired effect for the font if I downscale my rendertarget and text. I’m still new to rendertargets so I might be doing something wrong here.

Rendertarget draw if I clear the graphicsdevice with a white color: (all text is now drawn at scale 0.25f)

Rendertarget draw with the applied shader (outlineWidth = 4)

Rendertarget draw with the applied shader (outlineWidth = 1)

That’s odd… you did calculate the new value of texelSize for your smaller RenderTarget and pass that through to your shader, right?

(By the way, that sand looking tile… can that be rendered with the text in the down-scaled render target? It looks like it has the exact same effect on it.)

Thanks again :slight_smile: ! Forgot to change the texelsize…
The outlines on the sand texture are the same but the UI is temporary.
The shader performance seems about the same as before, probably because the extra spritebatch begin calls?

Edit: I’ll probably have some performance improvements against a fullscreen render target on very big resolutions.

Hmm, I’m surprised an additional sprite batch call brought you back to your old performance where you said you had 30+ :frowning:

Not 30+ spritebatch begin calls, 30+ spritebatch.drawstring calls per string drawn on screen.
I mean the old shader method (fullscreen render target). The shader is much faster than the drawstring method because each new string needs 30+ additional drawstring calls. The shader performance stays the same with more strings.

Oh I see. Interesting.

1 Like

So you don’t have the loops anymore now? I think you can use ddx and ddy instructions to replace some or maybe even all of the texture fetches.

Instead of alpha = max(alpha, sample) you can do alpha += sample. Not sure if that will have an impact, the compiler might have optimized it.

What does the shader look like right now?

Nvm, this won’t work. You can use ddx/ddy to get the texelsize so you don’t have to pass it into the shader.

I’ve updated the original post with my current shader.
There are no loops anymore but there’s an additional boolean value for the font shadow. How would I use ddx and ddy to get the texelsize?

As a recommendation, you might consider passing in each colour you want to set as a parameter. Then you don’t have to change the shader if you want to change colours.

1 Like

Very true. I’m currently still searching for an efficient way to draw multiple strings in different styles (ex.: some have a shadow and some don’t) without creating different render targets for each group of strings with a certain style. I don’t know if this is possible.

I’m not sure if that’s possible either, unless you bake the effect into the font itself. If they were always going to be the same colour this wouldn’t be a problem, but I remember your screenshots showing the inner colour being different.

I think the best you can do here is to just batch styles and render all text of each style to a render target, post process the effect, then layer them all together.

That said, with the graphics you’ve shown thus far (quite pixelated), and using down-scaled render targets, modern computers should be able to handle a lot without significant performance losses. If it does, you might just have to rethink how many text effects you want, or limit the colours for some of the effects and bake those visuals into sprites so you can avoid post processing?

1 Like

Thank you for the advice.

That was the drawstring() operation that is executed once per string before applying the outline shader. The main two are just the default outlines and a version without the outer outline when the text is being used as a button.

Most of the expense in your shader is all the samples. Look up “distance field font rendering”, if every pixel in the texture has, as lets say its alpha value, its distance in pixels from the border (the shortest distance) … then you only have to do one sample to know if its within the specified border range.

However, to answer the question about how people tend to generally do this…

Lets say your game has a TextControl which can be any one style. So if you want this sentence “Deals 300 damage!”, that is built from three text controls – since they need to be different styles. “Deals 300 damage!” could be just one.

Each text control is its own spritebatch.drawstring call … each style is a separate SpriteFont. You just have to emit controls in an efficient way, and it will be efficient.

You may or may not even need your own custom shader (colorized outlines might be a uniform). Anyhow there are also cool things you can do if you -do- have your own shader.

Actually though, unless you are having trouble hitting 60 fps or whatever your target is, and you know this is the culprit – move on to other things :smiley: I kind of doubt it is, considering how few pixels on the screen have text on them in a typical game.


I already discussed using “distance field font rendering” for this in another thread, but used this shader because the implementation seemed overkill for what I wanted to achieve. (see 2D outline pixel shader on a pixel font)

I actually tested the performance difference between the shader and not drawing text at all. It was almost no difference. I currently have 400 - 500 fps on a 3440 x 1440 resolution (while also drawing 33k tiles) so framerate is not a problem (yet).
But thanks for the advice :slight_smile: .

Actually, this is really good advice. It’s easy to get caught up in performance tweaks. As someone who has gotten lost in this before, trust me… just make content. You can optimize these things later :smiley:

(Plus i’m looking forward to seeing what you’re building! :P)

1 Like