Is there a proper way to use SpriteBatch with multiple source textures and multiple UVs? Defining custom input buffers to the vertex shader and pushing the data myself could be ideal.
I know I can use ‘Immediate’ mode and uniforms, but that will be devastating for performance and I won’t be able to enjoy things like depth sorting. There’s really not much I can do with uniforms per-draw call if I let MonoGame sort the renderings for me, unless there’s a way to tie additional input information to be sorted with the rest of the draw data.
If you wonder about the use case - I want to draw to 3 targets: diffuse, normal maps, and emission. I want to try out some 2D with “3D” lighting. I can build it in 3 passes (ie draw diffuse, then draw normals, then draw emission and mix them with the lights) but that would be rendering everything x3 times and very wasteful.
Edit: I’m thinking about switching to ‘Deferred’ mode, which to my understanding keeps the order of the draw calls, sort the sprites myself, and build a buffer of normal and emission UVs and feed it to the shader as a uniform whenever a draw call is about to be made to the GPU (ie when source texture changes or before calling End()).
Is this a good approach? Are there other things that might trigger an internal GPU draw call, like a limit to how many sprites can be batched in a single flush? If so how do I detect these?
As long as you write own shader, sure. It will be slightly hacky (depending on specific use, your will be fine, I will mention other possibility as well and I simply recommend writing own sprite batch in that case). Anyway, short version:
Bind textures to individuals slots (don’t forget t0 will be assigned before flush on the screen by texture set as parameter of .Draw, so either ignore that or simply account for that).
Now in your shader sample them as you see fit, so simply always .Draw(DiffuseTex) while binding Normal to t1 and emission to t2 beforehand.
This is your case, straight and simple. Now other possible use case would be batching across multiple textures. That would be done through texture array or again individual binds while deciding what to tap using for example red channel of color parameter (it’s however normalized byte4 so it has to be accounted for) in which case I would recommend own spritebatch. In your case you will be fine since that’s the first variant.
That won’t give me different UVs, right? Or am I missing something?
You want different UVs for different textures? So different UV for tap into Normal map, different for diffuse, different for emissive? That sounds very atypical, please elaborate specific use case (there is solution of everything, I just need to know what exactly is going on). Also if it is different for each map, is there some mathematical rule that could be established?
Using the same UVs for all texture sources forces me to use textures that match, which is quite limiting when using runtime-generated texture atlases. Also maybe I’m greedy but I want to allow some sprites to not have emission texture and just draw the original sprite in black to the emission render target just to hide whatever values below.
At the end I’ll use emission for only few sprites so forcing everything to have a black emission texture just to support cool effect for the few that need it sounds wasteful.
EDIT: now that I think about it more maybe I’ll just go with unified UVs and build 3 atlases - for diffuse, emission and normals, and whenever there’s no emission texture I’ll do the black silhouette trick on the atlas itself, so the 3 atlases will always remain in sync while the sources can be whatever.
It’s not ideal in runtime memory as I’ll eat up a lot of texture space for no reason, but it’s not that bad.
Ofc the tricky part will be when I exceed the atlas size and need to start building multiple atlases. But sounds like I can simplify the rendering pipeline significantly by making the atlas building more complicated and slightly less efficient.
Alright, different UVs for each sprite picked without any mathematical rule between them. Two solutions:
If there is fewer than 256 sprites per sheet (or it is uniform grid) we can get it going using default spritebatch vertex format using UV look up table (everything is data, nothing is truth, everything is permitted… I will elaborate on that approach if limitation is applicable) Edit: actually that limitation will be way higher now I think of it, thousands of UVs will be fine, we will hijack original UVs instead of Color as they are float4
Otherwise custom vertex declaration + custom spritebatch so you can feed required data, definitely cleaner.
Btw I am in train so connectivity might come and go.
Do you have example of custom spritebatch with custom vertex declaration in case I’ll want to explore that option too?
Personally I would do it through instancing… (well, I am lying a bit here, I would do it through custom fetch from structured buffer, but if we are talking “vanilla” mg then instancing)
You can make copy of original spritebatch source code. Replace default vertex declaration and .Draw functions, few hours of work tops, tho bit annoying.
I’ve posted instancing solution for mg (few lines of code basically) in MG discord few years ago, but that also might be bit annoying to find.
Ahh you meant like that, I was hoping MG had some base class for sprite batch that allows the creation of custom batches.
For instancing I’ll need custom spritebatch too, or not use it at all, right? I didn’t see any mention of instancing with the default sprite batch.
For instancing you need custom solution, yes, however it can be boiled down to like 30 lines of very efficient code (that can be made even more efficient by cluster instancing or mentioned custom fetch, however for normal purpose, as inefficient as quad instancing is, it will be easily 20 - 100 times faster than MG’s spritebatch anyway)
Note: There is bug in OpenGL MG implementation that makes instancing annoying to work with. Works perfectly fine on DX, I think it is fixed in OpenGL compute branch as well, also works perfectly fine in FNA.
I did some experiment with instancing and deferred rendering some years ago and remember there was some strange issue I solved by switching to DX but don’t remember what it was. So that still a thing I guess
Instancing is great but I don’t want to go overkill, at the end its still just 2D, I just don’t want it to be rendering-everything-3-times 2D because that’s just insulting to the gods of GPU.
AAAAAAAAAAAAAAAAA net crashed and it deleted my message, so I will short it down:
Instancing would be super straight forward, cleaner and simpler since your textures doesn’t change for whole batch
You dont even need vertices for your quads, just 6 indices per instance, use SV_vertexId to expand corners, so you dont even have to create whole vertex bindings as you can just create single instance data vertex buffer (MG code will ofc create bindings underneath but doesnt matter, you will save two lines of code).
I haven’t used MG in a while, so just approx:
DynamicVertexBuffer instanceVB = new .....
//for data preparation
void DrawSprite(vec position, rec difUv, rec normUv, rec emissiveUv)
ref var sprite = ref InstanceData[nextFreeSprite++];
sprite.Position = position....
//or constructor, I dont care
GraphicDevice.SetIndexBuffer(QuadIB); //depends on shader can be 0,1,3, 1, 2, 3
GraphicDevice.SetRendertargets(DiffuseRt, NormalRt, EmissiveRt);
nextFreeSpriteId = 0;
Thanks man, but I think I’ll just go with the regular sprite batch + sophisticated atlas.
Will save me some matrix calculations since all the basic 2d transformations are already in the sprite batch (although its a shame it doesn’t support skew out of the box, might need to create my own SB after all).
At this point, I offer a differing approach, bake the other maps 2D maps at lower resolution per image to fill the (4 channels of 8 bit data). Then recombine them from the texture samplers in the pixel shader.
The benefit is that pre-calculated interpretable lower resolution(1+/4) maps can share texture space.
There is a reason 3D mappings(color, UV, displacement, normal) are sometimes/usually passed to the Vertex Shader is to reduce the burden on GPU data transfer via pixel shader interpolation.
If it works for 3D then a modified 2D implementation should be at least as performant as it’s 2D counterparts.
The only negative is the resolution of each map is reduced at the edges due to the interpolation region.