Reality check: static instancing vs view frustum culling?

Quasar · September 5, 2022, 1:49am

I have a game with a ton of instanced, low poly tree’s. There are two performant approaches I can take to rendering this, and I’d like some new eyes on the problem to let me know if I’m missing something obvious. Please let me know if you see anything wrong with my assessment:

1. View Frustum Culling. Filter out all the tree’s that are out of the players sight using a quadtree, and only send the transforms of the ones in view to the GPU every frame.
- Pros: Less GPU load.
- Cons: A big chunk of instance transform data being transferred through the CPU to GPU bottleneck with every draw call.
1. Static Instancing. Throw the transforms of every tree in the game to the GPU at the start of the game, and just render them all every frame.
- Pros: No instance transform data needs to be transferred through the bottleneck.
- Cons: Extra GPU load, since I’m rendering all the tree’s all the time.

Both of those cons can be spun as pros: the transform data in view frustum culling is also a pipeline for other data that changes over time (eg. local wind vectors), and rendering all the tree’s all the time means my lag won’t spike when I zoom out to view the entire game map.

Right now I’m using static instancing, which I arrived at after suffering performance problems with view frustum culling, but I’m having trouble doing everything I want to do with that method. My game includes tree’s that grow and die and creatures that eat the tree’s, so I would really benefit from the ability to send a few bits of extra data to the instances every frame.

Essentially: can view frustum culling can be performed with less load on the CPU to GPU pipeline?

Ravendarke · September 5, 2022, 3:51am

Super simple solution, grab compute fork and do frustum culling on GPU in compute shader.

reiti.net · September 5, 2022, 8:11am

I also have (fully procedurally) trees in Exipelago, and basically each of them is a single geo instance. I actually use both of your approaches. Due to the nature of the game, I already have coarse spacial grouping and in that groups I combine all treedata into a single mesh.

so culling is done due to the spacial grouping and when data changes only a smaller bit of data is going to get recreated/resent. As most trees are fully procedurally, this only happens when trees are added or removed tho, everything else is calculated using GPU-side settings (including growth) - trees are still expensive, especially when drawing twice because of shadows.

In case of local wind, you may not need full vertex precision, so you could just use a texture with wind information in it and read from that to get wind per vertex so you dont need to recreate the static vertices and just change a small texture (data will naturally interpolate on GPU so auto-smooth)