Hello guys,
I have a performance problem with my raycasting (line-line intersection) implementation. If I leave the game running, the FPS starts dropping dramatically in a few seconds, below 10 fps (that’s the point I usually quit the game), even without displaying anything on the screen. Using the profiler, I narrowed down to my Cast function calls, which takes up most of the CPU time. I don’t create new objects, so it’s also not GC. I also got rid if every displayed objects, the attached screenshot is just to demonstrate how my scene looks, but when testing, I also turn off displaying the rays themselves and I only see the 5 walls on which the rays are stopping. If I turn off my raycasting logic, the problem disappears.
My situation is the following:
This is my scene:
I have 1 moving object which emits 72 ray objects. I’m reusing the rays, so no GC here (although the start position of the rays are changed as the object moves). I have 5 lines to intersect the rays, which are also created beforehand and reused, they are not moving during tests.
Meaning that in each Update() call, I call the Cast() function 5x1x72 times, which doesn’t seem that much to me.
This is the ray class:
class Ray2D
{
public Vector2 position;
public Vector2 direction;
public float angleRad;
private float x4;
private float y4;
private float den;
float t;
float u;
public Ray2D(Vector2 position, Vector2 direction)
{
this.position = position;
this.direction = direction;
this.angleRad = (float)Math.Atan2(direction.Y - position.Y, direction.X - position.X);
}
public Ray2D(Vector2 position, float angleRad = 0f)
{
this.position = position;
this.angleRad = angleRad;
this.direction = MathUtil.RadToVector(angleRad);
}
public Vector2 Cast((Vector2 from, Vector2 to) target) //this method kills my game
{
x4 = position.X + direction.X;
y4 = position.Y + direction.Y;
den = (target.from.X - target.to.X) * (position.Y - y4) - (target.from.Y - target.to.Y) * (position.X - x4);
if (den == 0)
{
return Vector2.Zero;
}
t = ((target.from.X - position.X) * (position.Y - y4) - (target.from.Y - position.Y) * (position.X - x4)) / den;
u = -((target.from.X - target.to.X) * (target.from.Y - position.Y) - (target.from.Y - target.to.Y) * (target.from.X - position.X)) / den;
if (t > 0 && t < 1 && u > 0)
{
return new Vector2(target.from.X + t * (target.to.X - target.from.X), target.from.Y + t * (target.to.Y - target.from.Y));
}
return Vector2.Zero;
}
}
and this is how I call it from the Update method:
public void UpdateRays()
{
foreach (Ray2D ray in rays) // 72 rays
{
ray.position = owner.GetPosition();
foreach (Entity e in Scene.Instance.GetEntityLayer().GetAll()) // 5 objects in the list
{
foreach ((Vector2, Vector2) line in e.GetRayBlockerLines()) //1 entry for each object
{
Vector2 intersection = ray.Cast(line);
}
}
}
}
Do you guys see some obvious performance bottleneck with my code? I’m pretty much stuck at this point, I don’t see how I could make this code run faster.
Thank you very much
Just out of curiosity, does commenting out the following code improve your performance:
// Vector2 intersection = ray.Cast(line);
Yup, when this line is gone, the problem is gone.
First off, a triple nested for loop is a bad smell. Not that you can always avoid this, it’s just the first thing that came to mind. However, if it’s only 72x5x1, that shouldn’t be your problem.
Having said that, you say you’re intending to re-use objects so no GC, but I see a couple of new calls in there, specifically in your main return. Also, your parameter syntax is unfamiliar to me. It’s neat, but I’ve never seen it before. From context, I’d expect it packages up those two vectors as a new type?
Maybe try avoiding this. In fact, maybe try writing your method signature as follows…
public void Cast(Vector2 from, Vector2 to, ref Vector2 result)
{
...
result.X = ... // Calculate X and store here.
result.Y = ... // Calculate Y and store here.
}
In this way you won’t actually allocate any new memory, you’ll just store the result back in the ref parameter. You can use a pool of objects that have been pre-allocated if you like.
Still, do me a favour… for science! Try just not using that parameter syntax you’ve got there. Go with this…
public Vector2 Cast(Vector2 from, Vector2 to)
{
...
}
I’m curious if that makes a difference.
1 Like
Alright guys, I’m totally stupid… I made a mistake in some other place in the code by calling a constructor in an Update method instead of just once during initialization, and it kept duplicating the lines to calculate and quickly got 10s of thousands of intersections to calculate in every update call… Now that I fixed it, I have tons of rays without any dropping in the FPS
Thank you very much for both of you!
But now that we are here:
First off, a triple nested for loop is a bad smell.
I do understand in general why it’s not a good practise, but in my case, I could not find any other way. I have to check every ray (first loop) against every object in my scene (second loop) and with all the lines that build up that polygon (third loop) to know if my ray hits them somewhere.
But if you have a better idea of how to do this, please let me know because I’d gladly eliminate this I mean, I can restructure the code myself as well, but it will just hide the underlying 3 loops, it won’t eliminate it
Sounds like you figured it out, way to go!
Yea, I hear ya. That’s why I said it’s a smell… something to look at, but not necessarily bad
Honestly, if that’s what you’ve gotta check, that’s what you’ve gotta check. It sounds like everything is working within your performance expectations, and that’s really key! If you find they stop working as expected, then I might suggest a few things…
- Try some kind of virtual world partitioning. This is a 2D scene, yea? Check out Quad Trees Instead of having to search through all objects in your scene, you can look up only the objects that are currently visible, or ones that are near by, for example.
- It sounds like there’s only one object you’re testing intersection with currently. If that expands, that inner loop is going to drastically reduce your performance. You might try doing an initial hit test using a bounding circle. Check to see if your ray first intersects a circle that contains all of the object’s geometry. If it does, only then proceed to check the rest of the geometry.
I would definitely stress that you only bother with these things once your performance dips below some threshold you define. Otherwise you could end up spending a lot of time on code that might not even matter in the long run. About the only thing I would suggest is design your system in such a way that you can easily swap out slower performing components with better ones when the need arises. You incur a bit of performance cost doing this, but the tradeoffs are usually well worth it.
Definitely benchmark though!