Calculating orbits.

MuntyScruntFundle · February 25, 2017, 8:13pm

I know how to rotate an object around another, that’s not the problem.

I want to animate several suns, each with planets, each planet with moons. So I have a:
List(Of Sprite) - which stores all the sprite information, rotation, spin, position etc. for all of the moving bodies.

The problem being the position of each planet and moon is dependent on it’s parent. This isn’t really a problem, the calculations are easy, however, how best to get the data required for the calculation.

For each planet and moon I’m storing an incrementing index that relates to it’s parent body, so for each loop through the list I have to check if the current sprite has a parent, if it has I need to loop through the list of sprites to find it. So in effect for a moon I’m looping the entire list 3 times to get the numbers needed for the calculations, Moon > Planet > Sun. If I have 5 suns each with 3 planets, each planet with 2 moons that’s a hell of a lot of looping for such tiny amounts of data. This isn’t efficient.

I’ve thought of 3 different ways of looping the data so far, each slightly better, but still hardly efficient.

Can anyone think of a way to cut this algorithm right down to save processing time?

Many thanks.

MuntyScruntFundle · February 25, 2017, 10:01pm

Sussed it.

I used an array to store the position of each body, then using the parent index of each body I can quickly get the position from the array rather than looping the list. Then update the position, also updating the position in the array. Job done.

If anyone has any other ideas it would be good to try them out.

Cheers.

GeonBit · February 25, 2017, 10:57pm

What you’re describing sound a lot like 2d bone animations, eg a tree of nodes with transformations. Basically every star is a node with local transformations of its own (position, rotation, scale…), but its final transformations are its parent + self. Eg if you have a node at position 10,0 and add a child to it at position 0,5, the child final position (aka world position) would be at 10,5.

Why is it good for you? You can create a sun at position 0,0 (canter) and attach a child star to it at 200,0, eg in a distance of 200 pixels from it. When you rotate the sun, the star will rotate around it while maintaining the distance of 200 px (in other words, orbit around it). Then you can add a moon as a child of the star and add rotation the same way.

I hope it won’t come out as a blatant self promotion but I wrote a library that do something like that

Edit: made a quick example to demonstrate what I mean:

Code to create the stars:

        // create sun
        sun = new Sprite(Content.Load<Texture2D>("sprite/star"), position: new Vector2(350, 240));
        sun.Color = Color.White;
        sun.ScaleScalar = 1f;

        // create star
        star = new Sprite(Content.Load<Texture2D>("sprite/star"), position: new Vector2(200, 0), parent: sun);
        star.Color = Color.Purple;
        star.ScaleScalar = 1f;

        // create moon
        moon = new Sprite(Content.Load<Texture2D>("sprite/star"), position: new Vector2(100, 0), parent: star);
        moon.Color = Color.Blue;
        moon.ScaleScalar = 0.5f;

Code to draw / rotate them:

        // draw the sprite
        spriteBatch.Begin();
        sun.Draw(spriteBatch);
        spriteBatch.End();

        // rotate
        sun.Rotation += 0.01f;
        star.Rotation += 0.07f;

Dismiss · February 26, 2017, 5:47pm

The alternative to your technique of working with flat arrays, is to iterate the tree from the top down by following actual objects rather than indices. (ie, root->computetransform() -> foreach child->computetransform() -> …).

I’m guessing you are working with structs or just value types since youa re working with indices. That’s probably what you want to be doing if performance is a concern.

In a random persons game code though, you are much more likely to see them iterating through a heirarchy of classes to perform that sort of operation, just because it was easier to write (and to some extent debug) and they haven’t even thought about performance yet '-p

GeonBit · February 26, 2017, 7:45pm

How’s iterating a list better performance than walking a tree? (not counting peripheral stuff like function calls etc)

LithiumToast · February 27, 2017, 12:08am

See
http://gameprogrammingpatterns.com/dirty-flag.html

There is an implementation for what was discussed above here.

MuntyScruntFundle · February 27, 2017, 1:18pm

Thanks folks.

The beauty of lists is their simplicity and speed, adding an item is simple, removing an item doesn’t create a hole. The problem with VB lists is they are one directional, you can’t run them backwards without creating an array, loading the list then running the array backwards into another list. This is obviously unwanted processor usage.

RonanNess, I will take a look at that as I’m intrigued how you’re linking the objects in the movement calculations. Many thanks.

Jjagg · February 27, 2017, 3:12pm

The List implementation in .NET uses an array internally, so it’s only beautiful from an end-user perspective There’s a LinkedList in .NET too, but that’s often not preferable for performance reasons.

kosmonautgames · February 27, 2017, 3:51pm

You can use the internal list array e.g. MyList[3]

And you should do that for iterators anyways, since for(…) is more performant than foreach(…)

GeonBit · February 27, 2017, 4:10pm

If you want to implement it yourself or just curious about the transformations, the part that is interesting for you is here:

github.com

RonenNess/MonoGame-Sprites/blob/master/MonoSprites/Source/Transformations.cs

#region File Description
//-----------------------------------------------------------------------------
// Represent drawing transformations - position, scale, rotation, etc..
// Based on code from here:
// http://www.catalinzima.com/2011/06/2d-skeletal-animations/
//
// Author: Ronen Ness.
// Since: 2017.
//-----------------------------------------------------------------------------
#endregion
using Microsoft.Xna.Framework;

namespace MonoSprites
{
    /// <summary>
    /// Renderable transformations - position, scale, rotation, color, etc.
    /// </summary>
    public class Transformation
    {
        /// <summary>

This file has been truncated. show original

Especially the function public static Transformation Compose(Transformation a, Transformation b).

If you look at the top page comment, the transformations code is based of this tutorial

Which is a good reading material.

Regarding the transformation tree vs a list

You build your transformations as a list of entities and a list that tells you order, eg who’s parent of who, right? In a way its pretty much the same as with a tree, only sliced as lists. In my code every sprite / entity have a list of children, and drawing goes recursively (draw self, and then iterate on children who draw themselves and their children, etc etc.)

Every entity got local-transformations (eg position, rotation, scale etc. of itself), and world-transformations which is its local + parent world transformations. The world transformations represent the final world position, rotation, etc of the entity, the final result.

How an update goes? When a transformation property changes on a sprite local transformation (for example when you change sprite position), it becomes "dirty". On the next Draw() call of this entity, we will recalculate its world transformations, eg will take its new “dirty” transformations and combine them with its parent. We cache those world transformations and mark as no-longer dirty, and we use them to draw the entity.

But since now we changed our own world transformation and we might have children who relay on it, after updating self transformations we will immediately iterate over our children and make them dirty, so the next time they draw they will update themselves and mark their children as dirty. It goes recursively all the way down.

Drawing comes after transformations update, so its always up-to-date. And due to the caching and dirty flags, only transformations that require an update will recalculate, eg static entities will not cause any work (note: static in this sense mean their parents are static too).

I hope my explaination was clear enough

As for the question of “list vs tree”, its totally up to you (I think both options are reasonable) but imo it’s so much easier to work with trees (every entity knows its children and its own parent), in both terms of debug and code readability. I think its also easier to extend and maintain, and if you decide you want different, wacky kind of entities that transform in an unexpected way, its easier to do. Since I don’t think there’s any meaningful performance impact to any side, I’d go with trees. But like I said, its a matter of preference.

Dismiss · February 28, 2017, 12:11am

Memory cache coherency. Ideally you want to be iterating through contiguous memory, sequentially, which is only possible to do when working with an array (or list that is just an array under the hood) of value type elements.

If you are iterating through your items as a tree, then that is not sequential memory access. Or if your items are not value types, then they could be anywhere in memory.

So, that is the trade off I was trying to explain in my initial post, make sense?

willmotil · February 28, 2017, 4:05am

Ya ill chime in on this list discussion.
The nice thing about lists is you can use them just like arrays, but get all of the features of a generic.
They are fast too when used with the indexers.
Lists pre allocate contiguous chunks of memory and can be pre-set with a size just like a array that will most likely be allocated contigously this is a important point.

(Contiguous doesn’t always equal optimal memory location in many cases chunks are better then a single contiguous block and i wont goto into why. If you don’t know already then chances are you won’t understand when i say ‘positioning the allocation of large contiguous memory blocks in fragmented memory’)

With lists and class object instances they also take and return val references.
Making copying not always straight forward and i think this scares some people off.

Also

How’s iterating a list better performance than walking a tree?

It’s not iterating a linked list from top down is walking a tree.
There is no difference other then in the implementation of either choice.

The dirty flags is interesting i would have never known if not for reading this.
That forever when i do this and use the word HasThisThingChanged that it was actually a big deal and even a pattern.

Last but not least let me point out the reality here that this specific structure is described as static with 4 levels of transformations as such the following logic would be optimal.

Update all Systems set a flag showing it HasUpdated i mean it is dirty lol.

Then update the next level down the suns or whatever. if it is not Changing by user or algorithm, check if its parent HasUpdated if so update using its parent as well it and set its flag that it HasUpdat … i mean it is dirty and needs a scrubbing :).
The pattern repeats till you run out of objects to update…

If its in a table a list a linked list or a structure or even a self registering class the best choice is to sweep everything at most one time marking and sweeping with dirty flags is still traversing. Reading that article it could be misinterpreted as mark then sweep instead of mark and sweep when optimal.

If i were to do this Id use a list of lists and a seperate list to mark the dirty level for each system instead of on each actual instance because a system itself is the roots and the first list level has a count equal to the systems.

Sorry if this got long winded i was bored anyways.
The edge case is access order and need of control of that access order in edge cases were a access b to get values to set c so it can access a to find look ups for e f g and though you don’t update the major say matrices or values search and access are redundant even if you know e f g could have been directly accessed.
So.
I would want to choose how i used it. In either manner depending on the situation. From time to time i do, something similar to the below.

I have a generic class to do this somewhere that’s way more refined.

I wouldn’t actually recommend this. You would really need to fully understand it or when you got a bug it can be a real crazy bug especially with copying by value as it basically works on references and once instances are set they are sort of really set, you wouldn’t understand it and no one else would probably.
I just wanted to post it cause its cool and i have been using it here and there for a while its my own little pattern. Its rare to need it and this is one of those things. Where here i would probably dig it out and use it. Though this is just the basic idea it wont run as is. You could hide the id’s thru encapsulation completely or do a number of things using them.

pseudo code.

// self registering direct id linked list oh ya... with dirty flags
List<LinkedTransforms> solarSystems;
public class LinkedTransforms
{
public static List<LinkedTransforms> selfRegisteringConstructionReferences;
bool IsRoot = false;
bool HasChanged = false;
private static int currentInstantiationId =0;
int Id =0; // private set public get
LinkedTransforms Parent;
List<LinkedTransforms> Children;
Matrix me;
// ill point out you can traverse with this = parent or child in non static methods

public static LinkedTransforms Create(out int id, Matrix m)
{
 LinkedTransforms i = new LinkedTransforms();
 i.id = LinkedTransforms.currentInstantiationId;
 LinkedTransforms.currentInstantiationId ++;
 i.me = m;
 selfRegisteringConstructionReferences.Add(i);
 return selfRegisteringConstructionReferences[id];
}
public static LinkedTransforms CreateChild(out int ChildId, int parentId, Matrix m)
{
 LinkedTransforms i = new LinkedTransforms();
 i.id = LinkedTransforms.currentInstantiationId;
 LinkedTransforms.currentInstantiationId ++;
 i.me = m;
 selfRegisteringConstructionReferences[parentId].Children.Add(i);
 selfRegisteringConstructionReferences[i.Id].Parent = selfRegisteringConstructionReferences[parentId];
 ... blah blah probably forgetting things...
 return selfRegisteringConstructionReferences[i.Id];
 // now you have both a linked list by reference and a direct access array.
 // the drawback here is complexity as it is now complex 
 // it is using both ids and reference linking so it can operate in either way via traversal or direct access 
// it has flags, as well can move thru parent or child via this
// it can direct access via index or loop the entire list directly thru id's or in sequence by the self registered list.
// objects can be added in real time
// the only real drawback is the list needs to be rebuilt or a dead flag must be used on deletes 
// a dead alive flag would be required to prevent garbage and maintain the unique id's integrity.


}
public void Update(Vector3 move, Vector3 rotate)
{ ... 
// has changed as previously described ect 
}
}

.

LithiumToast · February 28, 2017, 6:56am

Using linked lists can be done implicitly by keeping a parent reference for each game object / entity.

Using the dirty flags pattern in combination with the observer pattern, each game object / entity can notify its children when its local transform changed resulting in each of the child game objects / entities to mark their world transforms as dirty. By having each game object / entity keep track of its parent through a reference, the dependency tree can be walked using a simple while loop to re-calclate and cache (only if necessary) the world transform at each node. The result is that no formal linked list is required since all that is required is an implicit linked list through parent references for each game object / entity.

This is exactly what the code I linked above does for 2D transforms. Matrix2D in the above code is nothing but a smaller memory footprint for a Matrix for 2D purposes.

Also note that in C#, an array of objects does not fulfill cache coherency since each element in the array is just a reference to an object. With C# 7 features, it will be much more feasible to have game objects / entities using structs allowing for the cache coherency optimization to be re-visited in more depth with a pragmatic API design.

GeonBit · February 28, 2017, 9:33am

I understand what you mean but I think it doesn’t make much sense in this context. Consider the following code:

using System;
using System.Diagnostics;

class Test
{
    // child
    public Test child;

    // random action
    public int DoSomething()
    {
        int a = 2 + 5;
        return a + 1;
    }

    // random action recursive
    public int DoSomethingRec(ref int total)
    {
        if (child != null) { child.DoSomethingRec(ref total); }
        total++;
        return 1;
    }
}


namespace ConsoleApplicationTest
{
    class Program
    {
        static void Main(string[] args)
        {

            // create the array / tree
            int arraySize = 10000000;
            Test[] test1 = new Test[arraySize];
            Test last = null;
            for (int i = 0; i < arraySize; ++i)
            {
                test1[i] = new Test();
                if (last != null) { last.child = test1[i]; }
                last = test1[i];
            }

            // method 1 - iterate array
            {
                int total = 0;
                Stopwatch sw = new Stopwatch();
                sw.Start();
                for (int x = 0; x < 100; ++x)
                {
                    for (int i = 0; i < arraySize; ++i)
                    {
                        total++;
                        test1[i].DoSomething();
                    }
                }
                sw.Stop();
                Console.WriteLine("Test array={0} [total calls={1}]", sw.Elapsed, total);
            }
            {
                // method 2 - iterate tree
                int total = 0;
                Stopwatch sw = new Stopwatch();
                sw.Start();
                for (int x = 0; x < 100; ++x)
                {
                    Test curr = test1[0];
                    while (curr != null)
                    {
                        total++;
                        curr.DoSomething();
                        curr = curr.child;
                    }
                }
                sw.Stop();
                Console.WriteLine("Test tree={0} [total calls={1}]", sw.Elapsed, total);
            }
            {
                // method 3 - iterate tree recursive
                // note: had to break depth so we won't hit recursive limit, but total calls is the same (see code below).
                test1[10000-1].child = null;
                int total = 0;
                Stopwatch sw = new Stopwatch();
                sw.Start();
                for (int x = 0; x < 100000; ++x)
                {
                    test1[0].DoSomethingRec(ref total);
                }
                sw.Stop();
                Console.WriteLine("Test recursion={0} [total calls={1}]", sw.Elapsed, total);
            }
        }
    }
}

Results:

Test array=00:00:06.1396584 [total calls=1000000000]
Test tree=00:00:06.0975110 [total calls=1000000000]
Test recursion=00:00:07.1265290 [total calls=1000000000]

First, you’d be surprise to know that method 1 and 2 yield the same result (run it multiple times, sometimes tree is better and sometimes array is better). So if we’re talking about plain Array vs Tree and iterating children, its the same.

But you might say its not a fair comparison, because you imagine iterating array as loops but walking a tree as recursive calls. Although its not mandatory, I still added method 3 that use recursive function call, which was slightly less than a second overhead over array / tree .
Now you might want to say that a second is a lot, but here comes the context bit - a second for recursively walking 1000000000 objects. that’s 1 billion objects to go through, without culling. If he’s planning to iterate 1 billion entities every frame of his game, the difference between 6 and 7 seconds is not really the issue.

If I shrink it down to a more realistic size (yet still pretty large and something that should be culled), lets say, 10,000 objects, array and tree are still the same, recursive function is few ms slower (at some runs).

If I made any mistakes with my experiment let me know, but I believe even if I did - the difference would still be neglectable and simply not worth the optimization. If he want to draw billions of stars he should be focusing on how to cull and group them so he can process less of them instead of how to iterate them faster.

tl;dr the performance between array and tree is pretty much the same, and the difference between walking a tree recursively and iterations is not even worth to look at in real life. If you are writing the next algo-trading algorithm where every milisecond is the difference between make or break and you need to run billions of calculations per second? sure, go for it. but for a 2d game in C#? Not worth uglifying the code.

LithiumToast · February 28, 2017, 3:02pm

@GeonBit i don’t think your tests really capture the scenario of cache misses.

See http://gameprogrammingpatterns.com/data-locality.html

Dismiss · February 28, 2017, 10:11pm

@GeonBit
A better test would be if Test were a struct, and the method on it had to access a field.
@LithiumToast
Nice link!

GeonBit · February 28, 2017, 11:30pm

@LithiumToast

interesting read, thanks! Indeed the test wasn’t complete.

@Dismiss

Yes using a class instead of a struct was a fault I made. Out of curiosity I updated the test. First I added a field which is used inside the functions (still a class though). I’m on a different (much stronger) PC now so all results are better, so lets compare ratio:

Test array=00:00:03.6223599 [total calls=1000000000]
Test tree=00:00:03.5463660 [total calls=1000000000]
Test recursion=00:00:05.2259963 [total calls=1000000000]

Looks like recursive calls is a little less efficient here for some reason.
Now lets replace with proper structs and test only the array iteration, comparing it to the results above.

Code:

using System;
using System.Diagnostics;

struct Test
{
    //public Test child;
    public int testfield;

    // random action
    public int DoSomething()
    {

        int a = 2 + 5;
        return a + testfield;
    }

    // random action recursive
    public int DoSomethingRec(ref int total)
    {
        //if (child != null) { child.DoSomethingRec(ref total); }
        total++;
        return 1;
    }
}


namespace ConsoleApplicationTest
{
    class Program
    {
        static void Main(string[] args)
        {

            // create the array / tree
            int arraySize = 10000000;
            Test[] test1 = new Test[arraySize];
            for (int i = 0; i < arraySize; ++i)
            {
                test1[i].testfield = i;
            }

            // method 1 - iterate array
            {
                int total = 0;
                Stopwatch sw = new Stopwatch();
                sw.Start();
                for (int x = 0; x < 100; ++x)
                {
                    for (int i = 0; i < arraySize; ++i)
                    {
                        total++;
                        test1[i].DoSomething();
                    }
                }
                sw.Stop();
                Console.WriteLine("Test array={0} [total calls={1}]", sw.Elapsed, total);
            }
        }
    }
}

Output:

Test array=00:00:03.4940627 [total calls=1000000000]

Summary of all runs:

Test array=00:00:03.6223599 [total calls=1000000000]
Test tree=00:00:03.5463660 [total calls=1000000000]
Test recursion=00:00:05.2259963 [total calls=1000000000]
Test array with structs=00:00:03.4940627 [total calls=1000000000]

Still pretty much the same. Different runs give different results, each time different method is winning. Maybe it still doesn’t emulate cache miss correctly? Not sure.

Anyway I still think this optimization (assuming it works) doesn’t worth uglifying the code for, but I’m genuinely curious about seeing the difference between cache miss or not.

If you have any ideas or manage to make it work let me know, I think the experiment is still somehow faulty, because I do expect to see some difference.

Edit:

I also tried replacing the field with a class instance and bloat the original class/struct with longs, something like:

public long a;
public long b;
public long c;
public long d;
public long e;
public long f;
public long g;
public long h;
public long i;
public long j;
public long k;

Still same ratio, but the variations between different runs became bigger. Maybe the struct just have to be really big for it to make a difference.

Dismiss · March 1, 2017, 2:23am

Your classes are actually going to be very contiguous in memory in this test application because there is no memory fragmentation, no other threads, and you allocate them all at the same time during startup. That is in fact an equivalent way of accomplishing the same goal without having to use structs.

To generate real cache misses, you would need to either be…

iterating the array of structs in a random order
or
iterating an array of classes in linear order, but they were allocated in a random order

GeonBit · March 1, 2017, 9:06am

The pointer to classes should be enough back and forth though since I first allocate the entire array of pointer and then creating the classes (its a chunk of pointers followed by a chunk of classes). Maybe C# got internal optimization I’m not aware of that solve this.

I tried this, but you wouldn’t like the results first the code:

using System;
using System.Diagnostics;
using System.Linq;
using System.Collections.Generic;
using System.Collections.Concurrent;

class Test
{
    // child
    public Test child;
    public long b = 1;
    public long c = 2;
    public long d = 3;
    public long e = 4;
    public long f = 5;
    public long g = 6;
    public long h = 7;

    // random action
    public int DoSomething()
    {
        int a = 2 + 5;
        return a + 1;
    }

    // random action recursive
    public int DoSomethingRec(ref int total)
    {
        //if (child != null) { child.DoSomethingRec(ref total); }
        total++;
        return 1;
    }
}


namespace ConsoleApplicationTest
{
    class Program
    {
        static void Main(string[] args)
        {

            // create the array / tree
            int arraySize = 10000000;
            Test[] test1 = new Test[arraySize];
            Test last = null;
            Test root = null;
            Random r = new Random();
            foreach (int i in Enumerable.Range(0, arraySize).OrderBy(x => r.Next()))
            {                
                test1[i] = new Test();
                if (last != null) { last.child = test1[i]; }
                last = test1[i];
                if (root == null)
                {
                    root = test1[i];
                }
            }

            // to make sure tests are "clean" after init code
            GC.Collect();
            GC.WaitForPendingFinalizers();

            // method 1 - iterate array
            {
                int total = 0;
                Stopwatch sw = new Stopwatch();
                sw.Start();
                for (int x = 0; x < 100; ++x)
                {
                    for (int i = 0; i < arraySize; ++i)
                    {
                        total++;
                        test1[i].DoSomething();
                    }
                }
                sw.Stop();
                Console.WriteLine("Test array={0} [total calls={1}]", sw.Elapsed, total);
            }
            {
                // method 2 - iterate tree
                int total = 0;
                Stopwatch sw = new Stopwatch();
                sw.Start();
                for (int x = 0; x < 100; ++x)
                {
                    Test curr = root;
                    while (curr != null)
                    {
                        total++;
                        curr.DoSomething();
                        curr = curr.child;
                    }
                }
                sw.Stop();
                Console.WriteLine("Test tree={0} [total calls={1}]", sw.Elapsed, total);
            }
        }
    }
}

Note that I removed the recursion since its complicated now with the random order. now results (I’m back on the slower PC):

Test array=00:00:46.1011670 [total calls=1000000000]
Test  tree=00:00:09.6252371 [total calls=1000000000]

Looks like in this case tree is a lot faster. Really not sure why tbh. I was expecting them to be pretty much the same. I wonder if C# do something like “load related” internally, eg preparing the child pointer upfront before fetching it. In this case I suspect having an array of children and not a single child (more realistic case) would be just as slow as with the array.

Ofc if using an array he can create a pool of objects up-front with everything allocated in a single batch, but in this case its like the original test where tree and array are the same.

Dismiss · March 1, 2017, 10:27am

Thats… unexpectedly extreme.
I’m suspicious o.0

The only other factor I can think of that we haven’t already discussed is the cost of doing ‘out of bounds’ testing on each indexer operation. A linked list doesn’t have that overhead.

(note: to compare an actual ‘tree’ not a linked list, there would need to be more than one child per item (min two or its not a tree)).