PBD Fluids,ComputeShaders and Monogame

In order to continue with my project The SciTech Playroom it has been necessary that I include reasonably believable interactive fluids. Therefore, the last couple of months have seen me researching and experimenting with various means by which this can be accomplished.

I have chosen Position Based Dynamics (PBD) particles, combined with the SPH algorithm to provide the liquid properties.I was able to find two Unity implementations of PBD posted on Github. One was CPU based :Position-Based-Dynamics and the other was GPU based, using Compute Shaders:PBD-Fluid-in-Unity

The challenge for me was, firstly, to port the Unity C# code to Monogame C#. Then, secondly, to implement Compute Shaders into the Monogame code of my demo app. If successful with these two tasks then optimise so as to get the speed boost required for interactive liquids.

I’m pleased to say that all the above has now been accomplished. They say that “the proof of the pudding is in the eating” so here are a couple of vids to show the results.

CPU based implementation (2000 particles @ 2fps)

GPU based implemention using Compute Shaders (4500 particles @ 59fps)

The CPU based implementation is certainly NOT the way to go.
I will thus be following through on the GPU implementation.
Further work will be to add some kind of surface rendering to the particles (Marching Cubes? Volume Rendering?) to give a more cohesive liquid look.

My thanks to ‘Scrawk’ for making the Unity code freely available.

6 Likes

Which CPU?

Could help as a benchmark

CPU: Intel® Core™ i5-4200H CPU @ 2.8GHz 8.0 GB RAM

GPU: NVIDiA GEFORCE GTX 950

1 Like

If you aren’t already doing some form of particle instancing you may want to see if that can help improve your results. GL instanced shader problem.
see the bottom though in the code i posted a million particles i only had about a third that number on screen still in the hundreds of thousands.

Thanks for your suggestion, willmotil
I actually do have an instancing option available in the above demos, but did not have it set, as I wanted to render the particles as 3D spheres (using Monogame BasicEffect shader) so as to distinguish them more easily. With instancing, the particles were rendered with a plain texture only and no normals, so they appeared as flat color and could not be individually distinguished. Anyway, the instancing did not enable any dramatic increase in speed. I think that, in the case of the CPU example, the very intensive mathematics operations on a single thread accounted for the big slowdown in framerate. Possibly moving these operations onto a seperate thread might help increase the framerate, but CPU multithreading is not something I want to venture into.

I suspected you were not on an i7 :slight_smile:

Consider upgrading sometime soon when possible, modern i5’s are six cored now… but any i7 would wreak havoc on any code you throw at them these days, but wait for the10th gen lines to hit the market properly.

Keep coding!

Ah sounds like you’re cpu bound.

I have stand alone frame rate class that you can drop into you’re project to see.
It displays the draws and updates and draw to update ratio as well if you like.
You can instance though with 3d objects same as 2d and the basic effect shader.

Maybe you can do that by spawning a bunch of tasks in a monitored lock.
Or use a parallel for loop. Though im not really the one to ask i haven’t done any difficult threading even in forever. But i believe these were the newer ones geared toward just this type of thing getting the cpu to burst out a bunch of functions across all its resources.

// ______________________________________________________________
        // ex Example11_Parallel_Loops
        // https://docs.microsoft.com/en-us/dotnet/api/system.threading.tasks.parallel.for?view=netframework-4.8
        // ______________________________________________________________

    /*       
      
     */
    public class Example11_Parallel_Loops
    {
        public static void RunTest()
        {
            Console.WriteLine("\n Example11_Parallel. \n This one uses the parallel class. \n This example demonstrates several approaches to implementing a parallel loop using multiple language constructs. \n");

                ParallelLoopResult result = Parallel.For(0, 100, ctr => 
                {
                    Random rnd = new Random(ctr * 100000);
                    Byte[] bytes = new Byte[100];
                    rnd.NextBytes(bytes);
                    int sum = 0;
                    foreach (var byt in bytes)
                        sum += byt;
                    Console.WriteLine("Iteration {0,2}: {1:N0}", ctr, sum);
                }
                );
                Console.WriteLine("Result: {0}", result.IsCompleted ? "Completed Normally" :  String.Format("Completed to {0}", result.LowestBreakIteration));
            }
        }

this one seems pretty fast as well using tasks

        public void Example1_TasksAndMonitor()
        {
            Console.WriteLine("\n Example1 \n");
            List<Task> tasks = new List<Task>();
            Random rnd = new Random();
            long total = 0;
            int n = 0;

            for (int taskCtr = 0; taskCtr < 10; taskCtr++)
                tasks.Add(Task.Run(() =>
                {
                    int[] values = new int[10000];
                    int taskTotal = 0;
                    int taskN = 0;
                    int ctr = 0;
                    Monitor.Enter(rnd);
                    // Generate 10,000 random integers
                    for (ctr = 0; ctr < 10000; ctr++)
                        values[ctr] = rnd.Next(0, 1001);
                    Monitor.Exit(rnd);
                    taskN = ctr;
                    foreach (var value in values)
                        taskTotal += value;

                    Console.WriteLine("Mean for task {0,2}: {1:N2} (N={2:N0})",
                                      Task.CurrentId, (taskTotal * 1.0) / taskN,
                                      taskN);
                    Interlocked.Add(ref n, taskN);
                    Interlocked.Add(ref total, taskTotal);
                }));
            try
            {
                Task.WaitAll(tasks.ToArray());
                Console.WriteLine("\nMean for all tasks: {0:N2} (N={1:N0})",
                                  (total * 1.0) / n, n);

            }
            catch (AggregateException e)
            {
                foreach (var ie in e.InnerExceptions)
                    Console.WriteLine("{0}: {1}", ie.GetType().Name, ie.Message);
            }
        }
1 Like

Thanks for your helpful comments, willmotil…but I’m putting CPU multi-threading on the backburner for now.
ComputeShaders are currently meeting my needs so I will be travelling that road until the next hurdle :slight_smile:

Compute is definitely the way to go. You can use unsafe blocks in C# to pin and ptr access for these sorts of things in C# but it’s still a piss-poor slow excuse compared to C++ (where you’ll then but up against, “well damn, memcpy is 20% of my CPU time drawing these 62,500 stanford rabbits”) - compute and indirect-draw for the win.

Do you have any details on how you get compute shader working under Monogame, that you would be willing to share, please?

Personally I use the same tweaks as my DX11 GS related personal fork. As long as MG exposes access to the raw objects (or is forced to expose them) it’s trite to use it.

Though 30k draw-call indirect-draw scenarios are never going to happen in C# land. Hell, 1000 draws is mind blowing when C# slower-than-shit-out-an-asshole is involved.

.NET as a whole is incompetently slow. If you write genuinely fast code you just guarantee that VS will corrupt your projects in a few hours. Handling of unsafe blocks is still poor 20 years after being promised multiple inheritance in C#.

Never going to forgive them for not delivering inheritance in C#2.0. Looked me in the fucking eye and lied to me.

Just saying, try to control language please, there are minors in here sometimes.

The 4 months since my last post have been a bit of an uphill battle for me.
Many trial and error attempts were made by myself to overcome some of the issues I
encountered in bringing about some satisfaction in applying of ComputeShaders to MonoGame.
This was mainly brought about by the lack of some functionality in MonoGame
to smoothly integrate with SharpDX’s ComputeShaders.(e.g. no access to ComputeShader’s StructuredBuffers, RWTextures and ConstantBuffers, no easy access to GeometryShaders, lack of Vertex Texture Fetch(VTF) in MonoGames’s vertex buffers.)

The other problem I had was that my fast Win10 laptop gave up the ghost, and I’ve had to
revert to my old slower Win7 desktop. I was, however, somewhat surprised that I was able to still squeeze out a fairly satisfactory performance.

All this resulted in my many attempts at workarounds to overcome these hurdles.
Hardware instancing with billboarding of the particles so that they could be viewed at
any camera position also consumed much of my time.

Google and the MonoGame Community Forum helped me immensely in this regard.
My thanks to all contributors.

As a result of the above issues, I have not been able to produce the framerates and
particle numbers that one would expect, but 7000 particles at 30fps on an old Win7 machine
suits me just fine for my particular project The SciTech Playroom for which I especially wanted to use MonoGame.

In my opinion,Compute Shaders would add immense benefits to MonoGame and perhaps the devs should look into their future inclusion.

As an aside, I’ve been mulling around with the thought of putting together an e-book ‘cookbook’ of
my journey in the relatively uncharted region of applying BulletSharp and ComputeShaders to MonoGame. I’m a simple man and I like simple solutions, so what I provide will be based on that concept. However,I would have to charge a small monetary amount, as I’m an ‘old’ man (though still young at heart) with a rapidly dwindling pension. Any thoughts and comments on this would be greatly appreciated.

3 Likes