Wasted clock cycles in DrawBatch() ?

magneticmuesli · November 3, 2020, 9:22pm

Ok first of all: I’m not pretending to understand these things well at all so this is me just wanting to learn how to think better about them.

In the DrawBatch method in the DrawBatcher class I’ve noticed this at the top:

    // nothing to do
    if (_batchItemCount == 0)
        return;

Is it not extremely rare for DrawBatch to be called with no batch items?

Since the main bulk of that method is inside this while loop:

    // Iterate through the batches, doing short.MaxValue sets of vertices only. 
    while (batchCount > 0)
    {
         ...
    }

Is it not acceptable to let those presumably exotic empty batch calls run through the method? That would mean one less if statement in this critical code?

Again, might be there for good reason, but it would be cool to understand why…

Cheers

Arcadenut · November 3, 2020, 9:53pm

That check takes such an insignificant amount of time that it’s not even worth worrying about optimizing it.

Why are you worried about the performance of that? Are you running into a performance issue?

Arcadenut · November 3, 2020, 10:24pm

I did a little bit of looking and these days it’s very difficult to tell how many cycles a given instruction takes, but in the example above, I would assume around 3 cycles total (Load Value, Compare to 0, Jump if Equal).

So if it’s running on a 3Ghz processor, you save 0.000000001% of the time spent in that function.

magneticmuesli · November 3, 2020, 11:56pm

Haha ok wow! Those numbers definitely put things in perspective. I won’t lose any sleep over that percentage…

Just playing devil’s advocate at this point: Is it fair to say that it’s making an exotic case run an insignificant bit faster by making a very common one an insignificant bit slower?

Arcadenut · November 4, 2020, 12:46am

For Optimizations, the number one rule is “Only Optimize what needs to be optimized”. If you just start randomly picking things to optimize you’ll be wasting a lot of time and resources for very little gain.

The most significant increases in performance are those that change the way you are doing something rather than optimizing the code itself.

You should always profile your code and start with a base-line to find out what is slow and start with that. Once you find the “slow” spots, you need to take a look at how you are doing it and see if there is a more efficient way of doing it.

ShaggyTheHiker · November 4, 2020, 4:24pm

If you have any doubts about that, it’s good to have a little test app where you can try things out. A simple WinForms application with a single form will do. Put a pair of method stubs on there along with a stopwatch object for timing. I like to run method 1, then method 2, then method 2, and finally method 1, so each gets called twice. That almost never makes any difference, but you don’t want to call each one just once, because you’ll see that there’s a surprising amount of variability in timing. I know another person who runs each method three or four times and gets an average. I prefer just eyeballing the results.

As to those results, I have labels to show the elapsed milliseconds for each method, so four labels.

Inside the stubs, you can put whatever you want. In this case, you might put the If statement in one and make the other one identical except for the If statement. What you’ll find, though is that you’ll have to put them into a loop, and run the loop thousands or tens of thousands of times to get the difference high enough so that you can see the difference. For example, here’s what I happened to have in my test bed project currently:

 Private Sub test1()
        Dim x As Integer
        Dim n As Long = 0
        Dim lGD As TestGUID = Nothing
        Dim a As Integer = 1
        Dim b As Integer = 1
        Dim c As Integer = 1

        For x = 0 To 10
            Using cn As New SqlClient.SqlConnection(mConString)
                Using cmd As SqlClient.SqlCommand = cn.CreateCommand
                    cn.Open()
                    cmd.CommandText = "SELECT * FROM GEN_Configuration"
                    Dim obj = cmd.ExecuteScalar
                End Using
            End Using
        Next

    End Sub

    Private Sub test2()
        Dim x As Integer
        Dim n As Long = 0
        Dim lGD As TestGUID = Nothing
        Dim a As Integer = 1
        Dim b As Integer = 1
        Dim c As Integer = 1

        Using cn As New SqlClient.SqlConnection(mConString)
            Using cmd As SqlClient.SqlCommand = cn.CreateCommand
                cn.Open()
                For x = 0 To 10
                    cmd.CommandText = "SELECT * FROM GEN_Configuration"
                    Dim obj = cmd.ExecuteScalar
                Next
            End Using
        End Using
    End Sub

I remember what this was about. Somebody had asked whether opening the connection once and performing a bunch of queries was noticeably more efficient than opening a connection once for each query, then getting rid of it. In this case, since there was a SQL query involved, I only had to do 10 iterations to be able to see a difference, but if you looked at something like the cost of the If statement, you’d probably need to do tens of thousands of iterations for the difference to become large enough to notice.

In general, you’re right to be suspicious of If statements. Since the Pentium processor, conditionals can have an unpredictable impact on performance, but in all cases…it’s pretty modest, these days.

Arcadenut · November 4, 2020, 5:26pm

You really need a profiler to get more accurate results.

The problem with putting a timer around the stub in your example, is you’re not really profiling correctly. You don’t have enough information in that case to know what the real slow down is. Is it the overhead of creating the connections? or is it something on the SQL server slowing you down? When you execute SQL multiple times you could be running into caching issues that would skew your results.

To make your test more accurate to answer the question of whether the connection creation is the bottle neck or not this would require you to remove the execution of the SQL statement or time it independently of the method so you can subtract that time out.

LithiumToast · November 5, 2020, 2:50pm

If you are just comparing methods, you can just use https://github.com/dotnet/BenchmarkDotNet

ShaggyTheHiker · November 5, 2020, 4:16pm

That is sort of true, sort of not true. For one thing, it requires you to care what the real slow down is. All of the points you make are technically valid, but also irrelevant. For example:

Is it the overhead of creating the connections? That’s the point of the test. As it turns out, there is no difference between the two, so there was no overhead for creating the connections (pooling is efficient). Had there been a difference, then it would be worth looking further if you cared, but in this case, there was no average difference, so looking further wasn’t justified.

Is it something on SQL server slowing you down? Sure. That could be. Of course, it slowed both down equally. Still, that’s a very valid question, it just wasn’t the question asked or answered by the test.

You could run into cashing issues? That’s always a possibility with .NET. In fact, if you do things right, caching will be the biggest difference, hence the pattern of calls. By calling 1, 2, 2, 1, then running the test several times, you will see caching.

Profiling is important. Understanding profiling is even more so. This IS a form of profiling, and one that is super easy to use. It ONLY works for comparisons of very small things. For example, you can use it to look at the difference between addition and division, but you can’t use it to study where the time is being spent over an entire algorithm. Still profiling, though.

Arcadenut · November 5, 2020, 5:50pm

My point was, you were measuring something that could vary wildly (How long it takes the query to execute) and it wasn’t part of what you were trying to find out. One run the method could have executed in 1ms, the next run it could have executed in 1,000ms. Now your information is skewed because you’re measuring something that is not relevant to the question. You might waste a bunch of time trying to optimize that code when really the problem might have been something you had no control over (External Resource, i.e. the SQL Server)

If you were running the code in a profiler rather than using a Stopwatch, it would be very apparent that the creation of the connection was insignificant in either scenario and you could ignore optimizing the code.

If you don’t have a profiler (Does VS Code have that? Don’t know) then if you’re going to use techniques like Stopwatch then you need to be aware of what’s between the start and stop calls and make sure you manually exclude any time that isn’t relevant to question you’re trying to answer.

I do use Stopwatch in my games for profiling in addition to using a profiler. I put a Stopwatch around all my Draw code and a Stopwatch around all my Update code so I can monitor for any issues during development. If either exceeds a certain threshold, I can then run the profiler and find out what is the exact cause of the slow down is.

Another trap that people fall into when optimizing is they see a method that takes 2 seconds to run, and they think “I need to make that faster!”. But they fail to ask, “How often does that get executed”? If it’s once and at the start of the game, the user may never even notice it. Don’t waste the time. If it’s called 1,000,000 times then yes, that might need to be looked into.

ShaggyTheHiker · November 5, 2020, 8:10pm

Yes, that’s kind of true. For one thing, you have to do iterations. I generally run a dozen or so. That will tell you two things:

Whether or not the test is meaningful.
And if it is meaningful, what is the result.

What you described would result in a wild variation between runs. If you were to calculate the standard error, it would be quite large relative to the mean, in which case you know that the test is meaningless. You generally don’t have to measure the standard error directly, though, the Interocular Arm-Length Test is sufficient. If it doesn’t pass, then the test is worthless. If it does pass, then the test is meaningful.

It’s not an issue for the places where this approach would be used, and it wouldn’t be an issue for the question posed by the OP. It also wasn’t an issue for the example posted, but I already noted that. Sure, if there had been a huge difference between one run and the next, then the test wouldn’t be meaningful. There isn’t though. There is variability, which comes from the differences you postulated, and others. However, the test, as laid out and described, did answer the question posed. Had the variation between runs been quite large relative to the mean, then it would be theoretically possible to identify a difference with a sufficient number of repetitions, but that wasn’t necessary.

If that isn’t clear, then try it out using the two methods shown, and you’ll see what I mean.

I totally agree with most of what you said. You can save nanoseconds with optimization. To save milliseconds, or greater, you need to change the algorithm used. For this reason, optimization is only a last step. However, there isn’t any particular reason to write an inefficient line when you know of a line that is more efficient.

Arcadenut · November 5, 2020, 8:18pm

I will have to disagree with you on this to an extent. In cases where performance doesn’t out weigh readability and maintainability in the code. To me I write code to be easy to read and easy to maintain. If performance is required for that code then those two might suffer.

As a simplistic example, I’m not going to worry about if a foreach is faster or slower than a normal for loop, or what have you if the performance of that loop doesn’t matter.

ShaggyTheHiker · November 5, 2020, 9:34pm

That’s a very valid way of doing things. I agree that maintainability is a goal that is certainly worth a few cycles at times. On the other hand, it’s also somewhat in the eye of the beholder. For example, you can use LINQ and lambdas to write a very compact line, but you can also use it to write a very confusing line. Where the line should be drawn is a topic of considerable debate. Still, efficiency is not the sole virtue in code and shouldn’t be.

As for the for each vs for that’s an excellent example to use for the point you wanted to make. Back in 1995, when VBA was first introduced into Excel, the official guidance was that For…Next is faster than Foreach…period. That guidance then went away. I believe there is not necessarily a difference these days, but that there are cases where there might be (C# may be worse than VB in that regard). So, it’s an excellent example to cite, because it’s one where there COULD be a difference, and it’s one that should almost always be ignored.

Arcadenut · November 5, 2020, 10:32pm

Agreed. Goes back to rule #1 of Optimization Only Optimize what needs to be optimized.

MrValentine · November 5, 2020, 11:04pm

https://pics.onsizzle.com/2016-started-with-the-death-of-a-gorilla-and-ended-7485747.png

EDIT

Wow the forum broke just then

Arcadenut · November 5, 2020, 11:15pm

I would also like to mention that the compilers these days are better than 99.99% of developers at optimizing code these days. So avoid spending your time on the little things as the compiler could change the code that is generated anyway and your optimizations would no longer work or could actually give you worse performance.

MrValentine · November 5, 2020, 11:16pm

Provided they are using Native compilation

Arcadenut · November 5, 2020, 11:20pm

Not necessarily. Compiling into any other form (IL, ByteCode, etc…) can either generate good or bad code. In addition to that, since we are using .NET, the IL is compiled at runtime by the JIT for the platform it’s running on.

.NET will also have AOT Compilation in .NET 5.

Arcadenut · November 5, 2020, 11:21pm

Here is a good video, it’s for C++ but the principles apply to everything.

Arcadenut · November 5, 2020, 11:23pm

Here is one that applies directly to MonoGame as it’s about C#