Methods or Patterns for handling Input Commands

Just use a stack of event handlers.

So the player is playing the game , the stack has the player character controller in it.

He presses the pause button, you add the pause menu controller to the top of the stack, it handles all the inputs.

The player presses resume, you pop the pause menu controller off the stack and control passes back to the player

Hmm, so in essence, you’re suggesting I basically have another state manager, but for inputs. So only the specific ‘domain’(Menus, game screen, etc) is having its domain-specific command events fired?

That could work. The only change I’d even really need to make to pair each game state with a controller state. The object which pushes game states onto my state manager would also need to push the controller states at the same time, and it would need a way to know which controller states a given game state requires to function. That info could probably be on the game state itself, perhaps.

Am I following you correctly on that?

There are also Funcs and Action Delegates so you can make something more complex.

Ya everything should go thru a Ui state manager Each key should only be linked to one ui mode at a time.

You can think of playing your game as a Ui play mode and say a Textbox as part of a different Ui mode.
Both can register to the same Command which has a assigned key combo but not at the same time so to only one mode. (or stack as stated above by tobi or with a array and mode index if you like)
That way you switch to what is fired based on what mode you are in.
Clicking on the screen were a textbox icon is or hitting enter can switch modes.
Or
You can think of switch modes by first un-registering all the textboxes key commands when hitting enter to leave the textbox then registering everything for play mode as you re-enter it.
Doing the precise opposite when entering the textbox mode and leaving the game mode.

So you could fundamentally structure things ideologically in a few different ways but you want it all controlled thru a central class to keep it organized.

Either way entering or say pressing tab or clicking on a switch button on the screen can just cause a switching of modes and the appropriate unregistering and registering to the Command delegate or calling to the proper index if its not null.

What the keyboard combination is mapped to for a command can also be assigned in that ui state manager as well.

I am a little confused about what you want to do.

Say you have a fighting game which uses key combinations for actions. So say XXXYYBB == Spinning Heel Kick

You would not want to record those button presses when a menu is visible. That would be just cheating.

Pause the game , press XXXYYBB , unpause game , instant Spinning Heel Kick.

In the famous words “Do right. No can defend”

If one of your handlers records key combinations, when a new input handler is pushed onto the stack, it stops being updated. So any existing state is preserved until it again is the active handler .

No need for anything clever the very nature of the design helps you out.

The one thing to avoid is using timestamps. Use time span instead.

So say you want to have something happen when the user presses 0 for 5 seconds.

If you store the time the 0 key was first pressed then the user paused the game, when the game is unpaused things go wrong.

Instead set a counter to zero and on each update add delta time to it

Sorry, I was a bit rambly last night. I’ll see if I can be more clear. Being more clear means a wall of text, so apologies. If you skip this I won’t feel bad. I’ll put a TL;DR toward the bottom.

If I press A on my keyboard, my current InputManager class reads that key press(The Keys enum in Monogame), and matches the Keys enum to a more generic string (“A”). I have another class I’m calling the SequenceParser… it basically looks at a list of all the keys which were pressed on a given frame and decides if a valid sequence can be constructed from it. A valid sequence would be some combination of keys which include a modifier key, like Shift, Control, etc… It’s not for mapping things like combos in a game or anything like that. I’m not there yet.

So LControl+C is a valid sequence, but X+X+X… is not. That’s not what the class is trying to build.

Anyway once any valid sequences are found a final list containing every valid input for the frame is made. If I pressed Left Control and C that frame, then the list would contain { “LControl+C”, “LControl”, “C” } . Basically every individual key press, plus a constructed sequence if valid modifier keys were included.

The next step is to match those individual presses and sequences to what I was calling ‘commands’, but ‘input actions’ might also be another common word. Or hotkey… etc. Using your fighting game example, I might have a command/action for:

Low Attack
High Attack
Up, Down, Left, Right
Etc

Low Attack could be arbitrarily bound to something simple like X, High Attack to C, and the directions to WASD, or Arrow Keys. But, crucially, I could also bind them to Left and Right mouse buttons and WASD. Or to GamePad A, GamePad X, and the D-Pad directions. So to maintain device independence, its important that DpadUp, ArrowKeyUp, and W, are all possible assignments for the ‘Up’ command/action.

The fact that pressing W on my keyboard 3 times and then The C key twice maps to some kind of special move (Up, Up, Up, Heavy Attack, Heavy Attack), is something an entirely different part of my code will care about later, when I get there.

So where the entry point of my real question comes from is here.

I’ll just use my current worst-case scenario as an example. I have a (very shoddy, half-assed) editor in my setup right now for editing levels in my game. Ideally, I should be able to transition from the editor to the game, then back to the editor(Like in Mario Maker, for example). That’s all handled fairly easily with a state manager class. So far so good.

Where my uncertainty comes in is with the inputs. If I used events in the traditional sense, I might have a published event for every command/action. That way any code that cares about knowing when ‘High Attack’(“C”) has been pressed by the user will know about it and can react to it.

Of course the trouble with this simplistic publisher/subscriber model is that if I bring up my menu or editor, and my game character is still listening for ‘High Attack’ and other keys, those events will still fire and the state of my game character changes even while I’m in the menu, as you noted in your flying kick example.

TL;DR

So really what I’m asking is, what pattern will allow me to create delegates/events which my input manager can invoke when raw inputs come in, which will inform only the currently active state. Or maybe put another way, how I can be sure the InputManager isn’t invoking commands/actions for input domains(menus, game screen, text entry, etc) that shouldn’t currently be receiving inputs.

This question is further compounded by the added detail that some of my game states could reasonably be listening for events from more than one domain. My editor state for instance, would want to listen for menu/UI related inputs, but it probably also has its own editor-specific domain for actions like ‘Pan Camera Up, Left, Down, Right’ or ‘Place Terrain’ or whatever.

The comments you and willmotil left I feel have gotten me 85% of the way towards figuring out something robust to handle this, so I’m really appreciate of that. I’m just missing some of the more specific class/implementation details to make it all click in my head.

I would have a singleton that reads all input devices, creates the list of events in your own form, and holds a stack of event handlers.

This singleton has static methods so it is visible to everywhere you need it.

The stack will operate as we discussed above.

That’s really all you need.

You can structure the methods of the singleton any way you like, say

public static bool IsKeyDown(KeyCode code);

or

public static bool IsKeyDown(Sting code);

Then in your EditorInputHandler Update method you can have code like

if (InputManager.IsKeyDown(KeyCode.AltF1)
{
      // do something
}

The combination of the stack and the singleton is very powerful

Yep, thanks.

I spent the last few days trawling through forums and other sources and brushed up on various input handling methods. What I’m currently coding is similar to what you’ve suggested, minus the singleton. My InputManager class is handed around as a ServiceProvider.

Anyway, the basic flow is now:

Input manager maps all the virtual keys to see which ones have been pressed > A queue of these raw inputs are handed to an input controller > the input controller looks on a ‘context’ stack to see what the current input context is > map raw inputs into actions(MoveUp, Attack) > the map is generated from a simple text file which indicates which raw keys map to which actions > the final actions are put into another queue > that queue is then handled by the current active gamestate which reacts to whichever actions it sees fit.

The real power of this is the ‘input context’. Put simply, gamestates can push a context onto the context stack at will. So if I bring up a menu, the menu pushes a ‘Menu Context’, and that menu context has very specific mappings for keys which can be different from the ‘Character Moves Context’. If one context doesn’t map a raw input, that input falls down the stack until it gets mapped. Contexts can also be set ‘modal’ so they prevent the falling-down, in case a text input window or something else wants to consume all the focus from input.

It’s a little hard to explain succinctly, but it’s basically copied from this: https://www.gamedev.net/articles/programming/general-and-gameplay-programming/designing-a-robust-input-handling-system-for-games-r2975

Which is an interesting read. His Git is on https://github.com/grimtraveller/scribblings-by-apoch/tree/master/inputmapping

It’s in C++ so I had to spend a few hours studying C++ syntax to really parse it, but it’s all there and it does what is says on the tin. His method uses a callback function whereas I’m polling the final action queue, but either way… it’s pretty cool. I’m going to play with it and see if I like it.

You mention concern over doing O(n) on game inputs but n will probably be 0 or 1 and probably not more than 5 right? I don’t think it will be a performance issue.

You’re right. Inputs aren’t really a performance concern. Even if you roll your face across the keyboard while pumping your mouse in one hand and slapping your other hand over your controller you’d still only be generating a few dozen input events every frame, which just isn’t a big deal at all.

In the past few days of research I’ve changed my mind on polling. I don’t see it as a serious concern, provided that the polling is being done by only a few manager-type classes that is. Where polling and that O(n) operation would become a performance concern is if someone had every entity and class concerned with inputs, in their entire project, polling for inputs. For instance, if you build your own GUI, and you made every widget in the GUI poll to check for every type of input it can be affected by, and your GUI might be made of dozens and dozens of widgets. Then you might end up with a lot of O(n) operations going on every frame, which could potentially affect performance a little bit if there were a lot of inputs, a lot of entities, and a lot of conditional checks in each entity for each type of input it cares about(and even then it might not be too bad). The obvious solution to that is to… not do that. It isn’t needed anyway. xP

Better to have a few classes(like your game state class, or maybe Game1 class) which polls the inputs and knows which objects in your game need those inputs and relays them as needed. Or something like that.

That’s basically the system I’m writing right now. My game state class deals with the inputs every frame and polls them to see if anything interesting is generated each frame, and if it is, it has the logic built in to know what to do with the actions being generated by the inputs. This has the advantage also that I don’t have to worry about events being fired on objects which aren’t currently active/being updated by the state manager. The state manager is the relay, and that relay is only active when the state is active. This also avoids issues with unwanted input buffering which StainlessTobii mentioned. No performing a bunch of actions after unpausing. No inputs are being buffered when the state isn’t active.

Yes input management is not usually a performance issue.

Be careful with HID devices though, you need a high frequency interrupt to read some of them and if that is not created correctly it can cause problems.

Having said that, I did once have a bug report on the Atari ST 520

“If you press these five keys with your right hand {list of keys}, and these five keys with your left hand {list of keys} , then press the space bar with your nose, the game crashes.”

Bug fix was

“Don’t do it”

I had a project where I wanted an in-game editor that was shippable, in a tile-based game, with touch/mouse input for both the in game editor and the gameplay itself.

So, game-scene object click detection for objects in gameplay versus those same objects but for editing…versus other objects in the scene that are only there while editing (ie, a gizmo).

Turns out designing the architecture for this sort of thing, if its going to be elegant and not lead to a ton of code duplication, is really quite complicated.

I’ve been discovering this, yes.

The biggest issue, really, is the editor and the GUI which supports the editor.

The editor needs a reasonably fully-featured GUI(and I decided to make my own GUI because I’m an insane person, but I actually have 80% of something working already, just waiting for my final input module) in order to work. This means I ideally need things like functional modifier keys(for stuff like CTRL+Click, or ALT+Click for an eyedrop tool, or CTRL+C for copying… etc…). Modifier keys add a whole new dimension of complication, especially if you want to allow more than a single one in a hotkey command, and also want to ensure order of the input keys is maintained between frames(so that Shift+Control is treated as Shift -> Control rather than being treated the same in any order). Add onto that all the needs of a GUI for routed mouse and keyboard events, focus switching intelligently…

And on top of that are the commands for the in-game character.

And of course it all needs to be rebindable, because I personally hate when I can’t rebind keys in games.

And of course it all has to be data-driven so that I can easily add new actions, action contexts and keybinds directly from text files/xml. Which also need some mechanism for resorting default settings. And obviously saving the user’s personal settings…

It’s kind of a mess, honestly. But I am slowly moving toward a decent solution which, once done, should also be portable to any other Monogame project I work on. I don’t know how ‘elegant’ it will be in the end, but at this point I’ll just be glad if it works, a little code duplication and inefficiency be damned. Actually a big part of my inefficiency at the moment comes from the API between Monogame’s built-in input tools(Mouse/GamePad/Keyboard state) and my own input manager. Having to convert 3 different input types with different methods of relaying pressed keys, into a single cohesive registry, which can then be used by the rest of my code, is involving a little bit of code spaghetti.

It’s actually kind of odd to me that the Keyboard state struct has an easy to use ‘GetPressedKeys’, but the GamePad and Mouse state do not. Have to check each button on each device individually. Not a big deal by any means, but it would be a nice quality of life feature.

Ya to be honest i have like my i don’t even know like 30th rendition of my gui at the top of my projects list.
I just never am happy with the ui. I have looked at other peoples and don’t like theirs either.
I sort of have a ton of requirements i guess.
Anyways im working on my own again which is purely monogame c#.

1 Like

Out of curiosity, what aspects/features of the UI are you not happy with in other UI/your own UI?

I think input handling and focus handling are the parts that I’ve hated the most. That, and how to handle the UI with regards to the game’s state machine. Should each window be its own state? Should there be a single UI manager which swaps windows? How does that work with the state machine? If I put a new options screen on top of another options screen am I putting the same instance of the UI manager state on top of itself in the state manager? Seems pointless. But if not that, then what tells the UI manager which window to display at what time… etc…

It all just feels really messy. In fact, input handling and the UI in general have been the messiest and least pleasant aspects of the game development process I’ve dealt with so far. I guess because those elements directly interface with the user, and so they have to deal with a lot of ‘what ifs’ in their design since the user is unpredictable.

Compared to something like… making a camera which follows a character, which feels very deterministic and straightforward.

At this point.
The ‘bare minimum structure and code of the base class which is enough’ that all button types and windows derive from so its not over-complex or to simple either, no garbage, fast.
This includes basics like automatically placing sizing scrolling and determining visibility. tons of other stuff some of it specific but the structure has to intrinsically lend to it.

I have some guidelines i can basically recite off the top of my head.
In general i have come to the conclusion that all buttons or types of buttons that contain other buttons should derive from a single abstract base class (or interface) which should hold a reference to a parent button of the same base type as well as have a list of children of the same base type. (this means a state manager can essentially have a single node then a tree of screen states or sub states were all button types are added) It should have at least 3 abstract methods Update, Draw and Recalculate. Derived buttons may have more variables exclusive to each type but most of that should be set from the constructor and accessed thru one of the previous methods. The base class should probably at the very least have a abstract or virtual property textLabelRectangle a ItemRectangle (the item encompassing its entire area) and Child Area rectangle. Mines actually got tons more atm but the fewer the better if you can get away with less
(these seem to multiply for me).

Don’t use strings for text use stringbuilder at the least.
I have a stringbuilder wrapper class you can use that will prevent most garbage generated from dynamic numerical text.

Yeah, sounds like we had a similar thought process in GUI widget design. My GUI also has a similar structure using an abstract base class(which also implements an interface). The 5 main public functions are Init(), Draw(), Update(), Measure() and Arrange().‘Optimal’ sizing calculations are done in measure, final size and position calculations are finalized in arrange.

For properties, my base class also includes all the basic visual properties. Min/Max/Absolute height/width, Horizontal/Vertical alignment, padding, margins. My thinking there is that there won’t be any type of widget that is non-visual, so there’s no need to put off assigning the sizing properties to a subclass.

Parent, of type base interface/class.

Lastly the rendering properties. Position(Vector2), Bounding box(Whose x,y are automatically derived from position).

So in total, around 11 properties(most are the sizing properties) and the 5 methods. Subclasses can change sizing/child positioning behavior by overriding the virtual protected ‘ArrangeChildren’ method.

I also handled positioning from a purely ‘relative to its parent’ perspective. The position property includes the parent’s own positioning offset + addition offset determined by parent. No need to recalculate the positions this way when dragging the window around. You just recursively += the vectors of every object in the object tree.

So far it works fast. The only expensive operation is resizing the window, by doing some cache optimizations will help with that if I need it later. ATM, the only major work left to do on my GUI is input/focus handling… which is the far messier and less straightforward part. I’m thinking the gui manager class will be doing 95% of the work for handling that stuff… once I actually finish designing and writing the base input system anyway.

I didn’t realize there was such a large issue with SB garbage collection though. I thought it was suppose to be more efficient than using regular strings/string interpolation.

Your doing a lot of the same things im doing atm.

Any numeric ToString implicit or explicit will basically generate collectable garabge pressure that will eventually result in a collection. That above class really functions to bypass ToString to get around that for dynamic numerical text display most of the extra stuff is just gravy. You can test it with the framerate class next to it. Which shows the garbage pressure in real time.

Ya this sort of thing is like my kryptonite. One of the biggest pains to me is cutting off buttons hidden by other buttons or window edges which can force recalculations of tons of rectangles when say your using a scroll bar.
Just optimizing it so its not slow there are really alot of rectangle calculations when you have windows in windows and it seems to get overly complicated fast.

Edit on the garbage thing insert and delete have problems also.
I made a little test you can run to show were regular string builder breaks down on numeric text.

        initializedOnce = false;
        int numberIterator = 0;
        // works for test 1 but not test 2    comment uncomment this or the below
        //List<StringBuilder> posRulerText = new List<StringBuilder>();
        // works for test 1 and test 2
        List<MgStringBuilder> posRulerText = new List<MgStringBuilder>();

        int startx = 350;
        int starty = 140;

        protected override void Draw(GameTime gameTime)
        {
            // begin clear ect  use your own font.

            //
            // Setup a series of string builders to hold text with numerical data. While this is a one time collection at start up this sort of collection is inconsequential.
            //
            if (initializedOnce == false)
            {
                int j = 0;
                for (int y = starty; y < (starty + 500); y += 20)
                {
                    posRulerText.Add(new StringBuilder()); // the wrapper can take stringbuilders.
                    posRulerText[j].Append(y);
                    j++;
                }
                initializedOnce = true;
            }

            // test 2 dynamic numerical output data. stringbuilder cannot handle this.

            bool everyturn = true;
            if (everyturn)
            {
                numberIterator++;
                if (numberIterator > 100)
                    numberIterator = 0;
                // constantly alter the data.
                int j = 0;
                for (int y = starty; y < (starty + 500); y += 20)
                {
                    posRulerText[j].Clear();
                    posRulerText[j].Append(y+numberIterator);
                    spriteBatch.DrawString(Gu.currentFont, posRulerText[j], new Vector2(startx - 50, y - 10), Color.White);
                    j++;
                }
            }

            // test 1 static numerical output data. stringbuilder can handle this.

            //int i = 0;
            //for (int y = starty; y < (starty + 500); y += 20)
            //{
            //    spriteBatch.DrawString(Gu.currentFont, posRulerText[i], new Vector2(startx - 50, y - 10), Color.White);
            //    i++;
            //}

           spriteBatch.End();
   }

If you have a framerate class that measures garbage or use mine it should become clear just how much garbage and collections even a small amount of dynamic numerical text will generate. This example generates about 45 collections in 1 minute using stringbuilder.

So about the clipping of widgets… I also muddled around with this for a while. The way I decided to handle it, and this has generally worked well so far…

Let widgets go where they want to go. No extra calculations to try and force widgets to resize intelligently if clipping is occurring. That’s the job of the user of the GUI framework.

Instead what I did is have each parent run a simple intersection calculation on each child each frame. At first I wondered if this would be inefficient, but after testing I discovered that intersection calculations just aren’t expensive enough to worry about. In my testing I was able to do tens of thousands of them and wasn’t even breaking 1ms on the timer for the whole batch of them. And that included the time spent creating and destroying randomly generated rectangles using Random(). These are cheap operations, so unless you think your GUI will have 50,000+ widgets in it I don’t think it’s an issue.

So every frame, run them on each child. If a child is clipping out of the bounds of the parent(common for scrollable regions), the parent pushes a ‘Set Scissor Region’ on my Rendering handler. The handler is just a wrapper for spritebatch, basically, including the usual Draw() methods. But it also keeps track of scissoring. If scissoring is pushed, it starts a new batch, and all further Draw() calls go onto that new batch automatically. This is done through basically a state manager, or FSM. The currently active batch is at the top of the stack. Information about the current clipping region is also present on the stack.

This also allows handling of ‘nested’ scissoring regions. If parent widget A, has a child B which is clipping, AND child B has a child C which is clipping… Parent A sets a clip which is its own internal, usable bounds, then Child B will call its own scissor region, and the bounds of that scissor are the intersection of Parent A’s scissor and Child B’s usable bounds. This solves any issues with children setting a new batch, and that batch ignoring previously clipped regions.

Anyway, the final piece of the puzzle is that the rendering manager’s current render clip(the scissor region, if set, which should always be set if children are out of bounds), performs a check on all Draw() calls to it to see if the intersection of the draw call and the current rendering clip results in an empty rectangle(which happens if it falls outside the bounds of the clip entirely). If that’s the case, it skips drawing that element entirely. In that way, you get render culling. Only things that are visible are drawn, and the calculations for whether it will be drawn are all done by the rendering manager. The widgets themselves don’t care about it. All they care about is determining if their children are in bounds or not and setting the scissor clip region if they aren’t. And once all children’s Draw()'s are finished, the parent pops the scissor region off the stack before returning control to its own parent.

I think that’s reasonable separation of concerns.

As for performance… With a test GUI of dozens of widgets, many of which with intentionally misaligned and clipped regions, rendering was usually only a millisecond or so(And normally less). Keeping in mind that all sizing calculations are done one time in my setup and remain valid for as long as the window doesn’t change size, or none of the elements inside don’t change size. So all rendering is done using cached rectangles. The only on-the-fly calculations are the intersection checks.

And there’s lots of room for performance improvements if they were desired. You could do all sorts of caching to prevent intersection checks and the like, though they are so cheap I don’t know it’d be worth bothering.

Anyway I better shut up about that before I ramble forever.

As for the garbage collection… I think I see what you’re saying.

Edit: Actually after looking at it again, I realize the example code isn’t generating the debugger text. Though, is the debugger you’re using also using string builder and the drawstring method? Could also be contributing a lot to your GC, depending on how you wrote it.

Two possible solutions… One would be to not update every tick. Since this is in the Draw() method and your framestep is uncapped this process is getting run hundreds of times a second it looks like. Ticking text really only needs to change a few times every second. Anything more than that is extraneous. Second would be to not use a separate string builder for each line. Use a single builder and append an Environment.NewLine to it each new line instead. Using/clearing/appending to fewer builders would probably also help.

Of course, a certain amount of garbage collection is inevitable.

When it comes to strings you really shouldn’t be using string builder.

You should be using some sort of localiser so you can handle multiple languages.

This has the happy side effect of making all the strings in your game static so you don’t generate any garbage from these strings.

Dynamic strings are different, but if they become a problem there are lot’s of ways of dealing with this.

You can have arrays of static strings, dictionaries, all sorts of tricks you can use. Then just fall back on string builder when you have no choice.

Widget clipping is all to do with focus. So widgets should really be held in some kind of efficient storage and when a widget gets focus it is moved to the end of the list.

That way you only move a pointer, you don’t have to sort anything and it will always display correctly.

The thing that always ends up with me writing a new GUI is the bloody message handling.

You click on a button and this executes some code in the game, but you want the GUI to be separated from the game code. So you need some mechanism for passing events from the GUI to the game.

So far I think I have used 20 different ways of doing this, don’t like any of them :relaxed:

The de facto standard for dealing with that kind of separation would probably be something like MVVM.
But the logic has to go somewhere in the end. Me, I’d be satisfied just creating a new class inheriting from my root window class and putting all the logic in there, which is essentially how something like winforms or UWP would do it. Further separation could be handled through an intermediary layer. The VM in MVVM.