Jump to content

Multithreaded Architecture in Leadwerks Game Engine 5

Josh

2,515 views

Leadwerks Game Engine 5 is a restructuring of Leadwerks Game Engine 4 to adapt to the demands of virtual reality and leverage the full capabilities of modern and future hardware.

Basically, the main idea with VR is that if you don't maintain a steady 90 FPS in virtual reality, you will throw up.  Nausea may be the worst physiological feeling you can experience.  In fact, nausea has been rated by cancer patients as worse than pain.  Being sensitive to motion sickness myself, this is a problem I am very motivated to solve.

In a conventional renderer, running both your game logic and rendering at 60 hz (frames per second) seems reasonable.  However, when we up the framerate to the 90 hz required for fluid virtual reality experiences, it seems like an excessive demand on the game code.  Game logic normally handles AI, player input, and various other tasks, and those things don't have to be updated quite that often.

Distributing Tasks
The solution I have come up with is to decouple the game loop from the renderer.  In the diagram below, the game loop is running at only 30 hz, while the physics, culling, and rendering loops are running at independent frequencies.

multithread1.png.9a5b81c1fd8ac31eac590a913b23894c.png

Think of this as like gears on a bicycle.  Your pedals move slowly, but your wheels spin very fast.  The game logic loop is like your pedals, while the rendering loop is like the rear wheel it is connected to.

gears.jpg.2a420b3c735ed09ebaad61597144f6b8.jpg

Previously, your game logic needed to execute in about 8 milliseconds or it would start slowing down the framerate.  With my design here, your game code gets more than 32 milliseconds to execute, a lifetime in code execution time, while a steady framerate of 90 or 60 FPS is constantly maintained.

I actually came up with this idea on my own, but upon further research I found out this is exactly what Intel recommends.  The technique is called Free Step Mode.  The diagram below does not correspond to our engine design, but it does illustrate the idea that separate systems are operating at different speeds:

7951-2.jpg.a224d98361a9c2890ece33362afa653f.jpg

If all threads are set to execute at the same frequency, it is called Lock Step Mode.

7952.jpg.913116238ba12998a0b351549d732e16.jpg

Data Exchange
Data in the game loop is exchanged with the physics and navmesh threads, but is passed one-way on to the culling loop, where it is then passed in a single direction to the rendering loop.  This means there will be a slight delay between when an event occurs and when it makes its way to the rendering thread and the final screen output, but we are talking times on the level of perhaps 10 milliseconds, so it won't be noticeable.  The user will just see smooth motion.

multithread2.png.5f514bfdc821b4e6404455b91c10248a.png

Rather than risk a lot of mutex locks, data is going to be passed one-way and each thread will have a copy of the object.  The game loop will have the full entity class, but the rendering threads will only have a stripped-down class, something like this:

class RenderObject
{
public:
	Mat4 matrix;
	AABB aabb;
	std::vector<Surface*> surfaces;
};

The original entity this object corresponds to can be modified or deleted, without fear of affecting downstream threads.  Again, Intel confirmed what I already thought would be the best approach:

Quote

In order for a game engine to truly run parallel, with as little synchronization overhead as possible, it will need to have each system operate within its own execution state with as little interaction as possible to anything else that is going on in the engine. Data still needs to be shared however, but now instead of each system accessing a common data location to say, get position or orientation data, each system has its own copy. This removes the data dependency that exists between different parts of the engine. Notices of any changes made by a system to shared data are sent to a state manager which then queues up all the changes, called messaging. Once the different systems are done executing, they are notified of the state changes and update their internal data structures, which is also part of messaging. Using this mechanism greatly reduces synchronization overhead, allowing systems to act more independently.

-Designing the Framework of a Parallel Game Engine, Jeff Andrews, Intel
https://software.intel.com/en-us/articles/designing-the-framework-of-a-parallel-game-engine

But wait, isn't latency a huge problem in VR, and I just described a system that adds latency to the renderer?  Yes and no.  The rendering thread will constantly update the headset and controller orientations, every single frame, at 90 hz.  The rest of the world will be 1-2 frames behind, but it won't matter because it's not connected to your body.  You'll get smooth head motion with zero delays while at the same time relieving the renderer of all CPU-side bottlenecks.

Even for non-VR games, I believe this design will produce a massive performance boost unlike anything you've ever seen.



6 Comments


Recommended Comments

I don't understand, why render another frame if gameplay logic wasn't updated yet and nothing in the world changed? In non-VR games you will see two identical frames?

Will we be able to increase gameplay refresh rate? I think gameplay should be updated faster than rendering. My rhythm game feels much more responsive on 200 FPS than on 60 even though I have 60 hz monitor.

Share this comment


Link to comment
Just now, Genebris said:

I don't understand, why render another frame if gameplay logic wasn't updated yet and nothing in the world changed? In non-VR games you will see two identical frames?

Will we be able to increase gameplay refresh rate? I think gameplay should be updated faster than rendering. My rhythm game feels much more responsive on 200 FPS than on 60 even though I have 60 hz monitor.

The renderer can interpolate between two frames of data, creating a new frame in-between.

The frequencies of each system can probably be made adjustable.

Share this comment


Link to comment

Genebris,

While nothing in the game world may have changed, the camera may turn suddenly and a new rendered frame is necessary, one with out lag. I don't want game logic impacting movement of camera. 

 

 

Share this comment


Link to comment
1 minute ago, wayneg said:

Genebris,

While nothing in the game world may have changed, the camera may turn suddenly and a new rendered frame is necessary, one with out lag. I don't want game logic impacting movement of camera. 

 

 

That's the one point in the renderer where a mutex will lock, and a callback will be used to update the camera rotation and position.  The only thing that has to be instantaneous are the camera and VR controllers (if present).

Share this comment


Link to comment

I got goosebumps reading this. I am so excited to be able to leverage all of these great features. I cannot tell you how excited I am to have this the software in this architecture. It is really going to answer a lot of difficult questions for us in the engineering field as well. Well done leadwerks.

Share this comment


Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Blog Entries

    • By jen in jen's Blog 3
      I thought I would share my experience on this; if you're working on Multiplayer, you will need to protect your packets. The solution is simple, let's go through how we can achieve this by implementing what Valve calls "challenge codes". (Some reading on the topic from Valve here: https://developer.valvesoftware.com/wiki/Master_Server_Query_Protocol#Challenge_response).
      Disclaimer: this doesn't cover other security techniques like authoritative server or encryption.
      So, I've worked on Border Recon last year (I think) and I needed a way to protect my server/client packets. There was no need for me to re-invent the wheel, I just had to copy what Valve has had for a  long time - challenge  codes.
      The idea behind challenge codes is similar to Captcha, but not exactly. Think of it like this: for every packet submitted to the server, it must be verified - how? By requiring the client to solve challenges our server provides.
      To implement this we need to have the following:
      A randomised formula in the server i.e.: a = b * c / d + e or a = b / c + d - e, be creative - it can be any combination of basic arithmetic or some fancy logic you like and can be however long as you want - do consider that the longer the formula, the more work your server has to do to make the computation.  Copy the same formula to the client. A random number generator.  So the idea here is:
      (Server) Generate a random number (see 3 above) of which the result would become the challenge code, (Server) run it through our formula and record the result. (Client) And then, we hand over the challenge code to the client to solve (an authentic client would have the same formula implemented in its program as we have on the server). For every packet received from the player, a new challenge code is created (and the player is notified of this change by the server in response). For every other packet, a new challenge code is created. (Client) Every packet sent to the server by the client must have a challenge code and its answer embedded.  (Server receives the packet) Run the challenge code again to our formula and compare the result to the answer embedded on the client's packet. (Server) If the answers are different, reject the packet, no changes to the player's state. The advantage(s) of this strategy in terms of achieving the protection we need to secure our server:
      - For every packet sent, new challenge code is created. Typically, game clients (especially FPS) will update its state in a matter of ms so even if a cheater is successful at sniffing the answer to a challenge code it would be invalidated almost instantaneously. 
      - Lightweight solution. No encryption needed. 
      Disadvantage(s):
      - The formula to answering the challenge code is embedded to the client, a cheater can de-compile the client and uncover the formula. Luckily, we have other anti-cheat solutions for that; you can implement another anti-cheat solution i.e. checking file checksums to verify the integrity of your game files and more (there are third-party anti cheat solutions out there that you can use to protect your game files).
       
       
       
    • By Josh in Josh's Dev Blog 4
      New commands in Turbo Engine will add better support for multiple monitors. The new Display class lets you iterate through all your monitors:
      for (int n = 0; n < CountDisplays(); ++n) { auto display = GetDisplay(n); Print(display->GetPosition()); //monitor XY coordinates Print(display->GetSize()); //monitor size Print(display->GetScale()); //DPI scaling } The CreateWindow() function now takes a parameter for the monitor to create the window on / relative to.
      auto display = GetDisplay(0); Vec2 scale = display->GetScale(); auto window = CreateWindow(display, "My Game", 0, 0, 1280.0 * scale.x, 720.0 * scale.y, WINDOW_TITLEBAR | WINDOW_RESIZABLE); The WINDOW_CENTER style can be used to center the window on one display.
      You can use GetDisplay(DISPLAY_PRIMARY) to retrieve the main display. This will be the same as GetDisplay(0) on systems with only one monitor.
    • By Josh in Josh's Dev Blog 1
      A huge update is available for Turbo Engine Beta.
      Hardware tessellation adds geometric detail to your models and smooths out sharp corners. The new terrain system is live, with fast performance, displacement, and support for up to 255 material layers. Plugins are now working, with sample code for loading MD3 models and VTF textures. Shader families eliminate the need to specify a lot of different shaders in a material file. Support for multiple monitors and better control of DPI scaling. Notes:
      Terrain currently has cracks between LOD stages, as I have not yet decided how I want to handle this. Tessellation has some "shimmering" effects at some resolutions. Terrain may display a wire grid on parts. Directional lights are supported but cast no shadows. Tested in Nvidia and AMD, did not test on Intel. Subscribers can get the latest beta in the private forum here.

       
       
×
×
  • Create New...