Jump to content

Multithreaded Architecture in Leadwerks Game Engine 5

Josh

3,082 views

Leadwerks Game Engine 5 is a restructuring of Leadwerks Game Engine 4 to adapt to the demands of virtual reality and leverage the full capabilities of modern and future hardware.

Basically, the main idea with VR is that if you don't maintain a steady 90 FPS in virtual reality, you will throw up.  Nausea may be the worst physiological feeling you can experience.  In fact, nausea has been rated by cancer patients as worse than pain.  Being sensitive to motion sickness myself, this is a problem I am very motivated to solve.

In a conventional renderer, running both your game logic and rendering at 60 hz (frames per second) seems reasonable.  However, when we up the framerate to the 90 hz required for fluid virtual reality experiences, it seems like an excessive demand on the game code.  Game logic normally handles AI, player input, and various other tasks, and those things don't have to be updated quite that often.

Distributing Tasks
The solution I have come up with is to decouple the game loop from the renderer.  In the diagram below, the game loop is running at only 30 hz, while the physics, culling, and rendering loops are running at independent frequencies.

multithread1.png.9a5b81c1fd8ac31eac590a913b23894c.png

Think of this as like gears on a bicycle.  Your pedals move slowly, but your wheels spin very fast.  The game logic loop is like your pedals, while the rendering loop is like the rear wheel it is connected to.

gears.jpg.2a420b3c735ed09ebaad61597144f6b8.jpg

Previously, your game logic needed to execute in about 8 milliseconds or it would start slowing down the framerate.  With my design here, your game code gets more than 32 milliseconds to execute, a lifetime in code execution time, while a steady framerate of 90 or 60 FPS is constantly maintained.

I actually came up with this idea on my own, but upon further research I found out this is exactly what Intel recommends.  The technique is called Free Step Mode.  The diagram below does not correspond to our engine design, but it does illustrate the idea that separate systems are operating at different speeds:

7951-2.jpg.a224d98361a9c2890ece33362afa653f.jpg

If all threads are set to execute at the same frequency, it is called Lock Step Mode.

7952.jpg.913116238ba12998a0b351549d732e16.jpg

Data Exchange
Data in the game loop is exchanged with the physics and navmesh threads, but is passed one-way on to the culling loop, where it is then passed in a single direction to the rendering loop.  This means there will be a slight delay between when an event occurs and when it makes its way to the rendering thread and the final screen output, but we are talking times on the level of perhaps 10 milliseconds, so it won't be noticeable.  The user will just see smooth motion.

multithread2.png.5f514bfdc821b4e6404455b91c10248a.png

Rather than risk a lot of mutex locks, data is going to be passed one-way and each thread will have a copy of the object.  The game loop will have the full entity class, but the rendering threads will only have a stripped-down class, something like this:

class RenderObject
{
public:
	Mat4 matrix;
	AABB aabb;
	std::vector<Surface*> surfaces;
};

The original entity this object corresponds to can be modified or deleted, without fear of affecting downstream threads.  Again, Intel confirmed what I already thought would be the best approach:

Quote

In order for a game engine to truly run parallel, with as little synchronization overhead as possible, it will need to have each system operate within its own execution state with as little interaction as possible to anything else that is going on in the engine. Data still needs to be shared however, but now instead of each system accessing a common data location to say, get position or orientation data, each system has its own copy. This removes the data dependency that exists between different parts of the engine. Notices of any changes made by a system to shared data are sent to a state manager which then queues up all the changes, called messaging. Once the different systems are done executing, they are notified of the state changes and update their internal data structures, which is also part of messaging. Using this mechanism greatly reduces synchronization overhead, allowing systems to act more independently.

-Designing the Framework of a Parallel Game Engine, Jeff Andrews, Intel
https://software.intel.com/en-us/articles/designing-the-framework-of-a-parallel-game-engine

But wait, isn't latency a huge problem in VR, and I just described a system that adds latency to the renderer?  Yes and no.  The rendering thread will constantly update the headset and controller orientations, every single frame, at 90 hz.  The rest of the world will be 1-2 frames behind, but it won't matter because it's not connected to your body.  You'll get smooth head motion with zero delays while at the same time relieving the renderer of all CPU-side bottlenecks.

Even for non-VR games, I believe this design will produce a massive performance boost unlike anything you've ever seen.



6 Comments


Recommended Comments

I don't understand, why render another frame if gameplay logic wasn't updated yet and nothing in the world changed? In non-VR games you will see two identical frames?

Will we be able to increase gameplay refresh rate? I think gameplay should be updated faster than rendering. My rhythm game feels much more responsive on 200 FPS than on 60 even though I have 60 hz monitor.

Share this comment


Link to comment
Just now, Genebris said:

I don't understand, why render another frame if gameplay logic wasn't updated yet and nothing in the world changed? In non-VR games you will see two identical frames?

Will we be able to increase gameplay refresh rate? I think gameplay should be updated faster than rendering. My rhythm game feels much more responsive on 200 FPS than on 60 even though I have 60 hz monitor.

The renderer can interpolate between two frames of data, creating a new frame in-between.

The frequencies of each system can probably be made adjustable.

Share this comment


Link to comment

Genebris,

While nothing in the game world may have changed, the camera may turn suddenly and a new rendered frame is necessary, one with out lag. I don't want game logic impacting movement of camera. 

 

 

Share this comment


Link to comment
1 minute ago, wayneg said:

Genebris,

While nothing in the game world may have changed, the camera may turn suddenly and a new rendered frame is necessary, one with out lag. I don't want game logic impacting movement of camera. 

 

 

That's the one point in the renderer where a mutex will lock, and a callback will be used to update the camera rotation and position.  The only thing that has to be instantaneous are the camera and VR controllers (if present).

Share this comment


Link to comment

I got goosebumps reading this. I am so excited to be able to leverage all of these great features. I cannot tell you how excited I am to have this the software in this architecture. It is really going to answer a lot of difficult questions for us in the engineering field as well. Well done leadwerks.

Share this comment


Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Blog Entries

    • By Josh in Josh's Dev Blog 4
      I have been working on 2D rendering off and on since October. Why am I putting so much effort into something that was fairly simple in Leadwerks 4? I have been designing a system in anticipation of some features I want to see in the GUI, namely VR support and in-game 3D user interfaces. These are both accomplished with 2D drawing performed on a texture. Our system of sprite layers, cameras, and sprites was necessary in order to provide enough control to accomplish this.
      I now have 2D drawing to a texture working, this time as an official supported feature. In Leadwerks 4, some draw-to-texture features were supported, but it was through undocumented commands due to the complex design of shared resources between OpenGL contexts. Vulkan does not have this problem because everything, including contexts (or rather, the VK equivalent) is bound to an abstract VkInstance object.

      Here is the Lua code that makes this program:
      --Get the primary display local displaylist = ListDisplays() local display = displaylist[1]; if display == nil then DebugError("Primary display not found.") end local displayscale = display:GetScale() --Create a window local window = CreateWindow(display, "2D Drawing to Texture", 0, 0, math.min(1280 * displayscale.x, display.size.x), math.min(720 * displayscale.y, display.size.y), WINDOW_TITLEBAR) --Create a rendering framebuffer local framebuffer = CreateFramebuffer(window); --Create a world local world = CreateWorld() --Create second camera local texcam = CreateCamera(world) --Create a camera local camera = CreateCamera(world) camera:Turn(45,-45,0) camera:Move(0,0,-2) camera:SetClearColor(0,0,1,1) --Create a texture buffer local texbuffer = CreateTextureBuffer(512,512,1,true) texcam:SetRenderTarget(texbuffer) --Create scene local box = CreateBox(world) --Create render-to-texture material local material = CreateMaterial() local tex = texbuffer:GetColorBuffer() material:SetTexture(tex, TEXTURE_BASE) box:SetMaterial(material) --Create a light local light = CreateLight(world,LIGHT_DIRECTIONAL) light:SetRotation(55,-55,0) light:SetColor(2,2,2,1) --Create a sprite layer. This can be shared across different cameras for control over which cameras display the 2D elements local layer = CreateSpriteLayer(world) texcam:AddSpriteLayer(layer) texcam:SetPosition(0,1000,0)--put the camera really far away --Load a sprite to display local sprite = LoadSprite(layer, "Materials/Sprites/23.svg", 0, 0.5) sprite:MidHandle(true,true) sprite:SetPosition(texbuffer.size.x * 0.5, texbuffer.size.y * 0.5) --Load font local font = LoadFont("Fonts/arial.ttf", 0) --Text shadow local textshadow = CreateText(layer, font, "Hello!", 36 * displayscale.y, TEXT_LEFT, 1) textshadow:SetColor(0,0,0,1) textshadow:SetPosition(50,30) textshadow:SetRotation(90) --Create text text = textshadow:Instantiate(layer) text:SetColor(1,1,1,1) text:SetPosition(52,32) text:SetRotation(90) --Main loop while window:Closed() == false do sprite:SetRotation(CurrentTime() / 30) world:Update() world:Render(framebuffer) end I have also added a GetTexCoords() command to the PickInfo structure. This will calculate the tangent and bitangent for the picked triangle and then calculate the UV coordinate at the picked position. It is necessary to calculate the non-normalized tangent and bitangent to get the texture coordinate, because the values that are stored in the vertex array are normalized and do not include the length of the vectors.
      local pick = camera:Pick(framebuffer, mousepos.x, mousepos.y, 0, true, 0) if pick ~= nil then local texcoords = pick:GetTexCoords() Print(texcoords) end Maybe I will make this into a Mesh method like GetPolygonTexCoord(), which would work just as well but could potentially be useful for other things. I have not decided yet.
      Now that we have 2D drawing to a texture, and the ability to calculate texture coordinates at a position on a mesh, the next step will be to set up a GUI displayed on a 3D surface, and to send input events to the GUI based on the user interactions in 3D space. The texture could be applied to a computer panel, like many of the interfaces in the newer DOOM games, or it could be used as a panel floating in the air that can be interacted with VR controllers.
    • By Josh in Josh's Dev Blog 0
      Putting all the pieces together, I was able to create a GUI with a sprite layer, attach it to a camera with a texture buffer render target, and render the GUI onto a texture applied to a 3D surface. Then I used the picked UV coords to convert to mouse coordinates and send user events to the GUI. Here is the result:

      This can be used for GUIs rendered onto surfaces in your game, or for a user interface that can be interacted with in VR. This example will be included in the next beta update.
    • By Josh in Josh's Dev Blog 4
      I started to implement quads for tessellation, and at that point the shader system reached the point of being unmanageable. Rendering an object to a shadow map and to a color buffer are two different processes that require two different shaders. Turbo introduces an early Z-pass which can use another shader, and if variance shadow maps are not in use this can be a different shader from the shadow shader. Rendering with tessellation requires another set of shaders, with one different set for each primitive type (isolines, triangles, and quads). And then each one of these needs a masked and opaque option, if alpha discard is enabled.
      All in all, there are currently 48 different shaders a material could use based on what is currently being drawn. This is unmanageable.
      To handle this I am introducing the concept of a "shader family". This is a JSON file that lists all possible permutations of a shader. Instead of setting lots of different shaders in a material, you just set the shader family one:
      shaderFamily: "PBR.json" Or in code:
      material->SetShaderFamily(LoadShaderFamily("PBR.json")); The shader family file is a big JSON structure that contains all the different shader modules for each different rendering configuration: Here are the partial contents of my PBR.json file:
      { "turboShaderFamily" : { "OPAQUE": { "default": { "base": { "vertex": "Shaders/PBR.vert.spv", "fragment": "Shaders/PBR.frag.spv" }, "depthPass": { "vertex": "Shaders/Depthpass.vert.spv" }, "shadow": { "vertex": "Shaders/Shadow.vert.spv" } }, "isolines": { "base": { "vertex": "Shaders/PBR_Tess.vert.spv", "tessellationControl": "Shaders/Isolines.tesc.spv", "tessellationEvaluation": "Shaders/Isolines.tese.spv", "fragment": "Shaders/PBR_Tess.frag.spv" }, "shadow": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "Shaders/DepthPass_Isolines.tesc.spv", "tessellationEvaluation": "Shaders/DepthPass_Isolines.tese.spv" }, "depthPass": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "DepthPass_Isolines.tesc.spv", "tessellationEvaluation": "DepthPass_Isolines.tese.spv" } }, "triangles": { "base": { "vertex": "Shaders/PBR_Tess.vert.spv", "tessellationControl": "Shaders/Triangles.tesc.spv", "tessellationEvaluation": "Shaders/Triangles.tese.spv", "fragment": "Shaders/PBR_Tess.frag.spv" }, "shadow": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "Shaders/DepthPass_Triangles.tesc.spv", "tessellationEvaluation": "Shaders/DepthPass_Triangles.tese.spv" }, "depthPass": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "DepthPass_Triangles.tesc.spv", "tessellationEvaluation": "DepthPass_Triangles.tese.spv" } }, "quads": { "base": { "vertex": "Shaders/PBR_Tess.vert.spv", "tessellationControl": "Shaders/Quads.tesc.spv", "tessellationEvaluation": "Shaders/Quads.tese.spv", "fragment": "Shaders/PBR_Tess.frag.spv" }, "shadow": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "Shaders/DepthPass_Quads.tesc.spv", "tessellationEvaluation": "Shaders/DepthPass_Quads.tese.spv" }, "depthPass": { "vertex": "Shaders/DepthPass_Tess.vert.spv", "tessellationControl": "DepthPass_Quads.tesc.spv", "tessellationEvaluation": "DepthPass_Quads.tese.spv" } } } } } A shader family file can indicate a root to inherit values from. The Blinn-Phong shader family pulls settings from the PBR file and just switches some of the fragment shader values.
      { "turboShaderFamily" : { "root": "PBR.json", "OPAQUE": { "default": { "base": { "fragment": "Shaders/Blinn-Phong.frag.spv" } }, "isolines": { "base": { "fragment": "Shaders/Blinn-Phong_Tess.frag.spv" } }, "triangles": { "base": { "fragment": "Shaders/Blinn-Phong_Tess.frag.spv" } }, "quads": { "base": { "fragment": "Shaders/Blinn-Phong_Tess.frag.spv" } } } } } If you want to implement a custom shader, this is more work because you have to define all your changes for each possible shader variation. But once that is done, you have a new shader that will work with all of these different settings, which in the end is easier. I considered making a more complex inheritance / cascading schema but I think eliminating ambiguity is the most important goal in this and that should override any concern about the verbosity of these files. After all, I only plan on providing a couple of these files and you aren't going to need any more unless you are doing a lot of custom shaders. And if you are, this is the best solution for you anyways.
      Consequently, the baseShader, depthShader, etc. values in the material file definition are going away. Leadwerks .mat files will always use the Blinn-Phong shader family, and there is no way to change this without creating a material file in the new JSON material format.
      The shader class is no longer derived from the Asset class because it doesn't correspond to a single file. Instead, it is just a dumb container. A ShaderModule class derived from the Asset class has been added, and this does correspond with a single .spv file. But you, the user, won't really need to deal with any of this.
      The result of this is that one material will work with tessellation enabled or disabled, quad, triangle, or line meshes, and animated meshes. I also added an optional parameter in the CreatePlane(), CreateBox(), and CreateQuadSphere() commands that will create these primitives out of quads instead of triangles. The main reason for supporting quad meshes is that the tessellation is cleaner when quads are used. (Note that Vulkan still displays quads in wireframe mode as if they are triangles. I think the renderer probably converts them to normal triangles after the tessellation stage.)


      I also was able to implement PN Quads, which is a quad version of the Bezier curve that PN Triangles add to tessellation.



      Basically all the complexity is being packed into the shader family file so that these decisions only have to be made once instead of thousands of times for each different material.
×
×
  • Create New...