Jump to content

JMichael

Staff
  • Content Count

    16,743
  • Joined

  • Last visited

Blog Entries posted by JMichael

  1. JMichael
    So far the new Voxel ray tracing system I am working out is producing amazing results. I expect the end result will look like Minecraft RTX, but without the enormous performance penalty of RTX ray tracing.
    I spent the last several days getting the voxel update speed fast enough to handle dynamic reflections, but the more I dig into this the more complicated it becomes. Things like a door sliding open are fine, but small objects moving quickly can be a problem. The worst case scenario is when the player is carrying an object in front of them. In the video below, the update speed is fast, but the limited resolution of the voxel grid makes the reflections flash quite a lot. This is due to the reflection of the barrel itself. The gun does not contribute to the voxel data, and it looks perfectly fine as it moves around the scene, aside from the choppy reflection of the barrel in motion.
    The voxel resolution in the above video is set to about 6 centimeters. I don't see increasing the resolution as an option that will go very far. I think what is needed is a separation of dynamic and static objects. A sparse voxel octree will hold all static objects. This needs to be precompiled and it cannot change, but it will handle a large amount of geometry with low memory usage. For dynamic objects, I think a per-object voxel grid should be used. The voxel grid will move with the object, so reflections of moving objects will update instantaneously, eliminating the problem we see above.
    We are close to having a very good 1.0 version of this system, and I may wrap this up soon, with the current limitations. You can disable GI reflections on a per-object basis, which is what I would recommend doing with dynamic objects like the barrels above. The GI and reflections are still dynamic and will adjust to changes in the environment, like doors opening and closing, elevators moving, and lights moving and turning on and off. (If those barrels above weren't moving, showing their reflections would be absolutely no problem, as I have demonstrated in previous videos.)
    In general, I think ray tracing is going to be a feature you can take advantage of to make your games look incredible, but it is something you have to tune. The whole "Hey Josh I created this one weird situation just to cause problems and now I expect you to account for this scenario AAA developers would purposefully avoid" approach will not work with ray tracing. At least not in the 1.0 release. You're going to want to avoid the bad situations that can arise, but they are pretty easy to prevent. Perhaps I can combine screen-space reflections with voxels for reflections of dynamic objects before the first release.
    If you are smart about it, I expect your games will look like this:
    I had some luck with real-time compression of the voxel data into BC3 (DXT5) format. It adds some delay to the updating, but if we are not trying to show moving reflections much then that might be a good tradeoff. Having only 25% of the data being sent to the GPU each frame is good for performance.
    Another change I am going to make it a system that triggers voxel refreshes, instead of constantly updating it no matter what. If you sit still and nothing is moving, then the voxel data won't get recalculated and processed, which will make the performance even faster. This makes sense if we expect most of the data to not change each frame.
    I haven't run any performance benchmarks yet, but from what I am seeing I think the performance penalty for using this system will be basically zero, even on integrated graphics. Considering what a dramatic upgrade in visuals this provides, that is very impressive.
    In the future, I think I will be able to account for motion in voxel ray tracing, as well as high-definition polygon raytracing for sharp reflections, but it's not worth delaying the release of the engine. Hopefully in this article I showed there are many factors, and many approaches we are can use to try to optimize for different aspects of the effect. For the 1.0 release of our new engine, I think we want to emphasize performance above all else.
  2. JMichael
    Crowdfunding campaigns are a great way to kick off marketing for a game or product, with several benefits.
    Free promotion to your target audience. Early validation of an idea before you create the product. A successful crowdfunding campaign demonstrates organic consumer interest, which makes bloggers and journalists much more willing to give your project coverage. Oh yeah, there's also the financial aspect, but that's actually the least important part. If you make $10,000 in crowdfunding, you can leverage that campaign to make far more than that amount in sales of your final product. I did over a million dollars in sales on Steam starting with a $40,000 Kickstarter project.
    There are two types of crowdfunding projects. The first is something you don't really want to do unless you get paid enough to make it worthwhile. For this type of project you should set a goal for the minimum amount of money you would be able to finish the project for. There is more uncertainty with this type of campaign, but if you don't meet your goal you don't have to deliver anything. Failing early can be a good thing, because there's nothing worse than building a product and having nobody buy it. With the successful Leadwerks for Linux Kickstarter campaign, people were asking for Linux support and I said "Okay, put your money where your mouth is" and they did.
    The second type of project is something you would probably do anyways, and a crowdfunding campaign just gives you a way to test demand and make some extra cash early. For this type of project you should set a relatively low goal, something you think you can earn quickly. If your campaign fails, that puts you in an awkward position because then you have to either cancel the project or admit you didn't actually need the money. A successful campaign does put you on the hook with a delivery date and a firm description of the product, so make sure your goals are realistic and attainable within your planned time frame.
    For a campaign to be successful you need to prepare. Don't just kick off a campaign without having an existing fanbase. You need to build an email list of people interested in your project before the campaign starts. But if you haven't done that yet, there is another way...
    With my crowdfunding campaign for the new engine coming up in October, there is an opportunity for others to latch on to the success of the upcoming campaign. I have an extensive email list I don't use very often, and my more formal blog articles regularly get 20,000+ views. Plus I now have some reach on Steam, and a lot more customers than back in 2013. I expect my campaign will hit its target goal within the first few days. Once my goal is reached, it would be easy for me to post an announcement saying "Oh hey, check out these other projects built with my technology" and add links on my project page. Your project could link back to mine and to others, and we can create a network of projects utilizing the new game engine technology. I think my new campaign will be very successful, and jumping onto that will probably give you a better result than you would get otherwise.
    Another thing to consider is that with the new ray-tracing technology, even simple scenes look incredible. I think there is a temporary window of opportunity where games that utilize this type of technology will stand out and automatically get more attention because the graphics look so dramatically better. My final results will make your game look like the shot from Minecraft RTX below, but the voxel method I am using will run fast on all hardware:

    So if you have a game project made with the new engine, or something that would look good in the new engine, there is an opportunity to piggyback your crowdfunding campaign off of mine. What makes a good game pitch? Demonstrating gameplay, having a playable demo, a track record of past published games, and gameplay videos all make a much better case than pages of bullet points. (I like animated GIFs because they show a lot more than a static screenshot but they are dead simple and fun.) You need to inspire the audience to believe in your concept, and for them to believe in your ability to deliver. So put your best foot forward!
  3. JMichael
    I've been working to make my previously demonstrated voxel ray tracing system fully dynamic. Getting the voxel data to update fast enough was a major challenge, and it forced me to rethink the design. In the video below you can see the voxel data being updated at a sufficient speed. Lighting has been removed, as I need to change the way this runs.
    I plan to keep two copies of the data in memory and let the GPU interpolate smoothly in between them, in order to smooth out the motion. Next I need to add the direct lighting and GI passes back in, which will add an additional small delay but hopefully be within a tolerable threshold.
  4. JMichael
    A new beta update is available. The raytracing implementation has been sped up significantly. The same limitations of the current implementation still apply, but the performance will be around 10x faster, as the most expensive part of the raytrace shader has been precomputed and cached.
    The Material::SetRefraction method has also been exposed to Lua. The Camera::SetRefraction method is now called "SetRefractionMode".
    The results are so good, I don't have any plans to use any kind of screen-space reflection effect.
     
  5. JMichael
    An update is available for beta testers.
    All Lua errors should now display the error message and open the script file and go to the correct line the error occurs on.
    The voxel raytracing system is now accessible. To enable it, just call Camera:SetGIMode(true).
    At this time, only a single voxel grid with dimensions of 32 meters, centered at the origin is in use. The voxel grid will only be generated once, at the time the SetGIMode() method is called. Only the models that have already been loaded will be included when the voxel grid is built. Building takes several seconds in debug mode but less than one second in release. Raytraced GI and reflections do not take into account material properties yet, so there is no need to adjust PBR material settings at this time. Skyboxes and voxels are not currently combined. Only one or the other is shown. Performance is much faster than Nvidia RTX but still has a lot of room for improvement. If it is too slow for you right now, use a smaller window resolution. It will get faster as I work on it more. The raytracing stuff makes such a huge difference that I wanted to get a first draft out to the testers as quickly as possible. I am very curious to see what you are able to do with it.
  6. JMichael
    PBR materials look nice, but their reflections are only as good as the reflection data you have. Typically this is done with hand-placed environment probes that take a long time to lay out, and display a lot of visual artifacts. Nvidia's RTX raytracing technology is interesting, but it struggles to run old games on a super-expensive GPU, My goal in Leadwerks 5 is to have automatic reflections and global illumination that doesn't require any manual setup, with fast performance.
    I'm on the final step of integrating our voxel raytracing data into the standard lighting shader and the results are fantastic. I found I could compress the 3D textures in BC3 format in real-time and save a ton of memory that way. However, I discovered that only about 1% of the 3D voxel texture actually has any data in it! That means there could be a lot of room for improvement with a sparse voxel octree texture of some sort, which could allow greater resolution. In any case, the remaining implementation of this feature will be very interesting. (I believe the green area on the back wall is an artifact caused by the BC3 compression.)
    I think I can probably render the raytracing component of the scene in a separate smaller buffer and the denoise it like I did with SSAO to make the performance hit negligible on this. Another interesting thing is that the raytracing automatically creates it's own ambient occlusion effect.
    Here is the current state, showing the raytraced component only. It works great with our glass refraction effects.
    Next I will start blending it into the PBR material lighting calculation a little better.
    Here's an updated video that shows it worked into the lighting more:
     
  7. JMichael
    An update is available that adds the new refraction effect. It's very easy to create a refractive transparent material:
    auto mtl = CreateMaterial(); mtl->SetTransparent(true); mtl->SetRefraction(0.02); The default FPS example shows some nice refraction, with two overlapping layers of glass, with lighting on all layers. It looks great with some of @TWahl's PBR materials.

    If you want to control the strength of the refraction effect on a per-pixel basis add an alpha channel to your normal map.
    I've configured the launch.json for Visual Studio Code so that the current selected file is passed to the program in the command line. By default, game executable will run the "Scripts/Main.lua" file. If however, the current selected Lua file in the VSCode IDE is a file located in "Scripts/Examples" the executable will launch that one instead. This design allows you to quickly run a different script without overwriting Main.lua, but won't accidentally run a different script if you are working on something else.

    The whole integration with Visual Studio Code has gotten really nice.

    A new option "frameBufferColorFormat" is added to the Config/settings.json file to control the default color format for texture buffers .I have it set to 37 (VK_FORMAT_R8G8B8A8_UNORM) but you can set it to 91 (VK_R16G16B16A16_UNORM) for high-def color, but you probably won't see anything without an additional tone mapping post-processing effect.
    Slow performance in the example game has been fixed. There are a few things going on here. Physics weren't actually the problem, it was the Lua debugger. The biggest problem was an empty Update() function that all the barrels had in their script. Now, this should not really be a problem, but I suspect the routine in vscode-debugger.lua that finds the matching chunk name is slow and can be optimized quite a lot. I did not want to make any additional changes to it right now, but in the future I think this can be further improved. But anyways, the FPS example will be nice and snappy now and runs normally.
    Application shut down will be much faster now, as I did some work to clean up the way the engine cleans itself up upon termination.
  8. JMichael
    Heat haze is a difficult problem. A particle emitter is created with a transparent material, and each particle warps the background a bit. The combined effect of lots of particles gives the whole background a nice shimmering wavy appearance. The problem is that when two particles overlap one another they don't blend together, because the last particle drawn is using the background of the solid world for the refracted image. This can result in a "popping" effect when particles disappear, as well as apparent seams on the edges of polygons.

    In order to do transparency with refraction the right way, we are going to render all our transparent objects into a separate color texture and then draw that texture on top of the solid scene. We do this in order to accommodate multiple layers of transparency and refraction. Now, the correct way to handle multiple layers would be to render the solid world, render the first transparency object, then switch to another framebuffer and use the previous framebuffer color attachment for the source of your refraction image. This could be done per-object, although it could get very expensive, flipping back and forth between two framebuffers, but that still wouldn't be enough.
    If we render all the transparent surfaces into a single image, we can blend their normals, refractive index, and other properties, and come up with a single refraction vector that combined the underlying surfaces in the best way possible.
    To do this, the transparent surface color is rendered into the first color attachment. Unlike deferred lighting, the pixels at this point are fully lit.

    The screen normals are stored in an additional color attachment. I am using world normals in this shot but later below I switched to screen normals:

    These images are drawn on top of the solid scene to render all transparent objects at once. Here we see the green box in the foreground is appearing in the refraction behind the glass dragon.

    To prevent this from happening, we need add another color texture to the framebuffer and render the pixel Z position into it. I am using the R32_SFLOAT format. I use the separate blend mode feature in Vulkan, and set the blend mode to minimum so that the smallest value always gets saved in the texture. The Z-position is divided by the camera far range in the fragment shader, so that the saved values are always between 0 and 1. The clear color for this attachment is set to 1,1,1,1, so any value written into the buffer will replace the background. Note this is the depth of the transparent pixels, not the whole scene, so the area in the center where the dragon is occluded by the box is pure white, since those pixels were not drawn.

    In the transparency pass, the Z position of the transparent pixel is compared to the Z position at the refracted texcoords. If the refracted position is closer to the camera than the transparent surface, the refraction is disabled for that pixel and the background directly behind the pixel is shown instead. There is some very slight red visible in the refraction, but no green.

    Now let's see how well this handles heat haze / distortion. We want to prevent the problem when two particles overlap. Here is what a particle emitter looks like when rendered to the transparency framebuffer, this time using screen-space normals. The particles aren't rotating so there are visible repetitions in the pattern, but that's okay for now.

    And finally here is the result of the full render. As you can see, the seams and popping is gone, and we have a heavy but smooth distortion effect. Particles can safely overlap without causing any artifacts, as their normals are just blended together and combined to create a single refraction angle.

     
  9. JMichael
    One of the downsides of deferred rendering is it isn't very good at handling transparent surfaces. Since we have moved to a new forward renderer, one of my goals in Leadwerks 5 is to have easy hassle-free transparency with lighting and refraction that just works.
    Pre-multiplied alpha provides a better blending equation than traditional alpha blending. I'm not going to go into the details here, but it makes it so the transparent surface can be brighter than the underlying surface, as you can see on the vehicle's windshield here:

    I've been working for a while to build an automatic post-processing step into the engine that occurs when a transparency object is onscreen. If no transparent objects are onscreen, then the post-processing step can be skipped.
    You can also call Camera::SetRefraction(false) and just use regular GPU-blended transparency with no fancy refraction of the background, but I plan to enable it by default.
    To use this effect, there is absolutely nothing you have to do except to create a material, make it transparent, and apply it to a mesh somewhere.
    auto mtl = CreateMaterial(); mtl->SetTransparent(true); mtl->SetColor(1,1,1,0.5); The lower the alpha value of the material color, the more see-through it is. You can use an alpha value of zero to make a refractive predator-like effect.
     
  10. JMichael
    A new update is available that improves Lua integration in Visual Studio Code and fixes Vulkan validation errors.
    The SSAO effect has been improved with a denoise filter. Similar to Nvidia's RTX raytracing technology, this technique smooths the results of the SSAO pass, resulting in a better appearance.

    It also requires far fewer sample and the SSAO pass can be run at a lower resolution. I lowered the number of SSAO samples from 64 to 8 and decreased the area of the image to 25%, and it looks better than the SSAO in Leaqdwerks 4, which could appear somewhat grainy. With default SSAO and bloom effects enabled, I see no difference in framerate compared to the performance when no post-processing effects are in use.
    I upgraded my install of the Vulkan SDK to 1.2 and a lot of validation errors were raised. They are all fixed now. The image layout transition stuff is ridiculously complicated, and I can see no reason why this is even a feature! This could easily be handled by the driver just storing the current state and switching whenever needed, which is exactly what I ended up doing with my own wrapper class. In theory, everything should work perfectly on all supported hardware now since the validation layers say it is correct.
    You can now explicitly state the validation layers you want loaded, in settings.json, although there isn't really any reason to do this:
    "vkValidationLayers": { "debug": [ "VK_LAYER_LUNARG_standard_validation", "VK_LAYER_KHRONOS_validation" ] } Debugging Lua in Visual Studio Code is improved. The object type will now be shown so you can more easily navigate debug information.

    That's all for now!
  11. JMichael
    A new update is available to beta testers. This makes some pretty big changes so I wanted to release this before doing any additional work on the post-processing effects system.
    Terrain Fixed
    Terrain system is working again, with an example for Lua and C++.
    New Configuration Options
    New settings have been added in the "Config/settings.json" file:
    "MultipassCubemap": false, "MaxTextures": 512, "MaxCubemaps": 16, "MaxShadowmaps": 64, "MaxIntegerTextures": 32, "MaxUIntegerTextures": 32, "MaxCubeShadowmaps": 64, "MaxVolumeTextures": 16, "LuaErrorCommand": "code", "LuaErrorCommandArguments": "-g \"$(CurrentFile)\":$(LineNumber) \"$(AppDir)\"" The max texture values will allow you to reduce the array size the engine requires for textures. If you have gotten an error message about "not enough texture units" this setting can be used to bring your application down under the limit your hardware has.
    The Lua settings define the command that is run when a Lua error occurs. By default this will open Visual Studio code and display the file and line number an error occurs on. 
    String Classes
    I've implemented two string classes for better string handling. The String and WString class are derived from both the std::string / wstring AND the Object class, which means they can be used in a variable that accepts an object (like the Event.source member). 8-bit character strings will automatically convert to wide strings, but not the other way. All the Load commands used to have two overloads, one for narrow and one for wide strings. That has been replaced with a single command that accepts a WString, so you can call LoadTexture("brick,dds") without having to specify a wide string like this: L"brick.dds".
    The global string functions like Trim, Right, Mid, etc. have been added as methods on the two string classes, Eventually the global functions will be phased out.
    Lua Integration in Visual Studio Code
    Lua integration in Visual Studio Code is just about finished and it's amazing! Errors are displayed, debugging works great, and console output is displayed, just like any serious modern programming language. Developing with Lua in Leadwerks 5 is going to be a blast!

    Lua launch options are now available for Debug, Release, Debug 64f, and Release 64f.
    I feel the Lua support is good enough now that the .bat files are not needed. It's easier just to open VSCode and copy the example you want to run into Main.lua. These are currently located in "Scripts/Examples" but they will be moved into the documentation system in time.
    The black console window is going away and all executables are by default compiled as a windowed application, not a console app. The console output is still available in Visual Studio in the debug output, or it can be piped to a file with a .bat launcher.
    See the notes here on how to get started with VSCode and Lua.
  12. JMichael
    A new update is available for beta testers.
    The dCustomJoints and dContainers DLLs are now optional if your game is not using any joints (even if you are using physics).
    The following methods have been added to the collider class. These let you perform low-level collision tests yourself:
    Collider::ClosestPoint Collider::Collide Collider::GetBounds Collider::IntersectsPoint Collider::Pick The PluginSDK now supports model saving and an OBJ save plugin is provided. It's very easy to convert models this way using the new Model::Save() method:
    auto plugin = LoadPlugin("Plugins/OBJ.dll"); auto model = LoadModel(world,"Models/Vehicles/car.mdl"); model->Save("car.obj"); Or create models from scratch and save them:
    auto box = CreateBox(world,10,2,10); box->Save("box.obj"); I have used this to recover some of my old models from Leadwerks 2 and convert them into GLTF format:

    There is additional documentation now on the details of the plugin system and all the features and options.
    Thread handling is improved so you can run a simple application that handles 3D objects and exits out without ever initializing graphics.
    Increased strictness of headers for private and public members and methods.
    Fixed a bug where directional lights couldn't be hidden. (Check out the example for the CreateLight command in the new docs.)
    All the Lua scripts in the "Scripts\Start" folder are now executed when the engine initializes, instead of when the first script is run. These will be executed for all programs automatically, so it is useful for automatically loading plugins or workflows. If you don't want to use Lua at all, you can delete the "Scripts" folder and the Lua DLL, but you will need to load any required plugins yourself with the LoadPlugin command.
    Shadow settings are simplified. In Leadwerks 4, entities could be set to static or dynamic shadows, and lights could use a combination of static, dynamic, and buffered modes. You can read the full explanation of this feature in the documentation here. In Leadwerks 5, I have distilled that down to two commands. Entity::SetShadows accepts a boolean, true to cast shadows and false not to. Additionally, there is a new Entity::MakeStatic method. Once this is called on an entity it cannot be moved or changed in any way until it is deleted. If MakeStatic() is called on a light, the light will store an intermediate cached shadowmap of all static objects. When a dynamic object moves and triggers a shadow redraw, the light will copy the static shadow buffer to the shadow map and then draw any dynamic objects in its range. For example, if a character walks across a room with a single point light, the character model has to be drawn six times but the static scene geometry doesn't have to be redrawn at all. This can result in an enormous reduction of rendered polygons. (This is something id Software's Doom engine does, although I implemented it first.)
    In the documentation example the shadow polygon count is 27000 until I hit the space key to make the light static. The light then renders the static scene (everything except the fan blade) into an image, there thereafter that cached image is coped to the shadow map before the dynamic scene objects are drawn. This results in the shadow polygons rendered to drop by a lot, since the whole scene does not have to be redrawn each frame.

    I've started using animated GIFs in some of the documentation pages and I really like it. For some reason GIFs feel so much more "solid" and stable. I always think of web videos as some iframe thing that loads separately, lags and doesn't work half the time, and is embedded "behind" the page, but a GIF feels like it is a natural part of the page.

    My plan is to put 100% of my effort into the documentation and make that as good as possible. Well, if there is an increased emphasis on one thing, that necessarily means a decreased emphasis on something else. What am I reducing? I am not going to create a bunch of web pages explaining what great features we have, because the documentation already does that. I also am not going to attempt to make "how to make a game" tutorials. I will leave that to third parties, or defer it into the future. My job is to make attractive and informative primary reference material for people who need real usable information, not to teach non-developers to be developers. That is my goal with the new docs.
  13. JMichael
    A new update is available to beta testers.
    I updated the project to the latest Visual Studio 16.6.2 and adjusted some settings. Build speeds are massively improved. A full rebuild of your game in release mode will now take less than ten seconds. A normal debug build, where just your game code changes, will take about two seconds. (I found that "Whole program optimization" completely does not work in the latest VS and when I disabled it everything was much faster. Plus there's the precompiled header I added a while back.)
    Delayed DLL loading is enabled. This makes it so the engine only loads DLLs when they are needed. If they aren't used by your application, they don't have to be included. If you are not making a VR game, you do not need to include the OpenVR DLL. You can create a small utility application that requires no DLLs in as little as 500 kilobytes. It was also found that the dContainers lib from Newton Dynamics is not actually needed, although the corresponding DLLs are (if your application uses physics).
    A bug in Visual Studio was found that requires all release builds add the setting "/OPT:NOREF,NOICF,NOLBR" in the linker options:
    https://github.com/ThePhD/sol2/issues/900
    A new StringObject class derived from both the WString and Object classes is added. This allows the FileSystemWatcher to store the file path in the event source member when an event occurs. A file rename event will store the old file name in the event.extra member.
    The Entity::Pick syntax is changes slightly, removing the X and Y components for the vector projected in front of the entity. See the new documentation for details.
    The API is being finalized and the new docs system has a lot of finished C++ pages. There's a lot of new stuff documented in there like message dialogs, file and folder request dialogs, world statistics, etc. The Buffer class (which replaces the LE4 "Bank" class) is official and documented. The GUI class has been renamed to "Interface".
    Documentation has been organized by area of functionality instead of class hierarchy. It feels more intuitive to me this way.

    I've also made progress using SWIG to make a wrapper for the C# programming language, with the help of @klepto2 and @carlb. It's not ready to use yet, but the feature has gone from "unknown" to "okay, this can be done". (Although SWIG also supports Lua, I think Sol2 is better suited for this purpose.)
  14. JMichael
    The Leadwerks 5 beta has been updated.
    A new FileSystemWatcher class has been added. This can be used to monitor a directory and emit events when a file is created, deleted, renamed, or overwritten. See the documentation for details and an example. Texture reloading now works correctly. I have only tested reloading textures, but other assets might work as well.
    CopyFile() will now work with URLs as the source file path, turning it into a download command.
    Undocumented class methods and members not meant for end users are now made private. The goal is for 100% of public methods and members to be documented so there is nothing that appears in intellisense that you aren't allowed to use.
    Tags, key bindings, and some other experimental features are removed. I want to develop a more cohesive design for this type of stuff, not just add random ways to do things differently.
    Other miscellaneous small fixes.
  15. JMichael
    An often-requested feature for terrain building commands in Leadwerks 5 is being implemented. Here is my script to create a terrain. This creates a 256 x 256 terrain with one terrain point every meter, and a maximum height of +/- 50 meters:
    --Create terrain local terrain = CreateTerrain(world,256,256) terrain:SetScale(256,100,256) Here is what it looks like:

    A single material layer is then added to the terrain.
    --Add a material layer local mtl = LoadMaterial("Materials/Dirt/dirt01.mat") local layerID = terrain:AddLayer(mtl) We don't have to do anything else to make the material appear because by default the entire terrain is set to use the first layer, if a material is available there:

    Next we will raise a few terrain points.
    --Modify terrain height for x=-5,5 do for y=-5,5 do h = (1 - (math.sqrt(x*x + y*y)) / 5) * 20 terrain:SetElevation(127 + x, 127 + y, h) end end And then we will update the normals for that whole section, all at once. Notice that we specify a larger grid for the normals update, because the terrain points next to the ones we modified will have their normals affected by the change in height of the neighboring pixel.
    --Update normals of modified and neighboring points terrain:UpdateNormals(127 - 6, 127 - 6, 13, 13) Now we have a small hill.

    Next let's add another layer and apply it to terrain points that are on the side of the hill we just created:
    --Add another layer mtl = LoadMaterial("Materials/Rough-rockface1.json") rockLayerID = terrain:AddLayer(mtl) --Apply layer to sides of hill for x=-5,5 do for y=-5,5 do slope = terrain:GetSlope(127 + x, 127 + y) alpha = math.min(slope / 15, 1.0) terrain:SetMaterial(rockLayerID, 127 + x, 127 + y, alpha) end end We could improve the appearance by giving it a more gradual change in the rock layer alpha, but it's okay for now.

    This gives you an idea of the basic terrain building API in Leadwerks 5, and it will serve as the foundation for more advanced terrain features. This will be included in the next beta.
  16. JMichael
    I've moved the GI calculation over to the GPU and our Vulkan renderer in Leadwerks Game Engine 5 beta now supports volume textures. After a lot of trial and error I believe I am closing in on our final techniques. Voxel GI always involves a degree of light leakage, but this can be mitigated by setting a range for the ambient GI. I also implemented a hard reflection which was pretty easy to do. It would not be much more difficult to store the triangles in a lookup table for each voxel in order to trace a finer polygon-based ray for results that would look the same as Nvidia RTX but perform much faster.
    The video below is only performing a single GI bounce at this time, and it is displaying the lighting on the scene voxels, not on the original polygons. I am pretty pleased with this progress and I think the final results will look great and run fast. In addition, the need for environment probes placed in the scene will soon forever be a thing of the past.

    2035564276_VoxelGI_raytracingprogress.mp4.a173c8eb756aa1403cccc972a3306d49.mp4 There is still a lot of work to do on this, but I would say that this feature just went from something I was very overwhelmed and intimidated by to something that is definitely under control and feasible.
    Also, we might not use cascaded shadow maps (for directional lights) at all but instead rely on a voxel raytrace for directional light shadows. If it works, that would be my preference because CSMs waste so much space and drawing a big outdoors scene 3-4 times can be taxing.
  17. JMichael
    I am happy to show you a preview of the new documentation system I am working on:

    Let's take a look at what is going on here:
    It's dark, so you can stare lovingly at it for hours without going blind. You can switch between languages with the links in the header. Lots of internal cross-linking for easy access to relevant information. Extensive, all-inclusive documentation, including Enums, file formats, constructors, and public members. Data is fetched from a Github repository and allows user contributions. I am actually having a lot of fun creating this. It is very fulfilling to be able to go in and build something with total attention to detail.
  18. JMichael
    All this month I have been working on a sort of academic paper for a conference I will be speaking at towards the end of the year. This paper covers details of my work for the last three years, and includes benchmarks that demonstrate the performance gains I was able to get as a result of the new design, based on an analysis of modern graphics hardware.
    I feel like my time spent has not been very efficient. I have not written any code in a while, but it's not like I was working that whole time. I had to just let the ideas settle for a bit.
    Activity doesn't always mean progress.
    Anyways, I am wrapping up now, and am very pleased with the results. It all turned out much much better than I was expecting.
  19. JMichael
    TLDR: I made a long-term bet on VR and it's paying off. I haven't been able to talk much about the details until now.
    Here's what happened:
    Leadwerks 3.0 was released during GDC 2013. I gave a talk on graphics optimization and also had a booth at the expo. Something else significant happened that week.  After the expo closed I walked over to the Oculus booth and they let me try out the first Rift prototype.
    This was a pivotal time both for us and for the entire game industry. Mobile was on the downswing but there were new technologies emerging that I wanted to take advantage of. Our Kickstarter campaign for Linux support was very successful, reaching over 200% of its goal. This coincided with a successful Greenlight campaign to bring Leadwerks Game Engine to Steam in the newly-launched software section. The following month Valve announced the development of SteamOS, a Linux-based operating system for the Steam Machine game consoles. Because of our work in Linux and our placement in Steam, I was fortunate enough to be in close contact with much of the staff at Valve Software.
    The Early Days of VR
    It was during one of my visits to Valve HQ that I was able to try out a prototype of the technology that would go on to become the HTC Vive. In September of 2014 I bought an Oculus Rift DK2 and first started working with VR in Leadwerks. So VR has been something I have worked on in the background for a long time, but I was looking for the right opportunity to really put it to work. In 2016 I felt it was time for a technology refresh, so I wrote a blog about the general direction I wanted to take Leadwerks in. A lot of it centered around VR and performance. I didn't really know exactly how things would work out, but I knew I wanted to do a lot of work with VR.
    A month later I received a message on this forum that went something like this (as I recall):
    I thought "Okay, some stupid teenager, where is my ban button?", but when I started getting emails with nasa.gov return addresses I took notice.
    Now, Leadwerks Software has a long history of use in the defense and simulation industries, with orders for software from Northrop Grumman, Lockheed Martin, the British Royal Navy, and probably some others I don't know about. So NASA making an inquiry about software isn't too strange. What was strange was that they were very interested in meeting in person.
    Mr. Josh Goes to Washington
    I took my first trip to Goddard Space Center in January 2017 where I got a tour of the facility. I saw robots, giant satellites, rockets, and some crazy laser rooms that looked like a Half-Life level. It was my eleven year old self's dream come true. I was also shown some of the virtual reality work they are using Leadwerks Game Engine for. Basically, they were taking high-poly engineering models from CAD software and putting them into a real-time visualization in VR. There are some good reasons for this. VR gives you a stereoscopic view of objects that is far superior to a flat 2D screen. This makes a huge difference when you are viewing complex mechanical objects and planning robotic movements. You just can't see things on a flat screen the same way you can see them in VR. It's like the difference between looking at a photograph of an object versus holding it in your hands.

    What is even going on here???
    CAD models are procedural, meaning they have a precise mathematical formula that describes their shape. In order to render them in real-time, they have to be converted to polygonal models, but these objects can be tens of millions of polygons, with details down to threading on individual screws, and they were trying to view them in VR at 90 frames per second! Now with virtual reality, if there is a discrepancy between what your visual system and your vestibular system perceives, you will get sick to your stomach. That's why it's critical to maintain a steady 90 Hz frame rate. The engineers at NASA told me they first tried to use Unity3D but it was too slow, which is why they came to me. Leadwerks was giving them better performance, but it still was not fast enough for what they wanted to do next. I thought "these guys are crazy, it cannot be done".
    Then I remembered something else people said could never be done.

    So I started to think "if it were possible, how would I do it?" They had also expressed interest in an inverse kinematics simulation, so I put together this robotic arm control demo in a few days, just to show what could be easily be done with our physics system.
     
    A New Game Engine is Born
    With the extreme performance demands of VR and my experience writing optimized rendering systems, I saw an opportunity to focus our development on something people can't live without: speed. I started building a new renderer designed specifically around the way modern PC hardware works. At first I expected to see performance increases of 2-3x. Instead what we are seeing are 10-40x performance increases under heavy loads. During this time I stayed in contact with people at NASA and kept them up to date on the capabilities of the new technology.
    At this point there was still nothing concrete to show for my efforts. NASA purchased some licenses for the Enterprise edition of Leadwerks Game Engine, but the demos I made were free of charge and I was paying my own travel expenses. The cost of plane tickets and hotels adds up quickly, and there was no guarantee any of this would work out. I did not want to talk about what I was doing on this site because it would be embarrassing if I made a lot of big plans and nothing came of it. But I saw a need for the technology I created and I figured something would work out, so I kept working away at it.
    Call to Duty
    Today I am pleased to announce I have signed a contract to put our virtual reality expertise to work for NASA. As we speak, I am preparing to travel to Washington D.C. to begin the project. In the future I plan to provide support for aerospace, defense, manufacturing, and serious games, using our new technology to help users deliver VR simulations with performance and realism beyond anything that has been possible until now.
    My software company and relationship with my customers (you) is unaffected. Development of the new engine will continue, with a strong emphasis on hyper-realism and performance. I think this is a direction everyone here will be happy with. I am going to continue to invest in the development of groundbreaking new features that will help in the aerospace and defense industries (now you understand why I have been talking about 64-bit worlds) and I think a great many people will be happy to come along for the ride in this direction.
    Leadwerks is still a game company, but our core focus is on enabling and creating hyper-realistic VR simulations. Thank you for your support and all the suggestions and ideas you have provided over the years that have helped me create great software for you. Things are about to get very interesting. I can't wait to see what you all create with the new technology we are building.
     
  20. JMichael
    After working out a thread manager class that stores a stack of C++ command buffers, I've got a pretty nice proof of concept working. I can call functions in the game thread and the appropriate actions are pushed onto a command buffer that is then passed to the rendering thread when World::Render is called. The rendering thread is where all the (currently) OpenGL code is executed. When you create a context or load a shader, all it does is create the appropriate structure and send a request over to the rendering thread to finish the job:

    Consequently, there is currently no way of detecting if OpenGL initialization fails(!) and in fact the game will still run along just fine without any graphics rendering! We obviously need a mechanism to detect this, but it is interesting that you can now load a map and run your game without ever creating a window or graphics context. The following code is perfectly legitimate in Leawerks 5:
    #include "Leadwerks.h" using namespace Leadwerks int main(int argc, const char *argv[]) { auto world = CreateWorld() auto map = LoadMap(world,"Maps/start.map"); while (true) { world->Update(); } return 0; } The rendering thread is able to run at its own frame rate independently from the game thread and I have tested under some pretty extreme circumstances to make sure the threads never lock up. By default, I think the game loop will probably self-limit its speed to a maximum of 30 updates per second, giving you a whopping 33 milliseconds for your game logic, but this frequency can be changed to any value, or completely removed by setting it to zero (not recommended, as this can easily lock up the rendering thread with an infinite command buffer stack!). No matter the game frequency, the rendering thread runs at its own speed which is either limited by the window refresh rate, an internal clock, or it can just be let free to run as fast as possible for those triple-digit frame rates.
    Shaders are now loaded from multiple files instead of being packed into a single .shader file. When you load a shader, the file extension will be stripped off (if it is present) and the engine will look for .vert, .frag, .geom, .eval, and .ctrl files for the different shader stages:
    auto shader = LoadShader("Shaders/Model/diffuse"); The asynchronous shader compiling in the engine could make our shader editor a little bit more difficult to handle, except that I don't plan on making any built-in shader editor in the new editor! Instead I plan to rely on Visual Studio Code as the official IDE, and maybe add a plugin that tests to see if shaders compile and link on your current hardware. I found that a pragma statement can be used to indicate include files (not implemented yet) and it won't trigger any errors in the VSCode intellisense:

    Although restructuring the engine to work in this manner is a big task, I am making good progress. Smart pointers make this system really easy to work with. When the owning object in the game thread goes out of scope, its associated rendering object is also collected...unless it is still stored in a command buffer or otherwise in use! The relationships I have worked out work perfectly and I have not run into any problems deciding what the ownership hierarchy should be. For example, a context has a shared pointer to the window it belongs to, but the window only has a weak pointer to the context. If the context handle is lost it is deleted, but if the window handle is lost the context prevents it from being deleted. The capabilities of modern C++ and modern hardware are making this development process a dream come true.
    Of course with forward rendering I am getting about 2000 FPS with a blank screen and Intel graphics, but the real test will be to see what happens when we start adding lots of lights into the scene. The only reason it might be possible to write a good forward renderer now is because graphics hardware has gotten a lot more flexible. Using a variable-length for loop and using the results of a texture lookup for the coordinates of another lookup  were a big no-no when we first switched to deferred rendering, but it looks like that situation has improved.
    The increased restrictions on the renderer and the total separation of internal and user-exposed classes are actually making it a lot easier to write efficient code. Here is my code for the indice array buffer object that lives in the rendering thread:
    #include "../../Leadwerks.h" namespace Leadwerks { OpenGLIndiceArray::OpenGLIndiceArray() : buffer(0), buffersize(0), lockmode(GL_STATIC_DRAW) {} OpenGLIndiceArray::~OpenGLIndiceArray() { if (buffer != 0) { #ifdef DEBUG Assert(glIsBuffer(buffer),"Invalid indice buffer."); #endif glDeleteBuffers(1, &buffer); buffer = 0; } } bool OpenGLIndiceArray::Modify(shared_ptr<Bank> data) { //Error checks if (data == nullptr) return false; if (data->GetSize() == 0) return false; //Generate buffer if (buffer == 0) glGenBuffers(1, &buffer); if (buffer == 0) return false; //shouldn't ever happen //Bind buffer glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, buffer); //Set data if (buffersize == data->GetSize() and lockmode == GL_DYNAMIC_DRAW) { glBufferSubData(GL_ELEMENT_ARRAY_BUFFER, 0, data->GetSize(), data->buf); } else { if (buffersize == data->GetSize()) lockmode = GL_DYNAMIC_DRAW; glBufferData(GL_ELEMENT_ARRAY_BUFFER, data->GetSize(), data->buf, lockmode); } buffersize = data->GetSize(); return true; } bool OpenGLIndiceArray::Enable() { if (buffer == 0) return false; if (buffersize == 0) return false; glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, buffer); return true; } void OpenGLIndiceArray::Disable() { glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0); } } From everything I have seen, my gut feeling tells me that the new engine is going to be ridiculously fast.
    If you would like to be notified when Leadwerks 5 becomes available, be sure to sign up for the mailing list here.
  21. JMichael
    This is something I typed up for some colleagues and I thought it might be useful info for C++ programmers.
    To create an object:
    shared_ptr<TypeID> type = make_shared<TypeID>(constructor args…) This is pretty verbose, so I always do this:
    auto type = make_shared<TypeID>(constructor args…) When all references to the shared pointer are gone, the object is instantly deleted. There’s no garbage collection pauses, and deletion is always instant:
    auto thing = make_shared<Thing>(); auto second_ref = thing; thing = NULL; second_ref = NULL;//poof! Shared pointers are fast and thread-safe. (Don’t ask me how.)
    To get a shared pointer within an object’s method, you need to derive the class from “enable_shared_from_this<SharedObject>”. (You can inherit a class from multiple types, remember):
    class SharedObject : public enable_shared_from_this<SharedObject> And you can implement a Self() method like so, if you want:
    shared_ptr<SharedObject> SharedObject::Self() { return shared_from_this(); } Casting a type is done like this:
    auto bird = dynamic_pointer_cast<Bird>(animal); Dynamic pointer casts will return NULL if the animal is not a bird. Static pointer casts don’t have any checks and are a little faster I guess, but there’s no reason to ever use them.
    You cannot call shared_from_this() in the constructor, because the shared pointer does not exist yet, and you cannot call it in the destructor, because the shared pointer is already gone!
    Weak pointers can be used to store a value, but will not prevent the object from being deleted:
    auto thing = make_shared<Thing>(); weak_ptr<Thing> thingwptr = thing; shared_ptr<Thing> another_ref_to_thing = thingwptr.lock(); //creates a new shared pointer to “thing” auto thing = make_shared<Thing>(); weak_ptr<Thing> thingwptr = thing; thing = NULL; shared_ptr<Thing> another_ref_to_thing = thingwptr.lock(); //returns NULL! If you want to set a weak pointer’s value to NULL without the object going out of scope, just call reset():
    auto thing = make_shared<Thing>(); weak_ptr<Thing> thingwptr = thing; thingwptr.reset(); shared_ptr<Thing> another_ref_to_thing = thingwptr.lock(); //returns NULL! Because no garbage collection is used, circular references can occur, but they are rare:
    auto son = make_shared<Child>(); auto daughter = make_shared<Child>(); son->sister = daughter; daughter->brother = son; son = NULL; daughter = NULL;//nothing is deleted! The problem above can be solved by making the sister and brother members weak pointers instead of shared pointers, thus removing the circular references.
    That’s all you need to know!
  22. JMichael
    I have been spending most of my time on something else this month in preparation for the release of the Leadwerks 5 SDK. However, I did add one small feature today that has very big implications for the way the engine works. You can load a file from a web URL:
    local tex = LoadTexture("https://www.github.com/Leadwerks/Documentation/raw/master/Assets/brickwall01.dds") Why is this a big deal? Well, it means you can post code snippets that can be copied and pasted without requiring download of any extra files. That means the documentation can include examples that use files that aren't required to be in the user's project directory:
    https://github.com/Leadwerks/Documentation/blob/master/LUA_LoadTexture.md
    The documentation doesn't have to have any awkward zip files you are instructed to download like here, because any files that are needed in any examples can simply be linked directly to by the URL. So basically the default blank template can really be blank and doesn't need to include any "sample" files at all. If you have something like a model that has separate material and texture files, it should be possible to just link to the model file's URL and the rest of the associated files will be grabbed automatically.
  23. JMichael
    The old-timers on this site will remember a time before Steam, when leadwerks.com had a native image gallery as part of our site. The gallery is gone but all the images were saved on Google Drive here. This was replaced with the Steam-based image gallery which has accrued a massive amount of content (about 1800 screenshots).
    With Steam flooding the market with new products there is pressure for developers to release more different products to occupy digital shelf space. For this reason, our new engine will be released with separate app IDs for different editions, rather than using the DLC system to differentiate versions. That means I will be making much less use of Steam-specific features in the future like screenshots, videos, and Workshop, since all that content would be split among several app IDs. Additionally, I plan on working with multiple software resellers and Steam is just one of them, so there is less reason to push all our content through that platform. 
    At the same time, advances in web technology have given us a vast amount of storage at a low cost. All of our user-generated files are hosted on Amazon S3 which costs almost nothing, so there is no longer the same pressure we used to have to keep our website data from getting too large.
    With this in mind, I am bringing back the native website gallery here. In time, all our old screenshots from both Google Drive and Steam will be migrated in. I will see that the Steam import system attempts to match Steam authors names with the forum name, but if no match is found the image will be attributed to "guest".
    Steam has been great for us, but looking back at our old site I do feel like something was lost when we offloaded a lot of content onto their system. I'm excited about bringing the native gallery back and the possibilities for the new engine. You can upload your images here:
    https://www.leadwerks.com/community/gallery
    In the future, I am also interested in the idea of native video support, where we can upload a video or embed URL and have the website download the video, process it to a reliable format, generate thumbnails, and display it natively on our own site. In a world where everything on the web is simultaneously being monopolized and oversaturated, controlling your own data is critical. And it's not 2001, web space is cheap.
    I'm also really excited about the content that will be created with the new engine. It isn't going to be aimed at extreme beginners like LE4 was. It's easy to use, easier than LE4, in my opinion, but that's not the main selling point. Consequently I think we are going to see some very interesting work being done with the new engine and a lot of new energy on this site, more than we've ever seen before. So I am preparing for that.
    Get ready and dream big!
  24. JMichael
    I implemented light bounces and can now run the GI routine as many times as I want. When I use 25 rays per voxel and run the GI routine three times, here is the result. (The dark area in the middle of the floor is actually correct. That area should be lit by the sky color, but I have not yet implemented that, so it appears darker.)


    It's sort of working but obviously these results aren't usable yet. Making matters more difficult is the fact that people love to show their best screenshots and love to hide the problems their code has, so it is hard to find something reliable to compare my results to.
    I also found that the GI pass, unlike all previous passes, is very slow. Each pass takes about 30 seconds in release mode! I could try to optimize the C++ code but something tells me that even optimized C++ code would not be fast enough. So it seems the GI passes will probably need to be performed in a shader. I am going to experiment a bit with some ideas I have first to provide better quality GI results first though.
     
  25. JMichael
    The polygon voxelization process for our voxel GI system now takes vertex, material, and base texture colors into account. The voxel algorithm does not yet support a second color channel for emission, but I am building the whole system with that in mind. When I visualize the results of the voxel building the images are pretty remarkable! Of course the goal is to use this data for fast global illumination calculations but maybe they could be used to make a whole new style of game graphics.

    Direct lighting calculations on the CPU are fast enough that I am going to stick with this approach until I have to use the GPU. If several cascading voxel grids were created around the camera, and each updated asynchronously on its own thread, that might give us the speed we need to relieve the GPU from doing any extra work. The final volume textures could be compressed to DXT1 (12.5% their original size) and sent to the GPU.
    After direct lighting has been calculated, the next step is to downsample the voxel grid. I found the fastest way to do this is to iterate through just the solid voxels. This is how my previous algorithm worked:
    for (x=0; x < size / 2; ++x) { for (y=0; y < size / 2; ++y) { for (z=0; z < size / 2; ++z) { //Downsample this 2x2 block } } } A new faster approach works by "downsampling" the set of solid voxels by dividing each value by two. There are some duplicated values but that's fine:
    for (const iVec3& i : solidvoxels) { downsampledgrid->solidvoxels.insert(iVec3(i.x/2,i.y/2,i.z/2)) } for (const iVec3& i : downsampledgrid->solidvoxels) { //Downsample this 2x2 block } We can then iterate through just the solid voxels when performing the downsampling. A single call to memset will set all the voxel data to black / empty before the downsampling begins. This turns out to be much much faster than iterating through every voxel on all three axes.
    Here are the results of the downsampling process. What you don't see here is the alpha value of each voxel. The goblin in the center ends up bleeding out to fill very large voxels, because the rest of the volume around him is empty space, but the alpha value of those voxels will be adjusted to give them less influence in the GI calculation.




    For a 128x128x128 voxel grid, with voxel size of 0.125 meters, my numbers are now:
    Voxelization: 607 milliseconds Direct lighting (all six directions): 109 First downsample (to 64x64): 39 Second downsample (to 32x32): 7 Third downsample (to 16x16): 1 Total: 763 Note that voxelization, by far the slowest step here, does not have to be performed completely on all geometry each update. The direct lighting time elapsed is within a tolerable range, so we are in the running to make GI calculations entirely on the CPU, relieving the GPU of extra work and compressing our data before it is sent over the PCI bridge.
    Also note that a smaller voxel grids could be used, with more voxel grids spread across more CPU cores. If that were the case I would expect our processing time for each one to go down to 191 milliseconds total (39 milliseconds without the voxelization step), and the distance your GI covers would then be determined by your number of CPU cores.
    In fact there is a variety of ways this task could be divided between several CPU cores.
×
×
  • Create New...