Jump to content

Analyzing performance


Josh
 Share

Recommended Posts

If you want more in-depth information on what your scene is rendering, you can set an extra argument in context:DrawStats() to true:

 

post-1-0-45744800-1441290857_thumb.jpg

 

Things to look for:

 

Number of shadows rendered. This increases when you have light shadows being re-rendered, which takes more processing power.

 

Number of batches. This is correlated to the number of entities or surfaces.

 

The data displayed may change slightly between versions, as this command is not exactly official.

 

--Initialize Steamworks (optional)
Steamworks:Initialize()

--Set the application title
title="$PROJECT_TITLE"

--Create a window
local windowstyle = window.Titlebar
if System:GetProperty("fullscreen")=="1" then windowstyle=windowstyle+window.FullScreen end
window=Window:Create(title,0,0,System:GetProperty("screenwidth","1024"),System:GetProperty("screenheight","768"),windowstyle)
window:HideMouse()

--Create the graphics context
context=Context:Create(window,0)
if context==nil then return end

--Create a world
world=World:Create()
world:SetLightQuality((System:GetProperty("lightquality","1")))

--Load a map
local mapfile = System:GetProperty("map","Maps/start.map")
if Map:Load(mapfile)==false then return end

while window:KeyDown(Key.Escape)==false do

--If window has been closed, end the program
if window:Closed() then break end

--Handle map change
if changemapname~=nil then

	--Clear all entities
	world:Clear()

	--Load the next map
	Time:Pause()
	if Map:Load("Maps/"..changemapname..".map")==false then return end
	Time:Resume()

	changemapname = nil
end	

--Update the app timing
Time:Update()

--Update the world
world:Update()

--Render the world
if window:Minimized()==false then
	world:Render()
end

--Render statistics
context:SetBlendMode(Blend.Alpha)
if DEBUG then
	context:SetColor(1,0,0,1)
	context:DrawText("Debug Mode",2,2)
	context:SetColor(1,1,1,1)
	context:DrawStats(2,22,true)
	context:SetBlendMode(Blend.Solid)
else
	--Toggle statistics on and off
	if (window:KeyHit(Key.F11)) then showstats = not showstats end
	if showstats then
		context:SetColor(1,1,1,1)
		--context:DrawText("FPS: "..Math:Round(Time:UPS()),2,2)
		context:DrawStats(2,2,true)
	end
end

--Refresh the screen
context:Sync(true)

end

  • Upvote 2

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

I'm currently playing with optimization with the Vectronic demo. After removing Buffered spotlights, and adding Shadmar's culling script, my framerate has increased, but this room is still hell when you look forward, and the Script Memory Usage count rapidly keeps climbing. Are these results "too high"? I'd be nice if they changed colors if any of these numbers get to be dangerously high as a clear indicator that something's wrong.

post-12469-0-57111000-1441320214_thumb.png

Cyclone - Ultra Game System - Component PreprocessorTex2TGA - Darkness Awaits Template (Leadwerks)

If you like my work, consider supporting me on Patreon!

Link to comment
Share on other sites

If you have something in motion, the shadows rendered and polygon counts look fine. There are a lot of lights being drawn, but I am assuming a lot of those are non-shadowed accent lights? In any case, 23 lights onscreen is fine.

 

The batch count is a red flag. You don't have that many different materials. I would start by looking at model limb counts. There is a tool in the model editor window that will collapse models into a single object.

 

Script memory count will continually climb, and then drop at once periodically. The Lua garbage collector doesn't kick in until a certain limit is reached, and then it performs a full cleanup. You can continuously call collectgarbage() if you want to verify it isn't leaking, but I don't recommend this.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Nothing is moving in that screen. This is in that hallway right before you hit the trigger for the floor to raise up with the dispenser. The pillars/platforms that do move on the button press are set to a dynamic ShadowMode. I had an idea about the batch count, but it did not seem to be the case, so I have to look more into it later.

 

Also, quick question: Will using Caulk textures improve the overall framerate as there are less polys, or it does not matter? I noticed the AI and Events map is all textured while my maps have Caulk textures on the outside of the maps, and any other place that can't be seen,

Cyclone - Ultra Game System - Component PreprocessorTex2TGA - Darkness Awaits Template (Leadwerks)

If you like my work, consider supporting me on Patreon!

Link to comment
Share on other sites

I don't think removing hidden faces will make a big difference. It's good practice, but polycount doesn't really matter, within a certain tolerance. Maybe you will save like 3 FPS if you do the whole map.

 

Even if you are just continually setting a position on a shadow-casting object that will be enough to trigger the shadow renders. I suspect those buttons are setting it off. But my primary concern is the batch count.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

I've been trying hard to optimise my game some more and here's what I've done since the last version.

 

1) Checking view ranges I noticed shadamars script also effected csg entities (so csg is like a simple entity?) but csg doesn't obey the view range setting in the editor and always ends up max so I created a mini script to set these on internal csg walls. Not sure if that's advisable and it didn't make any noticeable difference to fps although I can see it working in the game with wireframe view.

 

2) I went through all the textures being used in the game and changed/removed some big ones. This didn't make any difference to fps but did save me about 100mb in video memory and the publish data.zip is now 50mb smaller. So that was worthwhile and maybe contributed to half a frame.

 

3) I went through all the models being used and collapsed some after understanding the surface/batches thing a bit better. I think this might have helped the most and I think it made some small improvement to fps.

 

4) I converted some more csg in the map to models eg. banisters in the houses, fence/wall poles and the sheds are now models. This didn't make as big a difference as I hoped but with less entities I think it helped reduce editor memory use by 100mb or so. The map currently uses 1GB in the editor so I guess that was worthwhile for that reason at least.

 

All in all I've gotten at most an extra 5ps after doing all this and the game does feel a little bit smoother. Without enemies the map now runs for me at about 15-30fps depending on location and with enemies about 10 to 25fps. So for now still <30fps gameplay on my system anyway. I'll spend some more time on it but not sure if I'll be able to squeeze much more performance out of it without sacrificing map content.

Check out my games: One More Day / Halloween Pumpkin Run

Link to comment
Share on other sites

I disabled textures that use it and my batch count is still high. the framerate only increases if you look at a certain direction in the room. This map alone has been giving me issues since the beginning, and really gave me ideas on how to do things better when I reboot this project.

Cyclone - Ultra Game System - Component PreprocessorTex2TGA - Darkness Awaits Template (Leadwerks)

If you like my work, consider supporting me on Patreon!

Link to comment
Share on other sites

I can look at your map if you send it to me. I am not sure how it is doing so many draw calls when the example FPS map only needs about 80 batches and has a much wider range of materials and objects.

  • Upvote 1

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Thanks Josh, since you have the raw project, just drop/replace these files. I've added Shadmar's script to the Main.lua script, while changing a few things in the map.

vectronicdemohalp.zip

Cyclone - Ultra Game System - Component PreprocessorTex2TGA - Darkness Awaits Template (Leadwerks)

If you like my work, consider supporting me on Patreon!

Link to comment
Share on other sites

Whatever Josh comes up with, all these suggestions/solutions need to be collected and presented in a neat article in the tutorial section under Performance. I have lost track of all of these posts.

  • Upvote 4

Intel Core i7 Quad 2.3 Ghz, 8GB RAM, GeForce GT 630M 2GB, Windows 10 (x64)

Link to comment
Share on other sites

Whatever Josh comes up with, all these suggestions/solutions need to be collected and presented in a neat article in the tutorial section under Performance. I have lost track of all of these posts.

 

I agree, and we definitely need good tutorial about threads...

Link to comment
Share on other sites

If you don't understand threading already, you don't need it! The number of applications this would be useful for are extremely limited, and it's only included because it's a nice cross-platform thread class.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

You have four major problems in this map.

 

Occlusion Culling

Many of your brushes have occlusion culling enabled on a per-entity basis. This is bad because the occlusion test itself requires an extra unbatched draw call. However, all these brushes are being collapsed, and the collapsed model does not use per-entity occlusion culling, so this isn't having any negative effect on performance. Just keep this in mind because if you do this with models or brushes that don't get collapsed, you will have issues. Generally the only entities that should use per-entity occlusion are lights and animated models.

 

post-1-0-73733700-1441383439_thumb.jpg

 

Shadow Modes

This map is a great test case for our cached shadow maps. This system stores a static shadow map and combines it with a second shadow map for dynamic objects. You have fairly detailed rooms with only a few objects moving in them. This allows lights to only render the dynamic shadowmap, with just a few objects in it, and add that to the static shadow map that contains the rest of the room. However, you have this setting disabled on all your shadow-casting lights, forcing the light to re-render the entire room. You previously were not seeing any gains from this system, and were in fact experiencing lower performance, because of the next issue. Note that cached shadow maps should never be used on a moving light, since it has to constantly re-render both shadowmaps.

 

post-1-0-72008900-1441383614_thumb.jpg

 

Animation

Animation causes motion, and motion triggers shadow rendering. Your doors use a script that constantly animates them, causing repositioning even when they are at rest. Even worse, some of your hatches used the static shadow mode, which triggered constant rendering of any intersecting light's static shadow, which is why the cached shadow mode wasn't improving your performance. I temporarily dealt with this by disabling shadows on all the button and hatch assemblies.

 

post-1-0-05203900-1441383661_thumb.jpg

 

Uncollapsed Brushes

A lot of brushes are used for collision and triggers. Trigger brushes use a nice transparent material that makes them easily visible in the editor, and then an attached script is used to apply the invisible material to them when the game runs. While this looks nice, it causes the render to iterate through lots of brush faces, because it doesn't know the entire brush is invisible. This itself can cause some significant slowdown.

 

Brushes are also used in place of the physics shape feature. You can create a physics shape in the model editor for all those objects, and it will save you the time of manually placing collision brushes.

 

Once I optimized the map, the first large room still sees an increase of 100+ batches when the box is picked up, triggering the dynamic shadow render. This is mostly due to 15 uncollapsed brushes, with six faces each, which makes for 90 extra batches. This is something I can optimize on my end, since your usage here is very reasonable. I'll change it so the brush geometry system collapses surfaces within the same brush that have the same material, so those 90 batches will go down to about 15, the same as it would be if each brush was a unique model. This will also make the editor a bit faster.

 

post-1-0-62518300-1441383856_thumb.jpg

 

Bonus Tip: Particle Collision

As of Leadwerks 3.6, any particle emitter with a non-zero collision type will perform raycasts on the scene to make individual particles bounce off the map. The problem is that previous versions created particle emitters with the "Prop" collision type by default. I should have added a hidden parameter to detect this somehow and maintain backwards compatibility by disabling collision automatically, but I didn't think about it. All your particle emitters should be set to use no collision, unless that is what you want. I don't think this was causing any issues in your map, since your emitters are usually inactive, but it could affect some existing maps other people have.

 

I also noticed all your particle emitters have individual occlusion culling enabled, which probably isn't worth it.

 

post-1-0-39360400-1441385780_thumb.jpg

 

Conclusion

Your performance problem was caused primarily by constant re-rendering of static and dynamic shadow maps, combined with a high number of uncollapsed brushes. Your map does some things wrong, but it also demonstrates some usage that is reasonable, even if I did not envision it being used this way. The use of transparent materials for clipping and trigger brushes is a nice technique, but the renderer is not presently set up to accommodate this efficiently, since the culling occurs as the faces are being iterated through. Your map uses a lot of uncollapsed brushes for moving platforms, which causes a high number of batches in the shadow render, but the renderer should be able to handle this better.

 

So going forward, here is what I recommend:

  • Get rid of the clipping brushes wherever possible and using the physics shape system for model collision.
  • Replace your animated movement with physics joints or manual motion that doesn't constantly reposition objects.

 

And on my end, I can improve the following:

  • Merge faces by material on a per-brush basis. This reduces the batch count.
  • Implement some kind of setting that makes objects visible in the editor and invisible in the engine, on a per-entity basis. This prevents the renderer from performing the brush Draw() function.

 

The attached map still drops down to 30 FPS when the brush count gets high, but everything else is improved and you will see better performance. Once I improve the uncollapsed brush handling, it will run at a solid 60 FPS constantly.

 

demo.zip

  • Upvote 9

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

A lot of brushes are used for collision and triggers. Trigger brushes use a nice transparent material that makes them easily visible in the editor, and then an attached script is used to apply the invisible material to them when the game runs. While this looks nice, it causes the render to iterate through lots of brush faces, because it doesn't know the entire brush is invisible. This itself can cause some significant slowdown

 

What's the alternative? I would say the "logical" or "normal" way a user thinks of volume triggers is via brushes that are invisible. It's far easier than any other option we have. I would say if you don't want us doing this because of performance reasons, that you make a special trigger brush that allows the ease of use as normal brushes (creation, resizing) without the performance hit you are saying they have. Triggers provide gameplay which I know you are big on these days, so having a special trigger brush would help.

  • Upvote 3
Link to comment
Share on other sites

What's the alternative? I would say the "logical" or "normal" way a user thinks of volume triggers is via brushes that are invisible. It's far easier than any other option we have. I would say if you don't want us doing this because of performance reasons, that you make a special trigger brush that allows the ease of use as normal brushes (creation, resizing) without the performance hit you are saying they have. Triggers provide gameplay which I know you are big on these days, so having a special trigger brush would help.

I'm not saying "don't use trigger brushes". In this case, all the trigger and clip brushes had shadows enabled, so they were all being iterated through each shadow render, each face, and discarded when the invisible material was discovered. That, and there was a pretty high number of brushes just used for clipping.

 

I may also be able to optimize the brush Draw() function a bit more, with in-game rendering in mind.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

This executable has some optimizations for in-game brush rendering. With vsync disabled in my version of the demo map, I never get less than 60 FPS with this:

VectronicDemo.zip

 

I think this EXE pretty much eliminates the overhead of clip and trigger brushes. Although I think you should be using the physics shape system instead of clip brushes, I don't think there will be any significant performance penalty anymore.

 

The nice thing about this style of level design is that once you get your optimization under control, you can make your maps as big as you want, and one part isn't going to effect another.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...