Jump to content

My SSAO Implementation


LordHippo
 Share

Recommended Posts

Hi all,

 

Here is my SSAO implementation.

gallery_84_111_10125.png

 

I've got some inspirations from HBAO, but it is a completely different method and is MUCH more optimized.

I've also implemented Crysis 1 SSAO to be able to compare.

gallery_84_110_476856.png

 

The time values you see on the image is only SSAO render time. Screen resolution is 1280*720, and the SSAO is rendered in full resolution on a GeForce GTS450 card. ( FPS = 1000 / time )

 

Model from Ywa by Harry ( http://harrysite.net/work/index.php?x=browse )

 

I've also implemented a method for false occlusion removal:

gallery_84_111_1530502.png

 

It's a combination of a method implemented by Crytek in Crysis1, and my own method. It still needs more optimizations.

 

Another improvement I've made is using a half resolution depth buffer for the SSAO rendering. The method is used in Uncharted2.

As you can see on the picture, this method almost doubles the speed of SSAO, with almost no visual artifacts.

gallery_84_111_402990.png

 

From a technical point of view, the main bottleneck of the SSAO algorithms is their texture fetches. More specifically, they heavily do GPU cache trashing, because of their random sampling basis. Also cache trashing decreases shader performance dramatically.

So any method that reduces cache trashing would boost up the SSAO.

By using a half resolution depth buffer, one of the main benefits is that texture sampling points get closer, so the amount of cache trashing is reduced.

Also as seen in the results, there is no noticeable difference or artifacts.

 

So with this technique, you gain much more performance, with no visual cost! That's the magic of OPTIMIZATION :(

 

I will share it to the community when I'm done with false occlusion removal optimizations.

 

Also check the album page for more shots.

 

BTW, I'll be happy if anyone have any visual or technical suggestions, cause I want to "get rid of it, once and for all" :P

  • Upvote 3

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

Fantastic work mate! Really cool stuff. I can't wait to get my own hands on this. You really should consider also packaging it up as an injector so people can use it on other games - Its a surprisingly popular mod type among Fallout, Skyrim, GTA, etc.

 

I would be interested to see what would happen if you halved again the resolution depth buffer so its 1 quarter of its original size. See if you can find that point where the speed returns versus the image quality meet that sweet spot.

 

Once you're done with SSAO you might want to take a look at this brand new AA method I read about last night, Its making a real big splash in the graphics community because its extremely fast and does almost as well as really high end AA settings: http://www.iryoku.com/smaa

Open source, so you can go at it.

  • Upvote 1

Programmer, Modeller

Intel Core i7 930 @ 3.5GHz | GeForce 480 GTX | 6GB DDR3 RAM | Windows 7 Premium x64

Visual Studio 2008 | Photoshop CS3 | Maya 2009

Website: http://srichnet.info

Link to comment
Share on other sites

Fantastic work mate! Really cool stuff. I can't wait to get my own hands on this. You really should consider also packaging it up as an injector so people can use it on other games - Its a surprisingly popular mod type among Fallout, Skyrim, GTA, etc.

 

I would be interested to see what would happen if you halved again the resolution depth buffer so its 1 quarter of its original size. See if you can find that point where the speed returns versus the image quality meet that sweet spot.

 

Once you're done with SSAO you might want to take a look at this brand new AA method I read about last night, Its making a real big splash in the graphics community because its extremely fast and does almost as well as really high end AA settings: http://www.iryoku.com/smaa

Open source, so you can go at it.

 

Thanks Scott.

I've never heard of "injectors" before, and have no idea how to make them. But it would be interesting to make them, and test my shaders in AAA games!

I will test the quarter res depth buffer, but I think there will be some halos around the objects. Currently running SSAO with half-res depth and in %75 resolution takes about 1.5ms in 720p on geforce GTS450 which is really good.

 

About the AA method (SMAA), I think the performance is not good. It runs on GTX295 (which is really powerful) in 1.8ms with 4x memory footprint.

So in my opinion, MLAA is the best post AA technique for the current generation of hardwares. I'll try to implement all the AA methods such as MLAA and SMAA and compare them in terms of quality, performance and memory footprint.

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

How are you measuring time? A GPU performance query, I hope.

 

Very nice stuff.

 

Thanks Josh.

 

No, I'm just measuring the frame time with and without the whole SSAO rendering (copying buffers, rendering SSAO, blurring it) and subtract them.

Can you tell me more about "GPU performance query"? I've googled it and found no results for OpenGL.

But this time was the same in any scene I've tested, so I think it's accurate enough.

Correct me if I'm wrong :)

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

In regards to injectors, you should take a look at the work of ENB. He is one of the most popular shader modders, with shaders for almost all games. His website is here: http://enbdev.com/index_en.html

The way injectors work is that you take a slightly modified d3d9.dll and drop it into the game directory. It redirects the initial call for the games shader file to your own one. From there you can inject your own shader and then pass back to the games one, or discard it entirely.

Programmer, Modeller

Intel Core i7 930 @ 3.5GHz | GeForce 480 GTX | 6GB DDR3 RAM | Windows 7 Premium x64

Visual Studio 2008 | Photoshop CS3 | Maya 2009

Website: http://srichnet.info

Link to comment
Share on other sites

Thanks Josh.

 

No, I'm just measuring the frame time with and without the whole SSAO rendering (copying buffers, rendering SSAO, blurring it) and subtract them.

Can you tell me more about "GPU performance query"? I've googled it and found no results for OpenGL.

But this time was the same in any scene I've tested, so I think it's accurate enough.

Correct me if I'm wrong :)

NVidia cards have a high-precision GPU performance query, I've used it, but can't remember the commands off the top of my head. I think it's an OpenGL extension. It will tell you exactly how long an operation takes.

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Great to see someone is using the model properly. :) While I don't understand anything of the technical stuff, it does look quite good.

 

Thank you so much for the model. It really helped me a lot in development :)

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

Can you do a comparison shot of that model with LE2's default SSDO vs. your SSAO?

Looks great, by the way.

 

Which model do you mean?

I've already posted that comparison on the cave model.

gallery_84_110_476856.png

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

I mean a comparison of SSAO on and off, with diffuse and everything else enabled. :)

 

So I've tried my best to turn everything enabled.

gallery_84_111_1355867.png

 

Notice that the effect is not noticeable side by side. The best way is to have a Photoshop layered image and switch them on and off.

Also the effect of SSAO is not noticeable, because the diffuse texture of the models have baked AO.

gallery_84_111_570898.png

  • Upvote 1

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

Wow, that last shot is amazing. This is by far the best SSAO I've ever seen. Better than mine, and much better than Crysis.

 

One thing you should test before you call it finished is how well it handles distant objects. I had problems with artifacts on distant terrain when I was implementing mine.

 

I'm also interested to see how it reacts when you have contours on terrain in the distance. Will it capture that detail like in this shot?:

Crysis_SSAO.jpg

My job is to make tools you love, with the features you want, and performance you can't live without.

Link to comment
Share on other sites

Will somebody explain to a graphics noob what I'm looking at here? I'm having a hard time seeing the differences on the cave screenshot comparison.

 

As i said, the difference in the cave shot is hard to notice. Cause the model has AO baked in its textures.

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

Surely we have all noticed the differences in our editors/games with SSDO turned on and off; even if its hard to tell the difference between the various illustrations presented. The fact that LordHippo's is almost twice as fast as our current one is in itself of huge benefit is it not!

  • Upvote 1

Intel Core i5 2.66 GHz, Asus P7P55D, 8Gb DDR3 RAM, GTX460 1Gb DDR5, Windows 7 (x64), LE Editor, GMax, 3DWS, UU3D Pro, Texture Maker Pro, Shader Map Pro. Development language: C/C++

Link to comment
Share on other sites

The main problem is that I'm not an artist, and can't make a good use of the shader. :)

Also I'm not allowed to publish any shots from the game we're working on.

Ali Salehi | Programmer

 

Intel Core i3 2100 @ 3.0GHz | GeForce GTS 450 | 4GB DDR3 RAM | Windows 7 Ultimate x64

LE 2.50 | Visual Studio 2010 | RenderMonkey 1.82 | gDEBugger 5.8 | FX Composer 2.5 | UU3D 3 | xNormal 3.17

 

 

76561198023085627.png

Link to comment
Share on other sites

Are you going to provide this for the LE community in some fashion? Paid or not? Maybe an artist on here would be willing to make some scenes that really show this off. I'm just curious as to the different look it provides. From the black/white picks it's clear to see the difference, but would just be nice to see it in a somewhat finished scene with all the colors where it makes a big difference. I'm sure it excels in certain situations and from Josh's reaction I'm sure it's a really cool feat, but I feel awkward and not sure how to react because I literally can't see any difference in the finished scenes presented :/

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

 Share

×
×
  • Create New...