With the help of @martyj I was able to test out occlusion culling in the new engine. This was a great chance to revisit an existing feature and see how it can be improved. The first thing I found is that determining visibility based on whether a single pixel is visible isn't necessarily a good idea. If small cracks are present in the scene one single pixel peeking through can cause a lot of unnecessary drawing without improving the visual quality. I changed the occlusion culling more to record the number of pixels drawn, instead just using a yes/no boolean value:
In OpenGL 4.3, a less accurate but faster
GL_ANY_SAMPLES_PASSED_CONSERVATIVE (i.e. it might produce false positives) option was added, but this is a step in the wrong direction, in my opinion.
Because our new clustered forward renderer uses a depth pre-pass I was able to implement a wireframe rendering more that works with occlusion culling. Depth data is rendered in the prepass, and the a color wireframe is drawn on top. This allowed me to easily view the occlusion culling results and fine-tune the algorithm to make it perfect. Here are the results:
As you can see, we have pixel-perfect occlusion culling that is completely dynamic and basically zero-cost, because the entire process is performed on the GPU. Awesome!