Interactive 3-D graphics lecture: deferred shading

Let me tell you a story about two fragments. Fragment F1 grew up poor. His father, lacking a degree, worked three jobs to keep food on the table. His mother stayed home to raise the five little fragments. When he was 10, his father was injured in the mines and lost use of his feet. F1, being the oldest, shouldered the burden of keeping his family alive. He was bright and skilled with his hands. His employers noticed him and the day he turned 18, one of them suggested he apply for an opening for a high-level managerial position.

Fragment F2’s story was different. Though born three days after F1, his father had money. F2 got into the best schools, had his own 5-channel framebuffer in the backyard, and was waited on by hired vertex and fragment processors. When he turned 18, his father talked to a friend of his who was hiring for a high-level managerial position.

F1 and F2 both applied for the job. F1’s resume was packed with relevant experience. He had worked hard. F2 didn’t even submit a resume, which would have been blank anyway. Guess who got the job? F2 did. Why? Because he had a lesser depth than F1.

What’s the lesson? Don’t waste time with fragments that aren’t going to make it. In graphics, we call this not wasting time deferred shading.

Forward Shading

The historical procedure of forward shading can be broken down into this:

for each primitive
  for each fragment
    // the early-z test
    if fragment depth > minimum depth so far

    for each light
      accumulate color
    assign color to framebuffer

The inefficiency of this algorithm is that we may spend a lot of time shading fragments that won’t ever be visible. That’s wasted computation. Shading and determining a fragment’s visibility are tightly coupled. The deferred shading approach breaks these two apart. Visibility of a fragment is determined first, and then it is shaded.

Let’s give a name to this problem. Depth complexity is the term used to describe how many times a pixel is written into. The higher the depth complexity, the more overdraw we have. We’d like it to be 1.

Note that deferred shading is most useful when lighting is expensive. Lighting can be expensive for a variety of reasons: you are generating shadows, light falls off with distance, there are lots of lights, etc.

Additionally, if you can render the scene front-to-back, then deferred shading is not all that helpful. You’re depth test will be quite effective at pruning out fragments that will not be visible. In general, however, rendering the scene front-to-back is not feasible.

As a computer scientist/engineer, you can reason about the computational complexity of forward shading. What is it here?


Roughly, the idea of deferred shading is:

  1. Render the scene geometry. Place vertices spatially, calculate normals, do texture lookups, etc., but do not calculate lighting.
  2. Oh, yeah. Instead of computing the fragment’s final color in the previous step, just store all the parameters in one or more buffers.
  3. Draw a full-screen quadrilateral. Texture it with the FBO. Compute the quadrilateral’s fragments’ lighting using the values from the textures.

To be useful, deferred shading requires a couple of things: render-to-texture (FBOs) and multiple rendering targets (MRT). With MRT, you can render once, but have the fragment shader write to multiple outputs. This is needed because framebuffers generally only have at most 4 color channels, and you generally need to store more than 4 parameters in step 2.

Each triangle is rendered once. Each visible pixel is shaded only once. What’s the complexity here?

The G-Buffer

The set of buffers you use to store parameters is collectively called the g-buffer. When folks talk to each other about their deferred shading implements, they ask, “What’s in your g-buffer?” There are many possibilities:

  1. Normals. This is almost a definite. You could store them using spherical coordinates. The radius of a normal is always 1, so you’d only need two channels. You’d need to convert them back into Cartesian space. Storing them as a Cartesian 3-tuple is pretty common. However, 8-bit normal components are generally considered too imprecise. Users are more highly-tuned to weird shading more than weird spatial arrangement. OpenGL offers a texture type of GL_RGB10_A2 which can help. Normals will need to be range-mapped to [0, 1]. If you have a decent GPU, GL_RGBA32F will give you full float support.
  2. Position. This can actually be computed and not stored. Implicitly you know the screen space position of each fragment (via gl_FragCoord). You can unproject this back into eye space through a matrix transformation. To unproject, you will need to store the fragment’s depth. If you choose to store it, you’ll probably need a GL_RGBA32F texture, unless you want to range map it to [0, 1].
  3. Material parameters. Diffuse color, shininess, ambient occlusion factor, etc., are commonly stored. If these don’t change continuously and storage space is an issue, consider storing indices of these values instead. The indices can be used to do lookup later on. (Turn off linear filtering before retrieving the index!)

Texture lookup is done in the visibility determination stage. Why? Because real scenes usually have a lot textures overlaid on them. If you had to store the texture coordinates for each texture, as well as the sampler, your g-buffer would be too bulky.

Here are a few examples of industrial g-buffer choices: Killzone 2, LeadWerks, StarCraft 2, and NVIDIA.

Global and local lighting

If your scene has only global lights in it, you can just blindly draw the full-screen quad and texture and light it. You’re done.

Suppose, though, that some lights have only a small scale effect. It wouldn’t make computational sense for each fragment to include distant local lights’ effects. For these, we can compute simple bounding geometry around the affected area, render it, and invoke a shader that tacks on the local light’s contribution.


Let’s implement deferred shading with a g-buffer that stores the albedo, the normal, and the eye space position. Each of these is a 3-vector, so that means we need 9 storage locations. Each rendering target can have 4-channels, so we’ll need 3 of them. We’ll have some empty channels. We create a texture-backed FBO with:

void DeferredRenderer::CreateFBO() {
  // Positions texture on unit 0.
  GLuint positions_tid;
  glGenTextures(1, &positions_tid);
  glBindTexture(GL_TEXTURE_2D, positions_tid);

  // Normals texture on unit 1.
  GLuint normals_tid;
  glGenTextures(1, &normals_tid);
  glBindTexture(GL_TEXTURE_2D, normals_tid);

  // Albedo texture on unit 2.
  GLuint albedo_tid;
  glGenTextures(1, &albedo_tid);
  glBindTexture(GL_TEXTURE_2D, albedo_tid);

  // Generate and allocate depth render buffer.
  GLuint depth_rid;
  glGenRenderbuffers(1, &depth_rid);
  glBindRenderbuffer(GL_RENDERBUFFER, depth_rid);

  // Now generate FBO and bind renderbuffers.
  glGenFramebuffers(1, &gbuffer_fid);
  glBindFramebuffer(GL_DRAW_FRAMEBUFFER, gbuffer_fid);
  glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, positions_tid, 0);
  glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, GL_TEXTURE_2D, normals_tid, 0);
  glFramebufferTexture2D(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, GL_TEXTURE_2D, albedo_tid, 0);


  glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0);

We’ll need at least two shader programs. One that writes to the g-buffer and one that reads from it. The vertex shader is pretty standard:

#version 150

uniform mat4 projection;
uniform mat4 modelview;

// In model space
in vec3 vposition;
in vec3 vnormal;
in vec3 vtexcoord;

// In eye space
out vec4 fposition;
out vec4 fnormal;
out vec3 ftexcoord;

void main() {
  fnormal = modelview * vec4(vnormal, 0.0);
  fposition = modelview * vec4(vposition, 1.0);
  ftexcoord = vtexcoord;
  gl_Position = projection * fposition;

The fragment shader is where things look different. We do not perform any lighting. We just calculate the parameters that we’d need to do so and store them in the g-buffer. Storage is done by creating a out vec4 for each of the color attachments in our FBO.

#version 150

uniform sampler3D noise;

// In eye space
in vec4 fposition;
in vec4 fnormal;
in vec3 ftexcoord;

// Outputs
out vec4 position;
out vec4 normal;
out vec4 albedo;

void main() {
  position = fposition; = normalize(;
  albedo.rgb = vec3(texture(noise, ftexcoord).r);

We also need to map the out names to the appropriate color attachment. This needs to be done before your shader is linked. The ShaderProgram class I provide can issue these calls for you, but in effect, we need:

glBindFragDataLocation(shader_pid, 0, "position");
glBindFragDataLocation(shader_pid, 1, "normal");
glBindFragDataLocation(shader_pid, 2, "albedo");

Now in DrawFrame, we can draw our scene to the g-buffer, with something like:

// Draw mesh to gbuffer.
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, gbuffer_fid);
glDrawBuffers(2, buffers);
glViewport(0, 0, FBO_SIZE, FBO_SIZE);
mesh_program->SetUniform("projection", projection);
mesh_program->SetUniform("modelview", modelview);
mesh_program->SetUniform("noise", 3);

Note that I’m not really doing any rendering here that warrants deferred shading.

We’re halfway done. In fact, we should be able to inspect our textures with a debugger. Moving to the second half, we need to render a fullscreen quadrilateral. This is pretty easy if we just strip out all transformations and make our model space identical to NDC space.

void DeferredRenderer::CreateQuad() {
  float positions[] = {
    -1.0f, -1.0f,
    1.0f, -1.0f,
    -1.0f, 1.0f,
    1.0f, 1.0f

  float texcoords[] = {
    0.0f, 0.0f,
    1.0f, 0.0f,
    0.0f, 1.0f,
    1.0f, 1.0f

  quad_attributes = new VertexAttributes();
  quad_attributes->AddAttribute("vposition", 4, 2, positions);
  quad_attributes->AddAttribute("vtexcoord", 4, 2, texcoords);

  quad_program = new ShaderProgram("deferred_quad_v.glsl", "deferred_quad_f.glsl");

  quad_array = new VertexArray(*quad_program, *quad_attributes);

The shader which colors this quadrilateral is where the action happens. The vertex shader is quite boring. It just passes through our vertex attributes:

#version 150

in vec2 vposition;
in vec2 vtexcoord;

out vec2 ftexcoord;

void main() {
  ftexcoord = vtexcoord;
  gl_Position = vec4(vposition, 0.0, 1.0);

The fragment shader does the full lighting, but it must read the parameters from the g-buffer:

#version 150

uniform sampler2D position_tex;
uniform sampler2D normal_tex;
uniform sampler2D albedo_tex;

const vec3 light_position = vec3(0.0, 0.0, 0.0);
const vec3 light_color = vec3(1.0);
const float ambient_factor = 0.2;
const float shininess = 50.0;

in vec2 ftexcoord;

out vec4 frag_color;

void main() {
  // Read in the eye position, normal, and albedo from the
  // g-buffer.
  vec4 sample = texture(position_tex, ftexcoord);  
  vec3 position = sample.rgb;

  sample = texture(normal_tex, ftexcoord);  
  vec3 normal = sample.rgb;

  sample = texture(albedo_tex, ftexcoord);  
  vec3 albedo = sample.rgb;

  // And now we light like normal.
  vec3 l = normalize(light_position - position);
  float n_dot_l = max(dot(normal, l), 0.0);

  vec3 v = normalize(-position);
  vec3 h = normalize(v + l);
  float n_dot_h = max(dot(normal, h), 0.0);

  vec3 ambient = ambient_factor * light_color * albedo;

  vec3 diffuse = (1.0 - ambient_factor) * n_dot_l * light_color * albedo;
  vec3 specular = pow(n_dot_h, shininess) * light_color;

  vec3 fcolor = specular + diffuse + ambient;
  frag_color = vec4(fcolor, 1.0);

Our g-buffer comes in as a series of 2-D textures. We pull out the parameters and after that, our shading is calculated conventionally. The trick of deferred shading is that this code is only executed for visible fragments. Nice, huh?

All we need to do is draw the quadrilateral as the last step of DrawFrame:

// Draw gbuffer-textured quad to default framebuffer. 
glBindFramebuffer(GL_DRAW_FRAMEBUFFER, 0); 
glViewport(0, 0, GetWidth(), GetHeight());
quad_program->SetUniform("position_tex", 0); 
quad_program->SetUniform("normal_tex", 1); 
quad_program->SetUniform("albedo_tex", 2); 

I don’t clear depth here. Why not?


  1. Transparent geometry is not really supported at all. We’re only keeping the parameters of the nearest opaque fragment. Adding support to hold onto an arbitrary number of transparent nearer fragments sounds formidable.
  2. Antialiasing.
Posted in graphics, teaching
One comment on “Interactive 3-D graphics lecture: deferred shading
  1. Nick Wiggill says:

    Thanks for this, Chris. It helped me overcome a bit of an obstacle as described on the OpenGL forums.

Leave a Reply

Your email address will not be published. Required fields are marked *