Inertia's picture

GL 3.2 machine

By dropping fixed-functionality OpenGL lost quite some weight, namely

Evaluators & Vertex Arrays
Feedback & Selection
Immediate mode Vertex conversion
Matrix Control
Texture Coordinate Generation, Fog

So what's left is

Vertex Array Control & Primitive Assembly
Clipping, Perspective & Viewport
Per-Fragment Ops
Framebuffer Control
Pixel Conversions

A diagram of what is left is quite compact now and I'd like to add this to the book/manual, but first I'd like to gather some feedback and suggestions. Free your mind.

OpenGL machine diagram v2.png154.61 KB


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
the Fiddler's picture


Nice overview of the pipeline. The sequence looks logical - I don't recall the exact ordering of steps between pixel ownership and masking but looks out of the ordinary (I can cross-check against my copy of the redbook, if you wish).

I'm a little curious about the pixel ownership test - do drivers actually evaluate fragments and discard them afterwards? (I distinctly recall my old R300 video card rendering faster when the window was partially off-screen, suggesting that those pixels where completely skipped). The OpenGL FAQ states that the implementation details differ between drivers and FBOs make things even more complicated (and I'm too lazy to crack open my copy of the specs).

Also, the "readback control" section is missing GL.BlitFramebuffer() (or I'm officially blind). :-)

Inertia's picture

The way I understand it, the pixel ownership test is the same concept as GL.Scissor but uses the GL Context as constraint. It would make sense to move pixel ownership and scissor before the fragment shader, but afaik this is not the case. Only viewport clipping and triangle culling are running before the fragment shader is.

This diagram is lacking the Transform Feedback functionality by design, that should be dealt with in a separate graphic in a separate chapter. Geometry shaders are there, but it may not be obvious enough that they're optional.

Edit: samplers, textures and shaders are quite abstract, the current program and uniforms should be merged.

Inertia's picture

How would you change the diagram so it shows properly that several VS/GS/FS may be executed at the same time? Atm it displays a rather sequential execution...

the Fiddler's picture

Do you need to show that in the card? Conceptually, the pipeline *is* sequential - it doesn't matter whether the driver executes parts in parallel or in series, as the result must be the same in either case.

It might be worth adding the potential feedback loops as simple arrows looping from the framebuffer to the top (transform feedback) and the fragment shader (texture feedback - this is allowed but undesirable).

Inertia's picture

True, maybe it should be ignored for the sake of simplicity.

[Transform Feedback]
The way I understand the specs, this extension would belong between the 2 primitive assembly and the clipping stage (i.e. it can only save the vertex or geometry shader output). I don't really want the extension in the diagram though, it's a solution used by very few applications and it's sole presence will be more confusing than helping.

[texture feedback]
I'm not quite sure yet in how far textures should become part of this. Maybe I need to formulate more clearly what I'm trying to do:

  • It is meant to be an overview, possibly a supplement for an article.
  • The diagram was just a sketch on paper when I did the Fragment Ops pages for the manual. I've extended this in both directions and now it shows the path from draw command to framebuffer.
  • It should somewhat "group" commands to a logical unit. E.g. "Viewport Application" is both: GL.Viewport and GL.DepthRange (this is non-obvious from the verbal descriptions in the manual)
  • There is alot of (hidden) GL state that has no effect until you start drawing. The diagram should list it all.
  • Under no circumstances should it cover the GL.GenObj, GL.DeleteObj, GL.IsObj and GL.BindObj parts of the API, nor the GL.Get* queries.
  • It is kinda what I'd really liked to have seen when I started with OpenGL. A simple top->bottom order of execution. Extra points if it can be printed on paper.
  • Maybe this will be a part of multiple diagrams. There is enough material to create some more: VBO&UBO, FBO&Textures, GLSL
Inertia's picture

I've uploaded the diagram again with minor changes: commands that initate pipeline operations have a different border style than state-related commands and sRGB conversion was missing. Uniforms and the active program were merged for clarity.

GL.BlitFramebuffer will not be added, since it only uses pixel ownership and scissor test from the fragment ops and skips all other steps (would also be confusing, since there is no reference to FBO besides the draw- and readbuffer commands).

I'm unsure how to proceed with deprecated functions, mark them with a different color? (this is overly complicated with the UML editor used, so I'd really like to do this only once).

Inertia's picture

Funny coincidence that nvidia released slides with a similar diagram a week later :)

A good read if you're curious about GL 3.2 extensions and porting DirectX/XNA applications to OpenGL. Does anyone know where to obtain an audio bootleg of the presentation?

Inertia's picture
Inertia wrote:

Only viewport clipping and triangle culling are running before the fragment shader is.

This is not entirely correct, Radeon 3xxx and Geforce 8xxx include "Early-Z" which may discard fragments before execution of the fragment shader under certain circumstances. However this is not exposed in OpenGL or DirectX and works behind the scenes, so it does not belong into the diagram.

the Fiddler's picture

I think early-Z is supported as far back as the initial GLSL capable cards (R300 and FX 5x00 series).

Early-Z is why it's highly beneficial to do a z-only pass before rendering any complex shaders. Note that if your fragment shader discards any fragments, early-Z will be disabled for the current pass and all subsequent passes.

Inertia's picture

Not sure about availability, but Hierarchical-Z, Early-Z and the likes needed to be mentioned. Besides the shader itself, the state set by GL.DepthMask(false) appears to be the toggle that enables these optimizations. However this is not part of GL spec and will not become part of the diagram, but I wanted to clarify my not-entirely-correct statement.