JTalton's picture

Occlusion Query

ARB_occlusion_query defines a mechanism whereby an application can query the number of pixels (or, more precisely, samples) drawn by a primitive or group of primitives. The primary purpose of such a query (hereafter referred to as an "occlusion query") is to determine the visibility of an object. Typically, the application will render the major occluders in the scene, then perform an occlusion query for the bounding box of each detail object in the scene. Only if said bounding box is visible, i.e., if at least one sample is drawn, should the corresponding object be drawn.

I have recently hooked up multiple object selection using the selection buffer. Since in OpenGL 3.0 the selection buffer is deprecated, I was looking into how I could perform multiple object selection using OpenGL 3.0. Several people have mentioned occlusion queries as a possible solution. Since OpenTK support for OpenGL 3.0 is not ready, I used the NVidia extension. Has anyone played with this or has any insight?

uint query;
GL.NV.GenOcclusionQueries(1, out query);
GL.NV.BeginOcclusionQuery(query);
Render();
GL.NV.EndOcclusionQuery();
GL.Flush();
int pixelCount;
GL.NV.GetOcclusionQuery(query, All.QueryResult, out pixelCount);
bool visible = pixelCount > 0;
GL.DeleteQueries(1, ref query);

There is stuff being rendered to the screen, but pixel count is always 0. Quadro FX 550 with the latest NVidia Drivers on Windows XP.


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Inertia's picture

According to the spec you linked you cannot get they QueryResult until QueryResultAvailable returns true. I.e.

// begin&end query
int result;
do {
  Wait();
  GL.NV.GetOcclusionQuery( query, All.QueryResultAvailableNv, out result );
} while ( result == 0 );
// get QueryResult

You might want to read this to get a real-use example of occlusion query, the delay between the query and the result being available is so big that they're doing the query in Frame 1 and evaluate the result in Frame 2.

JTalton's picture

Thanks for the pointer. It's working now. It returns 0 in the debugger, but non zero if the window gets focus again. It may be a GLContext thing. Not sure. I added the check you show above, but it works fine with or without it.

As for the delay, I have a thread manager that processes CPU and GPU tasks. So I push a GPU task to do the queries that when it executes pushes another task to handle the query results. I do this across multiple objects, so all the objects will put in their queries and then the query results are all handled after all queries are started. Plus the query results will affect frame buffers that need to be updated spreading the process out even further. It should be quite efficient.

JTalton's picture

As for the results... the selection buffer is a ton faster than doing occlusion queries on my 8800GT.

Inertia's picture

Performance should be pretty good, did you disable texturing, lighting etc. and set color- and depthmasks to false for the query? I haven't used the extension, but my understanding of the spec is that you cannot rely on the output of QueryResult before QueryResultAvailable is true and that you should disable pretty much everything besides depth testing and backface culling. (It only tells you how fragments passed the tests - but does not draw anything - so all drawing related stuff can be turned off.)

JTalton's picture

I'm only drawing about 40 points (40 queries). No texturing or lighting, with depth and color masks turned off. The selection buffer is almost instantaneous. I can see the slow performance of the occlusion query as I select points. I have the two code paths separated and will keep the occlusion path as a user setting. In the OpenTK examples it would be nice to have an example using selection buffer, occlusion queries, and the single object color selection method, and have performance comparison of each. Of course since the occlusion queries are asynchronous, the implementation will affect performance. Maybe I missed something that affects performance in my implementation. I'll have to double check.