rikky's picture

The fastest way to draw cubes?

Hey guys,

I'm a C# ASP.Net programmer by trade and I've spent plenty of time doing 2D applications in XNA.
I was feeling adventurous yesterday and figured that I should start going on a long-postponed voxel (3D pixel - basically a cube) project, and decided to use OpenTK to take a stab at it.

My computer is a recent build, with an overclocked Intel i7 2600k and an NVidia 560Ti, in addition to 8GB of RAM.
It runs Crysis on 'very high' with full AA at a stable 30fps. Point being - it's pretty fast.

I wasn't really sure what to expect with OpenTK's performance, but given my card's OpenGL 4.1 support I suspected that it'd be pretty sweet.
Voxels are essentially polygons (or maybe six - one for each face) and the only figure that I've seen lately that relates to OpenGL's polygons/second fillrate is the Nintendo 3DS with 15.3mil , so I figured that my machine could crush that number without breaking a sweat.

Currently, and it's probably an issue of how I've coded it, it's only just managing to draw a 100x100 grid of cubes at 60fps (taking around 14ms per frame).
This shouldn't be the case, as it's only drawing 3.6mil faces per second by my math (100 * 100 * 6 faces * 60 fps).
I'm now thinking that it's due to not using OpenCL-accelerated code, and spending overhead on the C# rather than sluggish hardware performance.
Either way, if anyone could take a look at it and give me some pointers, I'd be very grateful.

I've posted my code here.

Screenshot:
Shot of the running application


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
the Fiddler's picture

The reason why performance is so low is that you are drawing the cubes one by one with GL.Begin()-GL.End(). This method is very inefficient and is no longer available in GL3.0+.

Check the documentation for vertex buffer objects. The idea is to upload the geometry to video ram and draw everything with a single draw command. This will give you orders of magnitude better performance.

rikky's picture

Thanks a bunch for the reply (so quick, too!).

So I went back and changed the 'begin' and 'end' calls to bookend the mass drawing (something that I'm used to from XNA) but I didn't see much of a performance boost (down to around 12ms, although it may just be a coincidence).

I've looked through the documentation, but frankly there's a lot in there and without knowing what I'm looking for I'm kinda lost.
I did read through the geometry section, though, and came across this:

  • Avoid Immediate Mode or Vertex Arrays in favor of Display Lists or Vertex Buffer Objects.

Could that really be making the difference of tenfold (or potentially much more) performance?

elaverick's picture

Yeah ditch the immediate mode and render via VBO's, also I don't use GameWindows much in my stuff but can't you request a framerate in there and the window tries to accommodate it? Could it be that you've requested 60fps and that's simply what it's locking it to?

rikky's picture

Could it really make that much of a difference, though? It's still struggling to draw 1/6 of a much less powerful system.

And no, I've been working with less objects (50 or so) and it's rarely reaching 1ms, so it's not just reaching 14ms for kicks.

ERP's picture

In immediate mode you're making thousands of calls through to the driver, this is a killer.
Changing the cubes to use Index and VertexBuffers, will probably get you a marginal improvement, but you'll still hit a wall in the 10-20K draw call range.

To start pushing the graphics card drawing lots of simple models like this you have to find a way to draw them with less calls to the driver. The way to do this is Geometry Instancing.

Any app that makes calls for every Voxel will end up CPU limited.

the Fiddler's picture
ERP wrote:

In immediate mode you're making thousands of calls through to the driver, this is a killer.
Changing the cubes to use Index and VertexBuffers, will probably get you a marginal improvement, but you'll still hit a wall in the 10-20K draw call range.

To start pushing the graphics card drawing lots of simple models like this you have to find a way to draw them with less calls to the driver. The way to do this is Geometry Instancing.

Any app that makes calls for every Voxel will end up CPU limited.

Exactly. While the hardware is able to draw millions of vertices per second, CPU overhead is holding it back. Changing from GL.Begin/End to vertex buffers can often improve performance by two orders of magnitude or more but you'll still hit the wall ERP mentioned.

Geometry instancing is the solution but it's also relatively complex to implement. I'd suggest going for the lowest-hanging fruit first and working from there until you achieve your performance target.

rikky's picture

Sweet. Thanks for the help, guys.

So speaking of features like depth of field and shadowing, are there specifically recommended tutorials?
I've found plenty of references, but most of them seem to be for older revisions of OpenGL (written around the turn of the millennium).