PQ Torus Knots
I had the idea for this when Fiddler mentioned triangle-strips with vertex caches. For procedural generated geometry a triangle strip has some beneficial properties that can be used to examine the Vertex Cache's behaviour, which is quite problematic with indexed triangle-lists.
Because of the attributes used when building the Torusknot, it can be used in Profiling to help identify the size of the graphic-card's Vertex Cache. To my knowledge there is no GL variable one can query to retrieve the Cache size (however there exists a DX tool http://www.clootie.ru/delphi/dxtools.html ), this functionality could be handy for a Setup/Config Tool for a main Application, that can examine the graphics card (and maybe convert all Meshes to ideal cache-layout for the current Client).
Possible Vertex Cache sizes between 8-50 are examined, and the results are stored in the Logs/ Folder (which the app will create when missing). No manipulation to the system is done except (inside) this Folder.
This program does nothing but generating a Torusknot from given parameters and offer 3 modes (Interactive, Turntable and Profiling) to put the mesh to some use and was written in 2 days from scratch, in the processes of going through the OpenTK Examples and experimenting with some stuff.
After some alpha testing, the profiling produces rather useful results now. However some graphic cards produce weird results, which could be connected to multiple parallel Vertex-Processing-Units inserting Vertices into the Cache.
Also Problematic are measurings like this. The examined graphics card has a cache of 24 and 12 would be perfect with the given tri-strip layout.
11 Vertices per Ring. 3,796ms averaged per draw.
12 Vertices per Ring. 3,821ms averaged per draw.
13 Vertices per Ring. 6,515ms averaged per draw.
To get a more clear picture about this problem, I need Your help. Please run the application and do a profiling run by pressing "P". This will take a few seconds, then a new file is created in the Logs/ Folder. If you want, you can run multiple tests with different P/Q or disabling Texture2D, however a single benchmark from the default settings will be perfectly sufficient. Please attach or c&p that text file into this thread, this will take you less than 2 minutes, if you don't start toying around in interactive mode ;)
Make sure you have OpenTK.dll available to the app.
Thanks!
Edit: As promised, the source code. Only little documentation, most of it is trivial. Use at your own risk! :P
The Torusknot.cs class itself handles the mesh from generating Vertices and Triangles up to the VBO. A Torus Knot is specified like this
Create( uint pathsteps, uint shapevertices, float radius, int p, int q )
where pathsteps is the count how many Rings are in the Knot. shapevertices defines the number of Vertices per Ring.
The other files aren't really interesting and only included so you can build the app.
---------------------------------------------------------------------------
The included Solution was created with VC# Express 2008, in case you cannot load it:
Create a new project, console application.
-Add *.cs *.jpg from the compressed archive. Set the properties of logo-dark.jpg so that it'll be copied when building.
-Optionally add OpenTK.dll.
-Add System, System.Drawing and OpenTK as references.
| Attachment | Size |
|---|---|
| PQTorusKnots Source Code (OpenTK 0.9.0) | 37.83 KB |




Comments
Dec 06
22:50:20Tested on Vista x64 with the
posted by the Fiddler.Tested on Vista x64 with the following results:
Window Size 1400 x 1000
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 1,727ms averaged per draw.
5 Verts per Ring. 1,626ms averaged per draw.
6 Verts per Ring. 1,598ms averaged per draw.
7 Verts per Ring. 1,812ms averaged per draw.
8 Verts per Ring. 1,745ms averaged per draw.
9 Verts per Ring. 1,736ms averaged per draw.
10 Verts per Ring. 1,719ms averaged per draw.
11 Verts per Ring. 1,728ms averaged per draw.
12 Verts per Ring. 1,713ms averaged per draw.
13 Verts per Ring. 1,767ms averaged per draw.
14 Verts per Ring. 1,715ms averaged per draw.
15 Verts per Ring. 1,704ms averaged per draw.
16 Verts per Ring. 1,712ms averaged per draw.
17 Verts per Ring. 1,711ms averaged per draw.
18 Verts per Ring. 1,700ms averaged per draw.
19 Verts per Ring. 1,766ms averaged per draw.
20 Verts per Ring. 1,690ms averaged per draw.
21 Verts per Ring. 1,695ms averaged per draw.
22 Verts per Ring. 1,695ms averaged per draw.
23 Verts per Ring. 1,687ms averaged per draw.
24 Verts per Ring. 1,704ms averaged per draw.
25 Verts per Ring. 1,752ms averaged per draw.
I'm not sure what to make of the results, they look rather random and further testing didn't show anything different. Any ideas?
I'd love to take a look at the source. Also, would you mind if I took some screenshots and used them as a favicon for this site?
Dec 06
23:34:33You resized the window to
posted by InertiaYou resized the window to fullscreen, which made the fillrate a limiting factor too. This is simply 1 Light with fixed function Gouraud and Texture mapping, unless I'm timing the GL.Finish(); wrong the time should reflect exactly the time it took to GL.DrawElements() the VBO. Overdraw and culling affect the result aswell, that's why the model isn't moving during profiling.
According to these results your vertex cache would be estimated as 12, which is unlikely true ;)
Use them in any way you like to, but be careful that you don't attract Sceners or there will be spikeballs all over the place ><
Dec 07
08:25:11My test results: Profiling
posted by Stevo14My test results:
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0.15 P: 6 Q: 1
4 Verts per Ring. 3.604ms averaged per draw.
5 Verts per Ring. 3.564ms averaged per draw.
6 Verts per Ring. 3.550ms averaged per draw.
7 Verts per Ring. 5.961ms averaged per draw.
8 Verts per Ring. 5.893ms averaged per draw.
9 Verts per Ring. 5.867ms averaged per draw.
10 Verts per Ring. 5.841ms averaged per draw.
11 Verts per Ring. 5.817ms averaged per draw.
12 Verts per Ring. 5.911ms averaged per draw.
13 Verts per Ring. 5.790ms averaged per draw.
14 Verts per Ring. 5.778ms averaged per draw.
15 Verts per Ring. 5.765ms averaged per draw.
16 Verts per Ring. 5.756ms averaged per draw.
17 Verts per Ring. 5.749ms averaged per draw.
18 Verts per Ring. 5.740ms averaged per draw.
19 Verts per Ring. 5.741ms averaged per draw.
20 Verts per Ring. 5.729ms averaged per draw.
21 Verts per Ring. 5.723ms averaged per draw.
22 Verts per Ring. 5.718ms averaged per draw.
23 Verts per Ring. 5.721ms averaged per draw.
24 Verts per Ring. 5.711ms averaged per draw.
25 Verts per Ring. 5.708ms averaged per draw.
There seems to be a noticeable increase right at 7 vertices. Does this mean that my vertex cache is 6 vertices?
ps. first post in forums.
Dec 07
13:49:18Welcome Stevo, and thank you
posted by InertiaWelcome Stevo,
and thank you for posting the result. Like Fiddler's ATi card, your effective Vertex Cache would be 12 too.
This could mean that your true Vertex Cache size is 32, but 20 Vertex Units are inserting new Vertices parallel into the Cache, decreasing the effective size because of the new Vertices added. This is not a bad thing, especially when rendering objects that share very little or no Vertices (e.g Particle Systems) your graphics card will probably exceed any other card that relies on using the Vertex Cache.
I've looked this up, and it seems like ATi cards are using the same memory for L1 Texture Cache and Vertex Cache. Would you please make another profile run with Texture2D disabled? (Hotkey Q). Also make sure the driver does not enforce Anti-Alias/Anisotropy/Tru-form, especially the last could be responsible for these low values (insert new Vertices that aren't considered by the profiling).
Thanks!
Dec 07
14:02:26Welcome Stevo14 :) I reran
posted by the Fiddler.Welcome Stevo14 :)
I reran the tests with the default window and, sure enough, the results became a little clearer.
With textures:
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 1,038ms averaged per draw.
5 Verts per Ring. 1,002ms averaged per draw.
6 Verts per Ring. 1,035ms averaged per draw.
7 Verts per Ring. 1,268ms averaged per draw.
8 Verts per Ring. 1,267ms averaged per draw.
9 Verts per Ring. 1,264ms averaged per draw.
10 Verts per Ring. 1,248ms averaged per draw.
[...]
Without:
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 0,809ms averaged per draw.
5 Verts per Ring. 0,760ms averaged per draw.
6 Verts per Ring. 0,735ms averaged per draw.
7 Verts per Ring. 0,895ms averaged per draw.
8 Verts per Ring. 0,898ms averaged per draw.
9 Verts per Ring. 0,895ms averaged per draw.
10 Verts per Ring. 0,899ms averaged per draw.
[...]
Disabling textures doesn't seem to affect the effective size of the cache. I'll run the test on a couple of nv40 and g70 cards, to have something to compare against.
Dec 07
16:18:11Thank you, this clarifies at
posted by InertiaThank you, this clarifies at least the connection between vertex and texture cache. It seems like your card's effective cache is really 12, drawing a ring with 7 vertices is 120% the time compared to 6 verts.
If you take a look at the first benchmark, the result would be a vertex Cache of 10 though, while the second results in 12. I'm rather sure these discrepancies are related to the OS performing actions in the background while the app is running, a Diagnostics.Stopwatch has the resolution to be affected by this.
What I had in mind as a backup solution was binding an "expensive" vertex shader to draw the knot. This would be useless calculations that the compiler doesn't opt out, and should increase the cost of processing a vertex alot. So the ms/draw should increase stronger when no vertex cache hits are made.
Dec 07
16:30:20Looks like about the same
posted by Stevo14Looks like about the same result the second time with textures off:
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0.15 P: 6 Q: 1
4 Verts per Ring. 4.075ms averaged per draw.
5 Verts per Ring. 4.170ms averaged per draw.
6 Verts per Ring. 4.081ms averaged per draw.
7 Verts per Ring. 7.161ms averaged per draw.
8 Verts per Ring. 7.103ms averaged per draw.
9 Verts per Ring. 7.068ms averaged per draw.
10 Verts per Ring. 7.077ms averaged per draw.
11 Verts per Ring. 7.009ms averaged per draw.
12 Verts per Ring. 6.994ms averaged per draw.
13 Verts per Ring. 6.967ms averaged per draw.
14 Verts per Ring. 6.964ms averaged per draw.
15 Verts per Ring. 6.949ms averaged per draw.
16 Verts per Ring. 6.944ms averaged per draw.
17 Verts per Ring. 6.922ms averaged per draw.
18 Verts per Ring. 6.913ms averaged per draw.
19 Verts per Ring. 6.907ms averaged per draw.
20 Verts per Ring. 6.901ms averaged per draw.
21 Verts per Ring. 6.892ms averaged per draw.
22 Verts per Ring. 6.889ms averaged per draw.
23 Verts per Ring. 6.876ms averaged per draw.
24 Verts per Ring. 6.878ms averaged per draw.
25 Verts per Ring. 6.877ms averaged per draw.
I find it curious that it was slower this time with the textures off. Of course, it could just be the fact that something was running in the background slowing things down.
Dec 08
15:02:32Ofcourse other Processes
posted by InertiaOfcourse other Processes affect the results, although Thread priority is already set to highest. The Texture is trilinear filtered, and might have caused the Vertex Cache to make room for L1 Texture Cache, just wanted to verify that's not true.
Here's one benchmark clearly indicating a Vertex Cache Size of 24, at 12 rings (all vertices from the previous ring are free).
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
3 Verts per Ring. 4,734ms averaged per draw.
4 Verts per Ring. 4,471ms averaged per draw.
5 Verts per Ring. 4,188ms averaged per draw.
6 Verts per Ring. 3,997ms averaged per draw.
7 Verts per Ring. 3,858ms averaged per draw.
8 Verts per Ring. 3,709ms averaged per draw.
9 Verts per Ring. 3,593ms averaged per draw.
10 Verts per Ring. 3,572ms averaged per draw.
11 Verts per Ring. 3,538ms averaged per draw.
12 Verts per Ring. 3,391ms averaged per draw.
13 Verts per Ring. 6,114ms averaged per draw.
14 Verts per Ring. 6,128ms averaged per draw.
15 Verts per Ring. 6,092ms averaged per draw.
16 Verts per Ring. 6,085ms averaged per draw.
17 Verts per Ring. 6,111ms averaged per draw.
18 Verts per Ring. 6,091ms averaged per draw.
19 Verts per Ring. 6,074ms averaged per draw.
20 Verts per Ring. 6,092ms averaged per draw.
21 Verts per Ring. 6,002ms averaged per draw.
22 Verts per Ring. 6,098ms averaged per draw.
23 Verts per Ring. 5,994ms averaged per draw.
24 Verts per Ring. 6,027ms averaged per draw.
25 Verts per Ring. 5,996ms averaged per draw.
Edit: Source Code added.
Dec 08
15:17:58Thanks for the source. I
posted by the FiddlerThanks for the source. I will be running tests on a couple of other systems to see what comes up.
Ah, one small thing: you can register for KeyDown and KeyUp events in the GameWindow.Keyboard class, which can simplify the keyboard handling logic (if I understand how the KeyStrokeManager works). Documentation...
Dec 08
15:46:12I had a couple of profiling
posted by InertiaI had a couple of profiling runs on other people's laptops, but a Vertex Cache of 24 was the highest result so far. One Intel chipset only cached the last 8 vertices, i think that's the absolut minimum an OpenGL driver must provide? I'm also getting the suspicion that there may be no standard if the cache must have a FIFO or LRU logic to decide which entry gets discarded.
Well, I just had trouble with the repeating and wanted to get this done quickly. For a game this kind of behaviour from the input class is great, I didn't really look into the events as this was just a Quickstart template. I just built this app to get my mind off porting the MS3D Loader to OpenTK.Math, and kinda proving that the mathlib isn't the Problem factor ;)
Jan 04
14:28:07Please keep posting results,
posted by InertiaPlease keep posting results, couldn't get my hands on any state-of-the-art Geforce 8 or Radeon 3xxx card so far and would really like to see the trends there. According to a paper (link?) there is supposed to be a barrier at a Vertex Cache size of 32, where the area of the mesh is so large that cache misses become inevitable since all neighbours inside the cache are already drawn. If the trend on high-end hardware is to neglect cache optimizations for parallel processing, it might be best to optimize meshes for a very low cache size (8-12) to increase the chance to get the vertex while it's still in the cache at all.
Again, please do a profiling run. It will take less than 2 minutes. Thank you!
Edit:
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 1,567ms averaged per draw.
5 Verts per Ring. 1,461ms averaged per draw.
6 Verts per Ring. 1,434ms averaged per draw.
7 Verts per Ring. 1,359ms averaged per draw.
8 Verts per Ring. 1,308ms averaged per draw.
9 Verts per Ring. 1,284ms averaged per draw.
10 Verts per Ring. 1,252ms averaged per draw.
11 Verts per Ring. 1,265ms averaged per draw.
12 Verts per Ring. 1,261ms averaged per draw.
13 Verts per Ring. 1,987ms averaged per draw.
14 Verts per Ring. 1,952ms averaged per draw.
15 Verts per Ring. 1,903ms averaged per draw.
16 Verts per Ring. 1,915ms averaged per draw.
17 Verts per Ring. 1,948ms averaged per draw.
18 Verts per Ring. 1,951ms averaged per draw.
19 Verts per Ring. 1,893ms averaged per draw.
20 Verts per Ring. 1,885ms averaged per draw.
21 Verts per Ring. 1,944ms averaged per draw.
22 Verts per Ring. 1,937ms averaged per draw.
23 Verts per Ring. 1,943ms averaged per draw.
24 Verts per Ring. 1,883ms averaged per draw.
25 Verts per Ring. 1,889ms averaged per draw.
Dec 22
15:07:29:| ..and I thought it'd be
posted by Inertia:|
..and I thought it'd be quicker to get results over the net. Won't be my last err.
Do I have to wrap it into a Setup.exe and implement some function that e-mails me the results or what is the problem? It really takes less than 2 minutes if you have an IDE, .NET and OpenTK installed, so X-mas is not a valid excuse.
Jan 03
23:11:07Profiling Log for Quadro NVS
posted by athiniarWindow Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 3,041ms averaged per draw.
5 Verts per Ring. 3,035ms averaged per draw.
6 Verts per Ring. 2,960ms averaged per draw.
7 Verts per Ring. 3,051ms averaged per draw.
8 Verts per Ring. 3,564ms averaged per draw.
9 Verts per Ring. 3,570ms averaged per draw.
10 Verts per Ring. 3,438ms averaged per draw.
11 Verts per Ring. 3,552ms averaged per draw.
12 Verts per Ring. 3,577ms averaged per draw.
13 Verts per Ring. 3,519ms averaged per draw.
14 Verts per Ring. 3,732ms averaged per draw.
15 Verts per Ring. 5,506ms averaged per draw.
16 Verts per Ring. 5,575ms averaged per draw.
17 Verts per Ring. 5,628ms averaged per draw.
18 Verts per Ring. 5,606ms averaged per draw.
19 Verts per Ring. 5,455ms averaged per draw.
20 Verts per Ring. 5,580ms averaged per draw.
21 Verts per Ring. 5,539ms averaged per draw.
22 Verts per Ring. 5,513ms averaged per draw.
23 Verts per Ring. 5,622ms averaged per draw.
24 Verts per Ring. 5,569ms averaged per draw.
25 Verts per Ring. 5,513ms averaged per draw.
Jan 04
14:29:50Thanks hun! Very interesting
posted by InertiaThanks hun! Very interesting results, thanks again for posting them :)
Jan 07
20:05:09Here are the results with my
posted by athiniarHere are the results with my second computer (if you still need them for your cooking!)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 2,208ms averaged per draw.
5 Verts per Ring. 2,177ms averaged per draw.
6 Verts per Ring. 2,153ms averaged per draw.
7 Verts per Ring. 3,975ms averaged per draw.
8 Verts per Ring. 3,949ms averaged per draw.
9 Verts per Ring. 3,942ms averaged per draw.
10 Verts per Ring. 3,910ms averaged per draw.
11 Verts per Ring. 3,903ms averaged per draw.
12 Verts per Ring. 3,914ms averaged per draw.
13 Verts per Ring. 3,908ms averaged per draw.
14 Verts per Ring. 3,859ms averaged per draw.
15 Verts per Ring. 3,857ms averaged per draw.
16 Verts per Ring. 3,843ms averaged per draw.
17 Verts per Ring. 3,833ms averaged per draw.
18 Verts per Ring. 3,868ms averaged per draw.
19 Verts per Ring. 3,831ms averaged per draw.
20 Verts per Ring. 3,827ms averaged per draw.
21 Verts per Ring. 3,814ms averaged per draw.
22 Verts per Ring. 3,815ms averaged per draw.
23 Verts per Ring. 3,849ms averaged per draw.
24 Verts per Ring. 3,813ms averaged per draw.
25 Verts per Ring. 3,846ms averaged per draw.
Jan 07
20:38:39Thanks again, very much
posted by InertiaThanks again, very much appreciated :)
Every single result helps getting a better picture how the graphic cards are designed under the hood, keep them coming :D
Jan 08
11:32:10Here's mine! I just ran the
posted by objarniHere's mine! I just ran the .exe and pressed P, did not zoom or rotate or anything. As you know already, I have a lousy card, so don't be surprised by the numbers :).
It seems I don't have any vertex cache, or how should I interpret the result?
Profiling Log for GeForce 7300 SE/7200 GS/PCI/SSE2/3DNOW! (2.1.1)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
4 Verts per Ring. 6,038ms averaged per draw.
5 Verts per Ring. 6,084ms averaged per draw.
6 Verts per Ring. 6,379ms averaged per draw.
7 Verts per Ring. 6,241ms averaged per draw.
8 Verts per Ring. 6,226ms averaged per draw.
9 Verts per Ring. 6,243ms averaged per draw.
10 Verts per Ring. 6,252ms averaged per draw.
11 Verts per Ring. 6,226ms averaged per draw.
12 Verts per Ring. 6,217ms averaged per draw.
13 Verts per Ring. 6,205ms averaged per draw.
14 Verts per Ring. 6,185ms averaged per draw.
15 Verts per Ring. 6,185ms averaged per draw.
16 Verts per Ring. 6,189ms averaged per draw.
17 Verts per Ring. 6,190ms averaged per draw.
18 Verts per Ring. 6,179ms averaged per draw.
19 Verts per Ring. 6,185ms averaged per draw.
20 Verts per Ring. 6,172ms averaged per draw.
21 Verts per Ring. 6,194ms averaged per draw.
22 Verts per Ring. 6,183ms averaged per draw.
23 Verts per Ring. 6,157ms averaged per draw.
24 Verts per Ring. 6,150ms averaged per draw.
25 Verts per Ring. 6,149ms averaged per draw.
Jan 08
14:24:25I'd say the cache size is
posted by InertiaI'd say the cache size is 10, there's a low at 5 Verts and a high at 6 Verts. The slight difference between 4 and 5 Verts is related to timer accuracy, which is unfortunately a common problem.
Due to the TriangleStrip's zig-zag pattern there must be 2 complete Rings in the vertex cache to get the speed boost, thus the cache size is calculated by multiplying the number of "Verts per Ring" * 2.
Thank you for posting, all results help :)
Feb 07
19:19:50Re: PQ Torus Knots
posted by MincusNoticed you were asking for newer cards. Got an 8600M GT, so hope it's new enough! (Not sure if it makes a difference, but this was run under Vista.)
Profiling Log for GeForce 8600M GT/PCI/SSE2 (2.1.2)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0.15 P: 6 Q: 1
3 Verts per Ring. 1.593ms averaged per draw.
4 Verts per Ring. 1.465ms averaged per draw.
5 Verts per Ring. 1.420ms averaged per draw.
6 Verts per Ring. 1.403ms averaged per draw.
7 Verts per Ring. 1.387ms averaged per draw.
8 Verts per Ring. 1.375ms averaged per draw.
9 Verts per Ring. 1.377ms averaged per draw.
10 Verts per Ring. 1.359ms averaged per draw.
11 Verts per Ring. 1.342ms averaged per draw.
12 Verts per Ring. 1.370ms averaged per draw.
13 Verts per Ring. 1.352ms averaged per draw.
14 Verts per Ring. 1.354ms averaged per draw.
15 Verts per Ring. 1.308ms averaged per draw.
16 Verts per Ring. 1.346ms averaged per draw.
17 Verts per Ring. 1.336ms averaged per draw.
18 Verts per Ring. 1.296ms averaged per draw.
19 Verts per Ring. 1.342ms averaged per draw.
20 Verts per Ring. 1.346ms averaged per draw.
21 Verts per Ring. 1.345ms averaged per draw.
22 Verts per Ring. 1.349ms averaged per draw.
23 Verts per Ring. 1.339ms averaged per draw.
24 Verts per Ring. 1.332ms averaged per draw.
25 Verts per Ring. 1.344ms averaged per draw.
Feb 07
23:28:44Re: PQ Torus Knots
posted by InertiaThank you very much, this confirms my suspicion that state-of-the-art cards need a more expensive Vertex Shader program to give useful test results. From the log you posted the effective size could be either at 30 or 36 verts.
Will investigate and post an update to the app.
Feb 23
12:09:00Re: PQ Torus Knots
posted by InertiaRadeon 3870: Cache Size 14
Edit: DXtool reports a cache size of 0 (wtf?)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
3 Verts per Ring. 0,410ms averaged per draw.
4 Verts per Ring. 0,382ms averaged per draw.
5 Verts per Ring. 0,367ms averaged per draw.
6 Verts per Ring. 0,354ms averaged per draw.
7 Verts per Ring. 0,347ms averaged per draw.
8 Verts per Ring. 0,666ms averaged per draw.
9 Verts per Ring. 0,619ms averaged per draw.
10 Verts per Ring. 0,657ms averaged per draw.
11 Verts per Ring. 0,619ms averaged per draw.
12 Verts per Ring. 0,651ms averaged per draw.
13 Verts per Ring. 0,619ms averaged per draw.
14 Verts per Ring. 0,646ms averaged per draw.
15 Verts per Ring. 0,619ms averaged per draw.
16 Verts per Ring. 0,643ms averaged per draw.
17 Verts per Ring. 0,619ms averaged per draw.
18 Verts per Ring. 0,640ms averaged per draw.
19 Verts per Ring. 0,619ms averaged per draw.
20 Verts per Ring. 0,638ms averaged per draw.
21 Verts per Ring. 0,619ms averaged per draw.
22 Verts per Ring. 0,636ms averaged per draw.
23 Verts per Ring. 0,619ms averaged per draw.
24 Verts per Ring. 0,635ms averaged per draw.
25 Verts per Ring. 0,619ms averaged per draw.
-----------------------------------------------------------------------------
Profiling Log for GeForce 8600 GTS/PCI/SSE2 (2.1.2)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
3 Verts per Ring. 0,880ms averaged per draw.
4 Verts per Ring. 0,787ms averaged per draw.
5 Verts per Ring. 0,766ms averaged per draw.
6 Verts per Ring. 0,755ms averaged per draw.
7 Verts per Ring. 0,730ms averaged per draw.
8 Verts per Ring. 0,705ms averaged per draw.
9 Verts per Ring. 0,739ms averaged per draw.
10 Verts per Ring. 0,722ms averaged per draw.
11 Verts per Ring. 0,800ms averaged per draw.
12 Verts per Ring. 0,704ms averaged per draw.
13 Verts per Ring. 0,713ms averaged per draw.
14 Verts per Ring. 0,759ms averaged per draw.
15 Verts per Ring. 0,696ms averaged per draw.
16 Verts per Ring. 0,689ms averaged per draw.
17 Verts per Ring. 0,719ms averaged per draw.
18 Verts per Ring. 0,692ms averaged per draw.
19 Verts per Ring. 0,697ms averaged per draw.
20 Verts per Ring. 0,696ms averaged per draw.
21 Verts per Ring. 0,690ms averaged per draw.
22 Verts per Ring. 0,675ms averaged per draw.
23 Verts per Ring. 0,669ms averaged per draw.
24 Verts per Ring. 0,666ms averaged per draw.
25 Verts per Ring. 0,688ms averaged per draw.
Feb 23
15:51:36Re: PQ Torus Knots
posted by lubosWindow Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
3 Verts per Ring. 16,292ms averaged per draw.
4 Verts per Ring. 15,455ms averaged per draw.
5 Verts per Ring. 13,625ms averaged per draw.
6 Verts per Ring. 15,465ms averaged per draw.
7 Verts per Ring. 16,302ms averaged per draw.
8 Verts per Ring. 15,392ms averaged per draw.
9 Verts per Ring. 16,109ms averaged per draw.
10 Verts per Ring. 15,453ms averaged per draw.
11 Verts per Ring. 15,468ms averaged per draw.
12 Verts per Ring. 15,418ms averaged per draw.
13 Verts per Ring. 15,420ms averaged per draw.
14 Verts per Ring. 16,005ms averaged per draw.
15 Verts per Ring. 15,114ms averaged per draw.
16 Verts per Ring. 14,231ms averaged per draw.
17 Verts per Ring. 16,006ms averaged per draw.
18 Verts per Ring. 15,147ms averaged per draw.
19 Verts per Ring. 15,312ms averaged per draw.
20 Verts per Ring. 15,171ms averaged per draw.
21 Verts per Ring. 16,028ms averaged per draw.
22 Verts per Ring. 15,171ms averaged per draw.
23 Verts per Ring. 16,043ms averaged per draw.
24 Verts per Ring. 15,159ms averaged per draw.
25 Verts per Ring. 15,927ms averaged per draw.
Feb 24
11:29:00Re: PQ Torus Knots
posted by InertiaThanks for the log, did the laptop have any power-saving settings on or enforcing vsync? There's a low at 5 and 16 verts, and imho a geforce 7 should be able to process 100k vertices faster than a Geforce 5 (~15ms vs. ~6ms). Kinda suspicious :P
Feb 24
12:00:10Re: PQ Torus Knots
posted by lubosArgh, I found the magic slider in nVidia settings :)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0,15 P: 6 Q: 1
3 Verts per Ring. 1,594ms averaged per draw.
4 Verts per Ring. 1,574ms averaged per draw.
5 Verts per Ring. 1,569ms averaged per draw.
6 Verts per Ring. 1,544ms averaged per draw.
7 Verts per Ring. 1,541ms averaged per draw.
8 Verts per Ring. 1,533ms averaged per draw.
9 Verts per Ring. 1,528ms averaged per draw.
10 Verts per Ring. 1,539ms averaged per draw.
11 Verts per Ring. 1,544ms averaged per draw.
12 Verts per Ring. 1,526ms averaged per draw.
13 Verts per Ring. 1,519ms averaged per draw.
14 Verts per Ring. 1,626ms averaged per draw.
15 Verts per Ring. 1,617ms averaged per draw.
16 Verts per Ring. 1,668ms averaged per draw.
17 Verts per Ring. 1,736ms averaged per draw.
18 Verts per Ring. 1,698ms averaged per draw.
19 Verts per Ring. 1,624ms averaged per draw.
20 Verts per Ring. 1,621ms averaged per draw.
21 Verts per Ring. 1,628ms averaged per draw.
22 Verts per Ring. 1,612ms averaged per draw.
23 Verts per Ring. 1,577ms averaged per draw.
24 Verts per Ring. 1,558ms averaged per draw.
25 Verts per Ring. 1,624ms averaged per draw.
Feb 24
12:14:56Re: PQ Torus Knots
posted by InertiaThank you :) Confusing results though, there's a low at 13 verts, but it's more likely that no vertex cache strategy is used at all (cache hits should roughly halve the draw time measured). If you got a spare minute, would you please try the DXTool (linked at initial post) and see if it can detect a cache size? Would be good to have a "2nd opinion" ;)
Feb 24
14:23:55Re: PQ Torus Knots
posted by lubosprogram detected
size: 37
May 13
08:15:49Re: PQ Torus Knots Bench.
posted by DarianAfter modifying for 1.9.1 (just renaming the namespace)
AND commenting out the OpenGL version detection (which yields false on Intel GMA950 which is specified as 1.4 + ARB_vertex_buffer + EXT_shadow_funcs extensions + TexEnv shader caching)
The application now runs, the object rotation seems smooth, yet the results seem horrible.
(I'm a bit bewildered as I see the timing results separated by either a fp, or a comma - usually used to separate thousands, moreover, I wonder how precise are my results)
Is there any way repeating the benchmark after some modification yield better results?
the two major thing that might be affecting is X and compiz (which might be better to have been left off)
It seems weird that the best timing is for the largest number of vertices.
- Darian.
==================
Profiling Log for Mesa DRI Intel(R) 945G 20061017 x86/MMX/SSE2 (1.3 Mesa 7.0.3-rc2)
Window Size 512 x 512
Max. allowed Verts: 100000 Radius: 0.15 P: 6 Q: 1
3 Verts per Ring. 122.782ms averaged per draw.
4 Verts per Ring. 115.521ms averaged per draw.
5 Verts per Ring. 111.450ms averaged per draw.
6 Verts per Ring. 109.288ms averaged per draw.
7 Verts per Ring. 106.695ms averaged per draw.
8 Verts per Ring. 105.226ms averaged per draw.
9 Verts per Ring. 108.604ms averaged per draw.
10 Verts per Ring. 106.859ms averaged per draw.
11 Verts per Ring. 111.982ms averaged per draw.
12 Verts per Ring. 110.323ms averaged per draw.
13 Verts per Ring. 109.019ms averaged per draw.
14 Verts per Ring. 108.363ms averaged per draw.
15 Verts per Ring. 107.661ms averaged per draw.
16 Verts per Ring. 109.275ms averaged per draw.
17 Verts per Ring. 123.947ms averaged per draw.
18 Verts per Ring. 120.290ms averaged per draw.
19 Verts per Ring. 121.464ms averaged per draw.
20 Verts per Ring. 120.476ms averaged per draw.
21 Verts per Ring. 119.660ms averaged per draw.
22 Verts per Ring. 120.158ms averaged per draw.
23 Verts per Ring. 120.615ms averaged per draw.
24 Verts per Ring. 113.487ms averaged per draw.
25 Verts per Ring. 85.760ms averaged per draw.
May 13
10:45:45Re: PQ Torus Knots
posted by the FiddlerIf I remember correctly, the GMA950 does not have a T&L engine (vertices are processed on the CPU), which might explain the results.
It is likely that compiz plays a role too.
May 13
15:38:12Re: PQ Torus Knots
posted by InertiaSince the application only runs at ~10 fps it's quite safe to assume that there's no hardware acceleration happening at all, the slightly decreasing time per frame can probably be explained by the backface culling mechanism being able to reject more and more faces.
The major problem with this application is that the timing of the draw call is quite precise and any process running in the background (such as network traffic or input events) do affect the result notably. So it's quite hard writing a reliable benchmark that can reliably output a useful result.
For my own use I'm just optimizing meshes for a vertex cache size of 12 right now, but I'm planning to write a function that implements a weightless algorithm at some point and compare the results.
I'm not working on this project anymore and it is now pretty much only a demo how to generate a mesh procedurally and stuff it into a VBO, should probably move it to the archives?
May 13
16:18:57Re: PQ Torus Knots
posted by DarianThanks you both Fiddler and Inertia,
I tried this as an example to test my GMA 950 VBO capabilities and see if it yields an error running undefined extensions.
I still have not answer regarding this issue, I don't want to go off topic on this, so I'll keep it short.
I need to display between 2000 to 10000 spheres and looking for the best way doing it. with an important culprit, that i need the updated vertex data back into the programs logic when i rotate/translate objects.
Tried this VBO example to get a hold of a proper method.
- Darian