sgsrules's picture

poor performance using vector math library

Hi everyone, this is my first post. I''ve been coding a 3d rendering engine for VJ use using C# and OpenGL. Originally the project started in Java/jogl but i recently ported everything to C#. In java i was using the java vecmath library which worked great but now that i'm coding in c# i need a new library that will handle vector and matrix math. After some searching i came across the OpenTK.Math library which does everything i need but i've been having some serious performance issues. One test i ran in java was running at 60fps, the same application in C# was running at 20fps. After some testing i came to the conclusion that the bottleneck was happening in my math routines. I've read good things about the OpenTK math library and one of the benchmarks i saw stated that it was pretty fast, so i'm sure i'm not implementing it right. i think the problem is that im not passing things by reference so it's making a lot of extra copies. If someone could give me some pointers i'de greatly appreciate it. Here's some sample code i wrote a while back, it basically draws a 3d arrow that bends along a path. i'm only using this as a test since it uses a lot of vector and matrix math. I'm only interested in optimizing the math functions, everything is written in immediate mode no vbos etc i know how to implement vbos and could write better code so please don't point it out :P i just need help with the openTK.math calls.

TIA
Stephen

public class Arrow
{
int count;
float Size;
float scl = 1;
Vector3[] pos;
Vector3[] square = new Vector3[4];
Vector3 acl = new Vector3();
Vector3[] vel;
Vector3 up = new Vector3(1, 1, 1);
Vector3 vx = new Vector3();
Vector3 vy = new Vector3();
Vector3 vz = new Vector3();
Vector3 anchor;

public Arrow(float _Size)
{
Size = _Size;
count = 70;
pos = new Vector3[count];
vel = new Vector3[count];
up.Normalize();
square[0] = new Vector3(-Size, -Size, 0);
square[1] = new Vector3(Size, -Size, 0);
square[2] = new Vector3(Size, Size, 0);
square[3] = new Vector3(-Size, Size, 0);
for (int i = 0; i < count; i++)
{
pos[i] = new Vector3();
vel[i] = new Vector3();
}

}

public void draw(Vector3 _anchor, float _r, float _g, float _b, float _scl, int _length, bool _tip)
{
anchor = new Vector3(_anchor);
Gl.glColor4f(_r, _g, _b, 1);
Gl.glColorMaterial(Gl.GL_FRONT_AND_BACK, Gl.GL_AMBIENT_AND_DIFFUSE);
Gl.glColorMaterial(Gl.GL_FRONT_AND_BACK, Gl.GL_SPECULAR);
for (int k = 0; k < square.Length; k++)
{ ////Begin loop for each side
scl = _scl;
float c = 0; //color counter rest
////////////// BEGIN SHAPE ////////////
Gl.glBegin(Gl.GL_QUAD_STRIP);
for (int i = 0; i < _length; i++)
{
if (i == 0)
{
acl = anchor - pos[i];
// acl.sub(anchor, pos[i]);
}
else
{
acl = pos[i - 1] - pos[i];
// acl.sub(pos[i - 1], (pos[i]));
}
float aclScale = .09f * (.1f * (i + 1));
acl.Scale(aclScale, aclScale, aclScale);

vel[i] += acl;
pos[i] += vel[i];
// vel[i].add(acl);
// pos[i].add(vel[i]);

////////// MAKE ARROW HEAD /////////////
if (_tip)
{
if (i == 0)
{
scl = 0;
}
else if (i == 1)
{
scl = _scl * 3f;
Vector3 arrowhead = new Vector3(vel[0]);
//arrowhead.set(vel[0]);
arrowhead.Normalize();
arrowhead.Scale(Size / 3f, Size / 3f, Size / 3f);
pos[1] -= arrowhead;
//pos[1].sub(arrowhead);
//pos[2].add(arrow);

}
else
{
scl = _scl;
}
if (i == 2)
{
pos[i]=pos[i - 1];
}
}
/////////////////////////////////////////

///////// SETUP AXIS VECTORS /////////////
vz=(acl);
vz.Normalize();

vx = Vector3.Cross(vz,up);
//vx.cross(vz, up);
vx.Normalize();
vy = Vector3.Cross(vz,vx);
//vy.cross(vz, vx);
vy.Normalize();
//////////////////////////////////////////

///////////// CREATE MATRIX //////////////
// Vector4 mvx = new Vector4(vx);

Matrix4 mat = new Matrix4(new Vector4(vx), new Vector4(vy), new Vector4(vz), new Vector4(pos[i]));

Matrix4 sclm = Matrix4.Scale(scl);

////////////// ALIGN SQUARE //////////////
Vector4[] temp = new Vector4[4];
Vector4[] sqscl = new Vector4[4];
for (int j = 0; j < square.Length; j++)
{

sqscl[j] = Vector3.Transform(square[j],sclm);
temp[j] = Vector4.Transform(sqscl[j], mat);

// sclm.transform(square[j], sqscl[j]);

// mat.transform(sqscl[j], temp[j]);
}
//////////////////////////////////////////

//////////// TOP /////////////
if (k == 0)
{
Gl.glNormal3f(-vy.X, -vy.Y, -vy.Z);
Gl.glVertex3f(temp[0].X, temp[0].Y, temp[0].Z);
Gl.glVertex3f(temp[1].X, temp[1].Y, temp[1].Z);
}
/////////// RIGHT ////////////
else if (k == 1)
{
Gl.glNormal3f(vx.X, vx.Y, vx.Z);
Gl.glVertex3f(temp[1].X, temp[1].Y, temp[1].Z);
Gl.glVertex3f(temp[2].X, temp[2].Y, temp[2].Z);
}
/////////// BOTTOM ///////////
else if (k == 2)
{
Gl.glNormal3f(vy.X, vy.Y, vy.Z);
Gl.glVertex3f(temp[2].X, temp[2].Y, temp[2].Z);
Gl.glVertex3f(temp[3].X, temp[3].Y, temp[3].Z);
}
/////////// LEFT /////////////
else if (k == 3)
{
Gl.glNormal3f(-vx.X, -vx.Y, -vx.Z);
Gl.glVertex3f(temp[3].X, temp[3].Y, temp[3].Z);
Gl.glVertex3f(temp[0].X, temp[0].Y, temp[0].Z);
}
float velscl = .8f - (.015f * i);
vel[i].Scale(velscl, velscl, velscl);
c += .018f;
// scl-=.0125; /// Scale decrement

}
Gl.glEnd();
//////////////// END SHAPE ///////////////////
}
}
}


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
sgsrules's picture

Perfect answer, i wish i would have found this on my own. Thanks Inertia.

sgsrules's picture

I rewrote the above example using functions and passing things by reference and the performance is pretty much about the same. I'm not using opentk for graphics i'm using the tao framework, so i'm not sure if this is affecting anything at all. Java/Jogl is also managed code why is the performance so much better when using their math libraries?

Inertia's picture

1.a) How do you determine performance? (Your description is very spongy)
1.b) Where do you do the profiling? Inside VS IDE? (Only use optimized release builds with VS IDE shut down)
2.) You did not update your initial code with changes, so no idea how your program looks like now. When the Mathlib was introduced I asked for overloads that allow you to write "A.Add( ref B );" but iirc this drowned into a elegance vs. performance discussion and has never been picked up again. Vector3.Add( ref A, ref B, out C); is pretty close speedwise though.

JTalton's picture

I'm hoping to get A.Add( ref B ) and such added to the vector classes at some point.

sgsrules's picture

unfortunately i'm using visual studio 2008 professional which doesn't have a profiler.
I've changed every line of code to pass things by reference, but some of the functions don't have pass by reference methods.

If i comment out the following lines my freamerate goes from 7 fps to 22fps:

sqscl[j] = Vector3.Transform(square[j],sclm);
temp[j] = Vector4.Transform(sqscl[j], mat);

the Fiddler's picture

That's a significant oversight: passing by value means pushing 80 bytes on the stack, which positively kills performance. I'll add the necessary overloads to svn asap.

[Java vs .Net]
You said on the Tao / Gamedev forums that you are using a Java math library which (I suspect) is using native code for better performance. If that's the case, you are comparing oranges to apples (native code with simd optimizations vs managed code). Were you to write a math library in pure Java, you'd likely observe a similar performance hit (possibly even worse, since you cannot allocate on the stack-based structs).

Inertia's picture

The Stopwatch class - from the first link - can be used for profiling.

What Fiddler said. You should try compare pure C# vs. pure Java, or Mono.SIMD vs. an optimized Java mathlib.

sgsrules's picture

I downloaded the source code for the vecmath library in java and it seems to be written in pure java. Comparisons aside are there any vector libraries for c# that are going to give me good performance? at this point i don't care if they're not platform independent or use native code. I looked into mono.simd but it's still early in development and doesn't have any methods to handle matrices or quaternions.

the Fiddler's picture

Mono.Simd simply exposes CPU simd instructions to managed apps. It's not supposed to be a full math library, just provide the building blocks for one.

There are several other .Net math libraries available. If you don't care about platform independence, SlimDX uses C++/CIL to hook into the native DirectX math libraries and should be plenty fast (windows only, however). Alternatively, there are .Net BLAS interfaces that can be used to call into highly optimized BLAS implementations (typically provided by the CPU manufacturers). Last, you may be able to use the math functions bundled in the Media namespace of .Net 3.0.

I've never used any of those, so I don't know how suitable or fast they are.