the Fiddler's picture

OpenTK.Math.Half

Project:The Open Toolkit library
Component:Code
Category:feature request
Priority:normal
Assigned:Inertia
Status:closed
Description

This type should provide an interface similar to IntPtr, with a methods, conversion operators and constructors that can pack/unpack floats and doubles.

I'm opening this task so we can keep track of progress.


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Inertia's picture

#11

1. True, but it can avoid an unnecessary copy.

struct Vertex
{
   public Vector3 Position;
   public Half2 TexCoord;
}
 
Vertex[] VA = new Vertex[10000];
 
// Later you could simply fill the TexCoord with:
 
VA[i].TexCoord.X.FromFloat( f );
 
// rather then 
 
VA[i].TexCoord.X = new Half( f ); // create a new half just to copy it and trash it afterwards

2. Let's say we have this struct:

struct Half2
{
   public Half X;
   public Half Y;
}

How to feed glTexCoord2hv with the pointer to X._internalbits if it's not visible? If the pointer is available to the user to use for pinning, they may modify _internalbits aswell. You want to present no void* overload where the user may pin manually?

3. Ok, you changed it to that already ;)

4. Interesting, you prefer a "bool throwonerror" over a public static parameter to control this.

5. Half <-> integer makes as much sense as float <-> int. Not very useful in the float -> int direction, but in the int -> float direction it can be. Maybe integer constructors would make sense, but I cannot see a good usage scenario for this. Try entering MaxValue+1, MaxValue, MaxValue-1 and MaxValue-5 into the program, and it will become obvious that the Half's accuracy is simply too poor to work with integers other than byte, sbyte and short.
Half <-> double would be np, but actually it would implemented look like this Half <-> Float <-> Double. In practise it would not be very useful, you usually chose double because single precision is not accurate enough. Half is worse.
It's easy adding constructors tho, np support them all and let the user worry about accuracy problems.

P.S. did you manually change the issue properties back to what they were, or is this a bug?

Edit: Regarding 2. we actually need the bitfield public, else you cannot directly save the vertex array to disk and reconstruct it. Serializable would probably not care about this, but binary reader/writer. It would be rather bad if you must convert your Half to a float in order to save it.

AttachmentSize
OpenTK Half v3.rar5.17 KB
the Fiddler's picture

#12

1. Moving 2 bytes is nothing compared to the actual float->half conversion - I doubt this would even show up in benchmarks. Sounds like premature optimization, but I wouldn't object if it actually proves significant.

2. I'd prefer to override glTexCoord2hv directly, instead of making the representation public. I'm also pretty sure we can provide for Serialization/Deserialization without actually making the field public (I know how to do that with WCF, but haven't tried with the .Net 2.0 APIs).

4. I guess a static property didn't cross my mind at the time. :) I'm pretty sure I've seen "throwOnError" parameters in the BCL, too - can't remember where though (something with file IO?)

5. These probably won't be used in practice and the user can always cast to float first. I guess we'll see in practice whether the extra constructors make sense.

[Issue properties]
Race condition: I started my post, you updated the status, I posted and reverted it. Please re-update!

Inertia's picture

#13

1. You said in your very first post that the FromSingle() function is not "necessary". I take that as "not a problem". Can we just make it public again and people use whatever they prefer? I don't really like that the type becomes partially immutable (you can only change the Half by copying from another Half) and making the function public again won't hurt anybody.
In the example there were 10000x Half2 created. That's 20000 unnecessary instances which are just used to convert, copy from and then collected. Without doubt, avoiding these 20000 times is faster than doing it. I'm more concerned about the unnecessary strain put on the GC than the 40000 copied bytes.

2. Not sure if enforcing the user to use serialization is desireable. This will certainly be a problem when looking into file formats that support Half. I haven't bothered with serialization much, but it is my understanding that you cannot deserialize a file from disk that hasn't been created through serialization. If that is correct, trying that the serialization matches a file format is probably futile too.

5. I've added 9 overloads for the constructor, but ofcourse they all cause overflow exceptions when you pass values greater than 70.000 ^^ (this is perfectly normal, it's >MaxValue)

6. I take it you had 0 problems running the test app, the binary patterns matched and entered numbers would return roughly the same?

7. Do you know any other C# libraries that support Half? Should I search the basement for the pioneer flag?

P.S. I'm working on Half234 atm, will post v4 later, maybe tomorrow.

the Fiddler's picture

#14

Version:0.9.1» 0.9.x-dev
Priority:minor» normal
Status:postponed» in progress

1. Half is a struct, so no heap allocation and no GC pressure. Your example is equivalent to executing (the moral equivalent of) 20000 mov instructions, which is nothing compared to the actual conversion code.

2. I was using "serialization" in the broad meaning of the word, as in read /write to a file. The fact that we need to be able to serialize halfs does not necessarily mean exposing the backing field to the user. The same can be done by a suitable interface (e.g. byte[] GetBytes()), which have the added bonus of being CLS-compliant.

I guess I cannot see what a public field would get as, when we don't even provide any arithmetic operators.

6. No problems under Mono, but didn't test thoroughly.

7. None that I know of. First time I've heard of this flag, bring it on sounds like fun! :D

PS: Seems you can only assign issues to yourself, strange design.

Inertia's picture

#15

Assigned to:Anonymous» Inertia
Status:in progress» open

1. I fully agree that it's not expensive, but my urge for efficiency wants me not to execute any unnecessary operations. Can we agree that there is no harm making FromSingle() public?

2. This seems the best solution so far, I don't really want the _internalbits modifyable by anything besides constructor and FromSingle() Method. But it has to be readable. Oh yes ... CLS Compliance ... I had forgotten about this ... thanks for the reminder. -.-

6. If the bit patterns match, it should work fine.

objarni's picture

#16

1. I fully agree that it's not expensive, but my urge for efficiency wants me not to execute any unnecessary operations. Can we agree that there is no harm making FromSingle() public?
Inertia - so why don't you use C/assembler instead of C#? I thought one of the goals of OpenTK was being more productive (more elegant API) at the expense of not being 100% optimized?

the Fiddler's picture

#17

Objarni, that argument was not really constructive. OpenTK may not be assembly, but it doesn't mean it should be as optimized as possible. :)

I've taken the time to cook up a small test based on v3 and here are the results for converting 135f 100M times (Mono 1.9.1 amd64, 2.6GHz Core 2):
Constructor: 1.32 - 1.37 seconds
FromSingle: 1.32 - 1.37 seconds
Conversion: 1.32 - 1.37 seconds

Absolutely the same! (The second decimal digit changes between consecutive runs, but it's too variable).

That's about 13ns per conversion, which is awesome (tested with a few different numbers, it's about the same). The FromSingle method probably *is* slightly faster, but if you can't detect the difference after 100M I don't think it matters.

I've made a couple of changes for speed:

  • Using *(int*)&f to get the bit pattern proved 5ns faster (1.3sec vs 1.8sec).
  • The checks for Nan, Infinity etc add 20ns to each conversion (3.7sec). It seems separate throwOnError constructor makes sense after all.

I've attached the test - can someone please run it a few times and compare the results?

AttachmentSize
OpenTK Half v3.5.7z5.93 KB
objarni's picture

#18

Environment: Vista 32-bit, .NET3.5, compiled project using VS Express with optimization on in Release mode (had to build new project because of some strange "icon resource not found")

Results:

Timing tests
Constructor: 3,6385536 seconds
FromSingle: 2,9378697 seconds
Conversion: 2,93964 seconds

[ I'm sorry for ranting above.. Let me try to explain:

I feel there should be a balance between speed and elegance in OpenTK, a balance that favors elegance when the loss of performance is so insignificant you need a microscope to see it. If you know what I mean.

Optimization is dear to me - and microoptimization is good in inner loops (id' softwares guru optimizer Michael Abrash statement) - but I'd choose ordo-optimization any day before microoptimization. Things change (compilers, both static and runtime) so microoptimization seem to be very "today this is the best, in a year something else is better".

I think that a good measure is noticeability. If the optimization is not even noticeable it should not be followed through. And certainly not when it drops readability / elegance.]

the Fiddler's picture

#19

Yeah, we agree on elegance vs speed - but there are a few places (math) that makes sense to trade elegance for speed.

Your results are interesting. There's no reason for such a discrepancy, especially since the conversion operator simply uses the constructor internally. Was this result consistent over several runs? Timing is quite sensitive to system load.

I just tested on Vista 32-bit (1.8GHz Core 2, .Net 2.0) and the results were much closer, 6.6 vs 6.5 seconds (or 1ns per iteration, fairly consisten with the "1 mov" idea).

objarni's picture

#20

Timing tests
Constructor: 3,8519614 seconds
FromSingle: 3,1216907 seconds
Conversion: 3,1050145 seconds

Timing tests
Constructor: 3,7527534 seconds
FromSingle: 3,0849869 seconds
Conversion: 3,0321445 seconds

Timing tests
Constructor: 3,7475937 seconds
FromSingle: 3,0872202 seconds
Conversion: 3,0812113 seconds

Yeah, we agree on elegance vs speed - but there are a few places (math) that makes sense to trade elegance for speed.
Well, maths - for example linear algebra which is used alot in computer graphics - is one of the places where I strive for elegance the most, if not for any other reason than code being more close to what we see in maths books/papers. Readability. So I don't quite agree with that statement. Maybe something like "in the most frequently used types (Vector4/Matrix4), we should trade elegance for speed" is more like it?