
OpenTK.Math.Half
Posted Sunday, 9 November, 2008 - 22:30 by the Fiddler| Project: | The Open Toolkit library |
| Component: | Code |
| Category: | feature request |
| Priority: | normal |
| Assigned: | Inertia |
| Status: | closed |
Description
This type should provide an interface similar to IntPtr, with a methods, conversion operators and constructors that can pack/unpack floats and doubles.
I'm opening this task so we can keep track of progress.


Comments
#1
#2
I don't think adding this type is high priority (hardware just starts to support it), but for future reference:
http://www.openexr.com/downloads.html
IlmBase 1.0.1 source code release
Extremely well documented float->half conversion, with conditionals and without lookup tables. half->float uses tables. License appears to be no problem.
#3
Excellent find, this makes our lives significantly easier.
#4
mhmm how do I change the settings in a way that expresses this could be a feature for 0.9.3, but definitley won't make it into 0.9.2?
#5
With the current setup, this is a request for a feature missing from 0.9.1. The postponed status means that it won't make it to the next version. When
it's time to implement the feature, we can just mark it as active, bump the version and start working on it.
Which means it's fine the way it is now.
(Let's move this discussion to the forums, though)
#6
I've been working on this, kinda need some input to know where this should be heading.
1. Does anyone have a better idea to obtain the bits of a float?
I've tried it with fixed-statement, but type safety kicked my a*s.
2. After some tests it proves correct that doing any arithmetic operations with half type is futile. The type can only represent 65536 unique numbers (minus special cases) and you run into precision problems very soon. Should a HalfToFloat method be provided at all? Using it for debugging now, but I think anyone should be advised to store 32-bit float copies for arithmetics and use 16-bit only for certain attributes.
3. Currently the internal representation of the half (Uint16) is public. This is very dangerous, because this field is not intuitive to edit, but I see no alternative since it must be addressable for upload to GL. Any suggestions how to deal with this? (I've added a summary to the field to point that problem out)
4. I'm manually throwing ArithmeticExceptions if weird numbers like NaN or +/- infinity are passed. Half can represent those numbers, but I find it better to throw when encountering them, because these numbers are no good and probably some error in other calculations. Should it stay like that, or not throw and let the user deal with the problem?
PS: I'll attach an example app later on, kinda cleaning my debug mess up and make it somewhat usable without reading full source code 1st.
#7
1. Either this, or pointer cast in an unsafe context:
Edit: No idea if this is faster, though.
2. I think yes, as an explicit conversion operator and a ToSingle() method. We should probably follow the API of IntPtr (which is used in a somewhat similar way).
3. Make it
internal. This way, we can use it (e.g. to add overloads to the GL class), while keeping it safe from the user.Edit:
4. I'd say let the user deal with it. There should be an API to query for these conditions (e.g. IsNan, IsInfinity) - just like the Single class.
#8
1. I'm not certain either which is faster, but my guess is the compiler has better chances with the struct.
2. Currently there are only 2 public methods:
There are some MinValue constants, and I'll take your suggestion to add IsNaN, IsPositiveInfinity properties.
3. I failed to explain it properly, you will understand when you see code.
4. Well the C API itself had a function in it that would create an arithmetic overflow (which I ripped out and throw an exception instead). I don't think this should be removed, but I may ofcourse stop intercepting NaN/Infinity.
The idea was: you have a 3D model built out of floats and you want to compact normals and texcoords into half type, you could simply put the whole conversion into a try block, rather than converting and afterwards looping over the result and test individually for errors. If such an error occurs you might want to try a different vertex format (without Half), that's why the try block would make this easier.
Adding IsNaN property etc. should be easy.
------------------
Here's a x86 console app that let's you enter floating point numbers, converts them to Half, prints binary representation and then converts back to float.
I've removed all need for any lookup tables, because I don't see any reason why OpenTK should bother with a 256kb table just for Half type, which probably 10% of the users will use at all.
It works great for me, but I'd like to hear whether this causes any troubles with different endianess (which it shouldn't) or x64 OS.
#9
Added IsZero, IsNaN, IsPositiveInfinity and IsNegativeInfinity Properties. Removed all exceptions besides the FPU overflow.
#10
Comments on the code:
The following attachment implements these changes and adds a few simple conversion operators. Does it make sense to have fast Half<->Int or any Half<->Double conversions?
Edit: It seems you forgot to attach your updates..?
#11
1. True, but it can avoid an unnecessary copy.
2. Let's say we have this struct:
How to feed glTexCoord2hv with the pointer to X._internalbits if it's not visible? If the pointer is available to the user to use for pinning, they may modify _internalbits aswell. You want to present no void* overload where the user may pin manually?
3. Ok, you changed it to that already ;)
4. Interesting, you prefer a "bool throwonerror" over a public static parameter to control this.
5. Half <-> integer makes as much sense as float <-> int. Not very useful in the float -> int direction, but in the int -> float direction it can be. Maybe integer constructors would make sense, but I cannot see a good usage scenario for this. Try entering MaxValue+1, MaxValue, MaxValue-1 and MaxValue-5 into the program, and it will become obvious that the Half's accuracy is simply too poor to work with integers other than byte, sbyte and short.
Half <-> double would be np, but actually it would implemented look like this Half <-> Float <-> Double. In practise it would not be very useful, you usually chose double because single precision is not accurate enough. Half is worse.
It's easy adding constructors tho, np support them all and let the user worry about accuracy problems.
P.S. did you manually change the issue properties back to what they were, or is this a bug?
Edit: Regarding 2. we actually need the bitfield public, else you cannot directly save the vertex array to disk and reconstruct it. Serializable would probably not care about this, but binary reader/writer. It would be rather bad if you must convert your Half to a float in order to save it.
#12
1. Moving 2 bytes is nothing compared to the actual float->half conversion - I doubt this would even show up in benchmarks. Sounds like premature optimization, but I wouldn't object if it actually proves significant.
2. I'd prefer to override glTexCoord2hv directly, instead of making the representation public. I'm also pretty sure we can provide for Serialization/Deserialization without actually making the field public (I know how to do that with WCF, but haven't tried with the .Net 2.0 APIs).
4. I guess a static property didn't cross my mind at the time. :) I'm pretty sure I've seen "throwOnError" parameters in the BCL, too - can't remember where though (something with file IO?)
5. These probably won't be used in practice and the user can always cast to float first. I guess we'll see in practice whether the extra constructors make sense.
[Issue properties]
Race condition: I started my post, you updated the status, I posted and reverted it. Please re-update!
#13
1. You said in your very first post that the FromSingle() function is not "necessary". I take that as "not a problem". Can we just make it public again and people use whatever they prefer? I don't really like that the type becomes partially immutable (you can only change the Half by copying from another Half) and making the function public again won't hurt anybody.
In the example there were 10000x Half2 created. That's 20000 unnecessary instances which are just used to convert, copy from and then collected. Without doubt, avoiding these 20000 times is faster than doing it. I'm more concerned about the unnecessary strain put on the GC than the 40000 copied bytes.
2. Not sure if enforcing the user to use serialization is desireable. This will certainly be a problem when looking into file formats that support Half. I haven't bothered with serialization much, but it is my understanding that you cannot deserialize a file from disk that hasn't been created through serialization. If that is correct, trying that the serialization matches a file format is probably futile too.
5. I've added 9 overloads for the constructor, but ofcourse they all cause overflow exceptions when you pass values greater than 70.000 ^^ (this is perfectly normal, it's >MaxValue)
6. I take it you had 0 problems running the test app, the binary patterns matched and entered numbers would return roughly the same?
7. Do you know any other C# libraries that support Half? Should I search the basement for the pioneer flag?
P.S. I'm working on Half234 atm, will post v4 later, maybe tomorrow.
#14
1. Half is a struct, so no heap allocation and no GC pressure. Your example is equivalent to executing (the moral equivalent of) 20000 mov instructions, which is nothing compared to the actual conversion code.
2. I was using "serialization" in the broad meaning of the word, as in read /write to a file. The fact that we need to be able to serialize halfs does not necessarily mean exposing the backing field to the user. The same can be done by a suitable interface (e.g. byte[] GetBytes()), which have the added bonus of being CLS-compliant.
I guess I cannot see what a public field would get as, when we don't even provide any arithmetic operators.
6. No problems under Mono, but didn't test thoroughly.
7. None that I know of. First time I've heard of this flag, bring it on sounds like fun! :D
PS: Seems you can only assign issues to yourself, strange design.
#15
1. I fully agree that it's not expensive, but my urge for efficiency wants me not to execute any unnecessary operations. Can we agree that there is no harm making FromSingle() public?
2. This seems the best solution so far, I don't really want the _internalbits modifyable by anything besides constructor and FromSingle() Method. But it has to be readable. Oh yes ... CLS Compliance ... I had forgotten about this ... thanks for the reminder. -.-
6. If the bit patterns match, it should work fine.
#16
1. I fully agree that it's not expensive, but my urge for efficiency wants me not to execute any unnecessary operations. Can we agree that there is no harm making FromSingle() public?
Inertia - so why don't you use C/assembler instead of C#? I thought one of the goals of OpenTK was being more productive (more elegant API) at the expense of not being 100% optimized?
#17
Objarni, that argument was not really constructive. OpenTK may not be assembly, but it doesn't mean it should be as optimized as possible. :)
I've taken the time to cook up a small test based on v3 and here are the results for converting 135f 100M times (Mono 1.9.1 amd64, 2.6GHz Core 2):
Constructor: 1.32 - 1.37 seconds
FromSingle: 1.32 - 1.37 seconds
Conversion: 1.32 - 1.37 seconds
Absolutely the same! (The second decimal digit changes between consecutive runs, but it's too variable).
That's about 13ns per conversion, which is awesome (tested with a few different numbers, it's about the same). The FromSingle method probably *is* slightly faster, but if you can't detect the difference after 100M I don't think it matters.
I've made a couple of changes for speed:
I've attached the test - can someone please run it a few times and compare the results?
#18
Environment: Vista 32-bit, .NET3.5, compiled project using VS Express with optimization on in Release mode (had to build new project because of some strange "icon resource not found")
Results:
Timing tests
Constructor: 3,6385536 seconds
FromSingle: 2,9378697 seconds
Conversion: 2,93964 seconds
[ I'm sorry for ranting above.. Let me try to explain:
I feel there should be a balance between speed and elegance in OpenTK, a balance that favors elegance when the loss of performance is so insignificant you need a microscope to see it. If you know what I mean.
Optimization is dear to me - and microoptimization is good in inner loops (id' softwares guru optimizer Michael Abrash statement) - but I'd choose ordo-optimization any day before microoptimization. Things change (compilers, both static and runtime) so microoptimization seem to be very "today this is the best, in a year something else is better".
I think that a good measure is noticeability. If the optimization is not even noticeable it should not be followed through. And certainly not when it drops readability / elegance.]
#19
Yeah, we agree on elegance vs speed - but there are a few places (math) that makes sense to trade elegance for speed.
Your results are interesting. There's no reason for such a discrepancy, especially since the conversion operator simply uses the constructor internally. Was this result consistent over several runs? Timing is quite sensitive to system load.
I just tested on Vista 32-bit (1.8GHz Core 2, .Net 2.0) and the results were much closer, 6.6 vs 6.5 seconds (or 1ns per iteration, fairly consisten with the "1 mov" idea).
#20
Timing tests
Constructor: 3,8519614 seconds
FromSingle: 3,1216907 seconds
Conversion: 3,1050145 seconds
Timing tests
Constructor: 3,7527534 seconds
FromSingle: 3,0849869 seconds
Conversion: 3,0321445 seconds
Timing tests
Constructor: 3,7475937 seconds
FromSingle: 3,0872202 seconds
Conversion: 3,0812113 seconds
Yeah, we agree on elegance vs speed - but there are a few places (math) that makes sense to trade elegance for speed.
Well, maths - for example linear algebra which is used alot in computer graphics - is one of the places where I strive for elegance the most, if not for any other reason than code being more close to what we see in maths books/papers. Readability. So I don't quite agree with that statement. Maybe something like "in the most frequently used types (Vector4/Matrix4), we should trade elegance for speed" is more like it?
#21
I was thinking more about the internal implementation, for example stuff like pointer casts and unsafe code (can be used to interop with Mono.SIMD).
Do these results still hold if you move the tests around (e.g. put the constructor test second). This is abnormal, the first and last tests are practically the same code:
#22
Release - no optimization
Timing tests
Constructor: 1,3047712 seconds
FromSingle: 1,3008029 seconds
Conversion: 1,3021977 seconds
Release - optimized
Timing tests
Constructor: 1,1260197 seconds
FromSingle: 0,9906694 seconds
Conversion: 0,9889157 seconds
both executed a couple of times outside the VS IDE, results don't vary much. The FromSingle() is in no case slower than the constructor, I'd really appreciate if we just have both and let the user make his pick, instead of enforcing the use of the constructor by leaving no alternative.
Currently merging the additions you made, will post it here asap.
@objarni: I will start censoring your posts if you keep forcing off-topic remarks into discussions. Besides the timing results you posted you didn't talk about anything related to Half, which is the topic of this discussion. Create a new topic that matches the problem.
#23
Changelog:
unsafe. Works fine.public Half(Half h). No point in that, it's a struct.Imho we should abandon all notion of using any integers with Half and just work with Half <-> Float <-> Double. This limits the options significantly and does not hide 'the' problem. People can still cast to float/double and convert that to half, if they must. The only reason I think of atm what sense integers would make here is something like this:
#24
Thanks! I just checked and updating GL to use the new type is as single as changing the typemap in the generator (patch attached). Our work here is almost done!
[FromSingle]
Inertia, can you think of any scenario where this method could conceivably make even a remote difference? (I honestly cannot think of an.) If this part is actually important, we should simply bring back the LUT.
Or, we could spend our energy integrating Mono.SIMD, which would actually make a (world of) difference.
[Cast to/from double instead of int]
Makes sense.
#25
"Our work here is almost done!"
Yours maybe, I still have Half2,3,4 to do. :P
[save/load]
Is there any interface you want me to derive from? I couldn't find any that implements byte[] GetBytes.
[FromSingle]
None really. I see this more like an overload to get the same job done. I don't really care which is faster (the conversion process will most likely only be used at initialization time in programs when you build VBOs or Textures), it's simply that you have the option to avoid the 'new'. Yes, I'm aware 'new' is extremely cheap in .Net, but when you try hard enough you will probably find scenarios in which one case works better than the other, so it's good to have 2 options rather than 1. It's like 10 lines extra code for the method, and there's no difference in the result.
[Cast to/from double instead of int]
I wasn't sure of it either yesterday, today it made sense ;) So it's ok when I remove all integer constructors? (they're commented out atm)
"Or, we could spend our energy integrating Mono.SIMD, which would actually make a (world of) difference."
Not sure if this works, but can't we simply cast the Simd.Vector4 to our Math.Vector4? The fields align, but it would be fugly.
#26
@objarni: I will start censoring your posts if you keep forcing off-topic remarks into discussions. Besides the timing results you posted you didn't talk about anything related to Half, which is the topic of this discussion. Create a new topic that matches the problem.
I'm sorry if I offended you - but I don't think my remarks are off-topic: they have to do with a specific design decision regarding the Half type, a decision that has to do with elegance vs performance.
So please don't censor me.
#27
No offense taken, but this discussion is again leading off-topic :P Please no more posts that do not include at least 1x "Half", 1x "cast", 1x "math" and 1x "overload". Thanks!
#28
You got it - and I'm taking your idea and making the "elegance vs performance" a new topic - to try and avoid hassles between the "elegance camp" and the "performance camp". Cya there! :)
#29
Yours maybe, I still have Half2,3,4 to do. :PLook at it from the bright side, we won't have to implement overrides for every single function with 'h' in its name!
[Serialization]
The usual implementation is add the [Serializable] attribute and (if necessary) implement the ISerializable interface. This is how Vector234, Matrix4 etc work.
I think the ISerializable interface is necessary in this case, as the internalBits field is not public.
[FromSingle]
Yeah, but to use FromSingle you have to call new first anyway.
When working with structs, "new" simply calls your constructor (or initializes all fields to default(T) if you don't have one). Assuming correct inlining, the above are exactly the same speed-wise.
[Cast to/from double instead of int]
Feel free to remove these.
[Mono.SIMD]
This should do the trick. According to the devs the cast should be plenty fast. Problem is, we *will* see a slight speed hit on .Net and Mono < 2.2, but I think it is worth it. We'll have to benchmark and see.
#30
[Serialization]
ISerializable was the only thing that made sense to me too. Will take some time though as I have to read about the whole concept to write tests.
[FromSingle]
Agreed. My point is: people like choices. The 'new' is not necessary. Let's simply keep both.
[Cast to/from double instead of int]
Done.
[Mono.SIMD]
Let's discuss this in another topic, it's too much for this one.
[Texture]
We haven't touched any scenario yet where the user wants to use Half Textures. RGBA16f has pretty wide hardware support, so this should certainly be considered.
Making the field internal limits this to adding an overload to all functions of this style. I'm really starting to see benefits changing it public again. Most of all, we will have a good laugh when someone uses Half and modifies the field and ponders why his code doesn't work. :D
I believe Half is for advanced users anyways, they should be wise enough to figure out that the warning that the field is not intuitive to edit is there for a reason.