My company develops an enterprise software that heavily uses advanced statistics, stochastic algorithms, and the like. For the next major release (fall 2010) we'll focus on optimization. I came to know OpenCL just by chance when reading some Snow Leopard feature lists and thought that this technology could serve us well.
My idea is to build a small proof of concept picking one of the algorithms we use (matrix multiplication, FFT, etc) implementing it in a) non-parallel, b) .NET 4.0 parallel for (using Parallel.NET CTP for this POC) and c) OpenTK-OpenCL.
While I see that the programing cost of recoding our libraries to use OpenCL is much bigger than .NET 4.0 parallel for & co., I still believe that GPGPU has much more potential and is much more scalable. Besides they are non-competing technologies, aren't they?
From the benchmark output I would also like to deduct how much gain we could have by going from, say, a Radeon 4830 to a 4870.
Do you think this is a reasonable approach? Is it even possible? If so, has anyone done this before?
What kind of GP card is enough for a test like this? Right now I'm not able to spend time debugging drivers, so for this test I would like to use a very well known and cheap GP card. Has anyone compiled a list of cards?
Our code is C# on Windows .NET 3.5 (VS2008). We do not plan to shift to VS2010 until a few months after its release.
Thanks in advance,
PS. Apologize if this is not the right place to ask. My experience with GPGPU is merely some magazine and blog articles reading, a fact that's probably evident given my questions.