the Fiddler's picture

Nvidia OpenCL drivers released for Linux, Windows and MacOS!

The driver version is 190.29 and can be downloaded from

In addition to the core 1.0 API, this driver supports a number of extensions, "which enable significant acceleration across many image processing disciplines". These extensions will be added to OpenTK in the near future.

Personally, I hope to start using OpenCL for signal processing and pattern matching, as soon as Ati releases GPU-accelerated drivers.

Are you planning to use OpenCL in the near future? If yes, for what kinds of projects?


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
Inertia's picture

Deferred Shading with OpenGL&CL is certainly worth some thought. (fill geometry buffer with GL, do lighting in CL)

I'm kinda waiting for some articles comparing CPU-restricted parallelism with OpenCL and discussing using both together. OpenCL scales better, but the parallel expressions in .Net 4.0 are certainly more convenient to debug and maintain.

kvark's picture

For rendering, I see the possible computation of everything in OpenCL and providing a complete framebuffer for output.

This multi-processors environment (separate CPU +SSE, GPU, SPU, whatever) disappoints me. As a programmer I have to learn different languages, runtime environments and computation domains for each processor...

I'd like to have a single language for data processing (OpenCL) and choose available processors in run-time.

nythrix's picture

Hope this works better than ATI Stream on WinXP. Been stuck for a month.

Irritating yet funny. I can see where programmers grow that weird humor of theirs:
Launching sample: "Vector Addition"

System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
at OpenTK.Compute.CL10.CL.Core.CreateContextFromType(ContextProperties* properties, DeviceTypeFlags device_type, IntPtr pfn_notify, IntPtr user_data, ErrorCode* errcode_ret)
at OpenTK.Compute.CL10.CL.CreateContextFromType(ContextProperties* properties, DeviceTypeFlags device_type, IntPtr pfn_notify, IntPtr user_data, ErrorCode* errcode_ret) in CL.cs:line 1629
at Examples.FFT.Main() in VectorAdd.cs:line 39

Same happens on CL.GetDeviceIDs in my own code. But in my case it does go through a previous CL.GetPlatformIDs succesfully.

the Fiddler's picture

Are your two GetDeviceIDs calls identical? Can you post those two lines?

The driver might be trying to dereference one of the null pointers passed to this method. Will have to read the specs again.

Edit: idea! Try changing the DeviceTypeFlags from DeviceTypeDefault to DeviceTypeGpu.

nythrix's picture

The first call:

IntPtr[] ids;
int idsLength = 0;
    int err = CL.GetDeviceIDs( handle, DeviceTypeFlags.DeviceTypeGpu, 0, null, &idsLength );
    ids = new IntPtr[ idsLength ];
    fixed( IntPtr* pDevices = ids )
    { CL.GetDeviceIDs( handle, DeviceTypeFlags.DeviceTypeGpu, idsLength, pDevices, null ); }

returns an CL_INVALID_VALUE. Actually that holds for every DeviceTypeFlags entry. Obviously, the second call throws an exception since idsLength stays zero.
From the CL specs:
clGetDeviceIDs ... returns CL_INVALID_VALUE if num_entries is equal to zero and devices is not NULL or if both num_devices and devices are NULL....
I have num_entries=0, devices=null. idsLength is assigned. handle obtained a couple of calls ago = valid as well.
The other code is your "VectorAdd" example. Playing with the DeviceTypeFlags doesn't help. Same error.

Edit: code.

the Fiddler's picture

The wrapper matches the function definition so that's not it. Could be a driver issue then.

Viscum's picture

Hy there!

if you change the generated code for DeviceTypeFlags enum to be 64 bits

    public enum DeviceTypeFlags : long   // <-- long here!
        DeviceTypeDefault = ((int)(1 << 0)),
        DeviceTypeCpu = ((int)(1 << 1)),
        DeviceTypeGpu = ((int)(1 << 2)),
        DeviceTypeAccelerator = ((int)(1 << 3)),
        DeviceTypeAll = unchecked((int)0Xffffffff),

following code no longer returns an error

  uint ciDeviceCount = 99;
  result = CL.GetDeviceIDs(platform, DeviceTypeFlags.DeviceTypeAll, 0, null, &ciDeviceCount);

instead I get the correct result==0 and clDeviceCount==1

My configuration:
Windows Vista 32 bit
Geforce 8800GTS
Nvidia OpenCL beta driver 190.86

Hope this helps...

Viscum's picture

Same 64 bit issue applies to every enum which is an cl_bitfield in the original C header.

Changed CommandQueueFlags an MemFlags the same way and the VectorAdd Sample runs without an error.

...but the result array contains no real data only zeros.
investigating this...

Viscum's picture

Finally got the VectorAdd sample working:


  hDeviceMemB = CL.CreateBuffer(hContext,
    MemFlags.MemReadOnly | MemFlags.MemCopyHostPtr,
    new IntPtr(cnDimension.ToInt32() * sizeof(float)),
    new IntPtr(pB),  //nvidia sample had pA here for buffer B
    out error);


  CL.EnqueueReadBuffer(hCmdQueue, hDeviceMemC, true, IntPtr.Zero, //using hCmdQueue instead of hContext here
    new IntPtr(cnDimension.ToInt32() * sizeof(float)),
    new IntPtr(pC), 0, null, (IntPtr[])null);

OpenTK performed a great work, but too much "untyped" IntPtr in method signatures at the moment.
Keep on porting/wrapping...

:-) Viscum

the Fiddler's picture

Good catch, thank you!

This issue was actually discovered and fixed a few months ago, but was reintroduced in 0.9.9-2 (#964: [OpenCL] Bitfields should be mapped to 'long' not 'int').