
Cloo OutOfResourcesComputeException when trying to process successive image buffers
Posted Wednesday, 2 November, 2011 - 21:40 by rajaron inI am trying to use Cloo to demosaic a stream of raw images. The first call to the Demosaic method below successively converts the 8bpp Bayer mosaic to a 24bpp color image. The next call results in a black image (all zeros). Attempts to demosaic subsequent frames after 6 frames results in an OutOfResourcesComputeException. Input is byte[400, 400]. (My OpenCL kernel works on both an NVIDIA GeForce GT 525M and an AMD ATI Radeon 5450 as well as on an Intel i7 CPU; so, I have not included it here. The problem is evidently in how I am using Cloo to set things up for the computing device.)
Here's my class that does the demosaic:
class MHCdemosaic : IDemosaic { private ComputeProgram program; private ComputeKernel kernel; public MHCdemosaic() { // Init OpenCL // For now use the first platform and first device thereof for the context ComputePlatform platform = ComputePlatform.Platforms[0]; List<ComputeDevice> devices = new List<ComputeDevice>(); devices.Add(platform.Devices[0]); ComputeContextPropertyList properties = new ComputeContextPropertyList(platform); ComputeContext cc = new ComputeContext(devices, properties, null, IntPtr.Zero); // Setup the program and its kernel program = new ComputeProgram(cc, LoadSource("MHCdemosaic.cl")); program.Build(null, null, null, IntPtr.Zero); if (kernel != null) kernel.Dispose(); kernel = program.CreateKernel("MHCdemosaic"); } public Bitmap Demosaic(byte[,] bayer) // needs a 1D array, not 2D like others { // temporary "flatten" to 1D to send to GPU byte[] mosaic = new byte[bayer.Length]; int height = bayer.GetLength(0); int width = bayer.GetLength(1); for (int y = 0; y < height; y++) for (int x = 0; x < width; x++) mosaic[x + y * width] = bayer[y, x]; byte[] bimage = new byte[bayer.Length * 3]; // Setup input buffer using (ComputeBuffer<byte> imageIn = new ComputeBuffer<byte>(kernel.Context, ComputeMemoryFlags.ReadOnly | ComputeMemoryFlags.CopyHostPointer, mosaic)) { // Setup output buffer using (ComputeBuffer<byte> imageOut = new ComputeBuffer<byte>(kernel.Context, ComputeMemoryFlags.WriteOnly, bimage.Length)) { // Set arguments kernel.SetMemoryArgument(0, imageIn); kernel.SetMemoryArgument(1, imageOut); using (ComputeCommandQueue cq = new ComputeCommandQueue(kernel.Context, kernel.Context.Devices[0], ComputeCommandQueueFlags.None)) { // Execute cq.Execute(kernel, null, new long[] { width, height }, null, null); // Get the color buffer cq.ReadFromBuffer(imageOut, ref bimage, false, null); // Convert to Bitmap and return return CreateBitmap(bimage, width, height); } } } } private Bitmap CreateBitmap(byte[] buffer, int width, int height) { Bitmap newBitmap = new Bitmap(width, height, PixelFormat.Format24bppRgb); BitmapData newData = newBitmap.LockBits(new Rectangle(0, 0, width, height), ImageLockMode.WriteOnly, PixelFormat.Format24bppRgb); int stride = newData.Stride; IntPtr ptr = newData.Scan0; System.Runtime.InteropServices.Marshal.Copy(buffer, 0, ptr, stride * height); newBitmap.UnlockBits(newData); return newBitmap; } }


Comments
Re: Cloo OutOfResourcesComputeException when trying to ...
The kernel declaration is as follows:
Re: Cloo OutOfResourcesComputeException when trying to ...
I can see nothing wrong with this piece of code. You could however try to reduce overhead by moving resource creation to the constructor. Also kernel arguments need be set only once. Buffers can be updated through the ComputeCommandQueue.ReadFrom/WriteTo methods, which also accept 2D and 3D arrays directly (which is also faster since Cloo doesn't flatten the array but uses the pointer to the first element while talking to OpenCL).
Should these tips help that might indicate a problem with resources in Cloo. I'll try to tackle with that latter on.
Re: Cloo OutOfResourcesComputeException when trying to ...
Ok, so I moved the resource creation to the constructor as follows:
and then, in the Demosiac method, per your recommendation, I used a WriteToBuffer taking the 2D array directly as follows:
When using the NVIDIA device, on attempt to demosaic the 2nd frame, it throws an AccessViolationException on line 628 of ComputeCommandQueue:
I figure that this happens because the NVIDIA driver only supports OpenCL 1.0 and this is apparently a 1.1 function call. (There isn't a 1.1 driver available yet for my Dell laptop card.)
When using the Intel i7 device, which supports OpenCL 1.1, it doesn't throw an exception; but it also only returns a black image (all zeros) for every frame. And it doesn't run into the OutOfResourcesComputeException, even if I stream frames to it continuously. So, that's good. Now only if it would return the color image.
Am I using the right WriteToBuffer method and with the right parameters? Or, is there an alternative that would work for OpenCL 1.0 for my NVIDIA device?
Re: Cloo OutOfResourcesComputeException when trying to ...
Sorry about it. You'll have to flatten the array manually since the 2D/3D versions rely on OpenCL 1.1. I should've mentioned it previously.
You can use
WriteToBuffer<T>(T[] source, ComputeBufferBase<T> destination, bool blocking, IList<ComputeEventBase> events)(or the version with more parameters if you wish to specify a subrange). This should be usable in OpenCL 1.0.I have no idea about the black output. Have you tried simply transferring the data from input to output without processing them? Say, a red square or similar.
Re: Cloo OutOfResourcesComputeException when trying to ...
Yes, adding back the flattening to a 1D array and changing the WriteToBuffer as follows:
worked for the NVIDIA device.
I'm not sure either why the Intel i7 as a ComputeDevice isn't working. I had it working in a one-shot implementation, but it's not working in this program. I'll look into that later.
Thank you for your help in resolving this.
Re: Cloo OutOfResourcesComputeException when trying to ...
I changed the ReadFromBuffer to block, and then the Intel i7 as a computing device works now:
Evidently, the kernel was not finished when ReadFromBuffer was called so it was getting nothing but zeroes from the buffer. Now that it waits until the kernel finishes, it gets the color image back.