
[Solved] AccessViolationException in Cloo
Posted Sunday, 16 January, 2011 - 23:04 by kwaegel in[EDIT] This problem was fixed by updating to the latest version of nVidia's graphics driver. Works with version 263.06 and later.
I have been beating my head against some AccessViolationExceptions in Cloo for some months now, and I finally think I found the cause. The new code nythrix put in showing what threads were interacting with the Cloo objects was the tip off. Thanks nythrix!
This error appears to occur when ComputeResource objects constructed with a OpenCL/GL shared context are freed by a different thread then they were created by. In the cases I have tested, this appears to be the system cleanup thread, not the OpenTK thread calling Game.Dispose() . The latter thread is the safe thread to use.
As near as I can tell with my general lack of knowledge of OpenGL, it looks like only the main thread has a valid OpenGL context. The other threads do not and thus are not allowed to access (i.e. free) the shared blocks of memory.
I have included some code showing a test case below. This is a very stripped down version of another testing program I have written, and thus does not really do anything. To cause this error, either the second or third line in the Game.Dispose() function should be commented out. To fix this error in larger programs, I think that every ComputeResource object needs to be disposed of by the same thread that created it.
// AccessViolationException test case. namespace ClooTest { using System; using System.Drawing; using System.Threading; using System.Diagnostics; using System.Collections.Generic; using System.Runtime.InteropServices; using OpenTK; using OpenTK.Graphics; using OpenTK.Graphics.OpenGL; using OpenTK.Input; using Cloo; class SimpGame : OpenTK.GameWindow { // required for OpenCL-OpenGL interop [DllImport("opengl32.dll")] extern static IntPtr wglGetCurrentDC(); OpenTK.Graphics.IGraphicsContextInternal _glContext; ComputeContext _computeContext; ComputeCommandQueue _commandQueue; /// <summary>Creates a window with the specified title.</summary> public SimpGame() : base(400, 400, GraphicsMode.Default, "Cloo shared context tester") { VSync = VSyncMode.On; } protected override void OnLoad(EventArgs e) { base.OnLoad(e); GL.ClearColor(Color.Black); openCLSharedInit(); // Do not need to build a program or kernel to reproduce error. } // Create a sharde context between OpenGL and OpenCL. private void openCLSharedInit() { // select OpenCL device and platform. Need GPU for shared context. ComputePlatform platform = ComputePlatform.Platforms[0]; ComputeDevice device = platform.Devices[0]; if (device.Type != ComputeDeviceTypes.Gpu) { platform = ComputePlatform.Platforms[1]; device = platform.Devices[0]; } Trace.WriteLine("Creating shared context on "+ device.ToString()); IntPtr curDC = wglGetCurrentDC(); _glContext = (OpenTK.Graphics.IGraphicsContextInternal)OpenTK.Graphics.GraphicsContext.CurrentContext; IntPtr raw_context_handle = _glContext.Context.Handle; ComputeContextProperty p1 = new ComputeContextProperty(ComputeContextPropertyName.CL_GL_CONTEXT_KHR, raw_context_handle); ComputeContextProperty p2 = new ComputeContextProperty(ComputeContextPropertyName.CL_WGL_HDC_KHR, curDC); ComputeContextProperty p3 = new ComputeContextProperty(ComputeContextPropertyName.Platform, platform.Handle); List<ComputeContextProperty> props = new List<ComputeContextProperty>() { p1, p2, p3 }; ComputeContextPropertyList Properties = new ComputeContextPropertyList(props); _computeContext = new ComputeContext(device.Type, Properties, null, IntPtr.Zero); //Create the command queue from the context and device _commandQueue = new ComputeCommandQueue(_computeContext, device, ComputeCommandQueueFlags.None); } protected override void OnUpdateFrame(FrameEventArgs e) { base.OnUpdateFrame(e); if (Keyboard[Key.Escape]) { Exit(); } } protected override void OnRenderFrame(FrameEventArgs e) { base.OnRenderFrame(e); } public override void Dispose() { Trace.WriteLine("Dispose called in thread(" + Thread.CurrentThread.ManagedThreadId+")"); // Ensure all OpenCL objects are disposed of in the main thread. // WARNING: Removeing any of these dispose lines will cause an AccessViolationException on program shutdown. //_commandQueue.Dispose(); _computeContext.Dispose(); base.Dispose(); } /// <summary> /// The main entry point for the application. /// </summary> static void Main() { System.Diagnostics.Trace.WriteLine("\n********** Run at " + System.DateTime.Now.ToString() + " **********"); // The 'using' idiom guarantees proper resource cleanup. // We request 30 UpdateFrame events per second, and unlimited // RenderFrame events (as fast as the computer can handle). using (SimpGame game = new SimpGame()) { game.Run(30.0); //game.Run(); } } } }


Comments
Re: Solution to AccessViolationException in Cloo
- If Dispose must always be called, then, why have finalizers anyway?
Btw. finalizers may not be called at all which may or may not bear problems.
Re: Solution to AccessViolationException in Cloo
This code is also supposed to fail but it doesn't:
It behaves exactly like the one with the CL objects. I MUST be missing something obvious here.
Suggestions?
Re: Solution to AccessViolationException in Cloo
Calling GL functions without a context results in undefined behavior. It may crash, or it may not.
On Windows, GL1.1 functions tend to fail silently, while GL1.2 and higher tend to crash. On Linux, all functions tend to crash. This used to be a major source for bugs and confusion in the older Tao framework, since most people failed to initialize a context properly.
Re: Solution to AccessViolationException in Cloo
Cloo has been updated with some patches.
Unfortunately, this problem hasn't been dealt with yet. What's more, I'm leaning towards the idea that proper disposal can't be done at the library level.
I'm going to give it a go with SafeHandles. If that works, expect a minor breaking change (if accessing handles directly that is). If not, I'm afraid you'll have to Dispose (at least CL/GL shared) resources manually.
Re: Solution to AccessViolationException in Cloo
I hit a problem with OpenCL functions returning an array of OpenCL handles (clGetPlatformIDs, clGetDeviceIDs and a lot of other ones). The marshaler cannot convert these to SafeHandle arrays (a MarshalDirectiveException is thrown). Google tells me, no one has ever done something like this before. I guess, we're stuck with simple IntPtrs for now.
Re: Solution to AccessViolationException in Cloo
One solution would be to hide the p/invoke behind a public function that handles the conversion. Something like:
Re: Solution to AccessViolationException in Cloo
Considered that. Unfortunately, it eliminates one of the (not many) "advantages" of SafeHandles. The IntPtr may leak anywhere between
FooPrivate(native_handles)andhandles[i] = new SafeHandle(...)in case a
ThreadAbortExceptionor similar kicks in.All in all, I'm nowhere closer to solving the original problem which arises from the GC running in a thread without an OpenGL context. And I don't think SafeHandles (or any others) can fix that.
Here's a top cloo for all the users:
You should always free your shared CL/GL resources manually. Failing to do so may lead to unexpected behavior/exceptions/crashes when they're collected by the GC.
Pure OpenCL resources are probably fine. However, for best performance you should consider some sort of management (i.e. building the OpenCL program from scratch during every frame is fine, but extremely slow).
Re: Solution to AccessViolationException in Cloo
Oh well. I had hoped that SafeHandles would be of some help. I suppose you could call GLContext.MakeCurrent() in the GC thread, but I can see all sorts of problems with that idea. :D
In the meantime, here is some code I put in my local copy of the ComputeResource finalizer to explain (to myself) what is going on in these cases.
Re: Solution to AccessViolationException in Cloo
Fixed (I think).
You aren't going to believe this, but I think it's a driver issue. I updated to nVidia's latest developer drivers (263.06) and the problem disappeared. Can someone else try this out and let me know if you get the same results? The latest developer drivers are available here.
As a check, I downgraded to my previous developer driver version (260.61) and the problem started occurring again. I have not tried the latest production drivers yet (266.58).
Re: Solution to AccessViolationException in Cloo
Indeed, it doesn't happen with 266.58 (c2woody reported). I tested it and it works. I didn't know 263.06 works too. But how? Why?
Stealing the OpenGL context is a very bad idea. By which I was briefly tempted, I must admit :)
Btw:
Throwing an exception from a finalizer causes the CLR to fail fast, which tears down the process. Therefore, throwing exceptions in a finalizer should always be avoided.
http://msdn.microsoft.com/en-us/library/bb386039.aspx