Pablitinho's picture

Problem to measure kernel time in Cloo

Hi there,

When I measure the execution time with events I have a big delay in the normal execution of the code (in relation if I dont measure the time with the event), but seems that the time is measured right. Below is show the code:

//---------------------------------------------------------------------------------------------------------------------------------------------
// Code with event time measurement
//---------------------------------------------------------------------------------------------------------------------------------------------
ev.Clear();
GPUDevice.oclCQ.Execute(Kernels.myKernel, null, GlobalWorkSize, LocalWorkSize, ev);
GPUDevice.oclCQ.Finish();
GPUDevice.oclCQ.Wait(ev);
 
long  start = ev.Last.StartTime;
long   end = ev.Last.FinishTime;
 
float Time= (float)((end - start) / 1000000.0f);
//---------------------------------------------------------------------------------------------------------------------------------------------
//---------------------------------------------------------------------------------------------------------------------------------------------
// Code without event time measurement
//---------------------------------------------------------------------------------------------------------------------------------------------
sw.Reset();
sw.Start();
GPUDevice.oclCQ.Execute(Kernels.myKernel, null, GlobalWorkSize, LocalWorkSize, null);
GPUDevice.oclCQ.Finish();
sw.Stop();
float Time = (float)sw.Elapsed.TotalMilliseconds;
 
//---------------------------------------------------------------------------------------------------------------------------------------------

Anyone know the reason of this?.

Thx in advance !!!!!!


Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
nythrix's picture

This is sort of "normal". I've noticed the same behavior and it is due to the events being created in OpenCL. Cloo has little overhead here.

Pablitinho's picture

I was checking cloo code and I removed all the calls of : "Trace.WriteLine" and reduce a lot the times in my methods. Take into account that I create memories, detroy the memories, etc.. etc.. I have examples that I got 55 ms with the tracke.writeline call and if I comment it the method take only 7 ms.

Thx for the answer.

nythrix's picture

You should undefine the TRACE and DEBUG symbols and run in Release mode, if you want to do profiling then. The compiler will throw the Trace and Debug calls away for you.
Deleting those lines is not recommended since you might not know you're "leaking" resources. This mainly concerns resources shared with OpenGL.