Kamujin's picture

How to get more accurate FPS without spinning the CPU.

The following code is a replacement for the Run method in GameWindow that I use. There is some room for minor speed optimizations, but this method is orders of magnitude more efficient then spinning the CPU with Thread.Sleep(0);

In my tests, this algorithm is able to achieve FPS rates withing 1-3 frames of the target parameter when measured in quarter second intervals. Sampling over 5 seconds yields actual FPS within 1 frame of target. The algorithm also quickly adapts to new loads such as those experienced during window resizing.

The approach uses a simple "balancing broom" style fuzzy logic algorithm to deal with the problem of Thread.Sleep(...) timers with resolutions too low to support a deterministic calculation of the time parameter needed to obtain the desired FPS.

For Example:
On most windows machines, the sleep timer has a resolution of only 15ms. This means that calling Thread.Sleep(1) can actually put the thread to sleep for up to 15ms. Therefore, you can not simply achieve 100FPS but calling Sleep(10) between frames. In the case of a 100FPS target, the algorithm will determine a blend of Sleep(0) and Sleep(1) calls that will achieve the desired frame rate. Even though some frames result in a Sleep(0) call between frames, the occasional Sleep(1) will dramatically lower the CPU load.

For a target of 50 FPS, the algorithm will most likely blend Sleep(15) and Sleep(16) calls.

One very attractive part of this algorithm is that no assumptions need be made about timer resolution. The algorithm will automatically adapt to whatever timer resolution is available.

Fiddler,
I would really like write access to the SVN to further develop this algorithm. I would like to extend this code with the following improvements,

1) Support Update and Render calls with different target frequencies.

2) A more even distribution of Sleep calls that will remove the "clustering" effect of the current system.

3) Remove the arrays.

4) Remove the need to search.

5) Increased precision.

I already have a system in mind to accomplish these improvements.

        public void RunSimple(int targetFps)
        {
            if (disposed) throw new ObjectDisposedException("GameWindow");
            try
            {
                TargetUpdateFrequency = targetFps;
                TargetRenderFrequency = targetFps;
 
                UpdateFrameEventArgs updateArgs = new UpdateFrameEventArgs();
                RenderFrameEventArgs renderArgs = new RenderFrameEventArgs();
 
                try
                {
                    OnLoadInternal(EventArgs.Empty);
                }
                catch (Exception e)
                {
                    Trace.WriteLine(String.Format("OnLoad failed: {0}", e.ToString()));
                    return;
                }
 
                Debug.Print("Entering main loop.");
                hasMainLoop = true;
 
                Stopwatch stopWatch = Stopwatch.StartNew();
 
                int[] sleepTimes = new int[15];
                for (int i = 0; i < sleepTimes.Length; i++) sleepTimes[i] = 1000 / targetFps;
                int frameSleepTime = 0;
 
                int frames = 0;
                double previousElapsedSeconds = 0;
                while (!isExiting)
                {
                    frames++;
 
                    double totalElapsedSeconds = stopWatch.Elapsed.TotalSeconds;
                    double frameElapsedSeconds = totalElapsedSeconds - previousElapsedSeconds;
                    previousElapsedSeconds = totalElapsedSeconds;
 
                    if (totalElapsedSeconds >= 0.25)
                    {
                        double fps = frames / totalElapsedSeconds;
 
                        if (fps < targetFps)
                        {
                            int max = 0;
                            for (int i = 1; i < sleepTimes.Length; i++)
                            {
                                if (sleepTimes[i] > sleepTimes[max]) max = i;
                            }
                            sleepTimes[max] = System.Math.Max(0, sleepTimes[max] - 1);
                        }
                        else
                        {
                            int min = 0;
                            for (int i = 1; i < sleepTimes.Length; i++)
                            {
                                if (sleepTimes[i] < sleepTimes[min]) min = i;
                            }
                            sleepTimes[min] += 1;
                        }
 
                        //Console.Write(string.Format("FPS:{0} Target:{1} SleepTimes:", fps, targetFps));
                        //for (int i = 0; i < sleepTimes.Length; i++)
                        //{
                        //    Console.Write(string.Format(" {0}", sleepTimes[i]));
                        //}
                        //Console.Write("\n");
 
                        stopWatch.Reset(); 
                        stopWatch.Start();
                        frames = 0;
                        previousElapsedSeconds = 0;
                    }
 
                    ProcessEvents();
 
                    updateArgs.Time = frameElapsedSeconds;
                    OnUpdateFrameInternal(updateArgs);
 
                    renderArgs.Time = frameElapsedSeconds;
	                OnRenderFrameInternal(renderArgs);
 
                    System.Threading.Thread.Sleep(sleepTimes[frameSleepTime]);
                    frameSleepTime = (frameSleepTime + 1) % sleepTimes.Length;
                }
            }
            catch (GameWindowExitException)
            {
                Debug.WriteLine("GameWindowExitException caught - exiting main loop.");
            }
            finally
            {
                Debug.Print("Restoring priority.");
                Thread.CurrentThread.Priority = ThreadPriority.Normal;
 
                OnUnloadInternal(EventArgs.Empty);
 
                if (Exists)
                {
                    glContext.Dispose();
                    glContext = null;
                    glWindow.DestroyWindow();
                }
                while (this.Exists)
                    this.ProcessEvents();
            }
        }

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
objarni's picture

Is this approach fixed-time-step-logic or non-fixed? Sorry for being so code-blind..

Kamujin's picture

Currently, non-fixed, but it can be extended to support simulated fixed.

Kamujin's picture

Well, I have done as much testing as I can think to do at various loads and with various target FPS. I am not seeing jitter or negative side effects. I really think you should make this available for additional testers. I am sure other will uncover things that I have missed.

I have also tested this with VSync on and off. The only noticeable, but obvious side effect, is that the algorithm can no longer achieve target FPS above the vsync frequency. It is still able to match target FPS below the vsync frequency.

I do think I have failed to communicate the adaptive nature of the algorithm properly. The important thing to understand is that this algorithm is continuously re-balancing. Varying CPU and GPU loads are not a difficult problem for this system to deal with. Or to put it another way, this system is specifically designed to deal with varying loads. The trade off that one is making with this algorithm is that you are sacrificing some precision and determinism. Logically, this would be unacceptable to balance ones checkbook, but is completely tolerable in a FPS scheduler.

I am not advertising this as a complete solution to someone interested in doing fixed time interval updates. I do think it is reasonably simple for someone to consume the update events in a way that simulates a fixed time interval scheduler. Considering the great amount of non-determinism in the hardware and operating system combinations likely to run OpenTK, I am dubious that a true fixed time interval update scheduler is practical.

I still need to complete the documentation that you requested. I expect to have it complete shortly.

Currently, I am consuming the InterleavedScheduler the following way. I would like your opinions on whether I should internalize the calculation of the updateElapsedSeconds and renderElapsedSeconds values into the InterleavedScheduler.

        public void RunSimple(int targetFps)
        {
            RunSimple(targetFps, targetFps);
        }
 
        public void RunSimple(int updatesPerSecond, int framesPerSecond)
        {
            if (disposed) throw new ObjectDisposedException("GameWindow");
            try
            {
                TargetUpdateFrequency = updatesPerSecond;
                TargetRenderFrequency = framesPerSecond;
 
                UpdateFrameEventArgs updateArgs = new UpdateFrameEventArgs();
                RenderFrameEventArgs renderArgs = new RenderFrameEventArgs();
 
                try
                {
                    OnLoadInternal(EventArgs.Empty);
                }
                catch (Exception e)
                {
                    Trace.WriteLine(String.Format("OnLoad failed: {0}", e.ToString()));
                    return;
                }
 
                Debug.Print("Entering main loop.");
                hasMainLoop = true;
 
                Stopwatch stopwatch = new Stopwatch();
                stopwatch.Reset();
                stopwatch.Start();
                int updates = 0;
                int frames = 0;
                double updateElapsedSeconds = 0;
                double renderElapsedSeconds = 0;
 
                InterleavedScheduler scheduler = new InterleavedScheduler(updatesPerSecond, framesPerSecond);
                while (!isExiting)
                {
                    bool shouldUpdate, shouldRender;
                    double elapsedSeconds = scheduler.Update(out shouldUpdate, out shouldRender);
                    updateElapsedSeconds += elapsedSeconds;
                    renderElapsedSeconds += elapsedSeconds;
 
                    ProcessEvents();
 
                    if (shouldUpdate)
                    {
                        updates++;
                        updateArgs.Time = updateElapsedSeconds;
                        OnUpdateFrameInternal(updateArgs);
                        updateElapsedSeconds = 0;
                    }
 
                    if (shouldRender)
                    {
                        frames++;
                        renderArgs.Time = renderElapsedSeconds;
                        OnRenderFrameInternal(renderArgs);
                        renderElapsedSeconds = 0;
                    }
 
                    double totalSeconds = stopwatch.Elapsed.TotalSeconds;
                    if (totalSeconds >= 5.0)
                    {
                        double ups = updates / totalSeconds;
                        double fps = frames / totalSeconds;
 
                        Console.WriteLine("Actual UPS={0:0.00}, FPS={1:0.00}", ups, fps);
 
                        updates = 0;
                        frames = 0;
                        stopwatch.Reset();
                        stopwatch.Start();
                    }
 
                    scheduler.Sleep();
                }
            }
            catch (GameWindowExitException)
            {
                Debug.WriteLine("GameWindowExitException caught - exiting main loop.");
            }
            finally
            {
                Debug.Print("Restoring priority.");
                Thread.CurrentThread.Priority = ThreadPriority.Normal;
 
                OnUnloadInternal(EventArgs.Empty);
 
                if (Exists)
                {
                    glContext.Dispose();
                    glContext = null;
                    glWindow.DestroyWindow();
                }
                while (this.Exists)
                    this.ProcessEvents();
            }
        }

Lastly, as VSync is itself a form of scheduler, should OpenTK offer the option of disabling the interleaved scheduler in favor of using a purely VSync driven update / render scheduler?

the Fiddler's picture

Thanks for the code, I'll make a modified build available for testing. The API looks ok, but I'll have to play with it first (e.g. would it be more natural if the scheduler raised the update/render events itself?)

Regarding vsync, it would be enough if there was an option to run the scheduler as fast as possible (no limiting). This way, one could request e.g. 30 updates per second and enable vsync for the actual frame limiting (useful for CRTs).

Kamujin's picture

I have tried to avoid over-reaching with the scheduler. I am very open to suggestions on what functionality to encapsulate.

To my thinking, if you really just want to vsync, the execution path should not include calls to the scheduler.

With respect to raising events on behalf of GameWindow. I am not opposed, but as I said before, I didn't want to over-reach.

Inertia's picture

This sure is an interesting concept, thanks for sharing!

I assume you're working on this in the scope of your Robotech game? You should probably take a look at http://gafferongames.wordpress.com/game-physics/fix-your-timestep/ too. It deals with the inevitable order of input, simulate and rendering. The website also has some good articles about network programming, they might not apply directly for mmog but the sync and packing approaches might be interesting for you too.

Mincus's picture

I nearly mentioned something like that.
I haven't looked at the code really closely, but from what I have looked at, I noticed is that it lacks the ability to tell whether things are being delayed without checking the fps value itself (which lacks accuracy for interpolating animation) or keeping a seperate timer.
Maybe that's beyond the scope of what is intended though?

Kamujin's picture

It does report accurate elapsed times for both the renderFrame and update events. Am I missing something else? I think the consumer can tell pretty easily by sampling the intervals how close to their target they are. UPS and FPS are averages after all. I do think its important to abandon any notion of true fixed scheduling. Fighting the load and non-determinism in a game and its hardware is a losing battle IMO.

@Intertia. Thanks for the link. I use a similar method already internally for updates. Although, I also decouple actual position / orientation with apparent position / orientation and smoothly interpolate my render operations also. This might seem redundant, but in my case it's not. My needs for bounded update intervals comes from my use of numerical physics. IE, I don't integrate. As I am choosing my update bounds to maintain a desired precision in my physics calculations, I am necessarily not optimizing for smoothed rendering. The use of apparent position / orientation gives me very smooth visible movement in the presence of all sorts of load variation. It also gives me the freedom to concurrently compute my physics on another thread.

I didn't want to force an update ideology on anyone with this scheduler. It's quite trivial too add different smoothing techniques inside the event handlers, so why try to pick a one size fits all solution?

For my use case, CPU is precious. I loathe to spend it spinning.

objarni's picture

Kamujin: Do you have any example of an app' (something extremely simple like a spinning triangle or something) using the InterleavedScheduler?

Kamujin's picture

It's designed to replace the Run loop of OpenTK, so no.

I do use it in http://sourceforge.net/projects/oogl, but that's not a simple example.

Additionally, I use it in my personal projects.