Req RetroPlayer support for RetroArch's “Run-Ahead” input latency reduction feature?
#1
Earlier this year RetroArch implemented a feature they call “Run-Ahead” to reduce (or hide) input latency/lag and I'm wondering if this 'runahead latency reduction' concept could be adopted as a RetroPlayer feature for Kodi too?

RetroArch developers have found a method to achieve extremely low input latency/lag and have implemented that into RetroArch and some libretro cores to this date. They posted a summary here:

https://www.libretro.com/index.php/retro...ad-method/

"Most systems more complex than the Atari 2600 will display a reaction to input after one or two frames have already been shown. For example, Super Mario Bros on the NES always has one frame of input latency on real hardware. Super Mario World on SNES has two frames of latency. The Libretro forum post linked above provided patches to RetroArch that can effectively “remove” this baked-in latency by running multiple instances of a core simultaneously, and quickly emulating multiple frames at a time if input changes." "Be aware that this method is highly resource intensive. The higher the number of frames you are going to run ahead of emulation, the higher demands it places on your CPU."

Note the part explaining that by using this type of 'runahead' compensation method you can actually even achieve lower input latency than the real console/arcade hardware, so worth keeping in mind that you do not want to use it for a truly accurate experience as it no longer exact emulation of the real hardware, of which some built-in input lag what was always the case, but I would think that many might argue that less input latency/lag is better and it could still feel more true to how you remember the playing experiences on those consoles/arcades back in the 'good-old-days': 

The concept was first conceived by hunterk and is now implemented by dwedit into RetroArch, here are four links which explain this new “Run-Ahead” feature in more detail:

https://docs.libretro.com/guides/runahead/

https://www.reddit.com/r/emulation/comme...d/dwj3chb/

https://forums.libretro.com/t/input-lag-...-lag/15075

Update: Here are a couple of videos explaining how this Run-Ahead works from and end-users point of view in RetroArch



Reply
#2
Can you explain how runahead works in plain english?

This is a cool idea, but I want to focus more on accessibility than performance initially. I have some updates to RetroPlayer planned that may bring this kind of feature in the realm of what's possible.
Reply
#3
(2018-10-31, 02:24)garbear Wrote: Can you explain how runahead works in plain english?

I understand that RetroArch's “Run-Ahead” input latency reduction feature builds on the same principles as RetroPlayer's existing rewind and save-states features, with the emulator first running ahead then saving state without drawing a frame on the screen, then rewinding (rollback) and playing back before loading state. So that way "Every frame drawn is being shown from the future".

It is as they write here https://docs.libretro.com/guides/runahead/

Every game has a certain built-in amount of lag, some react on the next displayed frame, some can take 2, 3 or even more frames before an action on the gamepad finally get rendered on screen.

The Run Ahead feature calculates the frames as fast as possible in the background to "rollback" the action as close as possible to the input command requested.

That feature deals with "internal" game logic lag. This means you can still take advantage of other RetroArch lag reduction methods that 
happens later, such as Hard GPU Sync or Frame Delay.


Dwedit, the RetroArch/Libretro developer who coded the feature (and also happen to be the developer of PocketNES and Goomba Color emulators for GBA) posted a short summary in plain-English here: https://www.reddit.com/r/emulation/comme...d/dwj3chb/

Novel method to reduce emulator input lag beyond the limits of real hardware via constant savestates and rollback

How the Run-Ahead feature currently works:

There are two modes of operation.
  • Single-Instance Mode
  • Two-Instance Mode

In Single-Instance mode, when it wants to run a frame, instead it does this:
  • Disable audio and video, run a frame, Save State
  • Run additional frames with audio and video disabled if we want to run ahead more than one frame
  • Enable audio and video and run the frame we want to see
  • Load State
All save states and load states are done to ram and never reach the disk.

In
Two-Instance mode, it does this:
  • Primary core does Audio only, then saves state
  • Secondary core loads state, runs frames ahead discarding audio and video, then runs a frame with video only.
  • For performance reasons, it only resyncs the secondary core when input is dirty, otherwise it keeps running additional frames on the secondary core while the input is clean.
Why bother with Two-Instance mode at all? Many of the cores do not leave audio emulation in a clean state after loading state, so you would get buzzing. Using Two-Instance mode makes the primary core not do any load states and avoids that.

In Single-Instance mode, it is possible to improve performance further by running ahead without loading state while input is clean, but I am not currently doing that. I'd imagine there'd be issues if calling the "run a frame" function left you in a state further along than a single frame.

I'm also not doing any speculative inputs at all.



Dwedit has also posted a little more technical summary how it works here so hope it's ok if I just quote him again, this time from here: https://forums.libretro.com/t/input-lag-...-lag/15075

Input Lag Compensation to compensate for game’s internal lag

For a game, there are several critical timings: The time of the joystick input, the time the game has decided what to display on the screen, and the time the image actually appears on the screen.

For the Atari 2600, there is zero input lag. Input is sampled and the game logic runs during
vblank time, then the screen image is generated as the screen renders using the new information about the game’s state.

But for the NES and later 2D consoles, the game logic runs during render time. So as the game is reacting to your input, it is displaying the previous frame on your TV, and you
wan’t see the effects of your joystick input until the next frame. This is one frame of internal lag.

Sonic the Hedgehog games on the Genesis happen to have two frames of internal lag.


 Anyway, the “gens-rerecording” emulator has an example of how to deal with internal lag, as seen in the basic update_frame code:
 
Code:
int Update_Frame_Adjusted()
{
    if(disableVideoLatencyCompensationCount)
        disableVideoLatencyCompensationCount--;

    if(!IsVideoLatencyCompensationOn())
    {
        // normal update
        return Update_Frame();
    }
    else
    {
        // update, and render the result that's some number of frames in the (emulated) future
        // typically the video takes 2 frames to catch up with where the game really is,
        // so setting VideoLatencyCompensation to 2 can make the input more responsive
        //
        // in a way this should actually make the emulation more accurate, because
        // the delay from your computer hardware stacks with the delay from the emulated hardware,
        // so eliminating some of that delay should make it feel closer to the real system

        disableSound2 = true;
        int retval = Update_Frame_Fast();
        Update_RAM_Search();
        disableRamSearchUpdate = true;
        Save_State_To_Buffer(State_Buffer);
        for(int i = 0; i < VideoLatencyCompensation-1; i++)
            Update_Frame_Fast();
        disableSound2 = false;
        Update_Frame();
        disableRamSearchUpdate = false;
        Load_State_From_Buffer(State_Buffer);
        return retval;
    }
}
So what it does is this: (to compensate for N frames of latency)
  • Run one frame quickly without outputting video or sound
  • Save State
  • Run N - 1 frames quickly without outputting video or sound
  • Run one frame with outputting video and sound
  • Load State
Reply
#4
Here’s another article from a few years ago illustrating the basic concept:

http://filthypants.blogspot.com/2016/03/...cy-in.html

Using Rollback to Hide Latency in Emulators

Posted by Hunter K. at 8:05 AM Wednesday, March 9, 2016

This is something I thought about a pretty long time ago but recently decided to flowchart out in case anyone wanted to implement it.

The way rollback-based netplay (e.g., GGPO and RetroArch's current netplay implementation) works with emulators is that the two players exchange input states (which are small enough to pass back and forth within the time of a single frame of emulation) and then when they diverge, you roll back to a previous known-good time where the input still agreed and then emulate the intervening frames with the corrected input data to catch up (in RetroArch, these are called 'lag frames').

I had an idea for hiding a frame of latency that would work similar to that concept but in a single-player situation:

Image

The way it works is whenever the player's input changes, you roll back one frame and apply the new inputs retroactively and then emulate two frames to catch back up. This makes your inputs go into effect one frame before you actually pressed the button(s). This wouldn't result in a rollback loop because, even though we feel like we press a lot of buttons all the time when we play a game, most of the time (particularly from the emulator's point of view) we're really just holding a button or two.

You would want the audio to run one frame behind the video, so your audio buffer wouldn't be constantly emptying and skipping around every time you roll back. Instead, it would fill back up with the next frame's audio during catch-up. Despite running a frame behind the video, our brains would unconsciously sync the audio with the video, as they are very forgiving about that sort of thing (this is a known effect, read http://blogs.scientificamerican.com/obse...erception/ ).

One drawback is that you would frequently lose a single frame of animation but thankfully our brains are quite good at papering over that sort of thing, as well. The other drawback to this method is that it would require emulating two full frames in the space of a single frame, so the CPU requirements for any emulator using it would be doubled.

Again, this method would hide a single frame of latency. While most PC setups have significantly more than one frame of latency to contend with, every little bit helps.

UPDATE (4/12/2016): I was talking to
letoram, author of ArcanFE, the other day and he mentioned that he had implemented something just like this recently and that the lagging audio made everything seem *more* latent, presumably due to the everything-syncs-to-the-slowest effect mentioned in the above link. I didn't ask him whether muting/disabling audio helped with this or not.

UPDATE (4/30/2018): A few months ago, Dwedit--the author of PocketNES and Goomba Color emulators for GBA--mentioned having a similar idea and after some brief planning and discussion, he whipped up a working model and then refined it into a functional feature for RetroArch. It's included in the 1.7.2 release under the name "
runahead". Some differences in his approach to the one I described include keeping audio in sync (rather than lagging behind) and having the option of running a second instance of the core to hop over to in the case of audio issues. To avoid weird rollback effects (see: the 'lose a single frame of animation' thing in my original post), Dwedit astutely recognized that as long as you keep the number of rollback frames below the number of internal lag frames in the game (that is, the number of frames it takes for your input to cause a reaction on the screen), the effect is completely invisible to the user.
Reply
#5
FYI, Run-Ahead has now successfully been implemented in bsnes standalone as well.

Here is a video of bsnes developer explaining how Run-Ahead works in bsnes

https://www.youtube.com/watch?v=1AvOa8yt6Vc



You can download the latest version of bsnes with run-ahead support at https://bsnes.byuu.org

You can access the source code for bsnes at https://github.com/byuu/bsnes
Reply
#6
This is definitely a feature I'm interested in implementing in RetroPlayer.

I think I could even do better.

Run-ahead works by predicting the next controller state, and re-rolling the frame on a misprediction. Re-rolling is an expensive operation, so you can see how the better your prediction, the better your performance.

You can reframe run-ahead as a signal processing algorithm; treat input as a signal, and predict the next value. There are several signal processing algorithms available to perform prediction:

Zero-order Hold

A Zero-order hold (ZOH) is what RetroArch uses; simply assume the current value will hold until the next frame unchanged. While trivial, every button press or axis motion will cause a re-roll.

Fourier Transform

A Fourier transform (FT) can be used to predict a button being tapped at a constant rate. When about half the period elapses, you predict the *opposite* button value.

First-order Hold

A first-order hold (FOH) is like a ZOH, but uses the derivative (speed) of an axis. If the axis is moving up, a ZOH would predict the same value, while a FOH would predict a higher value.

Recurrent Neural Network

A recurrent neural network (RNN) is used to predict arbitrary sequences. This is how Google Translate works; given a sequence of foreign words and the current native words, predict the next native word.

The magic is that you don't need a meta-heuristic to choose between ZOH, FT, FOH or what-have-you. You simplify train the RNN on all controller input, and it learns to identify hidden patterns to predict future controller input.

An RNN trained on all controller input would probably perform poorly, with lots of mispredictions. To increase accuracy, you could condition input on the current game (strategy games will look a lot different than racing games). You could condition input on the current player (my button presses will look different from yours). As you add more and more data, the RNN gets better and better.

Convolutional Neural Network

A convolutional neural network (CNN) learns to predict values based on raw pixel data. You could use a CNN to reduce the pixels in the frame (high-dimensional data) to a small number of hidden classes (low-dimensional data). What do you do with these classes? Feed them to the RNN! You basically create an AI that *watches* you play, and based on your play history, guesses what you'll do next.

You might of heard of something called a Telsa. This is how self-driving works. A CNN watches raw pixels and radar, and trains an RNN based on user input (the steering wheel angle and throttle/brake position). They call this "shadow mode". Run-ahead prediction could work exactly the same way; a video-game playing AI operates in shadow mode, watching pixels and past input, and predicting future input for run-ahead.

Of course, an interesting by-product of Tesla's shadow mode is full self driving. You could switch the video game AI from prediction mode to play mode, and have a general gaming intelligence (GGI) for entertaining offline multiplayer.
Reply
#7
Would also be nice if Run-Ahead 'metadata profiles' for each game could be shared in the community via a shared repository using an addon or something, (that is, the possibility to share profile data for all games and what the best runaway setting would be instead of having to test all games personally yourself with frame-advancing). 

By the way, for credit reference; Dwedit replied with this to byuu's video on YouTube about Run-Ahead being added in bsnes standalone: 

"The first time I saw RunAhead was in Gens-Rerecording, a Sega Genesis emulator used by the TASvideos community for creating Tool-Assisted speedruns. According to Github logs, Nitsuja committed the feature on July 30, 2008. The motivation for adding the feature was to have Sonic respond within one frame advance while creating TASes of Sonic the Hedgehog, and not to try to reduce input lag for real time play. I brought this feature to the attention of the Libretro message board in March 2018, then later implemented RunAhead into RetroArch."

Link: https://www.youtube.com/watch?v=1AvOa8yt6Vc
Reply
#8
FYI, RetroArch 1.9.13 will on top of its existing "runahead" feature also feature a brand new feature called "Automatic Frame Delay" which is similarly another method aimed at further latency reduction as well:

https://www.libretro.com/index.php/retro...ame-delay/

The aim of this option is to lessen the burden of creating core-specific and also game-specific overrides for Frame Delay, and instead go for “set it and forget it”. With 0 value, the starting point will be half frame time. It is fairly simple and it is based on frame time averaging with a few different thresholds which filter out false positives and provide quick stabilization with bigger steps depending on the need. It only goes down, it is temporary, and it will reset on core unload and on SET_SYSTEM_AV_INFO. The necessary values are shown in video statistics and all actions are logged in info level.

https://twitter.com/libretro/status/1457...tro.com%2F

PS: Missed it back in 2020 but noticed now there was apparently some discussion about "Runahead for RetroPlayer" as a possible GSoC 2020 project, so perhaps this could be something for GSoC 2022 and/or 2023?
 
https://forum.kodi.tv/showthread.php?tid=352135

https://kodi.wiki/view/Google_Summer_of_...yer_engine
Reply
#9
Maybe a cool GSoC 2020 project!
Reply
#10
@garbear Maybe you could suggest this again as a GSoC 2023 proposal idea up for discussion?

https://kodi.wiki/view/Google_Summer_of_...yer_engine
Reply

Logout Mark Read Team Forum Stats Members Help
RetroPlayer support for RetroArch's “Run-Ahead” input latency reduction feature?0