> That was a cold-sweat moment for me: after all of my harping about latency and responsiveness, I almost shipped a title with a completely unnecessary frame of latency.
In this era of 3-5 frame latency being the norm (at least on e.g. the Nintendo Switch), I really appreciate a game developer having anxiety over a single frame.
You're over-crediting Carmack and under-crediting current game devs. 3-5 frames might be current end-to-end latency, but that's not what Carmack is talking about. He's just talking about the game loop latency. Even at ~4 frames of end-to-end latency, he'd be talking about an easily avoided 20% regression. That's still huge.
I believe both the PS5 and whatever nonsense string of Xs, numbers, and descriptors MS named this gen's console can do 144Hz output. I don't know how many games take advantage of that or whether that refresh rate is common on TVs.
60 FPS isn't even promised on PS5 Pro. Most graphically demanding titles still aim for 30 FPS on consoles, with any game able to support 60 FPS consistently worth noting.
What they said is true. There are some games with 120 FPS modes on PS5 and Series X, maybe even series S. That doesn't mean every game (or even most) are like that, just that the hardware supports it. At the end of the day you can't stop developers targeting whatever framerate they want.
You've seen games running at 120Hz and at 60Hz. The difference is obvious, isn't it? The difference between 24Hz and 60Hz is certainly obvious: that's the visual difference between movies and TV sitcoms.
I can type about 90 words per minute on QWERTY, which is about 8 keystrokes per second. That means that the average interval between keystrokes is about 120 milliseconds, already significantly less than my 200-millisecond reaction time, and many keystrokes are closer together than that—but I rarely make typographical errors. Fast typists can hit 150 words per minute. Performing musicians consistently nail note timing to within about 40 milliseconds. So it turns out that people do routinely time their physical movements a lot more precisely than their reaction time. Their jitter is much lower than their latency, a phenomenon you are surely familiar with in other contexts, such as netcode for games.
If someone's latency is 200 milliseconds but its jitter (measured as standard deviation) is 10 milliseconds, then reducing the frame latency from a worst-case 16.7 milliseconds (or 33.3 milliseconds in your 30Hz example) to a worst-case 8.3 milliseconds, and average-case 8.3 milliseconds to average-case 4.2 milliseconds, you're knocking off a whole 0.42 standard deviations off their latency. If they're playing against someone else with the same latency, that 0.42σ advantage is very significant! I think they'll win almost 61% of the time, but I'm not sure of my statistics†.
> Latency matters! For very simple tasks, people can perceive latencies down to 2 ms or less. Moreover, increasing latency is not only noticeable to users, it causes users to execute simple tasks less accurately. If you want a visual demonstration of what latency looks like and you don’t have a super-fast old computer lying around, check out this MSR demo on touchscreen latency.
> The most commonly cited document on response time is the nielsen group[sic] article on response times, which claims that latncies[sic] below 100ms feel equivalent and perceived[sic] as instantaneous. One easy way to see that this is false is to go into your terminal and try sleep 0; echo "pong" vs. sleep 0.1; echo "test" (or for that matter, try playing an old game that doesn't have latency compensation, like quake 1, with 100 ms ping, or even 30 ms ping, or try typing in a terminal with 30 ms ping). For more info on this and other latency fallacies, see this document on common misconceptions about latency.
(The original contains several links substantiating those claims.)
† First I tried sum(rnorm(100000) < rnorm(100000) + 0.42)/1000, which comes to about 61.7 (%). But it's not a consistent 0.42σ of latency being added; it's a random latency of up to 0.83σ, so I tried sum(rnorm(100000) < rnorm(100000) + runif(100000, max=0.83))/1000, which gave the same result. But that's not taking into account that actually both players have latency, so if we model random latency of up to a frame for the 60Hz player with sum(rnorm(100000) + runif(100000, max=1.67) > rnorm(100000) + runif(100000, max=0.83))/1000, we get more like a 60.8% chance that the 120fps player will out-twitch them. I'm sure someone who actually knows statistics can tell me the correct way to model this to get the right answer in closed form, but I'm not sure I could tell the correct closed-form formula from an incorrect one, so I resorted to brute force.
> You've seen games running at 120Hz and at 60Hz. The difference is obvious, isn't it?
Honestly, I have not. I'm not much of a gamer, even though I used to be a game developer.
Certainly the difference between 30Hz and 60Hz is noticeable.
Maybe this is just because I'm old school but if it were me, I would absolutely prioritize low latency over high frame rate. When you played an early console game, the controls felt like they were concretely wired to the character on screen in a way that most games I play today lack. There's a really annoying spongey-ness to how games feel that I attribute largely to latency.
I don't really give a shit about fancy graphics and animation (I prefer 2D games). But I want the controls to feel solid and snappy.
I also make electronic music and it's the same thing there. Making music on a computer is wonderful and powerful in many ways, but it doesn't have the same immediacy as pushing a button on a hardware synth (well, on most hardware synths).
Oh! I assumed that because you were a famous game developer you would hang out with gamers who would proudly show off their 120Hz monitor setups.
I agree that low latency is more important than high frame rate, and I agree about the snappiness. But low jitter is even more important for that than low latency, and a sufficiently low frame rate imposes a minimum of jitter.
Music is even less tolerant of latency, and PCM measures its jitter tolerance in single-digit microseconds.
I've heard that a good reaction time is around 200 ms, some experiments seem to confirm this figure [1]. At 60Hz, a frame is displayed every 17 ms.
So it would take a 12 frames animation and a trained gamer for a couple of frames to make a difference (e.g. push the right button before the animation ends and the opponent's action takes effect).
Reaction time is completely different to the input latency Carmack is worrying about in his scenario. Imagine if you thought I'm going to move my arm, and 200ms later your arm actually moved. Apply the same to a first-person shooter --- imagine you nudge your mouse slightly, and 200ms later you get some movement on screen. That is ___hugely___ noticeable.
this is for a stylus, but people can detect input latency as low as 1ms (possibly lower)
with VR, they use the term "motion to photon latency", and if it's over ~20ms, people start getting dizzy. at 200ms, nobody is going to be keeping their lunch down
google noticed people making fewer searches if they delayed the result by 100ms
edit: if you want an easy demo, open up vim/nano over ssh, and type something. then try it locally
I'm not sure this is the right way to look at it. I can't find stats right now, but I recall reading top players making frame-perfect moves in games like Smash Bros. Melee and Rocket League.
The mistake with focusing on reaction time is that humans can anticipate actions and can perform complex sequences of actions pretty quickly (we have two hands and 10 fingers). So someone playing one of those "test your reaction time" games might only score like 30ms. But someone playing a musical instrument can still play a 64th note at 120BPM.
Imagine playing a drum that took between 0 and 5 extra frames at 60FPS between striking the head and it producing a sound. Most people would notice that kind of delay, even if they can't "react" that quickly.
In games, frame delay translates to having to hold down a key (or wait before pressing the next one) for longer than is strickly necessary in order to produce an effect. Since fighting games are all about key sequences, the difference between needing to hold key for 0 frames and 5 frames is massive when you consider key combinations might be sequences of up to 5 key presses. 5 frames of delay x five sequential key presses x 8ms a frame = 1600ms vs 1 frame x 5 seq. key presses x 8ms = 40ms.
There's a massive difference between taking 1.6s to execute a complex move and 0.040s.
Another example is music (and relatedly, rythm games). With memorized music you have maximal anticipation of actions. The regular rithm only amplifies that anticipation. Musicians can be very consistent at timing (especially rithm section), and very little latency or jitter can throw that off.
It's something you can get used to. A concert pianist can have 2-3 notes chasing each other down his/her arm. Myelinated nerve fibres are fast (the physiology is really interesting), but still has limits. Latency is more important for organists. Firstly, some instruments can have a delay of up to half a second for some ranks (a rank is a set of pipes - there can be one or more ranks per stop). Secondly, in any church of appreciable size, there will be a significant delay between when you press a key and when you hear the congregation singing the note. In fact standard advice is to ignore the congregation, otherwise you can end up slowing down as a reaction to the latency.
So for the second problem, you just ignore input and play "open loop". For the first problem, you may have to play notes on the slow rank slightly early, although this is only practical if you separate them off on a different keyboard. Otherwise, you can only use that rank for slow music, and make use of the note increasing in volume and changing in tone as that rank comes in.
Frame perfect moves are exceedingly common in most top fields. Just watch any video about the latest speedruns.
The thing with latency is it needs to be consistent. If your latency is between 3 to 5 frames you blew it because you can't guarantee the same experience on every button press. If you always have 3 frames of latency, with modern screens, analog controls, and game design aware of those limitations, that's much better. Look at modern games like Celeste, who has introduced Coyote Time to account for all the latency of our modern hardware.
In this era of 3-5 frame latency being the norm (at least on e.g. the Nintendo Switch), I really appreciate a game developer having anxiety over a single frame.