The debate surrounding the hot hand in the NBA has been ongoing for many years. However, many of the previous works on this theme has focused on only the very next sequential shot attempt, often on very select players. This work looks in more detail the effect of a made or missed shot on the next series of shots over a two-year span, with time between shots shown to be a critical factor in the analysis. Also, multi-year streakiness is analyzed, and all indications are that players cannot really sustain their good (or bad) fortune from year to year.
Deep Dive into Time-based analysis of the NBA hot hand fallacy.
The debate surrounding the hot hand in the NBA has been ongoing for many years. However, many of the previous works on this theme has focused on only the very next sequential shot attempt, often on very select players. This work looks in more detail the effect of a made or missed shot on the next series of shots over a two-year span, with time between shots shown to be a critical factor in the analysis. Also, multi-year streakiness is analyzed, and all indications are that players cannot really sustain their good (or bad) fortune from year to year.
The debate surrounding the hot hand effect (or fallacy) [1][2][3] has been ongoing for decades, but lately has received more attention. In a recent article in the New York Times, it was implied once again that the hot hand is real, reigniting the debate [4]. And more recently it was shown that repetition in the case of NBA free throws can influence percentages [5]; and it was also shown to exist in Major League Baseball [6].
Over the years, many articles have been written with the intent of proving the hot hand in the NBA. Typically, when trying to prove a theory or hypothesis, the default assumption is null, or that the theory is false. In the case of field goals, there has been no clear evidence for the hot hand effect, at least in any widely accepted way. It has even been shown that there exists a counter effect to the hot hand [7]. As will be confirmed in this work, the next shot a shooter takes is less likely to go in than if he made the shot.
Most of the aforementioned work has focused on only the very next sequential shot (as opposed to series of several shots), also regardless of time elapsed in between shots. This work looks in more detail on the effect of a made or missed shot on the next sequence of shots, with time between shots also shown as a critical factor in the analysis.
By definition, an autocorrelation is a cross-correlation of a signal with itself. When a time series signal is auto-correlated, you start with two identical time series data sets and shift one by some number (shot lag). Then multiply each element in one signal with the analogous element in the other (that was just shifted), then sum [8].
And finally, divide the final result by a normalization term. In this case, the signal is just a sequence of ones (for made shots) and zeros (missed shots), with appropriate filters applied. Figure 1 shows the autocorrelagram of all shot sequences for all players from 2014-2016 averaged for all players over both seasons. The dataset was extracted from play-by-play data on basketballreference.com. Essentially, each point on the blue curve above represents the correlation between any given shot (either made or missed), with the n th subsequent attempt. The vertical-axis is closely related to the correlation (often denoted R), as if we had two independent variables x and y. For reference, with shot lag of 0, each point is 100% correlated (R = 1) because the comparison is made basically between two identical data sets (x is equal to y). In the case of shot lag equal to one, x would be the original sequence of made or missed shots, and y is the sequence “next” shots. Would these two variables be correlated? Figure 1 shows that with a shot lag of one, these are actually negatively-correlated. That is for every make, the player is slightly more likely to miss.
One can easily notice this in the curve, where it dips sharply from one to below zero for lags of one and two. For a truly random signal that is sufficiently long, the correlation should drop from unity to zero in the first lag, and all subsequent lags. But even after at a lag of 4-5, and again 7-9 there is some slight negative correlation.
In no subsequent shot in Figure 1 is the result strongly positive.
Next, let’s look at a very similar plot to Figure 1, except with a minute’s filter applied. For example, the blue color in Figure 2 filters sequences where all shots were taken at most 1 minute from one another (in terms of game time). One can see the autocorrelegram has large dependence on time between shots. The more time that passes, the more the hot hand “counter-effect” is diminished. Next time you see someone make a shot, keep a look out for a quick subsequent shot and take note of how many times that shot is a make (but don’t forget to count the misses). It essentially shows that if a player has the opportunity to shoot quickly after a make, he will be more likely to miss (relative to the same shot attempt with more time elapsed). But another natural question arises: how much quicker is the player going to take a next shot after a make, versus a miss?
Next, let’s look how likely a made shot will result in a quicker shot attempt. Figure 3 plots the average FG% (y-axis) of a given shot plotted vs. time (x-axis) between two adjacent shots, rounded to nearest minute. This plot was generated by putting the entire 2-year sample into buckets of average time between shots. For example, sequences with an average delta between shots of 2 minute indicates the made FG% of the original shot was 41%. Conversely, if you took 8 minutes to take your next shot, then your average FG% for the original shot was 33%, before the 8 minutes lapses. A filter for only shots greater than 15 feet was applied, where the hot hand argument is presumably most relevant. In essence, Figure 3 shows that the FG% is anti-correlated with the time passing until the next shot. This is an interesting finding and in a general sense agrees with the previous plot show
…(Full text truncated)…
This content is AI-processed based on ArXiv data.