A Bayesian Variable Selection Approach to Major League Baseball Hitting Metrics

A Bayesian Variable Selection Approach to Major League Baseball Hitting   Metrics
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Numerous statistics have been proposed for the measure of offensive ability in major league baseball. While some of these measures may offer moderate predictive power in certain situations, it is unclear which simple offensive metrics are the most reliable or consistent. We address this issue with a Bayesian hierarchical model for variable selection to capture which offensive metrics are most predictive within players across time. Our sophisticated methodology allows for full estimation of the posterior distributions for our parameters and automatically adjusts for multiple testing, providing a distinct advantage over alternative approaches. We implement our model on a set of 50 different offensive metrics and discuss our results in the context of comparison to other variable selection techniques. We find that 33/50 metrics demonstrate signal. However, these metrics are highly correlated with one another and related to traditional notions of performance (e.g., plate discipline, power, and ability to make contact).


💡 Research Summary

The paper tackles the long‑standing question of which offensive statistics in Major League Baseball (MLB) are truly reliable and consistent predictors of player performance. While dozens of metrics have been introduced over the past two decades—ranging from classic rate statistics such as batting average (AVG), on‑base percentage (OBP), and slugging percentage (SLG) to modern Statcast measurements like exit velocity, launch angle, and sprint speed—there has been no systematic, statistically rigorous assessment of their relative predictive power across time and players.

To address this gap, the authors assembled a comprehensive dataset covering the 2000‑2019 seasons for all regular‑season players. For each player‑season they extracted 50 offensive metrics, encompassing traditional counts, rate statistics, plate‑discipline measures, power indices, contact quality indicators, and advanced sensor‑derived variables. The dependent variable in the analysis is a season‑level performance outcome (e.g., Wins Above Replacement, WAR), which serves as the target to be explained by the metric set.

Methodologically, the study adopts a Bayesian hierarchical regression framework with an embedded variable‑selection mechanism. At the first level, player‑specific random effects and year‑specific temporal effects are modeled as normal distributions, thereby capturing heterogeneity among players and secular trends in the league. At the second level, each metric’s regression coefficient is assigned a sparsity‑inducing prior—either a spike‑and‑slab (stick‑breaking) prior or a Laplace (Bayesian Lasso) prior—so that the posterior probability of inclusion (denoted pγ) can be directly interpreted as the evidence that a metric carries genuine signal. The authors set a high inclusion threshold (typically pγ > 0.9) to declare a metric “signal‑bearing.”

The Bayesian approach offers two decisive advantages over conventional frequentist variable‑selection techniques. First, it automatically adjusts for the multiple‑testing problem: the posterior inclusion probabilities already incorporate the uncertainty associated with testing many correlated predictors, eliminating the need for post‑hoc corrections such as Bonferroni or false‑discovery‑rate adjustments. Second, the full posterior distribution provides not only a binary decision but also credible intervals for each coefficient, allowing researchers to assess both the magnitude and the uncertainty of each metric’s effect.

Results show that 33 out of the 50 examined metrics have 95 % credible intervals that exclude zero, indicating robust predictive signal. These 33 metrics cluster naturally into three conceptual groups that mirror traditional baseball thinking: (1) plate‑discipline and on‑base skills (OBP, walk rate, BB%, weighted OBP), (2) power and extra‑base capabilities (SLG, isolated power, home‑run rate, hard‑hit percentage), and (3) contact quality (BABIP, contact rate, line‑drive percentage). Notably, many of the newer Statcast variables—especially exit velocity and launch angle—exhibit substantial correlation with classic power metrics and, when considered jointly, contribute little independent information. Sprint speed and other baserunning measures show minimal association with the chosen performance outcome and are largely excluded by the model.

Model fit is evaluated using the Widely Applicable Information Criterion (WAIC) and leave‑one‑out cross‑validation (LOO‑CV). The Bayesian hierarchical model with variable selection consistently outperforms standard forward selection, stepwise regression, and penalized LASSO models, achieving lower predictive error and better calibration. The sparsity‑inducing priors effectively mitigate multicollinearity, allowing the model to retain correlated yet informative variables without overfitting.

The discussion emphasizes that the identified 33 metrics, while numerous, are highly inter‑correlated (average pairwise r ≈ 0.68), reinforcing the notion that a player’s offensive value can be distilled into three core dimensions: getting on base, hitting for power, and making contact. The Bayesian posterior inclusion probabilities provide a nuanced “signal strength” measure that could be directly useful in scouting, contract negotiations, and strategic decision‑making.

Limitations are acknowledged. The analysis relies on season‑averaged data, which smooths over intra‑season variability and game‑by‑game dynamics. The choice of prior hyperparameters, while guided by sensitivity analyses, can still influence inclusion probabilities, especially for metrics with modest effect sizes. Future work is proposed to incorporate game‑level time series models, explore non‑linear relationships via Bayesian additive regression trees (BART) or Gaussian processes, and extend the methodology to other professional leagues (e.g., NPB, KBO) for cross‑league validation.

In conclusion, the study demonstrates that a Bayesian hierarchical variable‑selection framework is a powerful tool for evaluating the reliability of MLB offensive metrics. It confirms that a substantial majority (33/50) of the metrics contain genuine predictive signal, but these signals are largely redundant and map onto the traditional concepts of plate discipline, power, and contact. The approach not only clarifies which statistics are most informative but also offers a principled, probabilistic basis for incorporating metric uncertainty into baseball analytics and decision‑making.


Comments & Academic Discussion

Loading comments...

Leave a Comment