There is a long standing debate over how to objectively compare the career achievements of professional athletes from different historical eras. Developing an objective approach will be of particular importance over the next decade as Major League Baseball (MLB) players from the "steroids era" become eligible for Hall of Fame induction. Here we address this issue, as well as the general problem of comparing statistics from distinct eras, by detrending the seasonal statistics of professional baseball players. We detrend player statistics by normalizing achievements to seasonal averages, which accounts for changes in relative player ability resulting from both exogenous and endogenous factors, such as talent dilution from expansion, equipment and training improvements, as well as performance enhancing drugs (PED). In this paper we compare the probability density function (pdf) of detrended career statistics to the pdf of raw career statistics for five statistical categories -- hits (H), home runs (HR), runs batted in (RBI), wins (W) and strikeouts (K) -- over the 90-year period 1920-2009. We find that the functional form of these pdfs are stationary under detrending. This stationarity implies that the statistical regularity observed in the right-skewed distributions for longevity and success in professional baseball arises from both the wide range of intrinsic talent among athletes and the underlying nature of competition. Using this simple detrending technique, we examine the top 50 all-time careers for H, HR, RBI, W and K. We fit the pdfs for career success by the Gamma distribution in order to calculate objective benchmarks based on extreme statistics which can be used for the identification of extraordinary careers.
Quantitative measures for success are important for comparing both individual and group accomplishments [1], often achieved in different time periods. However, the evolutionary nature of competition results in a nonstationary rate of success, that makes comparing accomplishments across time statistically biased. The analysis of sports records reveals that the interplay between technology and ecophysiological limits results in a complex rate of record progression [2][3][4]. Since record events correspond to extreme achievements, a natural follow-up question is: How does the success rate of more common achievements evolve in competitive arenas? To answer this question, we analyze the evolution of success, and the resulting implications on metrics for career success, for all Major League Baseball (MLB) players over the entire history of the game. We use concepts from statistical physics to identify statistical regularity in success, ranging from common to extraordinary careers.
The game of baseball has a rich history, full of scandal, drama and controversy [5]. Indeed, the importance of baseball in American culture is evident in the game’s longevity, having survived the Great Depression, two World Wars, racial integration, free agency, and multiple player strikes. When comparing players from different time periods it is often necessary to rely purely on statistics, due to the simple fact that Major League Baseball’s 130+ year history spans so many human generations, extending back to a time period before television and even before public radio.
Luckily, due to the invention of the box score very early in the evolution of the game, baseball has an extremely rich statistical history. When comparing two players, objectively determining who is better should be as straightforward as comparing their statistics. However, the results of such a naive approach can be unsatisfying. This is due to the fact that the history of professional baseball is typically thought of as a collection of ill-defined, often overlapping eras, such as the “deadball” era, the “liveball” era, and recently, the “steroids” era of the 1990’s and 2000’s. As a result, many careers span at least two such eras.
The use of statistics, while invaluable to any discussion or argument, requires proper contextual interpreta-tion. This is especially relevant when dealing with the comparison of baseball careers from significantly different periods. Among common fans, there will always be arguments and intergenerational debates. Closely related to these debates, but on a grander stage, is the election process for elite baseball players into the great bastion of baseball history, the National Baseball Hall of Fame (HOF). In particular, an unbiased method for quantifying career achievement would be extremely useful in addressing two issues which are on the horizon for the HOF:
(i) How should the HOF reform the election procedures of the veterans committee, which is a special committee responsible for the retroactive induction of players who were initially overlooked during their tenure on the HOF ballot. Retroactive induction is the only way a player can be inducted into the HOF once their voting tally drops below a 5% threshold, after which they are not considered on future ballots. Closely related to induction through the veterans committee is the induction of deserving African American players who were not allowed to compete in MLB prior to 1947, but who excelled in the Negro Leagues, a separate baseball league established for “players of color.” In 2006, the HOF welcomed seventeen Negro Leaguers in a special induction to the HOF.
(ii) How should the HOF deal with players from the “steroid” era (1990’s -2000’s) when they become eligible for HOF induction. The Mitchell Report [6] revealed that more than 5% of players in 2003 were using PED. Hence, is right to celebrate the accomplishments of players guilty of using PED more than the accomplishments of the players who were almost as good and were not guilty of using PED? Similarly, how can we fairly assess player accomplishments from the steroids era without discounting the accomplishments of innocent players?
Here we address the era dependence of player statistics in a straightforward way. We develop a quantitative method to “detrend” seasonal statistics by the corresponding league-wide average. As a result, we normalize accomplishments across all possible performance factors inherent to a given time period. Our results provide an unbiased and statistically robust appraisal of career achievement, which can be extended to other sports and other professions where metrics for success are available.
This paper is organized as follows: in Section I B we first analyze the distribution of career longevity and success for all players in Sean Lahman’s Baseball Archive [7], which has player data for the 139-year period 1871-2009. We plot in Fig. 1 and Fig. 2 the probability density function (pdf) of career lon
This content is AI-processed based on open access ArXiv data.