Probabilistic modeling is an effective tool for evaluating team performance and predicting outcomes in sports. However, an important question that hasn't been fully explored is whether these models can reliably reflect actual performance while assigning meaningful probabilities to rare results that differ greatly from expectations. In this study, we create an inference-based probabilistic framework built on expected goals (xG). This framework converts shot-level event data into season-level simulations of points, rankings, and outcome probabilities. Using the English Premier League 2015/16 season as a data, we demonstrate that the framework captures the overall structure of the league table. It correctly identifies the top-four contenders and relegation candidates while explaining a significant portion of the variance in final points and ranks. In a full-season evaluation, the model assigns a low probability to extreme outcomes, particularly Leicester City's historic title win, which stands out as a statistical anomaly. We then look at the ex ante inferential and early-diagnostic role of xG by only using mid-season information. With first-half data, we simulate the rest of the season and show that teams with stronger mid-season xG profiles tend to earn more points in the second half, even after considering their current league position. In this mid-season assessment, Leicester City ranks among the top teams by xG and is given a small but noteworthy chance of winning the league. This suggests that their ultimate success was unlikely but not entirely detached from their actual performance. Our analysis indicates that expected goals models work best as probabilistic baselines for analysis and early-warning diagnostics, rather than as certain predictors of rare season outcomes.
Uncertainty is a key part of all sports and plays a big role in attracting fans and keeping them engaged. To better understand and reduce this uncertainty, football has turned to data-driven methods. Football analytics marks a major change in how we interpret, evaluate, and improve the game. This drives the need for new measurements, like expected goals (xG) (Vilela 2024, Nipoti, and Schiavon 2025, Bandara et al. 2024). Beyond performance metrics, football analytics covers a wide range of tasks that can be tackled using statistical and predictive modeling. For example, it can predict match outcomes, assess player performance, examine team strategies, and guide decisions in player recruitment and injury prevention (see e.g., Souza et al. 2021, Skripnikov et al. 2025, Elsharkawi et al. 2025, title & Suguna 2023). Together, these applications show the broad reach and growing importance of analytics in today's game.
Expected goals (xG) models evaluate shot quality by calculating the chance of scoring based on past attempts (Spearman, 2018). In simple terms, expected goals assign a probability between 0 and 1 to every shot a team takes during a match. A score of 0 means that there is no chance that the shot being a goal, while 1 means that a goal is certain.
Formally, the expected goals (xG) of a team in a given match can be represented as the sum of the scoring probabilities of all its shots:
where N shows the number of shots taken by the team in that match, and p i is the probability that shot i results in a goal. This method is more effective than a conventional goal-based metric in addressing randomness in football, as a shot is a much more frequent occurrence than a goal (Anzer, & Bauer, 2021). Historically, researchers have modeled the number of goals a team scores in a football match using statistical distributions to forecast match outcomes (Wheatcroft, 2021). For instance, goal-based approaches (Egidi & Torelli 2021, Mead, O’Hare, and McMenemy) modelled the number of goals scored directly using statistical distributions such as Poisson-based models. The focus is on how many goals a team or a player is likely to score, while result based approaches directly model match outcomes (for example, win, draw or loss) rather than the sequence of events that leads to them (see, e.g., Macrì Demartino et al. 2024). Since its development, the xG metric has become ubiquitous in the world of football.
The majority of top-tier football teams and betting corporations employ these statistics, including related concepts of expected assists and post-shot expected goals. Additionally, these metrics play a critical role in player development and acquisition for organizations and in enhancing predictive models used in sports betting (Mead et al., 2023). The primary purpose of these metrics is to provide a more comprehensive assessment of a player and team’s performance beyond just the total number of goals scored. By quantifying shot probabilities, a team can gain an improved understanding of whether they are generating high-quality opportunities, experiencing poor finishing luck, or benefiting from favorable variance. This analytical tool has recently gained significant popularity, as the final outcome of a match does not always accurately reflect the opportunities that a team had.
Expected goals (xG) provide a more detailed picture of a team’s performance in individual matches, but their broader relevance becomes clear when examined over an entire season. Discrepancies between cumulative xG and final league rankings highlight the concept of ranking uncertainty, whereby a team’s position in the table may not accurately reflect its underlying performance. Even teams that consistently generate high-quality chances may underperform due to defensive errors, adverse variance, or unfavorable match dynamics. Conversely, other teams may outperform their xG by converting low-probability chances at unusually high rates. Such discrepancies expose structural limitations in point-based league standings as representations of team quality.
However, xG has limitations despite its widespread use and analytical utility. Methodological differences in data collection approaches can yield substantially different results for identical shots when different xG models are employed. This variability requires a thorough examination of data sources and a clear understanding of each model’s underlying assumptions.
Another important limitation is that the predictive power of xG values in individual matches is limited, and subject to high variance. Match outcomes can deviate substantially from xG expectations due to randomness and the small sample sizes inherent in single games. Consequently, distinguishing real performance trends from statistical noise typically requires aggregating data across multiple matches to obtain meaningful xG-based insights. These limitations highlight the importance of interpreting xG metrics within a broader framework o
This content is AI-processed based on open access ArXiv data.