Coherent Multi-Agent Trajectory Forecasting in Team Sports with CausalTraj
Jointly forecasting trajectories of multiple interacting agents is a core challenge in sports analytics and other domains involving complex group dynamics. Accurate prediction enables realistic simulation and strategic understanding of gameplay evolution. Most existing models are evaluated solely on per-agent accuracy metrics (minADE, minFDE), which assess each agent independently on its best-of-k prediction. However these metrics overlook whether the model learns which predicted trajectories can jointly form a plausible multi-agent future. Many state-of-the-art models are designed and optimized primarily based on these metrics. As a result, they may underperform on joint predictions and also fail to generate coherent, interpretable multi-agent scenarios in team sports. We propose CausalTraj, a temporally causal, likelihood-based model that is built to generate jointly probable multi-agent trajectory forecasts. To better assess collective modeling capability, we emphasize joint metrics (minJADE, minJFDE) that measure joint accuracy across agents within the best generated scenario sample. Evaluated on the NBA SportVU, Basketball-U, and Football-U datasets, CausalTraj achieves competitive per-agent accuracy and the best recorded results on joint metrics, while yielding qualitatively coherent and realistic gameplay evolutions.
💡 Research Summary
This paper, “Coherent Multi-Agent Trajectory Forecasting in Team Sports with CausalTraj,” addresses a critical challenge in sports analytics and related fields: predicting the future trajectories of multiple interacting agents (e.g., players and the ball). The authors identify a fundamental limitation in the prevailing research paradigm: state-of-the-art models are primarily designed and evaluated using per-agent accuracy metrics like minADE and minFDE. These metrics assess each agent’s best prediction independently, failing to evaluate whether the set of predicted trajectories across all agents forms a physically plausible and tactically coherent joint future scenario. Consequently, models optimized for these metrics may generate individually accurate but collectively inconsistent predictions.
To overcome this, the paper makes two key contributions. First, it advocates for a shift in evaluation focus towards joint metrics, specifically minJADE and minJFDE. These metrics evaluate a model’s output by considering each of its k generated future samples as a complete “scenario.” They compute the cumulative error across all agents within a single scenario and then select the best overall scenario. This provides a direct measure of a model’s ability to learn the true joint distribution of future states, which is essential for applications like realistic gameplay simulation.
Second, the authors propose CausalTraj, a novel model architecture explicitly built to generate jointly probable multi-agent forecasts. Its core design principles are temporal causality and likelihood-based training. Unlike many recent models that predict the entire future trajectory in parallel from a global latent state, CausalTraj operates autoregressively. At each timestep, it models the conditional distribution of the next displacement for all agents given the history up to that point. This causal step-by-step generation alleviates the burden of compressing long-term, complex multi-agent interactions into a single fixed representation. The model outputs a Mixture-of-Gaussians distribution at each step to capture multimodal uncertainty in the collective motion.
The CausalTraj architecture integrates several components:
- An Agent History Encoder that processes each agent’s past trajectory independently in a causal manner, using either an adapted Causal PointNet or a Mamba2 sequence model.
- An Inter-Agent Relation Encoder based on transformer blocks that models interactions between agents at each timestep. A key innovation here is the Spatial Relation Transformer block, which explicitly injects pairwise Euclidean displacement information into the attention mechanism, enhancing spatial reasoning.
- A Scene Aggregation and Prediction Head that combines all agents’ features into a scene-level representation and outputs the parameters for the displacement mixture model.
The model is trained by maximizing the likelihood of the ground-truth next-step displacements across all agents and timesteps.
Extensive experiments are conducted on three standard sports trajectory datasets: NBA SportVU, Basketball-U, and Football-U. CausalTraj is compared against recent state-of-the-art models like GroupNet, LED, and MoFlow. The results demonstrate that CausalTraj achieves:
- Competitive per-agent accuracy: It performs on par with or slightly better than top models on standard minADE/minFDE metrics across various prediction horizons.
- State-of-the-art joint accuracy: Crucially, CausalTraj sets new best records on the joint metrics minJADE and minJFDE across all datasets and horizons. This quantitatively confirms its superior capability in modeling coherent multi-agent futures.
- Qualitatively coherent generations: Visualizations show that CausalTraj produces more realistic and tactically plausible gameplay evolutions, with players maintaining formations and the ball moving in a logical manner relative to the players.
In summary, this paper successfully argues for the importance of joint evaluation in multi-agent trajectory forecasting and introduces CausalTraj, a causally structured, likelihood-based model that excels at this task. By prioritizing the learning of the joint distribution, CausalTraj moves beyond accurate but disjointed predictions towards generating holistic, interpretable, and simulation-ready future scenarios in team sports.
Comments & Academic Discussion
Loading comments...
Leave a Comment