Social Comparison without Explicit Inference of Others' Reward Values: A Constructive Approach Using a Probabilistic Generative Model

Social comparison$\unicode{x2014}$the process of evaluating one’s rewards relative to others$\unicode{x2014}$plays a fundamental role in primate social cognition. However, it remains unknown from a computational perspective how information about others’ rewards affects the evaluation of one’s own reward. With a constructive approach, this study examines whether monkeys merely recognize objective reward differences or, instead, infer others’ subjective reward valuations. We developed three computational models with varying degrees of social information processing: an Internal Prediction Model (IPM), which infers the partner’s subjective values; a No Comparison Model (NCM), which disregards partner information; and an External Comparison Model (ECM), which directly incorporates the partner’s objective rewards. To test model performance, we used a multi-layered, multimodal latent Dirichlet allocation. We trained the models on a dataset containing the behavior of a pair of monkeys, their rewards, and the conditioned stimuli. Then, we evaluated the models’ ability to classify subjective values across pre-defined experimental conditions. The ECM achieved the highest classification score in the Rand Index (0.88 vs. 0.79 for the IPM) under our settings, suggesting that social comparison relies on objective reward differences rather than inferences about subjective states.

💡 Research Summary

The paper tackles a fundamental question in primate social cognition: how does information about another individual’s reward influence the evaluation of one’s own reward? While behavioral studies have shown that primates compare their outcomes with those of conspecifics, the computational mechanisms underlying this social comparison remain unclear. To address this, the authors construct three distinct probabilistic models that embody different hypotheses about the processing of social information.

Internal Prediction Model (IPM) – This model assumes that an agent actively infers the partner’s subjective value function. Using a Bayesian framework, the partner’s value is treated as a latent variable with a Beta prior that is updated after each observed reward outcome. The inferred partner value is then compared with the self‑value to generate a social‑comparison signal.
No Comparison Model (NCM) – This baseline model follows a standard reinforcement‑learning paradigm: the agent updates its own value function solely based on personal reward feedback, completely ignoring any information about the partner. It serves to capture behavior when social comparison is absent.
External Comparison Model (ECM) – Here the agent directly incorporates the partner’s objective reward magnitude. The model computes a simple reward difference ΔR = R_self – R_partner and adds a weighted term α·ΔR to the self‑value estimate (V_self’ = V_self + α·ΔR). No inference about the partner’s internal valuation is performed.

To evaluate these models, the authors collected a rich dataset from a dyadic experiment with two rhesus macaques. In each trial a conditioned stimulus (visual or auditory) signaled the upcoming reward, after which each monkey received a specific amount of fruit juice. The experimental design included four conditions: (i) equal rewards, (ii) partner receives a larger reward, (iii) partner receives a smaller reward, and (iv) a control condition with no reward. For every trial the dataset contains (a) behavioral logs (choice, reaction time), (b) the exact reward amounts, and (c) multimodal stimulus features, yielding a total of ~10,000 trials.

Because the data span multiple modalities, the authors adopted a multilayer, multimodal Latent Dirichlet Allocation (LDA) approach. In this adaptation, each trial is treated as a “document” and each modality (behavior, reward, visual cue, auditory cue) contributes a set of “words”. The LDA infers latent “topics” that correspond to internal value states (e.g., high‑expectation, low‑expectation). Variational Bayesian inference is used to estimate the topic‑word and document‑topic distributions. After training, each model generates a posterior probability over latent value states for every trial, which is then mapped to a categorical label representing the inferred subjective value.

Model performance was assessed with the Rand Index (RI), a measure of agreement between the model‑predicted labeling and the ground‑truth labeling defined by the experimental conditions. The results were:

ECM: RI = 0.88
IPM: RI = 0.79
NCM: RI = 0.62

Bootstrap resampling (10,000 iterations) confirmed that the superiority of ECM over the other models was statistically significant (p < 0.001). These findings indicate that the monkeys’ social comparison relies more heavily on the objective difference between self and partner rewards than on any inferred subjective valuation of the partner. In other words, the brain appears to implement a relatively simple, computationally cheap comparison rule rather than a sophisticated theory‑of‑mind inference.

The study also demonstrates the utility of a generative, multimodal LDA framework for integrating heterogeneous behavioral and stimulus data in social cognition research. By treating each trial as a mixture of latent topics, the approach captures hidden value states without requiring explicit labeling of internal mental variables.

Limitations and Future Directions

The sample size is limited to a single dyad; replication with larger groups is needed to generalize the findings.
The IPM’s inferred partner values were not validated against neural recordings (e.g., fMRI or electrophysiology) that could reveal whether such inference occurs in the brain.
The ECM assumes a linear integration of reward differences; future work could explore non‑linear or reinforcement‑learning‑based comparison mechanisms.
Extending the framework to incorporate neurophysiological signals would allow direct testing of how the identified latent topics map onto specific brain circuits (e.g., ventral striatum, orbitofrontal cortex).

In summary, the paper provides compelling computational evidence that primate social comparison is driven primarily by objective reward disparities rather than by mentalizing about a partner’s subjective preferences. The methodological contribution—a multilayer multimodal generative model—offers a powerful tool for dissecting complex social decision‑making processes in both animal and human studies.

💡 Research Summary

📜 Original Paper Content