Auditing the Auditors: Does Community-based Moderation Get It Right?

A U D I T I N G T H E A U D I T O R S : D O E S C O M M U N I T Y - B A S E D M O D E R A T I O N G E T I T R I G H T ? Y eganeh Alimohammadi 1 † Univ ersity of Southern California Karissa Huang 1 ‡ UC Berkeley Christian Borgs § UC Berkeley Jennifer Chayes ¶ UC Berkeley March 20, 2026 A B S T R AC T Online social platforms increasingly rely on crowd-sourced systems to label misleading content at scale, b ut these systems must both aggregate users’ ev aluations and decide whose ev aluations to trust. T o address the latter , many platforms audit users by rewarding agreement with the ﬁnal aggregate outcome, a design we term consensus-based auditing. W e analyze the consequences of this design in X’ s Community Notes, which in September 2022 adopted consensus-based auditing that ties users’ eligibility for participation to agreement with the e ventual platform outcome. W e ﬁnd evidence of strate gic conformity: minority contributors’ e valuations drift to ward the majority and their participation share falls on controversial topics, where independent signals matter most. W e formalize this mechanism in a behavioral model in which contrib utors trade off priv ate beliefs against anticipated penalties for disagreement. Motiv ated by these ﬁndings, we propose a two-stage auditing and aggregation algorithm that weights contrib utors by the stability of their past residuals rather than by agreement with the majority . The method ﬁrst accounts for differences across content and contributors, and then measures how predictable each contributor’ s ev aluations are relativ e to the latent-factor model. Contributors whose e valuations are consistently informati ve recei ve greater inﬂuence in aggregation, e ven when they disagree with the pre v ailing consensus. In the Community Notes data, this approach improv es out-of-sample predicti ve performance while a voiding penalization of disagreement. K eywords Content moderation | Misinformation | Community Notes | Matrix f actorization | Cro wdsourcing In the face of rapidly increasing online misinformation and harmful content [ 60 ], online platforms face a fundamental design question: how can unreliable inf ormation be identiﬁed and ﬂagged at scale? Many platforms hav e turned this question back to their users, creating cro wdsourced systems of content moderation that seek to lev erage a div erse user base. Notably , X (formerly T witter)–and, at a more experimental stage, Meta, T ikT ok, and Bluesky–in vite users to ev aluate content and add context to potentially unreliable posts [66, 34, 42, 54]. A central challenge in these systems is that users v ary widely in the reliability of their e valuations [ 7 , 10 ]. As a result, platforms confront a second problem: how to audit the auditors themselv es? In practice, platforms aggre gate user content ev aluations into an inferred platform outcome, which we refer to as the consensus , that determines whether content is do wnranked or gi ven additional context [ 65 ]. The same consensus is then used to e valuate users’ reliability as well: users whose historical input to the system aligns with the consensus gain inﬂuence, whereas those who di ver ge are downweighted or lose eligibility to participate in future e valuations [ 14 , 15 ]. W e refer to this design choice as consensus-based auditing . † Marshall School of Business, Univ ersity of Southern California. yalimoha@usc.edu ‡ Department of Statistics, UC Berkeley . krhuang@berkeley.edu § Bakar Institute of Digital Materials for the Planet and Department of Electrical Engineering and Computer Sciences, UC Berkeley . borgs@berkeley.edu ¶ Department of Statistics, Department of Mathematics, School of Information, Department of Electrical Engineering and Computer Sciences, and Bakar Institute of Digital Materials for the Planet, UC Berkeley . jchayes@berkeley.edu 1 Equal contribution. A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Although consensus-based auditing appears operationally efﬁcient, it implicitly assumes that consensus is a reliable proxy for truth and that disagreement signals lo w credibility . This coupling of aggregation and auditing can potentially create a self - reinforcing dynamic [ 26 , 39 ]. When agreement with the consensus is rew arded, users hav e incentiv es to anticipate the outcome rather than provide independent input [37, 33]. As a result, disagreement becomes less visible, and minority perspectiv es may be withdrawn before the y are ev er aggregated [ 69 , 38 ]. In this paper , we quantify these effects and sho w , both theoretically and empirically , that consensus-based auditing systematically distorts contributor behavior and reduces the representation of informati ve minority vie wpoints. X’ s Community Notes pro vides a concrete setting for studying these design choices. Launched in January 2021 (initially as Birdwatc h ), Community Notes is a crowdsourced content ev aluation system where participating platform users ( contributor s ) add short “notes” that provide conte xt to help readers assess a post’ s claims [ 66 ]. Other participating users rate the helpfulness of these notes, and an aggre gation algorithm selects which notes are displayed publicly as annotations on the original post, based on their predicted helpfulness across div erse viewpoints. In September 2022, Community Notes introduced Rating Impact and Writing Impact , which enforce a consensus-based auditing rule: a user’ s ability to continue rating and writing content depends on their historical alignment with the platform’ s aggregated outcome (the consensus) [ 14 , 15 , 40 ]. This policy change of fers a natural setting for examining how consensus-based auditing shapes acti vity in a crowd-sourced moderation system. W e identify se veral systematic shifts indicati ve of reduced minority visibility . First, over time minority contrib utors increasingly align their ratings with the majority , suggesting strategic anticipation of the platform outcome. Further , we examine topic - lev el participation and ﬁnd that posts on controversial topics (e.g., politics, international conﬂict, etc.) recei ve fe wer notes than noncontroversial posts follo wing the adoption of consensus - based auditing, ev en though these are exactly the topics where misinformation risk is highest and where an effecti ve policy should encourage more ev aluator engagement [63, 61, 57]. T o understand these empirical observ ations, we dev elop a simple behavioral model in which users choose their ratings to balance their private belief about content quality against a penalty for deviating from the anticipated consensus. By analyzing the model’ s equilibrium, we formally prove that consensus-based penalties amplify conformity , and disproportionately suppress minority contributors. Finally , we propose an alternati ve auditing algorithm to address some of the ke y shortcomings of the current system. The algorithm proceeds in two stages. In the ﬁrst stage, we estimate content-le vel and user-le vel ef fects, and systematic user–content alignment from the observed ratings, and then compute residuals as the dif ference between observed and predicted ratings. The resulting residuals isolate the idiosyncratic component of each ev aluation. In the second stage, we estimate contributor reliability from the variance of these ﬁrst-stage residuals and aggregate current ev aluations using inv erse-variance weights. Crucially , “consistency” here refers to stability of residuals conditional on content and user’ s baseline , not agreement with the majority . As a result, a contributor may consistently disagree with the prev ailing consensus and still retain inﬂuence, provided their ev aluations are stable and informati ve relativ e to the modeled structure. This two-stage method is moti vated by classical results on weighted least squares under heteroskedasticity . After removing intrinsic content-level ef fects and user-speciﬁc average effects, residual ev aluations can be modeled as conditionally unbiased signals with user -speciﬁc variance; in this setting, inv erse-variance weighting yields the minimum-variance unbiased aggre gation of signals [ 4 , 12 , 27 , 46 ]. Empirically , we show that our algorithm impro ves out-of-sample predictiv e performance relativ e to the deployed algorithm. Better predictiv e performance yields harm reduction at scale. Previous work sho ws that the effecti veness of community annotations is highly time-sensitive. Notes attached earlier yield substantially larger reductions in engagement and diffusion than notes attached later; posts receive about 50% fewer reposts when notes are attached within 12 hours, compared to less than 10% reductions when attached after about 48 hours [ 48 ]. A more predicti ve auditing rule increases the signal-to-noise of early aggregates, reducing the number of ratings required to reach a conﬁdence helpfulness decision and shortening time-to-attachment. In sum, our work makes three key contributions: (1) W e provide empirical e vidence that consensus-based auditing systematically alters minority behavior and reduces engagement with contro versial topics. (2) W e introduce a beha vioral model that explains these shifts. (3) W e design and ev aluate an alternativ e auditing and aggregation algorithm that improv es predictiv e performance while preserving participation of minorities. Related W ork The promise of cro wd-sourced ev aluation is often moti vated by the “wisdom of cro wds”: when individual judgments are div erse and independent, their aggregation can outperform indi vidual experts [ 23 , 52 ]. Ho wev er , when individuals 2 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 are exposed to others’ opinions, social inﬂuence can undermine independence and induce herding [ 9 , 3 , 20 , 33 ]. Studies sho w that once early public feedback is observed, later contributors may follo w the emerging trend, causing correlated errors and conformity [ 2 , 16 ]. Consensus-based auditing creates an additional channel for such ef fects: when contributors are re warded for matching the ev entual consensus rather than for providing informati ve signals, they are encouraged to conform [ 26 , 35 , 31 , 62 , 43 ]. This is particularly problematic in crowd-sourced moderation systems like Community Notes, where vie wpoint div ersity can mitigate biases in content labeling [ 53 ]. Y et empirical evidence on ho w consensus-based auditing rules shape contrib utor behavior in deployed cro wd-sourced moderation systems remains limited. T o date, empirical work on Community Notes has primarily focused on downstream ef fects of note attachment on user behavior and information dif fusion. Studies show that posts with attached notes hav e lower engagement – measured in terms of likes, replies, vie ws, and reposts – with especially strong effects when notes are attached earlier in a post’ s lifecycle [ 48 , 24 , 18 , 11 , 6 ]. Other studies ﬁnd that notes receiving broad contributor endorsement are perceiv ed as more trustworthy by users than platform-issued misinformation labels [ 19 ]. Partisan asymmetries in content moderation participation have also been documented: contributors are more likely to challenge counter-partisan content and to rate co-partisans’ notes as more helpful [ 5 , 45 , 28 ]. T ogether, this literature establishes that Community Notes can affect eng agement, trust, and partisan dynamics, while leaving open ho w platform incentiv e and eligibility rules shape contributor participation and co verage across topics. One dif ﬁculty that crowd-sourced moderation systems f ace is lack of a ground truth for estimated quantities lik e note helpfulness and user latent factors. There is a large literature that addresses the problem of how to infer contributor reliability in such settings. For example, in the cro wd-sourcing and labeling literature, work er–task models estimate worker accurac y and task dif ﬁculty from patterns of agreement and disagreement [ 17 , 44 , 29 , 30 ]. Relatedly , there is a body of literature on designing mechanisms to elicit truthful information without veriﬁed labels, including proper scoring rules and peer-prediction methods [ 25 , 35 , 43 , 64 , 47 ]. A common lesson from these works is that participants should be e valuated against targets not determined by themselves, reducing incenti ves to coordinate on a focal consensus [ 35 , 43 , 25 , 32 , 21 ]. These approaches formalize the idea that reliability should be learned from error structure, not simply from matching a majority vote. Methodologically , the Community Notes algorithm is closely related to latent-factor models and collaborati ve ﬁltering in recommender systems, where one seeks to infer user and item attributes from sparse, noisy ratings [ 58 , 7 , 29 ]. A parallel line of work studies the design of aggregation rules that limit inﬂuence from noisy or adversarial users [ 46 ]. In statistics, classical results on weighted least squares show that in verse-v ariance weighting yields efﬁcienc y gains under heteroskedastic noise [4, 12]. Our two-stage algorithm combines these principles and ideas from the literature by drawing on the logic of scoring contributors ag ainst targets not deﬁned purely by ra w agreement with consensus, and using these scores to construct weights in a second-stage aggregation model. This combination links practical auditing in cro wd-sourced moderation to established ideas in statistical efﬁcienc y and incentiv e-compatible information elicitation. X Community Notes In the Community Notes program, users can write short annotations (called notes) that provide context for potentially misleading or disputed content on the platform, and rate notes written by other users as Helpful , Somewhat Helpful , or Not Helpful . These ratings are aggregated using a matrix factorization algorithm that determines which notes are displayed publicly under the corresponding posts, while all remaining notes are kept hidden [ 65 ]. As a result, aggregation outcomes directly shape which information is surf aced and which contributors shape public discourse. Aggregation via Matrix F actorization For each user–note pair ( u, n ) , let r un ∈ { 0 , 0 . 5 , 1 } denote the observed rating, where Helpful , Somewhat Helpful , and Not Helpful responses are mapped to 1 , 0 . 5 and 0 , respectiv ely [ 65 ]. The platform assumes that ratings are modeled as r un = µ + h u + i n + f u · g n , (1) where µ is a global intercept, h u is a rater intercept capturing user u ’ s baseline agreeability (the tendency to mark notes as helpful rather than unhelpful, regardless of content), i n is a note intercept capturing the perceiv ed overall helpfulness of note n , and f u , g n ∈ R are latent rater and note factors whose product represents the ideological alignment between user and note. 3 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 The platform estimates these parameters by solving the regularized least-squares problem ˆ h u , ˆ i n , ˆ g n , ˆ f u , ˆ µ = arg min µ,h u ,i n ,f u ,g n X ( u,n ) observed ( r un − ˆ r un ) 2 + λ u X u  ∥ h u ∥ 2 + ∥ f u ∥ 2  + λ n X n  ∥ i n ∥ 2 + ∥ g n ∥ 2  . (2) where λ u , λ n are regularization parameters. In this formulation, the note intercept i n is the primary quantity of interest, as it captures ho w broadly helpful a note is across the user base and directly determines its eligibility for public display . Notes with estimated intercept ˆ i n ≥ 0 . 4 are classiﬁed as Helpful and sho wn publicly beneath the corresponding posts, while notes with ˆ i n < 0 . 4 are withheld from public display . Among these, notes with ˆ i n < − 0 . 05 are classiﬁed as Not Helpful and may generate negati ve feedback for both the note’ s author and raters who marked the note as helpful [67]. 6 Rating Impact In September 2022, Community Notes introduced the Rating Impact feature, which links rating aggregation to contributors’ continued participation. Rating Impact is a user-le vel score computed based on whether a contrib utor’ s ratings align with a note’ s e ventual Helpful / Not Helpful classiﬁcation; agreement increases the score, while disagreement decreases it [ 41 , 40 ]. A user’ s Rating Impact score determines their ability to write Community Notes; ne w contrib utors must reach a minimum Rating Impact threshold before they can write notes. Writers hav e a separate Writing Impact score, which governs their ability to write notes on the platform; writers lose the ability to submit new notes if at least 3 of their 5 most recently written notes hav e been labeled Not Helpful after aggregation [14]. While the stated goal of this system is to surface notes from contributors with a track record of accuracy , it may create conformity incenti ves. T o gain or retain writing privile ges, contributors may feel pressure to align both their ratings and their note content with anticipated majority views, as repeated disagreement risks reduced inﬂuence and loss of access. Therefore, the September 2022 policy introduction, together with our use of October 1, 2022 as a conservati ve operational cutoff, pro vides a natural policy discontinuity for our study . Data Our analysis uses the publicly released Community Notes dataset, which contains the complete history of notes and ratings since Community Notes’ launch in 2021 [ 68 ]. 7 Our primary analysis window spans June 1, 2022 through May 31, 2023, covering se veral months on both sides of the rollout of Rating Impact. For each rating e vent, the dataset records the note identiﬁer n , rater identiﬁer u , timestamp, selected rating, a summary of the note content, and metadata about the associated post. The dataset does not include estimates of the latent parameters ( h u , i n , f u , g n ) , as these quantities are re-estimated by the platform as new data arri ve. T o study ho w the Rating Impact polic y affects contrib utor beha vior , we reconstruct the latent factors by running the platform’ s matrix factorization algorithm on the open-source ratings data. Speciﬁcally , we recover weekly estimates of ( h u , i n , f u , g n ) by applying the Community Notes aggreg ation algorithm to cumulativ e ratings data up to each week. W e implement the publicly released December 31, 2022 version of X’ s open-source matrix factorization code [ 65 , 55 ]. 8 As a robustness check, we additionally incorporate an impro ved platform’ s algorithm from the May 2025 release (see Supplementary Information, Section 2 for details). While both implementations share the same underlying matrix factorization frame work described earlier , they dif fer modestly in operational details such as regularization choices, handling of ties, and per-topic f actorizations. W e use both v ersions as a sensitivity check to v erify that our results are robust to reasonable v ariation in the estimation procedure. 6 The production scoring system has e volv ed over time; the formulation here reﬂects the core structure and thresholds rele vant for our analysis. Our empirical replication uses the corresponding open-source implementation released by platform X, though operational details such as conﬁdence adjustments or per-topic f actorizations may differ . 7 The initial name for Community Notes was Birdwatch. 8 The code is publicly av ailable on the platform’ s GitHub repository . 4 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Empirical Results W e document three empirical patterns in platform acti vity around the September 2022 introduction of the Rating Impact system, using October 1, 2022 as the analysis cutoff date 9 ; we refer to dates before October 1 as pre-rollout and dates after as post-rollout. First, we observe that minority-aligned raters shift their e valuations to ward the majority . Second, controv ersial content receiv es relativ ely lower engagement post-rollout, potentially contributing to fe wer visible annotations. Finally , the platform’ s latent-factor model exhibits reduced out-of-sample predicti ve performance post-rollout. These observ ations align with the hypothesis that raters behav ed strategically in response to the Rating Impact policy . The following subsections present each pattern in turn. Minority Behavior Shift First, we examine whether the introduction of the Rating Impact system altered how minority raters ev aluate notes. Recall that each user and note are assigned a latent factor , f u and g n , respectiv ely , that represent their ideology 10 . Then, for a pair of users and notes ( u, n ) , f u · g n represents their the rater -note alignment, and its magnitude measures the strength of that alignment. As proxies for behavioral change, (i) we study shifts in the distributions of rater and note factors, and (ii) changes in the predicti ve role of user –note alignment for Helpful ratings. (i) Latent Factor Distrib ution Shift Figure 1a plots the distrib ution of rater f actors in the pre-rollout and post-rollout periods. Relati ve to the pre-rollout period, the share of raters in the majority-aligned mode increases from 58.4% to 63.1%, while the share in the minority- aligned mode decreases from 41.6% to 36.2%. Within the minority group, the distribution also shifts toward zero: the mean absolute factor | f u | for minority-aligned raters declines from 0.522 (CI: ± 0.177) pre-rollout to 0.413 (CI: ± 0.215) post-rollout. This change reﬂects both a subset of minority-aligned raters mo ving closer to the center and a subset crossing ov er to the opposite alignment. A parallel pattern is visible in the note factor distrib ution. For note factors the mean absolute note factor | g n | changes from 0.451 (CI: ± 0.25) pre-rollout to 0.408 (CI: ± 0.233) post-rollout. This suggests that the notes produced by minority-aligned writers are, on av erage, positioned closer to the center of the latent-factor spectrum in the later period 11 . Note that user and note factors are computed relative to all user-note interactions on the platform. T o distinguish behavioral adaptation from compositional change due to entry of ne w raters, we compare factor shifts for two groups. W e call one group “Early Users"; this is the set of users who have been acti ve raters on the platform since before Oct. 1, 2022. The second group of users are “New Users"; this is the set of users who joined as ne w raters on X Community Notes between Oct. 1, 2022 and Jan. 1, 2023. W e compute the factor shift for Early Users and New Users, taking the difference between their ﬁrst latent f actor after Jan. 1, 2023 and ﬁrst latent factor after Oct. 1, 2022, for users where both factors e xist. W e then run a permutation test on the mean f actor shift (10,000 permutations). W e ﬁnd that early users’ shifts were more negati ve than ne w users ( ∆ = − 0 . 053 , p < 0 . 001 ). This is another indicator that the Rating Impact policy inﬂuences user beha vior , causing users who experienced the polic y change to become more strategic in rating. (ii) User -Note Alignment T o assess whether the predicti ve role of user–note alignment changed at the Rating Impact rollout, we use Spearman’ s correlation coefﬁcient as a proxy . W e use ratings data from Aug. 1, 2022 to Jan. 1, 2023, focusing on a ﬁx ed cohort of 1 , 202 users who were acti ve on the platform both before and after the October 1, 2022 cutof f. For each period, we compute the Spearman correlation between the rater-note factor dot product and helpfulness ratings, taking the dif ference (post minus pre) as our test statistic. The correlation declined from 0 . 792 before the rollout to 0 . 525 , yielding an observ ed difference of − 0 . 267 . T o assess whether this decline is statistically signiﬁcant, we conduct a permutation test with 1 , 000 iterations, randomly reassigning ratings to the pre/post groups while preserving the original group sizes, which gav e a p -value of 0 . 004 . 9 October 1 provides a conservati ve operational cutof f following the period when the relev ant eligibility rules began to take effect in our dataset. 10 These factors can theoretically be vectors b ut in the context of X’ s algorithm factors are scalar . 11 Note f actors partly reﬂect writer behavior b ut also depend on endogenous note entry and which notes recei ve suf ﬁcient e valuation to be estimated. 5 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 (a) Rater factor change (b) Note factor change Figure 1: V isualization of rater and note factor distrib ution shift over time. In the Appendix, we conduct sensiti vity test these behavioral shifts with various additional metrics. T aken together , these patterns are consistent with the hypothesis that after the Rating Impact rollout, minority–aligned raters adjusted their ev aluations toward the anticipated consensus, reducing observ able disagreement. Annotations on Contro versial T opics Another indication of changing rater beha vior is the e xtent to which the y engage with contro versial vs. non-controversial notes; agreement with the majority is far more likely on non-controv ersial notes, thereby increasing the potential to boost a rater’ s Rating Impact score. T o study this, we compare annotation patterns for controversial and non- controv ersial notes before and after rollout of the Rating Impact system. W e deﬁne controv ersy at the note lev el using two complementary approaches: a topic-based classiﬁcation and a factor-based classiﬁcation. T opic-based classiﬁcation. For topic-based classiﬁcation, we assign each note to a content topic based on its summary text, using retrained version of X’ s public topic-assignment pipeline (details in Appendix C). W e then ﬂag as controv ersial those topics that, in platform discourse and prior literature, are kno wn to be highly polarized (e.g., national politics, public health, geopolitical conﬂicts). This classiﬁcation captures domain-lev el controversy , independent of individual note characteristics. W e additionally use Large Language Model (LLM)-based topic assignments as alternativ e classiﬁcation procedures (details in Appendix C). F actor-based classiﬁcation. For factor -based classiﬁcation, we use the absolute v alue of the estimated note factor , | g n | , as a continuous measure of ideological alignment. In the Community Notes latent-factor model, notes with factors far from zero are those that recei ve systematically different ev aluations from different groups of raters. W e therefore classify a note as controv ersial if | g n | lies in the top quantile of the distribution. This approach captures instance-lev el controv ersy ev en within otherwise non-polarized topics. 6 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 2: Rolling Spearman correlation between rater-note factor dot product alignment and helpfulness ratings for the cohort of 1 , 202 early users, computed ov er a sliding windo w of 50 ratings sorted by date. The red dashed line marks the algorithmic change on October 1, 2022. The bold line shows a LO WESS smooth of the rolling correlation. Prior to the intervention, the correlation is relati vely stable, whereas after the interv ention, the correlation declines steadily . This is an observ ational indicator that the rollout of Rating Impact weakened the relationship between rater-note alignment and helpfulness ratings among the group of users who were activ e before the change. W e use both deﬁnitions in parallel to ensure rob ustness. The topic-based approach pro vides interpretability , while the factor -based approach is model-deriv ed and sensitiv e to within-topic variation. Our empirical results are qualitativ ely consistent across both deﬁnitions. In this section, we report results using the topic-based classiﬁcation leaving factor -based results to Appendix C. Figure 3 presents the weekly share of tweets recei ving a ﬁrst note that ultimately attains Helpful status before and after the rollout date, separately for contro versial and non-contro versial topics. For controv ersial topics, the helpful-share increases 12 from 0.061 with W ilson CI [0.044, 0.084] pre-rollout to 0.126 with W ilson CI [0.109, 0.146] post-rollout, a change of 6.5 percentage points. In contrast, non-controv ersial topics see an increase from 0.062 with W ilson CI [0.027, 0.138] to 0.199 with W ilson CI [0.151, 0.258], a change of 13.7 percentage points. W e use a difference-in-dif ferences (DiD) design to estimate the effect of the rollout of the Rating Impact system on note helpfulness for controversial vs. non-controversial notes. Using a symmetric 12 week band around Oct. 1, 2022, we see that the probability a non-controv ersial note is rated helpful increased by around 9 percentage points (95% CI [0.015, 0.161]) more than the probability a contro versial note is rated Helpful. These shifts are also reﬂected in the counts during this 12 week band before and after the cutof f date: the proportion of tweets recei ving at least one ne w note in contro versial topics decreases ( − 3 . 4 pp ), while the proportion of tweets recei ving at least one new note in non-controv ersial topics rises ( +3 . 4 pp ). This trend of reduced engagement on controversial topics is especially present in the minority group. In Appendix C, we run an additional DiD study on the proportion of ratings assigned to controversial notes for indi viduals in the minority group vs. non-minority group. W e ﬁnd that, in a 12 week band around the rollout date, the proportion of controv ersial notes rated by minority users decreases 7.53 percentage points relati ve to the change observ ed for non-minority users ( p = 0 . 0284 ). W e emphasize that these comparisons are descriptiv e; they show a post-rollout div ergence in annotation outcomes between controv ersial and non-controv ersial topics, but the y do not by themselves identify the mechanism. Ho wev er , the consistent pattern across both classiﬁcation schemes is in line with the hypothesis that, under the Rating Impact system, content on controv ersial topics receives fe wer helpful ratings and correspondingly fe wer surfaced annotations, while non-controv ersial topics receiv e more. 12 The deﬁnition of "helpfulness" was relaxed around the time of the Rating Impact rollout, so the helpfulness for both controversial and non-controv ersial notes increases [65]. 7 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 3: Pre–post change in the share of notes with ﬁnal status Helpful by controv ersy category around the Rating Impact rollout (cutoff: 2022-10-01). Bars sho w the mean proportion in the approximately 20 weeks before (Pre, blue) and after (Post, orange) the cutoff; error bars are 95% CIs. T ext abo ve bars reports the Post–Pre dif ference in percentage points. The increase is lar ger for non-controversial notes (+13.7 pp) than for contro versial notes (+6.5 pp). W eeks (Pre, Post) Estimate 95% CI p -value (12, 0-12) 0.0882 [0.0152, 0.1612] 0.0179 (12, 13-26) 0.1185 [0.0311, 0.2059] 0.0079 T able 1: DiD Estimates. W indowed dif ference-in-differences estimates of the change in the weekly helpfulness rate dif ference between non-controv ersial and controversial content after the October 1, 2022 rollout. The outcome is the difference in weekly proportion of tweets receiving a note that is ultimately rated Helpful between tweets with contro versial vs. non-controv ersial notes. The gap increases by 8.8 pp in the ﬁrst 12 weeks post-rollout and by 11.9 pp in weeks 13–26. Both effects are positi ve and statistically signiﬁcant. Predicti ve Accuracy W e study ho w the predicti ve performance of the platform’ s latent-factor model changes around the rollout of Rating Impact. For each calendar week t , we ﬁt the X Community Notes matrix factorization algorithm on cumulati ve data up to week t , and then e valuate two types of prediction error . The in-sample error measures the error for the model predictions on ratings in week t . The one-week-ahead error measures the error for model predictions on ratings in week t + 1 , restricting to rating pairs ( u, n ) where both the rater and note are observed in week t (see Appendix D for implementation details). Figure 4 plots the one-week-ahead mean squared error (MSE) over time, with the rollout week marked and T able 2 aggregates by period. The post-rollout period shows a higher and more volatile error than the pre-rollout period, with in-sample MSE increasing ov er 116% and one-week-ahead MSE increasing o ver 76% between pre- and post- rollout. Pre-Rollout Post-Rollout % Change In-sample 0.0488 (0.0327, 0.0648) 0.1057 (0.1005, 0.1109) +116.60% One-week-ahead 0.0927 (0.0600, 0.1254) 0.1634 (0.1429, 0.1839) +76.27% T able 2: Prediction Accuracy of Matrix Factorization. This table shows the a verage increase in in-sample MSE and one-week-ahead MSE of the matrix f actorization estimates, computed as av erages over three months pre- and post-rollout. Numbers in parentheses indicate 95% conﬁdence intervals. A uditing by Predictiv e Stability: T wo-Stage W eighted Matrix F actorization The empirical patterns suggest that the platform’ s current auditing may create incentiv es for conformity that reduce div ersity in ratings and cov erage of controversial topics. 8 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 4: This ﬁgure shows the weekly mean squared error (MSE) for in-sample vs. out-of-sample predictions from the matrix factorization model. The MSE is computed as the squared difference between the observed rating outcomes and the model’ s predicted rating outcomes (pre-discretization). In-sample errors reﬂect ﬁt to the same week’ s ratings, while out-of-sample errors use factors estimated from week t to predict ratings in week t + 1 . The vertical dashed line marks the Rating Impact analysis date we use. W e therefore propose an alternati ve rule that separates auditing from agreement. Follo wing well-established precedent in the literature, we propose an alternati ve weighting method, which we refer to as weighted matrix f actorization, that targets r ater r eliability directly rather than agreement with the ﬁnal note-status. The method consists of the following steps: First stage: Compute matrix factorization estimates ˆ µ, ˆ h u , ˆ i n , ˆ f u , ˆ g n as in (2) Second stage: 1. Compute r esiduals: For each observ ed user-note pair ( u, n ) , compute the ﬁrst stage residual: e (1) un = r un − ˆ µ − ˆ h u − ˆ i n − ˆ f u · ˆ g n . 2. Estimate variance: For each user u , estimate the empirical v ariance of their ratings as ˆ σ 2 u = 1 | N ( u ) | X n ∈ N ( u )  e (1) un  2 where N ( u ) is the set of notes that user u has rated. 3. Reﬁt inter cepts & factors: Run a ﬁnal weighted regression: arg min ˜ µ, ˜ h u , ˜ i n , ˜ f u , ˜ g n X ( u,n ) observed 1 ˆ σ 2 u  r un − ˜ µ − ˜ h u − ˜ i n − ˜ f u · ˜ g n  2 . This step adjusts the intercepts after incorporating the user variance terms. In principle, one could also recalculate the ˆ σ 2 u and iterate. The in verse-v ariance weight 1 / ˆ σ 2 u measures how predictable a rater’ s beha vior is, giv en their latent position. Because weights are based on internal consistency rather than agreement with other raters, consistent minority raters keep their inﬂuence ev en when they di ver ge from the majority . W e expect this approach to: • Mitigate conf ormity incentives: W eights depend on consistency , not consensus alignment. • Preser ve minority contributions: Consistent raters from minority vie wpoints are not penalized for disagree- ment. • Impro ve pr edictive perf ormance: As in WLS, accounting for heterosk edasticity should reduce mean squared error in prediction. 9 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Empirical Predicti ve Perf ormance T o test the performance of our weighted matrix factorization algorithm, we run it on the Community Notes dataset modifying the Community Notes matrix factorization algorithm. W e use ratings data from Jan. 1, 2023 to June 1, 2024. In particular, the ratings data from Jan. 1, 2023-July 1, 2023 serves as a w arm start for the matrix factorization algorithm, and we e valuate our method on the ratings data from July 1, 2023-June 1, 2024. Detailed implementation description is giv en in the Appendix. In Figure 5a and 5b we compare the mean absolute residual and the median absolute residual on the one-week-ahead predictions from the matrix factorization approach vs. the two-stage approach. Using our proposed two-stage approach, the mean absolute residual is 5.73% lower on a verage, and the median absolute residual is 27.99% lo wer on av erage. (a) Mean absolute residuals. (b) Median absolute residuals. Figure 5: W eekly out-of-sample (one-week-ahead) predictions for residuals estimated using matrix f actorization vs. two-stage approach. Figure 5a shows the mean absolute residuals, with error bars, for each weeks’ predictions. Figure 5b shows the median absolute residuals for each weeks’ predictions. A Beha vioral Model of Strategic Conformity What counts as “truth” in content moderation is not straightforw ard. Whether a note is helpful can depend on context, values, and vie wpoint, and in many settings there is no exogenous ground truth that the platform can audit against. Y et deployed systems necessarily act as if there is a stable truth to infer . In particular , both X’ s Community Notes model and related designs (including Meta’ s) are built around an implicit premise: contributors receiv e noisy priv ate signals 10 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 about a latent underlying e valuation (as in Equation 1), and aggre gation can recov er that latent helpfulness of a note when signals are div erse and independent. In this section, we take that premise at face value , and assume that contributors truly do observe latent signals that ﬁt the platform’ s modeling assumptions. W e then ask even if this latent-signal model were correct, would a consensus-based auditing system recov er the underlying evaluation once contrib utors anticipate the platform’ s eventual consensus? Our analysis shows that it does not. When contributors are rew arded or penalized based on agreement with anticipated consensus, they have incenti ves to report strategically rather than reporting their priv ate ev aluations (signals) about the content. This systematically biases the platform’ s inferred quantities, particularly on contro versial content where conformity pressure is strongest. Finally , we show ho w shifting auditing from agr eement with the majority to out-of- sample stability of r esidual behavior can remov e the direct incenti ve to match the anticipated platform outcome and giv es an unbiased estimator for helpfulness of a note. Model W e consider a system with U users and N notes where users u rate notes n that appear on their feeds. Follo wing the latent-signal interpretation implicit in matrix factorization [ 55 ], suppose each user , suppose that each user u observes a latent signal s un on a note n with additiv e noise ϵ un r ⋆ un = s un + ϵ un , s un = µ + h u + i n + f u g n where E [ ϵ un ] = 0 , E [ ϵ 2 un ] := σ 2 u ∈ (0 , ∞ ) . The errors ϵ un are independent across users u and i.i.d. for the same u . As in (1) , the main quantity of interest is i n , which captures the perceiv ed ov erall helpfulness of note n . Here, µ is a global intercept, and h u is a rater intercept capturing user u ’ s baseline tendency to rate notes as helpful rather than unhelpful, independent of the note’ s content. The latent v ariables f u , g n ∈ R are user and note factors, respecti vely , and their product f u g n captures the extent to which a note is vie wed as more or less helpful by users with different latent positions. The platform does not observe the latent quantity r ⋆ un directly . Instead, it observes reported ratings, which we denote by a un , and ﬁts the matrix f actorization model in (2) to these reports using ridge-re gularized least squares. Speciﬁcally , the platform computes arg min µ,h,i,f ,g X ( u,n ) ∈ Ω  a un − µ − h u − i n − f u g n  2 + λ h ∥ h ∥ 2 2 + λ i ∥ i ∥ 2 2 + λ f ∥ f ∥ 2 2 + λ g ∥ g ∥ 2 2 , (3) where Ω denotes the set of observed ratings. Let ˆ µ, ˆ h u , ˆ i n , ˆ f u , ˆ g n denote the resulting estimates. Modeling user’ s behavior Consensus-based auditing ties a contrib utor’ s future standing (eligibility or inﬂuence) to whether their ratings agree with the platform’ s eventual aggre gate outcome. At the time of rating, howe ver , that outcome has not yet been realized. Contributors therefore form e xpectations about it using information that is broadly observable on the platform, including past notes and ratings, visible patterns in prior outcomes, and the aggregation rule itself. These expectations may dif fer across contributors, reﬂecting dif ferences in attention or inference. Still, because they are formed from largely shared information, it is natural to model them as noisy forecasts of a common note-le vel consensus target. Deﬁnition 1. Let c n ∈ [0 , 1] be a scalar metric that denotes the contr oversy of the note n . F or a note n , let m n denote the anticipated platform consensus, that is, the outcome contributor s expect the platform eventually to assign to that note. W e leave m n unrestricted. It may reﬂect a simple heuristic, such as av eraging visible prior ratings, or a more sophisticated forecast of the score implied by the platform’ s aggregation rule. Contributor-speciﬁc differences in attention or inference are captured as additional noise around this target. Speciﬁcally , contributor u observes ˜ m un = m n ( c n ) + ϵ m un where ϵ m un captures heterogeneity in perceptions across users with E [ ϵ m un | f u , g n , c n , ρ n ] = 0 and V ar( ϵ m un | f u , g n , c n , ρ n ) = σ 2 m,u ( c n ) . Allowing σ 2 m,u ( · ) to depend on c n captures the idea that forecasts of the platform outcome may be noisier on more controv ersial notes. 11 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 W e next model how contributors choose ratings. A contrib utor u chooses a latent report a un ∈ R on note n by balancing two objecti ves: (i) reporting their priv ate signal r ⋆ un against (ii) aligning with what they e xpect the platform to treat as the consensus, which we denote by ˜ m un . The ﬁrst term captures truthful reporting. The second captures the incentive created by consensus-based auditing: contributors anticipate that future standing on the platform (such as eligibility or inﬂuence) depends on whether their ratings agree with the platform’ s ev entual aggregate outcome. Deviating from the anticipated outcome therefore carries an expected penalty . W e allo w this tradeof f to depend on the controv ersy lev el c n . On more controversial notes, the incenti ve to anticipate the platform’ s outcome is plausibly stronger , so the weight placed on the contrib utor’ s own signal is weaker . W e capture this expected downstream consequence of disagreement using a smooth quadratic loss. Similar tensions between pri v ate information and social conformity arise in models of social learning and information cascades [9, 49, 1, 2]. Deﬁnition 2 (User’ s Utility) . User u ’ s utility for a r eport a ∈ R on note n is deﬁned to be U un ( a | r ⋆ un , c n ) = − ρ ( c n ) 2 ( a − r ⋆ un ) 2 − 1 − ρ ( c n ) 2 ( a − ˜ m un ( c n )) 2 + ζ un . (4) Her e, ρ ( · ) ∈ [0 , 1] is conformity weight that is weakly decr easing in c n ; it r eﬂects the intuition that as the contr oversy of a note decreases, user s are mor e inclined to r eport their true signal, and as the contr oversy of a note increases, users ar e mor e inclined to conform to the majority . ζ un is an idiosyncratic payof f shock (mean zer o and ﬁnite variance, i.i.d. acr oss users u and notes n ) that generates r esidual noise in choices independent of action a 13 . W e model the expected downstream consequence of disagreement using a smooth quadratic loss, which serves as a reduced-form approximation to the platform’ s discrete eligibility and impact rules. Similar tensions between pri v ate information and social conformity arise in models of social learning and information cascades Maximizing a user’ s utility (4) immediately implies that the optimal latent report is the conv ex combination a ⋆ un = ρ ( c n ) r ⋆ un + (1 − ρ ( c n )) ˜ m un ( c n ) . (5) Thus, reports place weight ρ ( c n ) on the contributor’ s pri vate signal and weight 1 − ρ ( c n ) on the anticipated platform consensus. When ρ ( c n ) = 1 , the contributor reports their priv ate signal exactly; we refer to this case as truthful r eporting . Because ρ ( · ) is weakly decreasing in controversy , more controversial notes lead contributors to place relativ ely less weight on their own signals and more weight on the anticipated platform outcome. Results In this section, we present our main theoretical results. Proofs and technical regularity conditions are giv en in Appendix E. Throughout, we work in the setting where the number of users and notes are gro wing at the same asymptotic rate. In reality , the platform observes only a subset of user–note interactions, which we model as missing-completely-at- random sampling: each rating a un is observed independently with probability p ∈ (0 , 1] . W e also assume that the true { ( h u , f u ) } u and { ( i n , g n ) } u are bounded and i.i.d., with ﬁnite v ariance and that all intercept and factor terms are mutually independent. In addition, we assume that h u , i n , g n are mean zero, but that E [ f u ] = µ f , for some known positiv e constant µ f ; this allows us to identify a majority and minority group of users. The boundedness assumption is typically observed in real-life data, since user and item attributes are generally ﬁnite or are normalized by construction. Finally , we assume mean-zero sub-Gaussian noise terms ϵ un and ϵ m un independent of the latent variables. Formal regularity conditions are gi ven in Appendix E Assumption 1. Private-Signal Reporting ( ρ = 1 ) W e begin with the benchmark case ρ ( · ) ≡ 1 , so that contributors report their priv ate signals and a ⋆ un = r ⋆ un . In this setting the platform observes noisy realizations of the latent-signal model itself, and the matrix factorization recov ers note helpfulness. This result extends the analogous result in the interactiv e ﬁxed-ef fects framew ork in [ 8 ] to our setting with missing data using techniques from the matrix completion and interactiv e ﬁxed-ef fects literature [13, 36, 51]. Theorem 0.1. Assume U, N → ∞ and that E [ f u ] = µ f wher e µ f is known. Then, the estimate for note helpfulness is consistent. In particular ˆ i n p − → i 0 n . That is, in the truthful r e gime, rank-1 MF r ecovers the true note helpfulness. 13 In X’ s model, reports are discretized to { 0 , 0 . 5 , 1 } , so we interpret user actions a as a latent inde x; the observed rating maps negati ve v alues of a to 0 , positiv e values to 1 , and 0 to 0 . 5 . Our MF is ﬁt to the observed ratings, while the analysis proceeds on the latent index. 12 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Strategic Conformity ( ρ < 1 ) Next, we turn to analyzing the case when ρ ( · ) ≡ 1 , in which contributors place positive weight on the anticipated platform consensus. In this regime, the follo wing theorem tells us that the estimate of note helpfulness ˆ i n will be biased, and in particular will not conv erge to the model-implied helpfulness i n . For the next theorem, we assume that the anticipated platform consensus m n varies across notes and has ﬁnite variance; formal re gularity conditions are gi ven in Appendix E Assumption 2. Theorem 0.2. Suppose ρ ( · ) ≡ 1 . As U, N → ∞ , ther e exists a random variable i ∞ n such that ˆ i n p − → i ∞ n , and for at least one n , i ∞ n  = i 0 n . The source of this bias comes from the conformity incentives. When ρ ( c n ) < 1 , the platform no longer observes contributors’ latent signals directly . Instead, it observes reports that combine the pri vate signal r ⋆ un with the anticipated consensus tar get m n . Thus, matrix factorization is applied to conformity-distorted signals rather than to the latent signal matrix itself. T o see where the distortion enters, write δ n := m n − ( µ + i n ) . The quantity δ n measures how far the anticipated platform consensus deviates from the note-side latent component of the signal. Under users’ strategic reporting, the observed report matrix dif fers from the latent signal matrix by a note-side perturbation proportional to (1 − ρ ( c n )) δ n . The platform nevertheless ﬁts the same matrix factorization model. Consequently , the recovered parameters correspond to the best rank-1 approximation of this distorted matrix. In particular , the bias is gov erned by the projection of the conformity term (1 − ρ ( c n )) δ n onto the note-factor direction g n . Whenev er this projection is nonzero on a nontrivial set of notes, the resulting factorization con verges to a parameter i ∗ n which is different from the true note helpfulness i 0 n . Furthermore, we can characterize the extent to which user factors will shift in this regime due to strategic behavior . Note that the latent factors f u and g n are unique up to a global constant scaling factor . Multiplying all note factors by c and user factors by c − 1 still leads to a valid solution for solving the matrix factorization optimization problem gi ven by (2). Thus, we use the follo wing normalization for identiﬁcation. Deﬁne the (estimated) residualized ratings to be y un := a ⋆ un − ˆ µ − ˆ h u − ˆ i n where ˆ µ, ˆ h u , ˆ i n are giv en by solving the least squares problem (2). The latent factor normal equations are gi ven by ˆ f u = P n ω un y un g n P n ω un g 2 n , ˆ g n = P u ω un y un f u P u ω un f 2 u . (6) The identiﬁability conditions on the matrix factorization estimates allow us to determine the sign and scale of the factor estimates. Our next result states ho w the estimate of the user factor behav es as a function of ρ ( c n ) , m ( c n ) , and g n in expectation. Theorem 0.3. Let ρ ≡ 1 . Consider the setting of Theor em E.8, and suppose that the true note factors { g n } n ar e known. Let ˆ µ, ˆ h u , ˆ i n , ˆ f u denote the solution to (2) . Then E [ ˆ f u | f u , g n , c n ] = w 1 f u + c (1 − w 1 ) + o p (1) , wher e w 1 = P n ρ n g 2 n P n g 2 n . Theorem 0.3 tells us that E [ ˆ f u | f u ] is an af ﬁne transformation of the user’ s truth f u . As the probability of encountering controv ersial notes increases (more notes with c n = 1 ), w 1 decr eases . Thus, the estimated user f actor places less weight on the indi vidual’ s true position and more weight on the aggregate conformity distortion. In particular , the bias is driv en by the dependence of ratings on anticipated consensus, rather than by estimation error in the note f actors. In practice, the g n n are themselves estimated, so additional distortion may arise from estimation error , potentially amplifying the effect described abo ve. Theorem 0.3 also implies the follo wing proposition, telling us which members of the minority are more susceptible to measurement errors in their latent factor estimations. Let the true minority be { u : f u < 0 } with share π − true := Pr( f u < 13 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 0) ∈ (0 , 1 2 ) , since we assume that f u has positi ve expected value. Because the latent-factor sign is globally arbitrary , the theoretical section uses a normalization opposite to the empirical sections. Empirically , we ﬂip signs so that the majority is negati ve and the minority positive; all sign-based claims are in v ariant under the global transformation ( f , g ) → ( − f , − g ) . Proposition 0.4. Consider the setting of Theorem 0.3. Then, E [ ˆ f ∗ u | f u , g n , c n ] > 0 if and only if f u > − c (1 − w 1 ) w 1 . In particular , the minority slice ( − c (1 − w 1 ) w 1 , 0) is mapped to positive estimates. Proposition 0.5. Consider the setting of Theor em 0.3. Let F be the CDF of f u . Assume that F is continuous with no atom at 0 . Then, the estimated minority shar e π − est := Pr( ˆ f u < 0 | f u , g n , c n ) satisﬁes π − est = F  − c (1 − w 1 ) w 1  < F (0) + o (1) = π − true + o (1) , and π − est is (weakly) decr easing in w 2 and (weakly) incr easing in w 1 . Remark 0.6. One can carry a similar computation for the note factors E [ˆ g n | f u , g n , c n ] to get E [ ˆ g n | f u , g n , c n ] ≈ g n ρ n  1 − c P u f u P u f 2 u  . If user opinions ar e balanced ( E ( f u ) = 0 ), then the expected estimate E [ ˆ g n | f u , g n , c n ] is pr oportional to g n with a coefﬁcient that depends on ρ n ( c n ) . In particular , higher value of c n (contr oversial notes) reduce this coefﬁcient, shrinking the note factor towar d zer o. Statistical guarantee for the tw o-stage estimator W e no w turn to the alternativ e auditing rule studied in the empirical section. Recall two-stage algorithm: the platform ﬁrst ﬁts the standard unweighted regularized matrix factorization model, and then uses the resulting residuals to estimate contributor -speciﬁc noise le vels. It then reﬁts the same model using in verse residual v ariance weights. This is the matrix-factorization analogue of feasible generalized least squares and is closely related to weighted lo w-rank approximation [4, 50, 56]. In the ﬁrst stage, the platform ﬁts the unweighted re gularized model in (3) and obtains estimates ( ˆ µ, ˆ h, ˆ i, ˆ f , ˆ g ) . For each contributor u , let S u := { n : ( u, n ) ∈ Ω } and N u := | S u | , and deﬁne the ﬁrst-stage residual variance estimate ˆ σ 2 u := 1 N u X n ∈ S u  a un − ˆ µ − ˆ h u − ˆ i n − ˆ f u ˆ g n  2 . (7) In the second stage, the platform sets ˆ w u := 1 ˆ σ 2 u and reﬁts the same regularized rank-1 model using contributor -speciﬁc weights. More generally , for any bounded positiv e weights w = { w u } , deﬁne the weighted regularized matrix factorization problem arg min ˜ µ, ˜ h u , ˜ i n , ˜ f u , ˜ g n X ( u,n ) observed w u  r un − ˜ µ − ˜ h u − ˜ i n − ˜ f u · ˜ g n  2 . (8) W e write ˜ i ts n for the note-helpfulness estimate produced by (8) with weights ˆ w u = 1 / ˆ σ 2 u . Our next theorem shows that, among estimators obtained in this way , the estimator with weights w u = 1 / ˆ σ 2 u is consistent and has the lowest asymptotic v ariance. Theorem 0.7. Assume that ρ ≡ 1 and that µ f is known. Then the solution to (8) with weights 1 / ˆ σ 2 u , ˜ i ts n , reco vers consistent estimates of i n , i.e., as U, N → ∞ ˜ i ts n p − → i 0 n . 14 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Mor eover , among all other solutions of (8) ˜ i n with positive, ﬁnite weights w u ∈ (0 , ∞ ) , aV ar( ˜ i ts n ) ≤ aV ar( ˜ i n ) . Her e, for a scalar estimator X m , aV ar( X m ) = lim m →∞ m · V ar( X m ) . This theorem gi ves a statistical interpretation of contributor impact under the redesigned rule. In the two-stage estimator , contributors are weighted by the in verse of their residual v ariance, so contributors whose e valuations are more stable relativ e to the ﬁtted latent structure receiv e greater inﬂuence in the second stage. In this sense, the redesign audits contributors by r esidual stability rather than by agreement with the platform’ s e ventual consensus. This is the key contrast with consensus-based auditing. Under the current rule implemented in Community Notes, contributor inﬂuence is tied to whether ratings align with the platform’ s ﬁnal aggregate outcome. Under the two-stage rule, inﬂuence is instead tied to the statistical precision of a contributor’ s ev aluations within the latent-factor model. The theorem shows that, under pri vate-signal reporting, this weighting rule yields a consistent estimator and attains the lowest asymptotic v ariance among weighted matrix factorization estimators in this class. Whether the same rule also changes contributors’ strategic incenti ves is a separate behavioral question. The result here suggests an alternati ve notion of contrib utor rating impact based on r esidual stability rather than agreement with the platform’ s eventual consensus. Conclusion Crowdsourced moderation systems are often motiv ated by the idea that div erse, independent ev aluations can be aggregated into reliable judgments. Our ﬁndings sho w that this promise depends not only on how ratings are aggregated, but also on how contributors are audited. In X’ s Community Notes, auditing contrib utors by whether they agree with the platform’ s eventual consensus creates incenti ves to anticipate that consensus rather than to provide independent ev aluations. Empirically , this is associated with strategic conformity by minority contributors, reduced engagement on controv ersial content, and lower predicti ve performance of the platform’ s latent-factor model. Our theoretical analysis clariﬁes why this occurs. Even if one grants the platform’ s latent-signal model, consensus-based auditing alters the object being measured: once contributors partially conform to the anticipated platform outcome, matrix factorization no longer aggregates independent ev aluations alone. Instead, it recov ers a conformity-distorted projection of those e valuations. These observations also suggest a dif ferent design principle. Rather than re warding contributors for matching the eventual majority outcome, platforms can ev aluate them using targets that do not mechanically fav or conformity . Moti vated by this idea, we study a two-stage procedure that weights contributors by the stability of their residual behavior rather than by agreement with the ﬁnal consensus. In the Community Notes data, this approach improv es out-of-sample predictiv e performance while allowing informati ve disagreement to retain inﬂuence. More broadly , our results suggest that cro wdsourced moderation should be designed to preserve independence, especially on controv ersial content where ﬁnding misinformation is most v aluable. Systems that rew ard agreement with the ﬁnal aggregate may appear to improv e reliability , but can instead suppress the disagreement needed for accurate aggreg ation. In en vironments without externally v eriﬁable ground truth, the design of the auditing rule is therefore not a peripheral implementation detail; it is part of the core of the system. Acknowledgments KH was partially supported by the National Science Foundation under grant DGE 2146752. 15 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 References [1] Daron Acemoglu, Munther A. Dahleh, Ilan Lobel, and Asuman Ozdaglar . Bayesian learning in social networks. The Revie w of Economic Studies , 78(4):1201–1236, 2011. [2] Daron Acemoglu, Ali Makhdoumi, Azarakhsh Malekian, and Asu Ozdaglar . Fast and slow learning from re views. Econometrica , 90(2):775–810, 2022. [3] Daron Acemoglu, Asuman Ozdaglar , and James Siderius. A model of online misinformation. Review of Economic Studies , 91(6):3117–3150, 2024. [4] A. C. Aitken. Iv .—on least squares and linear combination of observ ations. Pr oceedings of the Royal Society of Edinbur gh , 55:42–48, 1936. [5] Jennifer Allen, Cameron Martel, and David G Rand. Birds of a feather don’t fact-check each other: Partisanship and the ev aluation of ne ws in twitter’ s birdwatch crowdsourced f act-checking program. In Pr oceedings of the 2022 CHI Confer ence on Human F actors in Computing Systems , CHI ’22, New Y ork, NY , USA, 2022. Association for Computing Machinery . [6] Jennifer Allen, Duncan J W atts, and David G Rand. Quantifying the impact of misinformation and v accine- skeptical content on facebook. Science , 384(6699):eadk3451, 2024. [7] Xavier Amatriain, Josep Pujol, and Nuria Oliv er . I like it... i like it not: Evaluating user ratings noise in recommender systems, 06 2009. [8] Jushan Bai. Panel data models with interacti ve ﬁx ed effects. Econometrica , 77(4):1229–1279, 2009. [9] Abhijit V . Banerjee. A Simple Model of Herd Behavior*. The Quarterly Journal of Economics , 107(3):797–817, August 1992. _eprint: https://academic.oup.com/qje/article-pdf/107/3/797/5298496/107-3-797.pdf. [10] Md Momen Bhuiyan, Amy X. Zhang, Connie Moon Sehat, and T anushree Mitra. Inv estigating differences in crowdsourced ne ws credibility assessment: Raters, tasks, and expert criteria. Pr oc. A CM Hum.-Comput. Interact. , 4(CSCW2), October 2020. [11] Nadia M Brashier , Gordon Pennycook, Adam J Berinsky , and Da vid G Rand. T iming matters when correcting fake ne ws. Pr oceedings of the National Academy of Sciences , 118(5):e2020043118, 2021. [12] Raymond J. Carroll and David Ruppert. Robust estimation in heteroscedastic linear models. The Annals of Statistics , 10(2):429–441, 1982. [13] Y uxin Chen, Y uejie Chi, Jianqing Fan, Cong Ma, and Y uling Y an. Noisy matrix completion: Understanding statistical guarantees for con vex relaxation via nonconv ex optimization. SIAM journal on optimization , 30(4):3098– 3121, 2020. [14] Community Notes Guide – X. Locking and unlocking the ability to write notes. https://communitynotes.x. com/guide/en/contributing/writing- ability , n.d. Accessed: 2025-08-30. [15] Community Notes Guide – X. Rating and writing impact. https://communitynotes.x.com/guide/en/ contributing/writing- and- rating- impact , n.d. Accessed: 2025-08-30. [16] W eijia Dai, Ginger Jin, Jungmin Lee, and Michael Luca. Aggregation of consumer ratings: an application to yelp. com. Quantitative Marketing and Economics , 16(3):289–339, 2018. [17] Alexander Philip Dawid and Allan M Skene. Maximum likelihood estimation of observer error-rates using the em algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics) , 28(1):20–28, 1979. [18] Chiara Patricia Drolsbach and Nicolas Pröllochs. Diffusion of community fact-checked misinformation on twitter . Pr oceedings of the A CM on Human-Computer Interaction , 7(CSCW2):1–22, 2023. [19] Chiara Patricia Drolsbach, Kirill Solov ev , and Nicolas Pröllochs. Community notes increase trust in fact-checking on social media. PNAS ne xus , 3(7):pgae217, 2024. [20] Erik Eyster and Matthew Rabin. Naiv e herding in rich-information settings. American economic journal: micr oeconomics , 2(4):221–243, 2010. [21] Boi Faltings and Goran Radano vic. Game theory for data science: Eliciting truthful information . Springer Nature, 2022. [22] V iv ek Farias, Andrew A Li, and Tian yi Peng. Uncertainty quantiﬁcation for low-rank matrix completion with heterogeneous and sub-e xponential noise. In International Confer ence on Artiﬁcial Intellig ence and Statistics , pages 1179–1189. PMLR, 2022. [23] Francis Galton. V ox populi, 1907. 16 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 [24] Y ang Gao, Maggie Mengqing Zhang, and Huaxia Rui. Can crowdchecking curb misinformation? evidence from community notes. Information Systems Researc h , 2025. [25] T ilmann Gneiting and Adrian E Raftery . Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association , 102(477):359–378, 2007. [26] Eric Horvitz. Incentiv es and truthful reporting in consensus-centric crowdsourcing. T echnical report, Microsoft Research, 2012. [27] Matthew Jackson and Stephen Nei. Finding the wise and the wisdom in a cro wd: Estimating underlying qualities of revie wers and items. American Economic Revie w , 111(3):1001–1024, 2021. [28] Uku Kangur , Roshni Chakraborty , and Rajesh Sharma. Who checks the checkers? exploring source credibility in twitter’ s community notes. arXiv pr eprint arXiv:2406.12444 , 2024. [29] David Kar ger , Sew oong Oh, and Dev avrat Shah. Iterative learning for reliable cro wdsourcing systems. Advances in neural information pr ocessing systems , 24, 2011. [30] Hisashi Kashima, Satoshi Oyama, Hiromi Arai, and Junichiro Mori. Trustw orthy human computation: a survey . Artiﬁcial Intelligence Revie w , 57(12):322, 2024. [31] Y uqing K ong, Katrina Ligett, and Grant Schoenebeck. Putting peer prediction under the micro (economic) scope and making truth-telling focal. In International Confer ence on W eb and Internet Economics , pages 251–264. Springer , 2016. [32] Y ang Liu and Y iling Chen. Machine-learning aided peer prediction. In Pr oceedings of the 2017 ACM Confer ence on Economics and Computation , EC ’17, page 63–80, Ne w Y ork, NY , USA, 2017. Association for Computing Machinery . [33] Jan Lorenz, Heiko Rauhut, Frank Schweitzer , and Dirk Helbing. How social inﬂuence can undermine the wisdom of crowd ef fect. Pr oceedings of the national academy of sciences , 108(22):9020–9025, 2011. [34] Meta Platforms, Inc. Introducing community notes — adding context to posts. https://www.meta.com/technologies/community- notes/?srsltid= AfmBOoqGYuB01StOhwvVzji0toKNwMWsuS3OurkU7X3c5L2AvsifdBYC , 2025. Accessed: 2025-11-17. [35] Nolan Miller , Paul Resnick, and Richard Zeckhauser . Eliciting informativ e feedback: The peer-prediction method. Management Science , 51(9):1359–1373, 2005. [36] Hyungsik Roger Moon and Martin W eidner . Linear regression for panel with unkno wn number of factors as interactiv e ﬁxed ef fects. Econometrica , 83(4):1543–1579, 2015. [37] Lev Muchnik, Sinan Aral, and Sean J T aylor . Social inﬂuence bias: A randomized experiment. Science , 341(6146):647–651, 2013. [38] Elisabeth Noelle-Neumann. The Spiral of Silence: A Theory of Public Opinion . Uni versity of Chicago Press, 1974. [39] Juan Perdomo, T ijana Zrnic, Celestine Mendler-Dünner , and Moritz Hardt. Performativ e prediction. In Interna- tional Confer ence on Machine Learning , pages 7599–7609. PMLR, 2020. [40] Sarah Perez. T witter expands its crowdsourced fact-checking program Birdwatch ahead of US midterms, September 2022. Accessed: 2025-09-16. [41] Sarah Perez. T witter is making its crowdsourced f act-checks visible to all U.S. users with Birdwatch expansion, October 2022. Accessed: 2025-09-16. [42] Sarah Perez. Bluesky adds ‘anti-toxicity’ tools and aims to integrate ‘a community notes-like’ feature in the future. T echCrunch , 2024. Accessed: 2025-11-17. [43] Drazen Prelec. A bayesian truth serum for subjectiv e data. Science , 306(5695):462–466, 2004. [44] V ikas C Raykar , Shipeng Y u, Linda H Zhao, Gerardo Hermosillo V aladez, Charles Florin, Luca Bogoni, and Linda Moy . Learning from crowds. Journal of machine learning r esearc h , 11(4), 2010. [45] Thomas Renault, Mohsen Mosleh, and David G Rand. Republicans are ﬂagged more often than democrats for sharing misinformation on x’ s community notes. Pr oceedings of the National Academy of Sciences , 122(25):e2502053122, 2025. [46] Paul Resnick and Rahul Sami. The inﬂuence limiter: prov ably manipulation-resistant recommender systems. In Pr oceedings of the 2007 A CM Confer ence on Recommender Systems , RecSys ’07, page 25–32, New Y ork, NY , USA, 2007. Association for Computing Machinery . 17 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 [47] V ictor Shnayder , Arpit Agarwal, Rafael Frongillo, and Da vid C Parkes. Informed truthfulness in multi-task peer prediction. In Pr oceedings of the 2016 A CM Conference on Economics and Computation , pages 179–196, 2016. [48] Isaac Slaughter , Axel Pe ytavin, Johan Ugander , and Martin Sav eski. Community notes reduce engagement with and diffusion of false information online. Proceedings of the National Academy of Sciences , 122(38):e2503413122, 2025. [49] Lones Smith and Peter Sørensen. Pathological outcomes of observ ational learning. Econometrica , 68(2):371–398, 2000. [50] Nathan Srebro and T ommi Jaakkola. W eighted low-rank approximations. In Proceedings of the 20th international confer ence on machine learning (ICML-03) , pages 720–727, 2003. [51] Liangjun Su, Fa W ang, and Y iren W ang. Estimation and inference for unbalanced panel data models with interactiv e ﬁxed ef fects. Journal of Econometrics , 255:106222, 2026. [52] James Suro wiecki. The wisdom of cr owds . V intage, 2005. [53] Jacob Thebault-Spieker , Sukrit V enkatagiri, Naomi Mine, and Kurt Luther . Div erse perspectives can mitigate political bias in crowdsourced content moderation. In Pr oceedings of the 2023 A CM Confer ence on F airness, Accountability , and T ranspar ency , pages 1280–1291, 2023. [54] T ikT ok Pte. Ltd. Rolling out tiktok footnotes in the u.s. https://newsroom.tiktok.com/ rolling- out- tiktok- footnotes- in- the- us?lang=en , 2025. Accessed: 2025-11-17. [55] T witter , Inc. Community notes: Documentation and source code powering community notes. https://github. com/twitter/communitynotes , 2022. [56] Madeleine Udell, Corinne Horn, Reza Zadeh, and Stephen Boyd. Generalized low rank models. F oundations and T rends in Mac hine Learning , 9(1):1–118, 2016. [57] Sander V an Der Linden. Misinformation: susceptibility , spread, and interventions to immunize the public. Natur e medicine , 28(3):460–467, 2022. [58] Benjamin V an Roy and Xiang Y an. Manipulation robustness of collaborativ e ﬁltering. Management Science , 56(11):1911–1929, 2010. [59] Roman V ershynin. Introduction to the non-asymptotic analysis of random matrices., 2012. [60] Michela Del V icario, Alessandro Bessi, Fabiana Zollo, Fabio Petroni, Antonio Scala, Guido Caldarelli, H. Eugene Stanley , and W alter Quattrociocchi. The spreading of misinformation online. Proceedings of the National Academy of Sciences , 113(3):554–559, 2016. [61] Soroush V osoughi, Deb Roy , and Sinan Aral. The spread of true and false news online. Science , 359(6380):1146– 1151, 2018. [62] Bo W aggoner and Y iling Chen. Output agreement mechanisms and common knowledge. In Pr oceedings of the AAAI Confer ence on Human Computation and Cr owdsour cing , volume 2, pages 220–226, 2014. [63] Je vin D. W est and Carl T . Bergstrom. Misinformation in and about science. Pr oceedings of the National Academy of Sciences , 118(15):e1912444117, 2021. [64] Jens W itko wski and David C Parkes. Peer prediction without a common prior . In Pr oceedings of the 13th A CM Confer ence on Electr onic Commer ce , pages 964–981, 2012. [65] X Community Notes Guide. Ranking notes. https://communitynotes.x.com/guide/en/ under- the- hood/ranking- notes , n.d. Accessed: 2025-08-30. [66] X Corp. About community notes on x. https://help.x.com/en/using- x/community- notes , 2025. Ac- cessed: 2025-11-17. [67] X Corp. / Community Notes Guide. Note ranking algorithm. https://communitynotes.x.com/guide/en/ under- the- hood/ranking- notes , n.d. Accessed: 2025-08-05. [68] X (formerly T witter) Community Notes Guide. Downloading data. https://communitynotes.x.com/guide/ en/under- the- hood/download- data , n.d. Accessed: 2025-08-30. [69] Dora Zhao, Diyi Y ang, and Michael S. Bernstein. Mapping the spiral of silence: Surve ying unspoken opinions in online communities. arXiv preprint , 2025. 18 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 A Guide to the Appendix This Appendix has tw o goals. First, it provides the empirical implementation details and robustness analyses underlying the main-text results. Second, it contains the full technical Appendix for the theoretical results. The org anization of the Appendix is designed to mirror the logic of the paper: we ﬁrst document the empirical reconstruction and additional analyses, and then turn to the stylized model and proofs. The Appendix is organized as follo ws. Appendix B describes the data sources, preprocessing steps, and reconstruction of weekly latent factors from the public Community Notes data and open-source code. Appendix C presents additional empirical analyses in the order of the main text, including the shift in minority-aligned contributors after the introduction of Rating Impact, the change in participation on controv ersial content, and additional predictive-performance analyses. Appendix D describes the implementation of the two-stage weighted matrix factorization procedure and additional empirical details for that estimator . Appendix E contains the full theory Appendix: it states the stylized estimation model, lists the regularity conditions used in the proofs, and provides proofs of all theorems and propositions in the main text. B Data, reconstruction, and empirical methodology B.1 Data sources In this paper , we use the open source code and data from X Community Notes [ 68 ]. T o study the effect of the Rating Impact rollout, we focus on the time frame between June 1, 2022 and May 31, 2023. T o ev aluate the predicti ve performance of our two-stage matrix factorization (MF) method, we use ratings data from Jan. 1, 2023 to June 1, 2024. Information about the primary dataframes we use in our analysis and relev ant columns are stored in T able 3. More detailed information about all av ailable data can be found in the Community Notes documentation [68]. Dataframe Relevant Columns Description ratings_df noteId, raterParticipantId, createdAtMillis, helpfulnessLevel Record of each (user , note) rating pair and the times- tamp history_df noteId, createdAtMillis, currentStatus Record of each note and what its most recent note status was (i.e. whether it has reached Helpful or Not Helpful status) note_df noteId, raterParticipantId, createdAtMillis, tweetId, summary Contains metadata about each note note_factor_df noteId, week_dt, noteIntercept, noteFactor1 Contains weekly computations for note intercept and note factor (one for 2022 v ersion, one for 2025 ver - sion of the code) rater_factor_df raterParticipantId, week_dt, raterIntercept, raterFactor1 Contains weekly computations for rater intercept and rater factor (one for 2022 v ersion, one for 2025 ver - sion of the code) T able 3: Dataframes used in our analysis. B.2 Reconstructing W eekly Latent Factors The public release does not include the latent parameters used internally by the platform’ s aggreg ation system. As a result, all note and rater intercepts and factors used in our analysis are reconstructed from the ratings history rather than observed directly . In order to recover weekly estimates for rater and note intercepts and factors, we run X’ s matrix factorization algorithm. For each week w from June 1, 2022 to May 31, 2023, we run the matrix f actorization algorithm on all ratings up to and including ratings from week w . Since matrix factorization can only recover the rater and note factors up to a global scaling and sign, the algorithm checks the sign distrib ution of factors and ensures that the majority always has a negati ve sign. Thus, the sign meaning stays consistent throughout the weeks. W e run both the version from Dec. 2022 and the version from May 2025 [ 55 ]. Both implementations largely solv e the biased matrix factorization problem presented in the main text using stochastic gradient descent with L 2 regularization, and factor normalization to ensure consistent interpretation of factors across weeks. The 2022 version is a straightforward implementation of the least-squares MF optimization problem with single- round optimization and basic con ver gence criteria. The 2025 version of the code includes many enhancements (multi-round reputation ﬁltering, harassment detection, uncertainty quantiﬁcation, and abuse mitigation), ho wever , our implementation utilizes only the stable initialization improvement from the modern codebase. Speciﬁcally , we run the 19 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 run_single_round_mf function, which implements stable initialization using a designated modeling group to prev ent factor sign drift across time. B.3 Policy T iming, and User Cohorts W e use Oct. 1, 2022 as the analysis cutoff date. This is a conservati ve operational cutof f following the period when the Rating Impact eligibility rules (announced in September 2022) begin to take ef fect in the observed data. Dates before Oct. 1 are referred to as pr e-r ollout , and dates on or after Oct. 1 as post-r ollout . Sev eral analyses distinguish beha vioral adaptation from compositional change due to entry of ne w raters. W e deﬁne early users to be raters who were activ e on Community Notes before Oct. 1, 2022, and new users to be raters who ﬁrst became activ e between Oct. 1, 2022 and Jan. 1, 2023. Finally , note and rater factors should be interpreted as relati ve positions within a weekly latent scale estimated from all observed user -note interactions on the platform. In particular , note factors are equilibrium objects: they reﬂect not only how notes are e valuated, b ut also which notes are written and which notes recei ve enough ratings to be assigned a factor . For this reason, our empirical comparisons focus on within-pipeline temporal changes, cohort differences, and discontinuities at the rollout boundary , rather than on absolute comparisons across different estimation procedures. C Additional Empirical Results C.1 Robustness Checks f or Minority Beha vior Shift In this section, we provide se veral sensitivity tests for the e vidence on changes in minority behavior . Recall that the platform normalizes users with negati ve latent factor to be the majority . The main empirical ﬁnding was that, follo wing the introduction of Rating Impact, minority-aligned contrib utors moved closer to the majority in the platform’ s latent-factor space, and the predictiv e role of user-note alignment for Helpful ratings weakened. Here we show that this pattern is robust across alternati ve visualizations of the factor distrib utions, a permutation-based comparison of factor shifts for early and new users, a regression discontinuity design for distrib utional shape, and additional predicti ve speciﬁcations based on the user-note dot product. C.1.1 Latent Factor Distrib ution Shift Recall from the main te xt that we deﬁne early users to be the cohort of users were who acti ve on Community Notes before the rollout date of Oct. 1, 2022, and new users to be the cohort of users who became acti ve between Oct. 1, 2022 and Jan. 1, 2023. In Figures 6, 8, and 9, we give additional visualizations for the distrib ution shift for early users compared with new users between Oct. 2022 and Jan. 2023. All ﬁgures gi ve observ ational evidence that users who were affected by the Rating Impact policy change their beha vior , with their factors aligning more with the majority over time. C.1.2 RDD T ests for Bimodality As an additional robustness test the latent factor distribution shift among note and rater factors, we compute the bimodality coefﬁcient of the distributions ov er time and run a regression discontinuity design. For each week t , we compute the bimodality coefﬁcient (BC) of the empirical distribution of latent f actors, deﬁned as BC t = sk ewness 2 t + 1 kurtosis t , (9) where ske wness and kurtosis are computed from the stimated factors in week t . The bimodality coef ﬁcient is scale- in variant and increases with the prominence of multiple modes; for reference, unimodal symmetric distrib utions (e.g., Gaussian) hav e BC ≈ 1 / 3 , while bimodal distributions yield lar ger values. W e compute BC t separately for (i) rater factors { f u } and (ii) note factors { f n } using weekly snapshots of the matrix factorization estimates reconstructed from the public data. The estimation pipeline, normalization, and regularization are held ﬁxed across time, so temporal changes in BC t reﬂect changes in the empirical distrib ution rather than rescaling artifacts. Regression discontinuity design. W e estimated a sharp RDD to test whether the October 1, 2022 intervention produced a discontinuous shift in weekly bimodality coef ﬁcients. The running variable R t is deﬁned as the signed 20 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 number of days between the Monday of week t and the cutoff date, so R t = 0 corresponds to the ﬁrst post-intervention week. W e use the full available time series as the estimation bandwidth ( 9 weeks pre-, 9 weeks post-interv ention). W e estimated a local linear model on each side of the cutof f: B C t = β 0 + β 1 R t + β 2 · 1 [ R t ≥ 0] + β 3 · ( R t × 1 [ R t ≥ 0]) + ε t (10) where β 2 identiﬁes the discontinuous jump at the threshold and β 3 allows the post-interv ention slope to differ from the pre-intervention slope. The model was estimated separately for the rater -lev el and note-level bimodality series via OLS with HC3 heteroskedasticity-rob ust standard errors. Results and interpretation. For rater factors, we observe a statistically signiﬁcant negati ve discontinuity in the bimodality coef ﬁcient at the cutoff (Figure 10), indicating an abrupt shift to ward a more unimodal distribution follo wing the introduction of Rating Impact. This pattern is consistent with minority-aligned raters moving closer to the center of the latent spectrum or crossing alignment tow ard the majority group. For note f actors, we also observe a signiﬁcant decline in the bimodality coef ﬁcient at the cutoff (Figure 11), though the subsequent time trend dif fers from that of raters. The regression discontinuity design is used to identify the local effect of the Rating Impact rollout on the distributional shape of latent factors, not to characterize longer-run dynamics. While the post-cutof f time path of note f actors differs from that of rater factors, this div ergence is expected and does not affect the interpretation of the discontinuity . Rater factors represent latent traits of a largely ﬁxed population of users and therefore e volv e primarily through beha vioral adaptation. In contrast, note factors are equilibrium objects shaped by endogenous entry: which notes are written, and which notes receive suf ﬁcient ev aluations to get assigend a factor all depend on post-rollout incentiv es. These selection forces can alter higher-order moments of the note-factor distribution o ver time, e ven when the immediate response to the policy change is a reduction in bimodality . Importantly , our inference relies on the direction and signiﬁcance of the discontinuity at the rollout boundary , which is common to both rater and note factors, and is consistent with strategic conformity reducing the salience of minority-aligned positions. C.1.3 Additional Alignment T ests The main text used Spearman’ s correlation between the rater–note dot product f u g n and Helpful ratings as a nonpara- metric measure of the predicti ve role of user –note alignment. Here we report two additional rob ustness checks on the same question. Logistic Regression For each rating between Aug. 1, 2022 and Jan. 1, 2023, we compute the dot product between the user factor and note factor at the time of rating. For each period before and after the Oct. 1, 2022 rollout, we regress helpfulness ratings on the rater -note dot product using logistic regression, and tak e the difference in coef ﬁcients (post minus pre) as our test statistic measuring change in predicti veness. The logistic regression coef ﬁcient declined from 15 . 696 to 4 . 427 , a change of − 11 . 269 . W e conduct a permutation test with 1 , 000 iterations, randomly reassigning the ratings to pre/post groups, while preserving the original group sizes. The p-value of 0 . 001 indicates that the observed decline is statistically signiﬁcant at the 0 . 05 lev el, consistent with the Spearman correlation results reported in the main text. The test statistic distribution is sho wn in Figure 13. Note that the logistic regression coef ﬁcient is sensiti ve to the scale of the dot product and is therefore less robust than the Spearman correlation as a test statistic; we include it here for completeness. DiD for Note Helpfulness Second, we estimate a difference-in-dif ferences (DiD) framework on the same dataset to test whether the rollout of Rating Impact affects the predicti veness of rater -note factor for note helpfulness ratings. W e regress note helpfulness using the follo wing: r un = α + β ( f u g n ) + γ Post + δ ( Post × f u g n ) + ϵ un , where f u , g n are the rate r and note factors, and Post is an indica tor that is 1 after Oct. 1, 2022. The coef ﬁcient β captures the baseline predictiv eness of the rater-note factor prior to the interv ention, γ captures any le vel shift in helpfulness ratings post-rollout, and δ is the DiD estimator of interest. Standard errors are heteroskedasticity-robust (HC3). Results are shown belo w . T aken together with the Spearman and rolling-correlation analyses in the main te xt, these additional speciﬁcations reinforce the same conclusion: after the introduction of Rating Impact, user–note alignment becomes less predicti ve of helpfulness ratings among contributors who were acti ve through the polic y change. 21 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Parameter Std. Err . z p -value Lo wer CI Upper CI Intercept 0.4937 0.039 12.649 < 0.001 0.417 0.570 f u g n 1.3821 0.068 20.323 < 0.001 1.249 1.515 Post − 0.0767 0.047 − 1.636 0.102 − 0.169 0.015 Post × f u g n − 0.5344 0.100 − 5.345 < 0.001 − 0.730 − 0.338 T able 4: DiD estimates for the predictiv eness of the rater-note factor on note helpfulness ratings, corresponding to the speciﬁcation in (11) . The dependent v ariable is r un , is the helpfulness rating. f u g n is the rater-note dot product and Post is an indicator equal to 1 after the Rating Impact rollout (October 2022). The baseline coef ﬁcient on f u g n ( ˆ β = 1 . 382 , p < 0 . 001 ) indicates a strong pre-intervention relationship between the rater-note factor and helpfulness. The DiD estimator Post × f u g n ( ˆ δ = − 0 . 534 , p < 0 . 001 ) indicates that this predictiv e relationship weakened signiﬁcantly following the rollout. Standard errors are heteroskedasticity-rob ust (HC3). C.2 Contro versial Content and Participation This section provides additional detail and sensitivity analyses for the controv ersial-content result in the main text. The main pattern is that, follo wing the rollout of Rating Impact, notes on controversial content are less lik ely to attain Helpful status than notes on non-controversial content. W e document this pattern using two complementary deﬁnitions of controv ersy . The ﬁrst is topic-based and uses note summaries to classify notes into broad content areas before labeling those areas as controv ersial or non-controv ersial. The second is factor -based and uses the magnitude of the estimated note factor as a model-based measure of polarization. W e also examine whether the decline in controversial-note visibility is accompanied by changes in contributor -lev el engagement with controv ersial content. C.2.1 T opic Assignment and Controversy Deﬁnitions Next we describe how we assign topic labels to notes and ho w those labels are used to classify content as controv ersial or non-controv ersial. W e ﬁrst deﬁne the primary topic assignment procedure used throughout the main text, which closely follows X’ s public implementation with expanded co verage. W e then present alternati ve large-language-model (LLM)- based topic assignments used as alternative classiﬁcation procedures to assess sensitivity to the topic-classiﬁcation mechanism. Primary T opic Assignment (Bag-of-W ords Classiﬁer) Our primary topic assignment b uilds on the topic-modeling code used by the platform implementation, which combines seed-term matching with a supervised bag-of-words classiﬁer . In the version of X’ s code used for this paper , each topic is deﬁned initially by a small set of seed terms. Preliminary topic assignment is based on exact and fuzzy matches to these seed terms, after which a multi-class logistic regression classiﬁer is trained to e xpand the set of in-topic notes. In the version of X’ s code used for this paper , the nativ e topic in ventory is limited to Ukr aine Conﬂict , Gaza Conﬂict , Messi–Ronaldo , and Scams . T o support analyses requiring broader topical cov erage, we expand this in ventory by introducing additional candidate topics and associated seed terms. These additional topics and seed terms are used only to augment the training data for the bag-of-words classiﬁer . The classiﬁer architecture, feature representation, and regularization follo w X’ s implementation, with one modiﬁcation: we lower the minimum balanced-accuracy threshold for topic inclusion to 0 . 01 in order to retain topics with sparse coverage. T able 10 lists the resulting topic set and seed terms. All topic labels used in the main-text analyses are produced by this retrained bag-of-w ords classiﬁer . Independently of the topic-assignment procedure, we label topics a priori as controv ersial or non-controv ersial based on domain knowledge and prior literature. T able 11 reports the full list of topics and their controv ersy classiﬁcation. LLM-Based T opic Assignment (Alternative Classiﬁcation Procedur es) T o assess whether our results depend on the speciﬁc topic-assignment mechanism, we conduct rob ustness checks using two alternati ve LLM-based classiﬁcation procedures. These procedures are not used in the main analyses and serve only to ev aluate sensitivity to the choice of topic classiﬁer . Both LLM-based procedures use an identical, ﬁxed in ventory of topic labels: Ukraine Conﬂict, Gaza Conﬂict, Messi Ronaldo, Sports NFL, Sports NBA, Movies TV , Education, F ood Nutrition, Space Astr onomy , Health, CO VID-19, Climate En vir onment, W eather Disasters, Artiﬁcial Intelligence , T ec h Companies, US P olitics, Crime Le gal, Economy F inance, Scams, Other . In both cases, the input text is the note-le vel summary ﬁeld, and each note is assigned exactly one topic label. Notes with missing summaries are excluded. 22 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 6: The top ﬁgure shows rater factor distrib ution shift for the early user cohort and the bottom ﬁgure sho ws rater factor distrib ution shift for the new user cohort using the 2022 version of the matrix factorization code. The minority mode decreases signiﬁcantly for early users, while remaining more stable for new users. For early users, the proportion of positi ve factors during the T ransition period compared with the Stabilized period decreases from 31 . 4% to 24 . 8% , a decrease of 6 . 6 percentage points. For new users, the proportion of positive factors comparing the same two time periods decreases from 50 . 3% to 49 . 5% , a decrease of only 0 . 8 percentage points. Appr oach 1 (Prompted LLM Classiﬁcation): Our ﬁrst procedure uses a prompted instruction-tuned LLM accessed through Sno wﬂake Cortex’ s te xt completion interface. For each note summary , we supplied an e xplicit natural-language prompt that enumerates the full label set and enforces a single-label classiﬁcation objecti ve with a structured output format. Decoding was deterministic (temperature set to zero) to ensure reproducibility across runs. The exact prompt used was: You are a classifier. Choose exactly ONE topic label from: [Ukraine Conflict, Gaza Conflict, Messi Ronaldo, Sports NFL, Sports NBA, Movies TV, Education, Food Nutrition, Space Astronomy, Health, COVID-19, Climate Environment, Weather Disasters, Artificial Intelligence, Tech Companies, US Politics, Crime Legal, Economy Finance, Scams, Other]. Return ONLY valid JSON like {"topic":""}. Summary: { summary } The model’ s response was parsed to extract the value of the topic ﬁeld. Outputs that failed to parse as valid JSON or that returned a label outside the admissible set were treated as missing for this procedure (these cases were rare in practice). 23 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 7: The top ﬁgure shows rater factor distrib ution shift for the early user cohort and the bottom ﬁgure sho ws rater factor distrib ution shift for the new user cohort using the 2025 version of the matrix factorization code. Here again, the minority mode decreases, while remaining relati vely stable for ne w users. For early users, the proportion of positi ve factors decreases from 30 . 8% during the Transiti on period to 27 . 8% during the Stabilized period, a decrease of 3 . 0 percentage points. For ne w users the proportion of positiv e factors actually increases from 48 . 8% during the T ransition period to 49 . 6% during the Stabilized period, an increase of 0 . 8 percentage points. Appr oach 2 (Managed LLM Classiﬁcation): Our second procedure uses Sno wﬂake Cortex’ s managed multi-class classiﬁcation function. Gi ven a note summary and the same ﬁxed list of candidate topics, this service returns a ranked list of predicted labels. W e assigned the highest-ranked label as the note’ s topic, using the following task description to enforce a single-label objectiv e: Classify the topic of a short summary. Choose exactly one topic. As with the prompted approach, notes with missing summaries or malformed outputs were excluded for this classiﬁcation. X’ s bag-of-words classiﬁer excludes 64.22% of notes, LLM Approach 1 excludes 28.79% of notes, and LLM Approach 2 excludes 31.11% of notes from ha ving a topic. Excluding all notes that are excluded in any one of the models abov e, the three approaches agree on 10.89% of the notes. T able 5 shows pairwise comparisons for the agreement of the different approaches. The three topic-assignment procedures differ in co verage and agreement. The bag-of-words classiﬁer excludes 64.22% of notes outright, reﬂecting some limitations of X’ s native topic-modeling implementation, which degrades in precision when extended to our broader topic in ventory . This contributes to the lo wer agreement between the bag-of-words and LLM-based classiﬁers, relativ e to the agreement between the LLM-based approaches. These agreement rates indicate that the bag-of-words and LLM-based procedures are not interchangeable at the topic lev el. W e therefore interpret the LLM-based analyses as sensitivity checks to alternati ve classiﬁcation procedures. 24 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 8: This ﬁgure sho ws a histogram of the difference in rater factor between Oct. 2022 and Jan. 2023 for raters in the early cohort vs. in the ne w cohort. The mean of the early cohort factor shift is − 0 . 049 (95% bootstrap CI: [-0.067, -0.033]), and the mean of the ne w cohort factor shift is 0 . 003 (95% bootstrap CI: [-0.010, 0.017]). This shows that early user factors shift more to ward the majority than ne w user factors. Figure 9: This plot visualizes the difference in rater factor distrib ution for early users compared to new users before and after Jan. 2023 when the rollout of Rating Impact was complete. Again we see that the distribution of new user f actors remains relativ ely stable before and after , whereas the distribution of early user f actors shifts towards the majority . 25 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 10: This ﬁgure sho ws the RDD analysis on the bimodality coefﬁcient measured at weekly intervals for rater factors with the cutoff date of Oct. 1, 2022. There is a 0 . 022 decline in the bimodality coefﬁcient at the cutoff ( p = 0 . 003 ), implying that post-cutof f the distribution becomes more unimodal. Figure 11: This ﬁgure shows the RDD analysis on the bimodality coefﬁcient measured at weekly intervals for note factors with the cutoff date of Oct. 1, 2022. There is a 0 . 010 decline in the bimodality coefﬁcient at the cutoff ( p = 0 . 017 ), implying that at the cutof f the distribution jumps to a more unimodal distrib ution. 26 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 12: Permutation test test statistic distribution for the difference in Spearman correlation (post minus pre intervention) between rater-note factor dot product and helpfulness ratings among early users. The red dashed line marks the observed difference of − 0 . 267 . V ery few of the 1 , 000 permuted differences fall near the observed value, yielding p = 0 . 004 , indicating that the decline in predictiveness after October 1, 2022 is highly unlikely to hav e occurred by chance. Figure 13: Permutation test test statistic distribution for the dif ference in logistic re gression coefﬁcients (post minus pre intervention). The red dashed line marks the observed difference of − 11 . 269 . V ery fe w of the 10 , 000 permuted differences f all near the observed v alue, yielding p = 0 . 001 , indicating that the decline in predictiv eness after October 1, 2022 is highly unlikely to ha ve occurred by chance. 27 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Bag-of-W ords Prompted LLM Managed LLM Bag-of-W ords – 11.20% 10.92% Prompted LLM 11.20% – 74.57% Managed LLM 10.92% 74.57% – T able 5: Pairwise topic agreement rates between different topic modeling approaches. Percentages are calculated over denominator computed as notes that are included for both approaches being compared. C.2.2 T opic-Based Controversy Results to Alter native Classiﬁers The main-text analyses use topic labels only to stratify notes into contr oversial versus non-contr oversial bins, rather than to make topic-speciﬁc claims. T o assess sensiti vity to the topic-classiﬁcation procedure, we repeat the analysis using each classiﬁer’ s assigned labels independently (see Figures 14-15). These procedures should be interpreted as alternativ e operationalizations of controversy rather than interchangeable measurements of the same latent topic labels. Reassuringly , the qualitativ e pattern is broadly similar across procedures: after the policy change, controvers ial content is relati vely less likely to attain Helpful status than non-controv ersial content. W e therefore interpret the LLM-based results as sensitivity analyses to alternati ve classiﬁcation procedures. C.2.3 T opic-Based Difference-in-Differences Here we give the details for the DiD in T able 1. of the main text. W e construct a weekly panel dataframe for each week w and topic group g ∈ { Controversial, Non-Controv ersial } . Deﬁne n g ,w to be the number of unique tweets that receiv e their ﬁrst note in week w and k g ,w to be the number of unique tweets whose ﬁrst observed note attains Helpful status in week w . W e estimate a weekly helpfulness proportion using Jeffre ys shrinkage: p g ,w = k g ,w + 0 . 5 n g ,w + 1 . W e restrict our analysis to weeks where n g ,w > 5 for all g (the cutoff 5 is chosen by X). T o summarize the contrast between topic groups, we deﬁne the weekly helpfulness gap d w = p Controversial ,w − p Non-Controversial ,w weighted by η w = min g n g ,w . W e estimate short-term and medium-term effects by comparing the mean of d w for pre-deﬁned time periods before the rollout date (Oct. 1, 2022). The pre weeks are the 12 weeks before Oct. 1, 2022, the post short term weeks are the 12 weeks after Oct. 1, 2022, and the post medium term weeks are weeks 13-26 after Oct. 1, 2022. For each set of post weeks, we estimate d w = α + β · Post w + ϵ w . W e compute HA C standard errors, and report ˆ β , HA C standard error , 95% conﬁdence interv als, and p -values for both the short term and medium term post windows. This speciﬁcation yields the topic-based difference-in-dif ferences estimates reported in the main text. A positive coef ﬁcient β indicates that, after the rollout, non-controv ersial notes become more likely to attain Helpful status relati ve to controv ersial notes. The resulting estimates and conﬁdence intervals are reported in T able 1 of the main text. C.2.4 User -Level Engagement Analysis The note-lev el DiD above is descripti ve: it sho ws that controversial notes become relati vely less likely to attain Helpful status after the rollout, b ut it does not by itself distinguish between changes in e xposure, changes in rating beha vior , and changes in which users choose to rate which content. T o address this, we study contrib utor-le vel engagement with controv ersial content. For each user u and calendar week w , let R uw denote the total number of ratings submitted by user u in week w , and let R C uw denote the number of those ratings assigned to controv ersial notes. W e deﬁne S uw = R C uw R uw , the share of user u ’ s rating activity in week w dev oted to controversial content, and restrict attention to weeks with R uw > 0 . W e estimate the following dif ference-in-differences speciﬁcation: S uw = α u + γ w + β ( Minority u × Post w ) + ε uw , (11) 28 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 where α u are user ﬁxed ef fects, γ w are week ﬁxed ef fects, Minority u is an indicator for users classiﬁed as minority- aligned based on pre-rollout behavior , and Post w indicates weeks on or after the Rating Impact rollout. T o assess whether minority-aligned users reduce rating activity more broadly , we estimate a parallel difference-in- differences speciﬁcation using total weekly eng agement as the outcome: log(1 + R uw ) = α u + γ w + δ ( Minority u × Post w ) + η uw , (12) where variables are deﬁned as abo ve. The log transformation accommodates the ske wness of rating counts and allows coefﬁcients to be interpreted as approximate percentage changes. Comparing estimates from (11) and (12) allows us to distinguish selecti ve disengagement from o verall reductions in activity: a decline in S uw conditional on R uw > 0 indicates av oidance of controv ersial content ev en among users who remain activ e. For each note, we add a binary indicator for whether the note is controversial or non-controv ersial based on its topic, as deﬁned by T able 11. W e generate topics using LLMs, with the method described above. W e consider the cohort of users who were activ e before the initial roll out of Rating Impact. In this cohort, there are 1052 users. Our cut-off date for treatment is Oct. 1, 2022 when the Rating Impact policy w as announced. W e take the users’ pre-treatment factor to be their most recently computed factor before Oct. 1, 2022. There are 354 users with positiv e factor and 698 users with negati ve factor . Those with positiv e factor ( > 0 . 1 to av oid borderline cases at the boundary of 0 ) are labeled to be in the minority group using a binary indicator v ariable. W e then run the two DiD studies documented by (11) and (12) . W e consider a 12-week window before and after Oct. 1, 2022. Results are sho wn in the tables below . Parameter Std. Err . T -stat P-value Lo wer CI Upper CI Intercept 0.6125 0.0063 96.730 0.0000 0.6001 0.6249 treat -0.0753 0.0343 -2.1926 0.0284 -0.1427 -0.0080 T able 6: This table shows the DiD gi ven by (11) for the proportion of contro versial notes a user rates in a week. The DiD estimate of -0.0753 indicates that, post Oct. 1, 2022, the proportion of controv ersial notes rated by minority users decreased by about 7.5 percentage points relati ve to the change observed for non-minority users. This effect is statistically signiﬁcant at the 0.05 lev el (p = 0.0284). Parameter Std. Err . T -stat P-value Lo wer CI Upper CI Intercept 1.4903 0.0129 115.37 0.0000 1.4650 1.5157 treat 0.1346 0.0701 1.9216 0.0547 -0.0027 0.2720 T able 7: This table shows the DiD gi ven by (12) for the log of the total number of notes that a user rates in a week. The DiD estimate of 0.1346 indicates that, post Oct. 1, 2022, the log of the number of notes rated by minority users increases, howe ver this was not statistically signiﬁcant at the 0.05 le vel. C.2.5 Factor -Based Controv ersy Robustness As discussed in the main text, we complement the topic-based classiﬁcation with a factor-based measure of controv ersy that does not rely on text or topic labels. In the latent-factor model used by Community Notes, the note factor g n captures systematic differences in how a note is ev aluated by different groups of raters. Notes with larger absolute factor v alues therefore correspond to content that elicits more polarized ev aluations across the user base. As a robustness check, we replicate the analyses from the Annotations on Contr oversial T opics section of the main text using this factor -based deﬁnition of controv ersy . Speciﬁcally , we classify a note as contr oversial if the magnitude of its estimated factor | g n | lies in the top 20 th percentile of the distrib ution of note factors, and as non-contro versial otherwise. 14 Because note factors are re-estimated over time, we ﬁrst average the estimated factor values for each note across the pre-weeks (before the intervention) and then form the distribution over these av eraged note factors. This av eraging reduces noise from week-to-week re-estimation and ensures a stable ranking of notes by ideological extremity . Using this classiﬁcation, we re-compute W ilson conﬁdence interv als and estimate a difference-in-dif ferences (DiD) design to compare changes in note helpfulness around the rollout of Rating Impact for controv ersial versus non- controv ersial notes. For controversial notes, the share of tweets receiving at least one note rated helpful decreases 14 W e choose the 20 th percentile for concreteness; results are qualitatively unchanged when using alternativ e thresholds, e.g. 10 th percentile. 29 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 from 0 . 143 (W ilson CI [0 . 071 , 0 . 3267] ) pre-rollout to 0 . 080 (W ilson CI [0 . 059 , 0 . 107] ) post-rollout, a decline of 6 . 3 percentage points. In contrast, for non-controv ersial notes, the helpful share increases from 0 . 131 (W ilson CI [0 . 068 , 0 . 238] ) to 0 . 265 (W ilson CI [0 . 232 , 0 . 302] ), an increase of 13 . 4 percentage points. The DiD estimate using a symmetric 14 -week window around October 1, 2022 implies that the probability a non-controv ersial note is rated helpful increased by about 13 . 8 percentage points more than the probability a contro versial note is rated helpful ( p = 0 . 031 ). Results are shown in T able 8 and Figure 16. W eeks (Pre, Post) Estimate 95% CI p -value (14, 0-14) 0.138 [0.012, 0.263] 0.031 (14, 15-28) 0.239 [0.153, 0.324] <1e-5 T able 8: DiD Estimates. W indowed dif ference-in-differences estimates of the change in the weekly helpfulness rate dif ference between non-controv ersial and controversial content after the October 1, 2022 rollout. The outcome is the difference in weekly proportion of tweets receiving a note that is ultimately rated Helpful between tweets with contro versial vs. non-controv ersial notes. The gap increases by 13.8 pp in the ﬁrst 14 weeks post-rollout and by 23.9 pp in weeks 15–28. Both effects are positi ve and statistically signiﬁcant. T o account for the possibility that controv ersial notes receive more polarized ratings and therefore fail to attain helpful status, we additionally examine eng agement at the rating lev el. Speciﬁcally , we compare the proportion of all ratings (regardless of ﬁnal helpfulness status) assigned to controv ersial versus non-controv ersial notes before and after the rollout. W e run a Fisher’ s exact test on a 2 × 2 contingency table of contro versial and non-controv ersial ratings in the pre- and post-period, excluding ratings on unclassiﬁed topics. The odds ratio quantiﬁes the relativ e shift in the balance between controv ersial and non-controv ersial ratings across the two periods; an odds ratio greater than 1 indicates that controv ersial ratings were relati vely more pre valent than non-controv ersial ratings in the pre-period compared to the post-period. Results are sho wn in T able 9 for the three topic modelers. T aken together , the topic-based and factor -based analyses point in the same direction. The topic-based approach is easier to interpret substantiv ely , while the factor-based approach captures within-topic heterogeneity and a voids dependence on text class iﬁcation. The agreement between the two strengthens the conclusion that, after the rollout of Rating Impact, controv ersial content becomes relativ ely less likely to receiv e publicly surfaced annotations. T opic Modeler Pre Controv ersial Rate Post Controversial Rate Odds Ratio p -v alue Bag of W ords 0.897 0.853 1.505 < 0.0001 LLM1 0.756 0.535 2.696 < 0.0001 LLM2 0.783 0.495 3.672 < 0.0001 T able 9: Fisher’ s Exact T est Results by T opic Modeler . Odds ratios from a Fisher’ s exact test on a 2 × 2 contingency table of controv ersial and non-controversial ratings in the pre- and post-period ov er classiﬁed topics. An odds ratio greater than 1 indicates that controv ersial ratings were relativ ely more prev alent in the pre-period than in the post-period. All three topic modelers show a signiﬁcant shift tow ard non-controversial ratings after the rollout. 30 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Figure 14: Pre-post change in the share of notes with ﬁnal status Helpful by contro versy cate gory using topic-based controv ersy with topics classiﬁed using the prompted LLM approach around the Rating Impact rollout (cutoff: 2022-10- 01). Bars show the mean proportion in the roughly 20 weeks before (Pre, blue) and after (Post, orange) the cutof f; error bars are 95% CIs. The dif ference between controversial notes and non-contro versial notes is stark; controv ersial notes face a dramatic decline in helpful ratings post-rollout, whereas non-contro versial notes ha ve increased helpful ratings. Figure 15: Pre-post change in the share of notes with ﬁnal status Helpful by contro versy cate gory using topic-based controv ersy with topics classied using the managed LLM approach around the Rating Impact rollout (cutof f: 2022-10- 01). Bars show the mean proportion in the roughly 20 weeks before (Pre, blue) and after (Post, orange) the cutof f; error bars are 95% CIs. The dif ference between controversial notes and non-contro versial notes is stark; controv ersial notes face a dramatic decline in helpful ratings post-rollout, whereas non-contro versial notes ha ve increased helpful ratings. D T wo-Stage W eighted Matrix Factorization: Empirical Implementation In this section we gi ve the implementation details for the tw o-stage matrix factorization approach documented in the empirical section (Fig. 5) of the main text. As proof of concept that our algorithm improves upon the predictiv e performance of the current algorithm, we test our algorithm on ratings data from Jan. 1, 2023 to June 1, 2024, dates during which there is a larger v olume of ratings. D.1 Algorithm and Implementation Details The empirical comparison is designed to be conserv ativ e. Both the baseline algorithm and the two-stage algorithm are run on the same weekly ﬁltered datasets, with the same underlying re gularized matrix factorization objecti ve and the same sign conv ention for the latent factor . The only modiﬁcation is that the second-stage procedure reweights contributors using the in verse of their ﬁrst-stage residual v ariance. 31 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Category T opic Seed terms Conﬂict & Geopolitics Ukraine Conﬂict ukrain , russia , kiev , kyiv , moscow , zelensky , putin Gaza Conﬂict israel , palestin , gaza , jerusalem Sports & Entertainment Messi–Ronaldo messi\s , ronaldo Sports (NFL) \bnfl\b , super\sbowl , touchdown , quarterback , patrick\smahomes , eagles , cowboys , patriots Sports (NB A) \bnba\b , basketball , lebron , curry , lakers , warriors , playoffs? , finals? , dunk Movies & TV movie , film , hollywood , box\soffice , netflix , disney\+? , marvel , season\s\d+ , episode , series Education, Food & Space Education student\sloan , tuition , college , university , k[-]?12 , school\sboard Food & Nutrition \bdiet\b , calorie , vegan , vegetarian , keto , gluten[-\s]?free , nutrition , protein , sugar Space & Astronomy nasa , spacex , rocket , launch , mars , lunar , satellite , iss , telescope Science, Health & En vironment Health, CO VID-19 covid , coronavirus , pandemic , vaccine , booster , omicron , mask , cdc , \bwho\b , pfizer , moderna Climate & En vironment climate\schange , global\swarm , greenhouse , carbon , emission , renewable\senergy , solar , wind\s(?:energy|power) , climate\scrisis W eather & Disasters hurricane , tornado , earthquake , wildfire , heatwave , flood , blizzard , typhoon , storm T echnology & AI Artiﬁcial Intelligence \bai\b , artificial\sintelligence , machine\slearning , chatgpt , openai , gpt[-_]?4 , deepmind , neural\snetwork T ech Companies apple , iphone , google , android , microsoft , windows , amazon , aws , meta , facebook , instagram , threads Society & Gover nance U.S. Politics biden , trump , democrat , republican , gop , congress , senate , house\sof\srepresentatives , white\shouse , supreme\scourt , 2024\selection Crime & Legal indict , lawsuit , legal , court , judge , jury , police , arrest , felony , trial Economics & Finance Economy & Finance inflation , recession , interest\srate , stock\smarket , dow\sjones , nasdaq , gdp , unemployment , cpi Scams & Platform Integrity Scams scam , undisclosed\sad , terms\sof\sservice , help\.x\.com , x\.com/tos , engagement\sfarm , spam , gambling , apostas , apuestas , dropship , drop\sship , promotion T able 10: Our expanded set of topics and seed terms. The rege x syntax follows Python re con ventions. 32 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Category T opics Controv ersial USPolitics , UkraineConflict , GazaConflict , CrimeLegal , HealthCovid , EconomyFinance , MessiRonaldo , politics , economy , health Non-controv ersial SpaceAstronomy , ClimateEnvironment , EntertainmentMoviesTV , WeatherDisasters , TechCompanies , SportsNFL , SportsNBA , FoodNutrition , Education , Scams , ArtificialIntelligence , science , other T able 11: Manual categorization of topics as controv ersial or non-controversial. Figure 16: Pre-post change in the share of notes with ﬁnal status Helpful by controversy category using factor -based controv ersy around the Rating Impact rollout (cutoff: 2022-10-01). Bars show the mean proportion in the roughly 20 weeks before (Pre, blue) and after (Post, orange) the cutoff; error bars are 95% CIs. The difference between controv ersial notes and non-controv ersial notes is stark; controv ersial notes face a dramatic decline in helpful ratings post-rollout, whereas non-controv ersial notes have increased helpful ratings. Consistent with the Community Notes implementation, we restrict attention each week to notes that ha ve recei ved at least ﬁv e ratings and raters who have rated at least ten notes. W e index weeks by t , let D all t denote all ratings observed up to and including week t , and let D t denote the ﬁltered version of this cumulative dataset. W e write Θ t for the weekly latent-factor estimates from the platform’ s baseline matrix factorization algorithm and Θ ts t for the corresponding estimates from the two-stage procedure. The algorithm proceeds as follows. W arm start. Before the weekly ev aluation loop begins, we ﬁt the e xisting matrix factorization algorithm to all ratings observed prior to June 1, 2023 and use the resulting latent quantities as a w arm start. This reduces transient instability in the factor estimates early in the e valuation period. Stage 1: baseline matrix factorization. For each week t from June 1, 2023 to June 1, 2024, we ﬁt the platform’ s regularized matrix f actorization algorithm to the ﬁltered cumulati ve dataset D t , initialized at the pre vious week’ s ﬁt. Denote the resulting ﬁrst-stage estimates by Θ (1) t = ( ˆ µ t , ˆ h t , ˆ i t , ˆ f t , ˆ g t ) . For each observ ed user–note pair ( u, n ) ∈ D t , we compute the ﬁrst-stage residual e (1) un := r un − ˆ r un  Θ (1) t  , where ˆ r un (Θ (1) t ) is the ﬁtted rating estimate under the ﬁrst-stage model. For each contributor u , let N t ( u ) denote the set of notes rated by u in the ﬁltered cumulati ve dataset at week t . W e then estimate the contributor -speciﬁc residual variance by ˆ σ 2 u,t := V ar  { e (1) un : n ∈ N t ( u ) }  . 33 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 The corresponding second-stage weight is ˆ w u,t :=  max { ˆ σ 2 u,t , 10 − 4 }  − 1 . The ﬂoor at 10 − 4 prev ents extremely lar ge weights when a contributor’ s empirical residual variance is near zero. Stage 2: weighted matrix factorization. W e then reﬁt the same regularized matrix factorization model on the same ﬁltered cumulativ e dataset D t , but no w using contributor -speciﬁc weights: Θ (2) t = arg min Θ X ( u,n ) ∈ D t ˆ w u,t  r un − ˆ r un (Θ)  2 + λ ∥ Θ ∥ 2 2 . W e initialize this second-stage optimization at the ﬁrst-stage solution Θ (1) t . The resulting ﬁtted quantities are stored as Θ ts t := Θ (2) t . Algorithm 1 outputs two dataframes, one with latent factors from X’ s existing matrix f actorization algorithm, and one with latent factors from the two-stage algorithm. Sign con vention. Because matrix factorization identiﬁes the latent factor only up to a global sign, we orient the factors after each weekly ﬁt so that the majority of users ha ve ne gativ e latent factor . This keeps the interpretation of majority- and minority-aligned contrib utors consistent across weeks. The sign-ﬁx ed second-stage estimates are also used as the warm start for the next week’ s ﬁt. Algorithm 1: T wo-Stage Matrix F actorization Input: ratings_df : a dataframe containing rater, note pairs with the date of the rating, the rating ( Helpful, Somewhat Helpful, Not Helpful ) and the estimated helpfulness le vel based on pre vious latent factors Output: Θ t and Θ ts t : dataframes containing weekly note and rater latent factor estimates from X’ s matrix factorization algorithm and from the two-stage algorithm, respecti vely /* Warm start initialization */ 1 Let D 0 ← { ( u, n, r un ) | t < T start } 2 Θ prev ← T rainExistingMF ( D 0 ) /* Weekly latent factor estimation loop */ 3 f or week t ← T start to T end do 4 Let D all t ← { ( u, n, r un ) | timestamp ≤ t } 5 D t ← Filter ( D all t ) s.t. within D t each note has at least 5 ratings and each user has rated as least 10 notes // Stage 1: X’s Matrix Factorization Algorithm 6 Initialize Θ (1) with Θ prev 7 Θ (1) ← argmin Θ P ( u,n ) ∈D t ( r un − ˆ r un (Θ)) 2 + λ ∥ Θ ∥ 2 8 Θ t ← append (Θ t , Θ (1) ) 9 for eac h user u ∈ D t do 10 Calculate residuals: e un ← r un − ˆ r un (Θ (1) ) 11 Estimate variance: σ 2 u ← V ar ( { e un | n ∈ set of notes rated by user u } ) 12 Compute weight: w u ← (max( σ 2 u , 10 − 4 )) − 1 13 end 14 Initialize Θ (2) with Θ (1) 15 Θ (2) ← argmin Θ P ( u,n ) ∈D t w u ( r un − ˆ r un (Θ)) 2 + λ ∥ Θ ∥ 2 16 Θ ts t ← append (Θ ts t , Θ (2) ) 17 Θ prev ← FixFactorSigns (Θ (2) ) // Flip signs if needed 18 end D.2 Prediction T argets and Evaluation Protocol W e compare the baseline and two-stage procedures using out-of-sample one-week-ahead predictions. For each week t , the baseline ﬁt Θ t and the two-stage ﬁt Θ ts t are computed using ratings observed up to and including week t . These week- t estimates are then used to predict ratings observed in the follo wing week. 34 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Formally , for e very rating r un observed in week t + 1 , we compute two ﬁtted values: ˆ r MF ,t un := ˆ r un (Θ t ) , ˆ r TS ,t un := ˆ r un (Θ ts t ) . The corresponding one-week-ahead residuals are e MF ,t +1 un := r un − ˆ r MF ,t un , e TS ,t +1 un := r un − ˆ r TS ,t un . For each e v aluation week, we summarize predictiv e performance using se veral standard statistics deri ved from these out-of-sample residuals. In particular , the main text reports the weekly mean absolute residual MAR t := 1 |R t +1 | X ( u,n ) ∈R t +1 | e t +1 un | , and the weekly median absolute residual MedAR t := median  | e t +1 un | : ( u, n ) ∈ R t +1  , computed separately for the baseline and two-stage procedures, where R t +1 denotes the set of ratings observ ed in week t + 1 . These are the quantities shown in the main-text ﬁgure comparing one-week-ahead predictiv e performance of the two methods. W e also report the corresponding one-week-ahead MSE, MSE t := 1 |R t +1 | X ( u,n ) ∈R t +1  e t +1 un  2 , but our main empirical comparison emphasizes the absolute-residual metrics because they are easier to interpret on the scale of the platform’ s ﬁtted rating scores. 35 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 E Theory Appendix This appendix collects the technical results underlying the theoretical section of the main text. Throughout, we study a stylized regularized rank-1 matrix factorization model that isolates the ef fect of conformity incentiv es on estimation. This stylized estimator is intentionally simpler than the full Community Notes production system used in the empirical sections. The distinction is important: the empirical reconstruction follo ws the open-source Community Notes pipeline, whereas the theory abstracts to a regularized rank-1 factorization in order to characterize what object is recovered when contributors strate gically anticipate the platform’ s eventual consensus. The appendix is organized to mirror the logic of the main text. W e ﬁrst state the estimation problem and regularity conditions. W e then prov e consistency of note helpfulness under priv ate-signal reporting, show that conformity makes estimators biased, characterize the resulting distortion in user factors and minority compression, and ﬁnally establish a statistical guarantee for the two-stage weighted estimator . E.1 Estimation Model and Assumptions W e begin by restating the estimator analyzed in the proofs. Let Ω ⊆ [ U ] × [ N ] denote the set of observed user–note ratings. The platform ﬁts the regularized rank-1 model arg min µ,h,i,f ,g X ( u,n ) ∈ Ω  r un − ( µ + h u + i n + f u g n )  2 + λ h ∥ h ∥ 2 2 + λ i ∥ i ∥ 2 2 + λ f ∥ f ∥ 2 2 + λ g ∥ g ∥ 2 2 . (13) Here and below , for a matrix A ∈ R U × N , we write ∥ A ∥ ∗ for the nuclear norm of a matrix, ∥ A ∥ F for the Frobenius norm, and ∥ A ∥ ∞ for the entrywise maximum norm. W e use tw o sets of regularity conditions. Assumption 1 governs the latent components, noise, and sampling scheme under priv ate-signal reporting. Assumption 2 adds conditions on the anticipated consensus m n and conformity weights ρ n = ρ ( c n ) for the conformity regime. Assumption 1 (Random latent components, boundedness, and MCAR sampling) . Assume: 1. µ ∈ R is ﬁxed. 2. { ( h u , f u ) } U u =1 ar e i.i.d., with E [ h u ] = 0 , E [ f u ] = c, V ar( h u ) = σ 2 h > 0 , V ar( f u ) = σ 2 f > 0 , and h u is independent of f u . Mor eover , | h u | ≤ B h and | f u | ≤ B f almost sur ely . 3. { ( i n , g n ) } N n =1 ar e i.i.d., with E [ i n ] = 0 , E [ g n ] = 0 , V ar( i n ) = σ 2 i > 0 , V ar( g n ) = σ 2 g > 0 , and i n is independent of g n . Mor eover , | i n | ≤ B i and | g n | ≤ B g almost sur ely . 4. The user-side variables { ( h u , f u ) } U u =1 ar e independent of the note-side variables { ( i n , g n ) } N n =1 . 5. The noise variables ϵ un ar e independent, mean-zero, σ ϵ -sub-Gaussian, and independent of all other latent variables (see e.g ., [59] Deﬁnition 5.7). 6. Each entry ( u, n ) is observed independently with pr obability p ∈ (0 , 1) , and U ≍ N . Assumption 2 (Conformity parameters) . In addition to Assumption 1, assume: 1. m n ar e i.i.d. draws fr om a bounded distribution on a compact interval with positive ﬁnite variance. 2. ρ n := ρ ( c n ) is a weakly decr easing function in c n . 3. Let ¯ ρ N = N − 1 P n ρ n . Assume that ¯ ρ N → ¯ ρ ∈ [0 , 1] a deterministic constant. 4. F or each note n , the random variables m n and g n ar e independent. 36 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 E.2 Private-Signal Reporting: Consistency of Note Helpfulness (Proof of Theor em 1) W e ﬁrst study the benchmark case in which contrib utors report their pri vate signals, so the platform observes a noisy version of the latent signal matrix S = µ 1 U 1 ⊤ N + h 1 ⊤ N + 1 U i ⊤ + f g ⊤ . The proof proceeds in three steps. First, we show that the note intercept can be uniquely decoded from the ﬁtted matrix by means of a canonical centered decomposition. Second, we sho w that the optimal solution to (13) is consistent for the true intercept and f actor terms under the canonical decomposition. Finally , we connect the estimator of (13) to a corresponding nuclear-norm matrix completion problem. Centered decomposition Here, we identify the intercept terms from a complete signal matrix. For any tuple θ = ( µ, h, i, f , g ) , deﬁne M ( θ ) := µ 1 U 1 ⊤ N + h 1 ⊤ N + 1 U i ⊤ + f g ⊤ . W e use the canonical centered decomposition of M ( θ ) to recov er centered intercept and factor components directly from the matrix itself. Lemma E.1. Every matrix M ( θ ) with entries of the form m un = µ + h u + i n + f u g n admits a canonical r epr esentation with m un = ˜ µ + ˜ h u + ˜ i n + ˜ f u ˜ g n such that 1 ⊤ U ˜ h = 0 , 1 ⊤ N ˜ i = 0 , 1 ⊤ U ˜ f = 0 , 1 ⊤ N ˜ g = 0 . Since the factors f , g ar e only identiﬁable up to sign and scale, we may also choose a canonical sign and scaling r epr esentation: 1 U ∥ ˜ f ∥ 2 2 = 1 , ⟨ ˜ f , f ⟩ ≥ 0 . Pr oof. Let ¯ h = 1 U 1 ⊤ U h, ¯ i = 1 N 1 ⊤ N i, ¯ f = 1 U 1 ⊤ U f , ¯ g = 1 N 1 ⊤ N g , and deﬁne ˜ µ = µ + ¯ h + ¯ i + ¯ f ¯ g , ˜ h = h − ¯ h 1 U + ¯ g  f − ¯ f 1 U  , ˜ i = i − ¯ i 1 N + ¯ f  g − ¯ g 1 N  , ˜ f = f − ¯ f 1 U , ˜ g = g − ¯ g 1 N . Writing f = ˜ f + ¯ f 1 U , g = ˜ g + ¯ g 1 N , we hav e f g ⊤ = ˜ f ˜ g ⊤ + ¯ f 1 U ˜ g ⊤ + ¯ g ˜ f 1 ⊤ N + ¯ f ¯ g 1 U 1 ⊤ N . Substituting and collecting terms giv es M ( θ ) = ˜ µ 1 U 1 ⊤ N + ˜ h 1 ⊤ N + 1 U ˜ i ⊤ + ˜ f ˜ g ⊤ . The centering conditions 1 ⊤ U ˜ h = 0 , 1 ⊤ N ˜ i = 0 , 1 ⊤ U ˜ f = 0 , 1 ⊤ N ˜ g = 0 are immediate from the deﬁnitions. Multiplying the centered decomposition on the left and right by vectors of ones then yields ˜ µ = 1 U N 1 ⊤ U S 1 N , ˜ i = 1 U S ⊤ 1 U − ˜ µ 1 N , ˜ h = 1 N S 1 N − ˜ µ 1 U . Finally , the rank-one term is unchanged under reciprocal rescaling: ˜ f ˜ g ⊤ =  ˜ f c  c ˜ g  ⊤ , c > 0 . For ˜ f  = 0 , take c = q 1 U ∥ ˜ f ∥ 2 2 and redeﬁne ˜ f ← ˜ f /c , ˜ g ← c ˜ g . Then 1 U ∥ ˜ f ∥ 2 2 = 1 . Since simultaneously replacing ( ˜ f , ˜ g ) by ( − ˜ f , − ˜ g ) leav es the product unchanged, we may also choose the sign so that ⟨ ˜ f , f ⟩ ≥ 0 . These operations preserve both the representation and the centering conditions. If ˜ f = 0 , then the rank-one term vanishes. 37 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 In all following sections, unless explicitly otherwise stated, we work in the setting where all parameters, true and estimated, are in their canonical centered decomposition. Thus, for notation we use µ, h u , i n , f u , g n to denote the canonical decomposition of the true parameters. At the end of each section, we remark on ho w our results transfer back to the original factors and intercepts. Remark E.2. Under the canonical centered decomposition of the matrix M ( θ ) , the center ed inter cepts and factors µ, h u , i n , f u , g n ar e still bounded random variables with strictly positive, ﬁnite variance. In the following, we thus overload the notation σ h , σ i , σ f , σ g to denote the variances of the canonical centered factors in the following sections. Note that accor ding to the decomposition σ f = 1 . Consistency of solution under missing data Recall that when ρ ≡ 1 , users report their true latent signals r un = s un + ϵ un . (13) is equiv alent to the following optimization problem where ω un ∼ Bern( p ) with constant probability p : arg min µ,h,i,f ,g X u,n ω un ( r un − ( µ + h u + i n + f u g n )) 2 + λ h ∥ h ∥ 2 2 + λ i ∥ i ∥ 2 2 + λ f ∥ f ∥ 2 2 + λ g ∥ g ∥ 2 2 . (14) Let µ, h, i, f , g be the canonical representation of the true parameters. Let ˆ µ, ˆ h, ˆ i, ˆ f , ˆ g denote the canonically identiﬁed observed optimizers for (13). Then, we can prove the follo wing theorem. Theorem E.3. Let θ = ( µ, h, i, f , g ) and ˆ θ = ( ˆ µ, ˆ h, ˆ i, ˆ f , ˆ g ) . Under Assumption 1, as U, N → ∞ ˆ θ i p → θ i . The proof of Theorem E.3 requires the following lemmas. Lemma E.4. Suppose that Assumption 1 holds. Let ˆ S be the matrix with entries ˆ s un ﬁtted fr om (13) and S be the true signal matrix with entries s un = µ + h u + i n + f u g n . Further , assume that λ h , λ f = o ( N − 1 / 2 ) , and λ i , λ g = o ( U − 1 / 2 ) . Then, ∥ ˆ S − S ∥ F = O p  √ U ∧ N  . Furthermor e, the canonically identiﬁed solutions for S ∗ and ˆ S satisfy , | ˆ µ − µ | = O p  ( U ∧ N ) − 1 / 2  1 √ U ∥ ˆ h − h ∥ 2 = O p  ( U ∧ N ) − 1 / 2  , 1 √ U ∥ ˆ f − f ∥ 2 = O p  ( U ∧ N ) − 1 / 2  1 √ N ∥ ˆ i − i ∥ 2 = O p  ( U ∧ N ) − 1 / 2  , 1 √ N ∥ ˆ g − g ∥ 2 = O p  ( U ∧ N ) − 1 / 2  . Pr oof. Let ˆ θ be the canonical representation (as gi ven by Lemma E.1) of a solution to (13). Then, letting δ µ = ˆ µ − µ, δ h = ˆ h − h, δi = ˆ i − i, δf = ˆ f − f , δg = ˆ g − g , we can write ˆ s un = µ + δ µ + h u + δ h u + i n + δ i n + ( f u + δ f u )( g n + δ g n ) . Let δ ˆ θ = ( δ µ, δ h, δ i, δ f , δ g ) . Let ˆ S be the matrix with entries ˆ s un . Since ˆ S solves (13), we furthermore hav e that X u,n ω un ( r un − s un ) 2 − X u,n ω un ( r un − ˆ s un ) 2 + G λ ( θ ) − G λ ( ˆ θ ) > 0 . Some rearranging of the abov e implies that X u,n ω un ( ˆ s un − s un ) 2 ≤ 2 X u,n ω un ϵ un ( ˆ s un − s un ) + G λ ( θ ) − G λ ( ˆ θ ) (15) ≤ 2 X u,n ω un ϵ un ( ˆ s un − s un ) + G λ ( θ ) . (16) Let δ un = ˆ s un − s un . Then, we ha ve that δ un = x un + y un 38 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 where x un = δ µ + δ h u + δ i n + f u δ g n + δ f u g n , y un = δ f u δ g n . Rewriting the error bound from abo ve and using the assumptions on the re gularization terms, we then have X u,n ω un ( x un + y un ) 2 ≤ 2 X u,n ω un ϵ un ( x un + y un ) + O p ( √ U ∧ N ) . This implies that 1 2 X u,n ω un x 2 un ≤ X u,n ω un y 2 un + 2 X u,n ω un ϵ un ( x un + y un ) + O p ( √ U ∧ N ) . First, we can lower bound the left-hand side. Under the canonical choice of centering constraints: δ h ⊤ 1 U = δ i ⊤ 1 N = δ f ⊤ 1 U = δ g ⊤ 1 U = 0 , we hav e X u,n x 2 un = U N ( δ µ ) 2 + N ∥ δ h ∥ 2 2 + U ∥ δ i ∥ 2 2 + ∥ f δ g ⊤ + δ f g ⊤ ∥ 2 F = U N ( δ µ ) 2 + N ∥ δ h ∥ 2 2 + U ∥ δ i ∥ 2 2 + ∥ f ∥ 2 2 ∥ δ g ∥ 2 2 + ∥ g ∥ 2 2 ∥ δ f ∥ 2 2 + 2 ⟨ f , δ f ⟩⟨ g, δ g ⟩ . By the choice of canonical representation for the factors f and g from Lemma E.1, we hav e that ⟨ f , δ f ⟩ = − 1 2 ∥ δ f ∥ 2 2 . Continuing from abov e we have that there e xists constants c 1 , c 2 with high probability such that ∥ f ∥ 2 2 ∥ δ g ∥ 2 2 + ∥ g ∥ 2 2 ∥ δ f ∥ 2 2 + 2 ⟨ f , δ f ⟩⟨ g, δ g ⟩ = U ∥ δ g ∥ 2 2 + ∥ g ∥ 2 2 ∥ δ f ∥ 2 2 − ∥ δ f ∥ 2 2 ⟨ g , δ g ⟩ ≥ c 1  U ∥ δ g ∥ 2 2 + N ∥ δ f ∥ 2 2  − c 2 ∥ δ f ∥ 2 2 ∥ δ g ∥ 2 2 , where in the last line we bounded ⟨ g , δ g ⟩ by Cauchy-Schwarz and then used Y oung’ s inequality . Note that by Assumption 1, since f , g hav e component-wise v ariances bounded and away from zero, continuing from abo ve we ha ve for some constant c that X u,n x 2 un ≥ c  U N ( δ µ ) 2 + N ∥ δ h ∥ 2 2 + U ∥ δ i ∥ 2 2 + N ∥ δ f ∥ 2 2 + U ∥ δ g ∥ 2 2  − c 2 ∥ δ f ∥ 2 2 ∥ δ g ∥ 2 2 . Now , it sufﬁces to show that the observed errors do not deviate too much from the population errors. Deﬁne the following orthonormal basis matrices. Let Q h be an orthonormal basis for the space of v ectors { v ∈ R U : 1 ⊤ U v = 0 } , Q i , Q g be an orthonormal basis for the space of vectors { v ∈ R N : 1 ⊤ N v = 0 } . Finally , let Q f to be an orthonormal basis for the tangent space of vectors { v ∈ R U : 1 ⊤ U v = 0 , ( f ) ⊤ v = 0 } . Deﬁne z un =      1 Q h ( u, :) Q i ( n, :) g n Q f ( u, :) f u Q g ( n, :)      . Let ˆ G = X u,n ω un z un z ⊤ un , G = X u,n z un z ⊤ un , and D be the block diagonal matrix D = diag  U N N I U − 1 U I N − 1 ∥ g ∥ 2 2 I U − 2 ∥ f ∥ 2 2 I N − 1  . where I N denotes the N × N identity matrix. Write δ f = Q ⊤ f c + r f , where r = ⟨ f , δ f ⟩ ∥ f ∥ 2 2 . Since ∥ ˆ f ∥ 2 2 = ∥ f ∥ 2 2 = U , we have 2 ⟨ f , δ f ⟩ + ∥ δ f ∥ 2 2 = 0 , and therefore r = − ∥ δ f ∥ 2 2 2 U . 39 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Deﬁne x lin un = δ µ + δ h u + δ i n + f u δ g n + g n ( Q ⊤ f c ) u , x rem un = r f u g n , so x un = x lin un + x rem un . Then we can write X u,n ω un ( x lin un ) 2 = δ ϑ ⊤ ˆ Gδ ϑ, X u,n ( x lin un ) 2 = δ ϑ ⊤ Gδ ϑ, where δ ϑ = ( δ µ, a, b, c, d ) , a = Q h δ h, b = Q i δ i, d = Q g δ g . In addition, notice that ˜ G = D − 1 / 2 GD − 1 / 2 = I . Write ˜ G = D − 1 / 2 ˆ GD − 1 / 2 . Then, v ⊤ ( ˜ G − pI ) v = X u,n ( ω un − p )( v ⊤ ˜ z un ) 2 , ˜ z un = D − 1 / 2 z un . W e ha ve that ∥ ˜ z un ∥ 2 2 ≤ 1 U N + ∥ Q h ( u, :) ∥ 2 2 N + ∥ Q i ( n, :) ∥ 2 2 U + | g n | 2 ∥ Q f ( u, :) ∥ 2 2 ∥ g ∥ 2 2 + | f u | 2 ∥ Q g ( n, :) ∥ 2 2 ∥ f ∥ 2 2 . Using the fact that ∥ Q h ( u, :) ∥ 2 , ∥ Q i ( n, :) ∥ 2 , ∥ Q f ( u, :) ∥ 2 , ∥ Q g ( n, :) ∥ 2 ≤ 1 , and Assumption 1, this giv es ∥ ˜ z un ∥ 2 2 ≤ C U ∧ N . Therefore each summand X un := ( ω un − p ) ˜ z un ˜ z ⊤ un satisﬁes ∥ X un ∥ op ≤ C U ∧ N . The matrix-Bernstein variance term is O (( U ∧ N ) − 1 ) . Hence matrix Bernstein gi ves ∥ ˜ G − pI ∥ op = O p r log( U + N ) U ∧ N ! = o p (1) . For an y vector v we then hav e    v ⊤ ( ˆ G − pG ) v    =    D 1 / 2 v ⊤ ( ˜ G − pI ) D 1 / 2 v    ≤ ∥ ˜ G − pI ∥ op v ⊤ D v = o p (1) v ⊤ Gv . This implies that X u,n ω un ( x lin un ) 2 = ( p + o p (1)) X u,n ( x lin un ) 2 . Moreov er , X u,n ( x rem un ) 2 = r 2 ∥ f ∥ 2 2 ∥ g ∥ 2 2 ≤ C N U ∥ δ f ∥ 4 2 , and hence, by Cauchy-Schwarz,      X u,n x lin un x rem un      ≤ ∥ X lin ∥ F ∥ X rem ∥ F . Therefore X u,n x 2 un ≥ 1 2 X u,n ( x lin un ) 2 − X u,n ( x rem un ) 2 , and similarly X u,n ω un x 2 un ≥ 1 2 X u,n ω un ( x lin un ) 2 − X u,n ω un ( x rem un ) 2 . 40 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Since U ≍ N with probability tending to 1 , the remainder terms are bounded by C ∥ δ f ∥ 4 2 . Combining this with the lower bound on P u,n x 2 un established abov e, we obtain with high probability X u,n ω un x 2 un ≥ c ′  U N ( δ µ ) 2 + N ∥ δ h ∥ 2 2 + U ∥ δ i ∥ 2 2 + N ∥ δ f ∥ 2 2 + U ∥ δ g ∥ 2 2  − C ∥ δ f ∥ 2 2 ∥ δ g ∥ 2 2 − C ∥ δ f ∥ 4 2 for some constants c ′ , C > 0 . Now , we turn to proving upper bounds. Recall that it remains to upper bound the terms X u,n ω un y 2 un , X u,n ω un ϵ un x un , X u,n ω un ϵ un y un . Deﬁne M ( δ ) = U N ( δ µ ) 2 + N ∥ δ h ∥ 2 2 + U ∥ δ i ∥ 2 2 + N ∥ δ f ∥ 2 2 + U ∥ δ g ∥ 2 2 . Then, X u,n ω un y 2 un ≤ X u,n y 2 un = X u,n ( δ f u δ g n ) 2 ≤ ∥ δ f ∥ 2 2 ∥ δ g ∥ 2 2 ≤ M ( δ ) 2 U N . Since X is at most rank 5 , we hav e      X u,n ω un ϵ un x un      ≤ ∥ Ω ◦ E ∥ op · ∥ X ∥ ∗ ≤ √ 5 ∥ Ω ◦ E ∥ op · ∥ X ∥ F . Here, we slightly abuse notation and let Ω denote the matrix with entries ω un . Using the fact that ∥ X ∥ 2 F ≲ M ( δ ) , Y oung’ s inequality , and sub-Gaussian errors from Assumption 1, we have that for some constants η and C η , √ 5 ∥ Ω ◦ E ∥ op · ∥ X ∥ F ≤ η M ( δ ) + C η ∥ Ω ◦ E ∥ 2 op = η M ( δ ) + O p ( U ∨ N ) . Finally ,      X u,n ω un ϵ un y un      ≤ ∥ Ω ◦ E ∥ op · ∥ δ f ∥ 2 2 · ∥ δ g ∥ 2 2 ≤ ∥ Ω ◦ E ∥ op 2 √ U N  N ∥ δ f ∥ 2 2 + U ∥ δ g ∥ 2 2  ≤ ∥ Ω ◦ E ∥ op 2 √ U N M ( δ ) . Combining this all giv es, for some constants c ′ , C ′  c ′ − η − ∥ Ω ◦ E ∥ op 2 √ U N  M ( δ ) ≤ C ′ M ( δ ) 2 U N + O p ( U ∨ N ) . By sub-Gaussian concentration, we hav e that ∥ Ω ◦ E ∥ op = O p ( √ U ∨ N ) . Thus, the bound simpliﬁes to ( c ′ − η − o p (1)) M ( δ ) ≤ C ′ M ( δ ) 2 U N + O p ( U ∨ N ) . (17) By a simple argument, we can argue that M ( δ ) = o p ( U N ) . Fix ϵ > 0 . T ake η < c ′ / 4 so that c ′ − η > c ′ / 2 . Since U ≍ N with probability tending to 1 , on the ev ent A = { M ( δ ) ≥ ϵN 2 } (17) giv es c ′ 2 ϵN 2 ≤ C ′ ϵ 2 N 2 + O p ( N ) , which implies that c ′ 2 ϵ ≤ C ′ ϵ 2 + O p ( N − 1 ) . If ϵ < c ′ / (4 C ′ ) , then ϵ 2 < c ′ / 2 ϵ , implying that M ( δ ) = o p ( U N ) . Putting this all together , we hav e that U N ( δ µ ) 2 + N ∥ δ h ∥ 2 2 + U ∥ δ i ∥ 2 2 + N ∥ δ f ∥ 2 2 + U ∥ δ g ∥ 2 2 = O p ( U ∨ N ) = ⇒ ( δ µ ) 2 + 1 U ∥ δ h ∥ 2 2 + 1 N ∥ δ i ∥ 2 2 + 1 U ∥ δ f ∥ 2 2 + 1 N ∥ δ g ∥ 2 2 = O p  1 U ∧ N  immediately implying the Frobenius bounds on the recovered matrix ˆ S , as well as the bounds for each of the intercept terms. 41 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Pr oof of Theor em E.3. First, we sho w consistency of the estimated user-side factors ˆ α u = ( ˆ h u , ˆ f u ) ⊤ . By the same argument, we can conclude consistency of the note-side f actors ˆ β n = ( ˆ g n , ˆ i n ) ⊤ . Note that Lemma E.4 already implies consistency of the global intercept term ˆ µ to µ . Fix u and let η u denote the vector of all parameters other than α u . Denote the loss function in the objecti ve by L ( θ ) = X ( u,n ) ω un ( r un − ( µ + h u + i n + f u g n )) 2 + λ h ∥ h ∥ 2 2 + λ i ∥ i ∥ 2 2 + λ f ∥ f ∥ 2 2 + λ g ∥ g ∥ 2 2 . Let S ( θ ) = ∇ L ( θ ) be called the score function (gradient) of the loss function. Let S u ( θ ) = ( ∂ h u L ( θ ) , ∂ f u L ( θ )) ⊤ . If ˆ θ minimizes (13) , then for all u , S u ( ˆ θ ) = 0 is the ﬁrst order condition on α u . Let θ denote the true parameter v ector , then by Lemma E.4 we may write the T aylor expansion of S u about θ as 0 = S u ( ˆ θ ) = S u ( θ ) +  Z 1 0 H u ( θ + t ( ˆ θ − θ )) dt  ( ˆ θ − θ ) , (18) Deﬁne ¯ H u = Z 1 0 H u ( θ + t ( ˆ θ − θ )) dt, and ¯ H u,α = Z 1 0 ∇ α u S u ( θ + t ( ˆ θ − θ )) dt, ¯ H u,η = Z 1 0 ∇ η u S u ( θ + t ( ˆ θ − θ )) dt. Then, we can rewrite (18) as 0 = S u ( θ ) + ¯ H u,α ( ˆ α u − α u ) + ¯ H u,η ( ˆ η u − η u ) . Therefore, ˆ α u − α u = ¯ H − 1 u,α ( − S u ( θ ) − ¯ H u,η ( ˆ η − η )) =  1 pN ¯ H u,α  − 1 1 pN ( − S u ( θ ) − ¯ H u,η ( ˆ η u − η u )) . (19) First, observe that ¯ H u,α =  P n ω un + λ h P n ω un  g n + 1 2 ( ˆ g n − g n )  P n ω un  g n + 1 2 ( ˆ g n − g n )  P n ω un  g 2 n + 1 2 g n ( ˆ g n − g n ) + 1 3 ( ˆ g n − g n ) 2  + λ f .  Thus, by 1, Lemma E.4, and weak law of lar ge numbers, 1 pN ¯ H u,α p →  1 + λ h N 0 0 σ 2 g + λ f N  . The limiting matrix is diagonal, hence in vertible with smallest eigen value bounded aw ay from 0 (since σ 2 g > 0 by Assumption 1). Next, notice that 1 pN S u ( θ ) = − 2 pN X n ω un  1 g n  ϵ un + 1 pN  λ h h u λ f f u  = o p (1) , by weak law of lar ge numbers, and assumption 1. Finally , it remains to bound ¯ H u,η ( ˆ η − η ) . W e have that ¯ H u,η ( ˆ η − η ) = 2( ˆ µ − µ ) Z 1 0 X n ω un  1 g n + t ( ˆ g n − g n )  dt + X n Z 1 0 ω un ( ˆ i n − i n )  1 g n + t ( ˆ g n − g n )  dt + X n Z 1 0 ω un ( ˆ g n − g n )  f u + ( ˆ f u − f u ) f u ( t ) g n ( t ) − e un ( θ ( t ))  dt, = 2 X n ω un  1 g n + 1 2 ( ˆ g n − g n )  ( ˆ µ − µ ) + 2 X n ω un  1 g n + 1 2 ( ˆ g n − g n )  ( ˆ i n − i n ) + X n Z 1 0 ω un ( ˆ g n − g n )  f u + t ( ˆ f u − f u ) f u g n + 1 2 f u ( ˆ g n − g n ) + 1 2 g n ( ˆ f u − f u ) + 1 3 ( ˆ f u − f u )( ˆ g n − g n ) − R 1 0 e un ( θ ( t )) dt  dt where for any scalar quantity ℓ , ℓ ( t ) = ℓ + t ( ˆ ℓ − ℓ ) , and e un = r un − µ − h u − i n − f u g n . Again, by weak law of large numbers, Lemma E.4, and Assumption 1, we can conclude that 1 pN ¯ H u,η ( ˆ η − η ) = o p (1) . Finally , plugging everything back in to (19) we can conclude that ˆ α u is pointwise consistent for α u . Repeating the same argument for ˆ β n guarantees pointwise consistency for the note-side factors, thus pro ving the theorem. 42 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 T o prov e Theorem 1 from the main text, it now sufﬁces to sho w that the solution to (13) with no missing data is consistent for the true parameters under the canonical decomposition. W e will then be able to recov er the original factors by de-centering the estimates using Lemma E.1. The proof is similar to the frame work in [ 8 ], but we specialize to our case and write a proof here for completion. Pr oof of Theor em 1. Theorem E.3 implies that the canonical representation of the solution to (13) is consistent for the canonical representation of the true parameter values. Let θ 0 = ( µ 0 , h 0 , i 0 , f 0 , g 0 ) denote the original true parameters (i.e. before canonical centering) and θ = ( µ, h, i, f , g ) denote their canonical representations. By Lemma E.1, the centered representation i n relates to i 0 n as follows, i n = i 0 n − ¯ i 0 1 N + ¯ f 0 ( g 0 n − ¯ g 0 1 N ) , with ¯ i 0 = 1 N 1 ⊤ N i 0 , ¯ f 0 = 1 U 1 ⊤ U f 0 . Let ˆ θ = ( ˆ µ, ˆ h, ˆ i, ˆ f , ˆ g ) denote the canonical solution to (13) . By Theorem E.3, ˆ θ p → θ coordinate-wise. Furthermore, by Lemma E.1 and weak law of large numbers, g n = g 0 n + ¯ g 0 1 N p → g 0 n since E [ g n ] = 0 . Therefore, ˆ g n p → g 0 n . This in turn implies, by weak law of lar ge numbers, that ˆ i n p → i 0 n + cg 0 n , where c = E [ f u ] . If c is known, then we can recov er a consistent estimate of i 0 n by ˆ i 0 n = ˆ i n + c ˆ g n . Matrix-completion f ormulation and incoherence In this section, we describe a related approach for ﬁnding estimators for latent factors and intercept terms. W e can ﬁrst estimate the missing entries of the observed matrix using matrix completion techniques, and then use the canonical decomposition of the resulting matrix as an estimator for the latent factors. Deﬁne the nuclear-norm completion problem ˆ S mc ∈ arg min X ∈ R U × N 1 2 X ( u,n ) ∈ Ω ( X un − R un ) 2 + λ ∥ X ∥ ∗ . (20) The following deﬁnition and lemmas v erify the conditions needed to apply the matrix-completion result. As before, in the truthful regime users report r un = s un + ϵ un , s un = µ + h u + i n + f u g n , where ϵ un is an additi ve noise term. Let S be the true signal matrix with entries s un . W e apply the framework of [ 13 ] (see also [ 22 ]) to get bounds on the entry-wise norm of the error of an estimator of S , ∥ b S mc − S ∥ ∞ , and then sho w ho w this translates to pointwise consistenc y of the intercept terms. This framework requires bounds on the condition number , κ ( S ) , and that S is ν -incoherent, which we deﬁne next. Deﬁnition 3 (Incoherence) . A rank- r matrix A ∈ R U × N with Singular V alue Decomposition A = U Σ V ⊤ is said to be ν -incoher ent if ∥ U ∥ 2 , ∞ ≤ r ν U ∥ U ∥ F = r ν r U , and ∥ V ∥ 2 , ∞ ≤ r ν N ∥ V ∥ F = r ν r N . Her e ∥ X ∥ 2 , ∞ denotes the max r ow ℓ 2 -norm of all r ows in the matrix X . Lemma E.5. Under Assumption 1, the signal matrix S = µ 1 U 1 ⊤ N + h 1 ⊤ N + 1 U i ⊤ + f g ⊤ has rank at most 3 , and there e xists a deterministic constant ν < ∞ such that, with pr obability tending to one, S is ν -incoher ent and has condition number κ ( S ) = O p (1) . Pr oof. Let θ = ( µ, h, i, f , g ) be the canonical representation of S given by Lemma E.1. Then S ( θ ) = µ 1 U 1 ⊤ N + h 1 ⊤ N + 1 U i ⊤ + f g ⊤ . Write L := [ 1 U h f ] ∈ R U × 3 , R := [ µ 1 N + i 1 N g ] ∈ R N × 3 . 43 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Then S = LR ⊤ , so rank( S ) ≤ 3 . Let ℓ ⊤ u denote the u -th row of L . By the weak law of lar ge numbers, 1 U L ⊤ L = 1 U X u ℓ u ℓ ⊤ u p − → Σ L := E [ ℓ 1 ℓ ⊤ 1 ] =   1 0 0 0 σ 2 h 0 0 0 σ 2 f   , which is positi ve deﬁnite. Therefore, there exist constants 0 < c L < C L < ∞ such that, with probability tending to one, c L U I 3 ⪯ L ⊤ L ⪯ C L U I 3 . Let L = Q L T L be the thin QR decomposition. Since T ⊤ L T L = L ⊤ L , σ min ( T L ) ≥ √ c L U with probability tending to one. Since the latent variables are uniformly bounded under Assumption 1, so too are their centered decompositions, so there exists a deterministic constant B L < ∞ such that with probability tending to one max 1 ≤ u ≤ U ∥ ℓ u ∥ 2 ≤ B L . Hence ∥ ( Q L ) u · ∥ 2 ≤ ∥ ℓ u ∥ 2 ∥ T − 1 L ∥ op ≤ B L σ min ( T L ) − 1 = O p ( U − 1 / 2 ) , uniformly in u . Therefore ∥ Q L ∥ 2 , ∞ = O p ( U − 1 / 2 ) . Now let r ⊤ n denote the n -th row of R . Again by the weak law of lar ge numbers, 1 N R ⊤ R = 1 N N X n =1 r n r ⊤ n p − → Σ R := E [ r 1 r ⊤ 1 ] =   µ 2 + σ 2 i µ 0 µ 1 0 0 0 σ 2 g   . The matrix Σ R is positi ve deﬁnite because its upper -left 2 × 2 principal minor equals σ 2 i > 0 , and σ 2 g > 0 . Thus, for some constants 0 < c R < C R < ∞ , c R N I 3 ⪯ R ⊤ R ⪯ C R N I 3 with probability tending to one. If R = Q R T R is the thin QR decomposition, the same argument giv es ∥ Q R ∥ 2 , ∞ = O p ( N − 1 / 2 ) . Finally , deﬁne K := T L T ⊤ R . Then S = LR ⊤ = Q L K Q ⊤ R . Let K = e U Σ e V ⊤ be the singular v alue decomposition of K . Then S = ( Q L e U )Σ( Q R e V ) ⊤ is a singular value decomposition of S . Since right multiplication by an orthogonal matrix preserves ro w-wise Euclidean norms, ∥ U S ∥ 2 , ∞ ≤ ∥ Q L ∥ 2 , ∞ = O p ( U − 1 / 2 ) ∥ V S ∥ 2 , ∞ ≤ ∥ Q R ∥ 2 , ∞ = O p ( N − 1 / 2 ) , where U S and V S denote the left and right singular vector matrices of S . Since rank( S ) ≤ 3 , this is exactly the desired incoherence bound. Finally , the nonzero singular values of S are exactly the singular v alues of K so κ ( S ) = κ ( K ) . By submultiplicati vity of the spectral condition number , κ ( K ) = κ ( T L T ⊤ R ) ≤ κ ( T L ) κ ( T R ) . On the high-probability ev ent abov e, κ ( T L ) ≤ r C L c L , κ ( T R ) ≤ r C R c R , so κ ( S ) ≤ r C L C R c L c R = O (1) with probability tending to one. Moreover , the nonzero singular values of S are exactly the singular values of K = T L T ⊤ R . On the same high-probability ev ent, σ min ( T L ) ≥ √ c L U , σ min ( T R ) ≥ √ c R N . Since T L , T R ∈ R 3 × 3 , we may apply the bound σ min ( AB ) ≥ σ min ( A ) σ min ( B ) to obtain σ min ( K ) ≥ σ min ( T L ) σ min ( T R ) ≥ √ c L c R √ U N . Therefore the smallest nonzero singular value of S satisﬁes σ + min ( S ) = σ min ( K ) = Ω( √ U N ) with probability tending to one. Combined with the bound κ ( S ) = O (1) , this also implies σ + min ( S ) = Θ( √ U N ) , σ max ( S ) = Θ( √ U N ) with probability tending to one. This completes the proof. 44 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 W e no w apply Theorem 1 from [13], which we bring next for completion: Theorem E.6 (Theorem 1 from [ 13 ]) . Assume that S is ν -incoher ent with condition number κ , such that κ = O (1) . Assume that U ≍ N . Let λ = C λ σ p ( U ∨ N ) p be the r e gularizer in (20) for some lar ge constant C λ . Then, with pr obability at least 1 − O (( U ∨ N ) − 3 ) , any minimizer ˆ S of (20) satisﬁes ∥ ˆ S − S ∥ ∞ ≲ √ κ 3 ν r σ σ min s ν log( U ∨ N ) p ∥ S ∥ ∞ . Since rank( S ) ≤ 3 , S is ν -incoherent with ν = O (1) and κ ( S ) = O p (1) by Lemma E.5, and ∥ S ∥ ∞ ≤ | µ | + B h + B i + B f B g < ∞ , Theorem E.6 yields ∥ b S mc − S ∥ ∞ = O p r log( U ∨ N ) U ∨ N ! . (21) Proof of Theor em 1 (Matrix Completion) Now , we can prov e an analogous version of Theorem 1 from the main text under Assumption 1, using the nuclear norm estimator . Pr oof of Theor em 1 (Matrix Completion). Deﬁne the matrix-decoded note estimator from b S mc by ˆ µ mc = 1 U N 1 ⊤ U b S mc 1 N , ˆ i mc := 1 U b S ⊤ mc 1 U − b µ mc 1 N . Let ( µ, h, i, f , g ) be the canonical centering of the true parameters. Then corresponding centered note effect is i = 1 U S ⊤ 1 U − µ 1 . N Comparing ˆ i mc to i we hav e, ∥ ˆ i mc − i ∥ ∞ =     1 U ( b S mc − S ) ⊤ 1 U −  1 U N 1 ⊤ U ( b S mc − S ) 1 N  1 N     ∞ ≤     1 U ( b S mc − S ) ⊤ 1 U     ∞ +     1 U N 1 ⊤ U ( b S mc − S ) 1 N     ≤ 2 ∥ b S mc − S ∥ ∞ . Using (21), this giv es ∥ ˆ i mc − i ∥ ∞ = O p ( α U,N ) , α U,N = r log( U ∨ N ) U ∨ N . Therefore, for each ﬁxed n , | ˆ i mc ,n − i n | = O p ( α U,N ) → 0 . Remark E.7. In the same way as in the pr oof of Theor em 1, by de-centering the estimated ˆ i mc ,n , we can r ecover an estimate for the original note helpfulness, i 0 n . Let ˆ θ = ( ˆ µ mc , ˆ h mc , ˆ i mc , ˆ f mc , ˆ g mc ) denote the matrix-decoded estimators, θ = ( µ, h, i, f , g ) denote the canonical centered r epr esentation of the true estimators, and θ 0 = ( µ 0 , h 0 , i 0 , f 0 , g 0 ) denote the true estimators. By the same ar gument as in the Pr oof of Theor em 1 (Matrix Completion), we can conclude that ˆ h mc ,u p → h u . This also implies that the rank-1 residual ˆ f mc ,u ˆ g mc ,n p → f u g n , and thus the canonical r epr esentations of the factors, as per Lemma E.1 also con verg e pointwise. Thus ˆ i 0 n = ˆ i mc ,n + c ˆ g mc ,n is again a consistent estimate of i 0 n . 45 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 E.3 Proof of the Conf ormity Regime (Theorem 2) W e now turn to the case ρ ( · ) < 1 , where contributors partially align their reports with the anticipated platform consensus. In this regime, the platform no longer observes the latent signal matrix S , but rather a conformity-distorted matrix. The proof parallels the pri vate-signal case, but with a dif ferent target matrix and a different limiting parameter . Here, we restate the Theorem 2 from the main text to be pro ven. Theorem E.8 (Conformity Regime) . Assume that ρ ≡ 1 , i.e. ther e exists at least one n such that ρ n < 1 . Let ˆ θ = ( ˆ µ, ˆ h, ˆ i, ˆ f , ˆ g ) denote the canonically chosen estimator fr om either (13) and let θ = ( µ, h, i, f , g ) denote the canonically center ed true parameters. Under Assumptions 1-2, ˆ i n p → i ∞ n with i ∞ n  = i n for at least one n . In the case that ρ ≡ 1 , a users’ best action is given by a un + ϵ un = ρ ( c n ) s un + (1 − ρ ( c n )) m n + ϵ un , where ϵ un satisﬁes the condition in Assumption 2. For notational simplicity , in this section we denote ρ n = ρ ( c n ) . W e can use the techniques from Section E.2 to show that in this case, the platform’ s estimate for the note helpfulness does not allow for the reco very of the true note helpfulness. Pr oof of Theor em E.8. The natural target under conformity is not the original note effect i n , b ut the note intercept induced by the approximation to the distorted matrix A with entries a un . Under misspeciﬁcation, the arguments in Section E.2 continue to apply with the true parameter replaced by the canonical centered pseudo-true parameters, i.e., the projections of A recov ered from Lemma E.1. The only change is that the residual is ζ un = ϵ un + ( a un − s ∗ un ) rather than ϵ un ; the additional approximation term is absorbed by the projection property of the pseudo-true optimizer , so all rates in Lemma 2 and the subsequent score expansion argument goes through with minor modiﬁcation. Thus, it suf ﬁces to understand the limiting behavior of ˆ µ, ˆ h u , ˆ i n , ˆ f u , ˆ g n when they solv e (13) with full data. W e show that ˆ i is a biased estimate of i . Deﬁne δ n = m n − ( µ + i n ) . W e have that ˆ µ = 1 U N X u,n ρ n ( µ + h u + i n + f u g n ) + (1 − ρ n ) m n = 1 U N X u,n µ + i n + ρ n ( h u + f u g n ) + (1 − ρ n ) δ n = µ + 1 N X n i n + 1 U X u h u ! 1 N X n ρ n ! + 1 U X u f u ! 1 N X n ρ n g n ! + 1 N X n (1 − ρ n ) δ n . Using Assumption 1-2 and weak law of lar ge numbers, we hav e that ˆ µ p → µ + E [ δ 1 ](1 − ¯ ρ ) . Then, for ﬁxed n , we ha ve ˆ i n = 1 U X u ( µ + i n + ρ n ( h u + f u g n ) + (1 − ρ n ) δ n ) − ˆ µ = µ + i n + ρ n  1 U X u h u  + ρ n g n  1 U X u f u  + (1 − ρ n ) δ n − ˆ µ. By the weak law of lar ge numbers and Assumption 1, 1 U X u h u p − → E [ h u ] = 0 , 1 U X u f u p − → E [ f u ] = c. Since ρ n and g n do not depend on u , it follows by Slutsk y’ s theorem that ρ n  1 U X u h u  p − → 0 , ρ n g n  1 U X u f u  p − → cρ n g n . Therefore, ˆ i n p → i n + cρ n g n + (1 − ρ n ) δ n − (1 − ¯ ρ ) E [ δ 1 ] . In this setting, the platform cannot recov er an estimate of the true i n without additionally kno wing δ n and ρ n , hence generally i ∞ n  = i n . 46 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 E.4 User -Factor Distortion and Minority Compr ession No w , we turn to understanding when the factor estimates become biased in the case where users do not report truthfully . In this section, we again assume that all estimated factors are chosen under the canonical decomposition, and at the end of the section we discuss what the implications are for the true factors. By Theorem E.3 we may proceed as if we observe full data. The ﬁrst order conditions on f u and g n in the full data case giv e ˆ f u = P n y un g n P n g 2 n , ˆ g n = P u y un f u P u f 2 u , (22) where y un = a un − ( ˆ µ + ˆ h u + ˆ i n ) is the estimated rank-1 residual. Theorem E.9. Let ρ ≡ 1 . Consider the setting fr om Theor em E.8, and suppose that the true note factors { g n } n ar e known. Let ˆ µ, ˆ h u , ˆ i n , ˆ f u denote the canonically center ed solution to (13) . Then E [ ˆ f u | f u , g n , c n ] = w 1 f u − w 1 c + o p (1) , w 1 = P n ρ n g 2 n P n g 2 n . Pr oof. From (13), the ﬁrst order conditions for f u giv es ˆ f u = P n y un g n P n g 2 n + λ f , (23) where y un = a un − ( ˆ µ + ˆ h u + ˆ i n ) is the estimated residual. First, we compute E [ y un | f u , g n , c n ] . Recall that the ﬁrst order conditions on ˆ h u and ˆ i n imply that ˆ h u = 1 N X n a un − ˆ µ, ˆ i n = 1 U X u a un − ˆ µ, with ˆ µ deﬁned as in the proof of Theorem E.8. Then, E [ y un | f u , g n , c n ] = E [ a un − ( ˆ µ + ˆ h u + ˆ i n ) | f u , g n , c n ] = E " a un − 1 N X n a un + 1 U X u a un − ˆ µ ! | f u , g n , c n # . W e can compute each term separately . E [ a un | f u , g n , c n ] = E [ ρ n ( µ + h u + i n + f u g n ) + (1 − ρ n ) m n | f u , g n , c n ] = µρ n + E [ m n ](1 − ρ n ) + ρ n f u g n . Then, 1 N X n E [ a un | f u , g n , c n ] = 1 N X n µρ n + E [ m n ](1 − ρ n ) + ρ n f u g n = µ ¯ ρ + E [ m 1 ](1 − ¯ ρ ) + f u 1 N X n g n ρ n , 1 U X u E [ a un | f u , g n , c n ] = 1 U X u µρ n + E [ m n ](1 − ρ n ) + ρ n f u g n = µρ n + E [ m 1 ](1 − ρ n ) + g n ρ n 1 U X u f u , 1 U N X u E [ a un | f u , g n , c n ] = 1 U N X u,n µρ n + E [ m n ](1 − ρ n ) + ρ n f u g n = µ ¯ ρ + E [ m 1 ](1 − ¯ ρ ) + 1 U X u f u 1 N X n g n ρ n . 47 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Therefore, E [ y un | f u , g n , c n ] = µρ n + E [ m n ](1 − ρ n ) + ρ n f u g n − µ ¯ ρ + E [ m 1 ](1 − ¯ ρ ) + f u 1 N X n g n ρ n ! − µρ n + E [ m 1 ](1 − ρ n ) + g n ρ n 1 U X u f u ! + µ ¯ ρ + E [ m 1 ](1 − ¯ ρ ) + 1 U X u f u 1 N X n g n ρ n ! = ρ n f u g n − f u 1 N X n g n ρ n − g n ρ n 1 U X u f u + 1 U X u f u 1 N X n g n ρ n . By weak law of lar ge numbers and Assumptions 1-2, E [ y un | f u , g n , c n ] = ρ n f u g n − cg n ρ n + o p (1) . Since λ f = o ( N − 1 / 2 ) , we hav e that E h ˆ f u | f u , g n , c n i = E  P n y un g n P n g 2 n + λ f | f u , g n , c n  = P n f u ρ n g 2 n − cg 2 n ρ n P n g 2 n + o p (1) = f u P n ρ n g 2 n P n g 2 n − c P n ρ n g 2 n P n g 2 n + o p (1) Notice that Theorem E.9 is a statement about ˆ f u , which represent the centered user factor estimates. T o recover estimates for the original factors, the platform can de-center the distribution of ˆ f u according to Lemma E.1. In particular , giv en kno wledge of the true mean E [ f u ] = c , the estimate for the original factor can be recov ered as ˆ f 0 u = ˆ f u + c . Theorem E.9 implies the following tw o propositions about ˆ f 0 u . Proposition E.10. Consider the setting of Theorem E.9. Then, E [ ˆ f 0 u | f u , g n , c n ] > 0 if and only if f u > − c (1 − w 1 ) w 1 . In particular , the minority slice ( − c (1 − w 1 ) w 1 , 0) is mapped to positive estimates. Pr oof. Since ˆ f 0 u = ˆ f u + c , Theorem E.9 implies E [ ˆ f 0 u | f u , { g n , ρ n } ] = w 1 f u − w 2 + c + o p (1) = w 1 f u + c (1 − w 1 ) + o p (1) . This immediately implies the proposition. Proposition E.11. Let F be the CDF of f u . Assume that F is continuous with no atom at 0 . Then, the estimated minority shar e π − est := Pr( ˆ f 0 u < 0 | f u , { g n , ρ n } ) satisﬁes π − est = F  w 2 − c w 1  + o (1) = F  − c (1 − w 1 ) w 1  + o (1) ≤ F (0) + o (1) = π − true + o (1) . If 0 < w 1 < 1 , then the inequality is strict. Moreo ver , π − est is weakly incr easing in w 1 . 48 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Pr oof. From Theorem E.9, E [ ˆ f 0 u | f u , { g n , ρ n } ] = w 1 f u − w 2 + c + o (1) . Hence π − est = Pr( ˆ f 0 u < 0 | { g n , ρ n } ) = Pr( w 1 f u − w 2 + c < 0)+ o (1) = Pr  f u < w 2 − c w 1  + o (1) = F  w 2 − c w 1  + o (1) . Since w 2 = cw 1 and c > 0 , w 2 − c w 1 = − c (1 − w 1 ) w 1 ≤ 0 , with strict inequality when 0 < w 1 < 1 . Therefore π − est ≤ F (0) + o (1) = π − true + o (1) , and strict inequality holds when 0 < w 1 < 1 because F has no atom at 0 . Finally , d dw 1  w 2 − c w 1  = d dw 1  − c (1 − w 1 ) w 1  = c w 2 1 > 0 , so π − est is weakly increasing in w 1 . F Statistical Guarantee for the T wo-Stage Estimator W e ﬁnally analyze the tw o-stage weighted estimator introduced in the main te xt. The argument proceeds in the same order as feasible generalized least squares: ﬁrst analyze the in verse-variance rule, then sho w that estimated residual variances are consistent, and ﬁnally conclude that the feasible two-stage estimator attains the same asymptotic variance. For bounded positi ve weights w = { w u } , recall the weighted regularized rank-1 problem arg min µ,h,i,f ,g X ( u,n ) ∈ Ω w u ( r un − ( µ + h u + i n + f u g n )) 2 + λ h ∥ h ∥ 2 2 + λ i ∥ i ∥ 2 2 + λ g ∥ g ∥ 2 2 + λ f ∥ f ∥ 2 2 . (24) In this section, we use the nuclear norm problem to complete the matrix in both the ﬁrst and second stages of the analysis before proceeding with ridge regression. Although we ha ve sho wn that the nuclear norm and the direct solution to the ridge regularized problem are both consistent for the true canonical parameters, the sharper bounds on the nuclear norm problem allow for a more reﬁned analysis. W e ﬁrst sho w that our estimates ˆ σ 2 u are consistent for the true variance of user residuals σ 2 u . Lemma F .1. Let ˆ µ, ˆ h u , ˆ i n , ˆ f u , ˆ g n be the ﬁrst stag e estimates for the latent factors. Deﬁne ˆ σ 2 u = 1 N u X n ∈ S u  r un − ˆ µ − ˆ h u − ˆ i n − ˆ f u ˆ g n  2 , (25) wher e S u = |{ n : ( u, n ) ∈ Ω }| . Then, max u | ˆ σ 2 u − σ 2 u | = O p r log( U ∨ N ) U ∨ N ! . Pr oof. As before, s un = µ + h u + i n + f u g n be the true signal for user u on note n . Let the matrix ˆ S with entries ˆ s un denote the matrix estimator for S given by solving (20). Then, ˆ σ 2 u = 1 N u X n ∈ Ω u ( r un − ˆ s un ) 2 = 1 N u X n ∈ Ω u ( ϵ un − ( ˆ s un − s un )) 2 = 1 N u X n ∈ Ω u ϵ 2 un − 2 ϵ un ( ˆ s un − s un ) + ( ˆ s un − s un ) 2 . 49 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Note that the ﬁrst term above conv erges to σ 2 u at rate O p ( N − 1 / 2 ) by la w of large numbers, the MCAR sampling scheme, sub-Gaussian concentration and Slutsky’ s theorem. For the second term, by Cauch y-Schwarz we hav e X n ∈ Ω u ϵ un ( ˆ s un − s un ) ≤ X n ∈ Ω u ϵ 2 un ! 1 / 2 X n ∈ Ω u ( ˆ s un − s un ) 2 ! 1 / 2 . W e ha ve X n ∈ Ω u ( ˆ s un − s un ) 2 ≤ N u max n ( ˆ s un − s un ) 2 = N u O p  log( U ∨ N ) U ∨ N  . (26) By sub-Gaussian concentration, max 1 ≤ u ≤ U      1 N u X n ∈ S u ϵ 2 un − σ 2 u      = O p r log U N ! . By (26), the last term is O p  log( U ∨ N ) U ∨ N  . Since all bounds are uniform bounds ov er u , max u | ˆ σ 2 u − σ 2 u | = O p r log( U ∨ N ) U ∨ N ! . (27) Theorem F .2. Assume that ρ ≡ 1 andt hat E [ f u ] = c is known. Let ˆ µ ts , ˆ h ts , ˆ i ts , ˆ f ts , ˆ g ts be the solutions to (24) with weights w u = 1 /σ 2 u , wher e σ 2 u = E [ ϵ 2 un ] . Then ˆ i ts is consistent for the canonical center ed r epr esentation i . Mor eover , among all solutions of (24) obtained using weights positive, ﬁnite weights w u ∈ (0 , ∞ ) , the estimator ˆ i ts has the lowest asymptotic variance; that is, for any other solution ˆ i of (24) , a v ar( ˆ i ts ) ⪯ a v ar( ˆ i ) . (28) Furthermor e, let ˜ i be the solution to (24) with feasible weights ˆ w u = 1 / ˆ σ 2 u . Then, aV ar( ˜ i ) = aV ar( ˆ i ts ) . (29) Pr oof. Let T = W 1 / 2 S , where S is the true signal matrix and W = diag( w 1 , . . . , w U ) . Let b T = W 1 / 2 b S be the ﬁtted matrix obtained by solving (24) . Since the weights are known, they can be factored out, so it sufﬁces to study the transformed signal T . W e ﬁrst sho w that T satisﬁes the conditions needed to apply (21). By standard singular value inequalities, σ min ( W 1 / 2 ) σ j ( S ) ≤ σ j ( W 1 / 2 S ) ≤ σ max ( W 1 / 2 ) σ j ( S ) . Since the weights are bounded away from 0 and ∞ , Lemma E.5 implies that κ ( T ) = κ ( W 1 / 2 S ) ≤ κ ( W 1 / 2 ) κ ( S ) = O p (1) , and that σ + min ( T ) = σ + min ( W 1 / 2 S ) ≍ p √ U N . Let T = U T Σ T V ⊤ T and S = U Σ V ⊤ be singular value decompositions. Since T = W 1 / 2 S = W 1 / 2 U Σ V ⊤ and W 1 / 2 is inv ertible, T and S hav e the same right singular subspace. Hence there exists an orthogonal matrix Q such that V T = V Q , and therefore ∥ e ⊤ n V T ∥ 2 = ∥ e ⊤ n V Q ∥ 2 = ∥ e ⊤ n V ∥ 2 = O p ( N − 1 / 2 ) by Lemma E.5. For the left singular subspace, note that span( U T ) = span( W 1 / 2 U ) . An orthonormal basis is giv en by U T = W 1 / 2 U ( U ⊤ W U ) − 1 / 2 . Since the weights are bounded away from 0 and ∞ , U ⊤ W U ⪰ w min I so ( U ⊤ W U ) − 1 ⪯ w − 1 min I . Therefore ∥ e ⊤ u U T ∥ 2 = ∥ e ⊤ u √ w u U ( U ⊤ W U ) − 1 / 2 ∥ 2 ≤ √ w u √ w min ∥ e ⊤ u U ∥ 2 = O p ( U − 1 / 2 ) , 50 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 where the last step again uses Lemma E.5. Thus T is ν -incoherent with high probability . Combining incoherence, bounded condition number , and the growth of σ + min ( T ) , we obtain ∥ b T − T ∥ ∞ = O p r log( U ∨ N ) U ∨ N ! . (30) Since W 1 / 2 is in vertible, b S = W − 1 / 2 b T . Deﬁne the weighted decoded estimators ˆ µ = 1 N P u w u 1 ⊤ U W b S 1 N , ˆ i = 1 P u w u b S ⊤ W 1 U − ˆ µ 1 N . By the same ar gument as in the proof of Theorem E.6, using (30) and the fact the weights are bounded and bounded away from 0 , the decoded estimators are consistent, and asymptotically coincide with the minimizers of (24) . In particular , when ρ ≡ 1 , ˆ i is consistent for i . Just as in the proof of Theorem 1 from the main text in Section E.2, we can de-center the estimates ˆ i n to get consistent estimates of the original note parameter i 0 n . Next, we compare asymptotic v ariances across admissible weights. Let R = S + Ξ , where Ξ = ( ϵ un ) , and let N w := X u w u , M N := I N − 1 N 1 N 1 ⊤ N . Then ˆ i = 1 N w b S ⊤ W 1 U − ˆ µ 1 N = 1 N w b T ⊤ W 1 / 2 1 U − ˆ µ 1 N = 1 N w M N b T ⊤ W 1 / 2 1 U . Using (30), we may replace b T by T = W 1 / 2 R in a ﬁrst-order expansion, yielding ˆ i = 1 N w M N T ⊤ W 1 / 2 1 U + o p (1) . Substituting T = W 1 / 2 R = W 1 / 2 S + W 1 / 2 Ξ , we obtain ˆ i = 1 N w M N ( W 1 / 2 S ) ⊤ W 1 / 2 1 U + 1 N w M N ( W 1 / 2 Ξ) ⊤ W 1 / 2 1 U + o p (1) = 1 N w M N S ⊤ W 1 U + 1 N w M N Ξ ⊤ W 1 U + o p (1) . The ﬁrst term is deterministic conditional on the latent v ariables, so the asymptotic variance is determined by the second term. Since the noise v ariables are independent across u and n , with V ar( ϵ un ) = σ 2 u , we hav e V ar(Ξ ⊤ W 1 U ) = X u w 2 u σ 2 u I N . Therefore V ar( ˆ i ) = 1 N 2 w M N X u w 2 u σ 2 u I N ! M N = P u w 2 u σ 2 u ( P u w u ) 2 M N . Thus each coordinate has variance P u w 2 u σ 2 u ( P u w u ) 2 , and any two distinct coordinates ha ve co variance − 1 N P u w 2 u σ 2 u ( P u w u ) 2 . 51 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 T o minimize the asymptotic v ariance, it sufﬁces to minimize P u w 2 u σ 2 u ( P u w u ) 2 . Since the objectiv e is scale-in variant, we may impose the normalization P u w u = 1 , reducing the problem to min { w u } u X u w 2 u σ 2 u subject to X u w u = 1 . The ﬁrst-order conditions gi ve 2 w u σ 2 u = λ, so w u ∝ 1 /σ 2 u . Hence the variance is minimized by in verse-variance weights, proving aV ar( ˆ i ts ) ⪯ aV ar( ˆ i ) . Finally , deﬁne the feasible weights ˆ w u = 1 / ˆ σ 2 u , and let ˜ i denote the solution to (24) with weights ˆ w u . By Lemma F .1, there exist constants c 1 , c 2 > 0 such that with probability tending to one, ˆ σ 2 u ∈ [ c 1 , c 2 ] for all u . Since x 7→ 1 /x is Lipschitz on [ c 1 , c 2 ] , | ˆ w u − w u | = | ˆ σ 2 u − σ 2 u | ˆ σ 2 u σ 2 u ≤ 1 c 2 1 | ˆ σ 2 u − σ 2 u | . Hence, by (27) and the fact that σ 2 u are bounded away from 0 and ∞ , max u | ˆ w u − w u | = o p (1) . Moreov er , by construction, X u ˆ w u ≍ U, X u ˆ w 2 u σ 2 u = O p ( U ) . It therefore sufﬁces to analyze      P u ˆ w 2 u σ 2 u ( P u ˆ w u ) 2 − P u w 2 u σ 2 u ( P u w u ) 2      . Adding and subtracting an intermediate term yields      P u ˆ w 2 u σ 2 u ( P u ˆ w u ) 2 − P u w 2 u σ 2 u ( P u w u ) 2      ≤      P u ˆ w 2 u σ 2 u ( P u ˆ w u ) 2 − P u ˆ w 2 u σ 2 u ( P u w u ) 2      +      P u ˆ w 2 u σ 2 u ( P u w u ) 2 − P u w 2 u σ 2 u ( P u w u ) 2      . For the second term,      P u ˆ w 2 u σ 2 u ( P u w u ) 2 − P u w 2 u σ 2 u ( P u w u ) 2      = 1 ( P u w u ) 2      X u ( ˆ w 2 u − w 2 u ) σ 2 u      ≤ 1 ( P u w u ) 2 X u | ˆ w u − w u | | ˆ w u + w u | σ 2 u ≤ C max u | ˆ w u − w u | U = o p (1) , for some deterministic constant C < ∞ , since ˆ w u , w u , and σ 2 u are uniformly bounded. For the ﬁrst term,      P u ˆ w 2 u σ 2 u ( P u ˆ w u ) 2 − P u ˆ w 2 u σ 2 u ( P u w u ) 2      = X u ˆ w 2 u σ 2 u !      1 ( P u ˆ w u ) 2 − 1 ( P u w u ) 2      = X u ˆ w 2 u σ 2 u ! | P u ( w u − ˆ w u ) | | P u ( w u + ˆ w u ) | ( P u ˆ w u ) 2 ( P u w u ) 2 = o p (1) . Combining the two bounds yields P u ˆ w 2 u σ 2 u ( P u ˆ w u ) 2 = P u w 2 u σ 2 u ( P u w u ) 2 + o p (1) . 52 A P R E P R I N T - M A R C H 2 0 , 2 0 2 6 Since w u = 1 /σ 2 u minimizes the asymptotic variance among all admissible weights, it follows that ˜ i has the same asymptotic variance as the oracle in verse-v ariance weighted estimator . Therefore aV ar( ˜ i ) = aV ar( ˆ i ts ) ⪯ aV ar( ˆ i ) . F .1 Discussion of the T wo-Stage Result The two-stage theorem is a statistical result, not a full incentiv e-compatibility result. It sho ws that under priv ate-signal reporting, in verse residual v ariance weighting yields a consistent estimator of note helpfulness and improves ef ﬁciency relativ e to alternativ e bounded weight choices. What the theorem does not show is that residual-v ariance weighting uniquely induces truthful reporting in equilibrium. The argument that such a rule may weaken conformity incentives is heuristic: unlike consensus-based auditing, it does not directly rew ard agreement with the ev entual majority outcome. This makes it a plausible alternativ e auditing principle, but a full strate gic analysis of that induced game lies beyond the scope of the present theorem. For this reason, we treat the two-stage rule in the main text as an alternati ve estimator and auditing principle moti vated by predicti ve stability . The empirical results show that it impro ves out-of-sample prediction, and the theorem sho ws that it is statistically well behav ed under informativ e reports. The stronger incenti ve question is left open. 53

Auditing the Auditors: Does Community-based Moderation Get It Right?

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment