Identifying Stable Influencers: Distinguishing Stable and Temporal Influencers Using Long-Term Twitter Data

Identifying Stable Influencers: Distinguishing Stable and Temporal Influencers Using Long-Term Twitter Data
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

For effective social media marketing, identifying stable influencers-those who sustain their influence over an extended period-is more valuable than focusing on users who are influential only temporarily. This study addresses the challenge of distinguishing stable influencers from transient ones among users who are influential at a given point in time. We particularly focus on two distinct types of influencers: source spreaders, who widely disseminate their own content, and brokers, who play a key role in propagating information originating from others. Using six months of retweet data from approximately 19,000 Twitter users, we analyze the characteristics of stable influencers. Our findings reveal that users who have maintained influence in the past are more likely to continue doing so in the future. Furthermore, we develop classification models to predict stable influencers among temporarily influential users, achieving an AUC of approximately 0.89 for source spreaders and 0.81 for brokers. Our experimental results highlight that current influence is a critical factor in classifying influencers, while past influence also significantly contributes, particularly for source spreaders.


💡 Research Summary

The paper tackles the practical problem of distinguishing “stable influencers” – users who maintain high influence over an extended period – from “temporal influencers” who are only briefly popular on Twitter. While many prior works identify influential users using short observation windows or single diffusion events, this study leverages six months of retweet data from roughly 19,000 English‑language Twitter accounts to examine long‑term dynamics. The authors focus on two canonical influencer roles: (1) source spreaders, who generate large numbers of retweets for their own original tweets, and (2) brokers, who act as bridges by retweeting others’ content and thereby triggering further cascades.

Data collection and preprocessing

  • Monthly snapshots of the follower graph (who follows whom) and the retweet graph (who retweeted whom) were collected from October 2021 to December 2022.
  • Users were retained only if they received at least one retweet in every month of 2022 and maintained at least one follower throughout the period, yielding 18,950 active accounts.
  • For each month, a directed follower network (G_{FO}^m) and a directed retweet network (G_{RT}^m) were built; nodes represent users, edges represent follow relationships or at least one retweet during that month.

Influence metrics

  • Source spreader score (S_u^\tau = \sum_{p\in P_u} |R_{p}^{\tau}|) – total retweets received by all tweets posted by user (u) in month (\tau).
  • Broker score (B_u^\tau = \sum_{p} |D_{p,u}^{\tau}|) – total number of subsequent retweets that occur after (u) retweets a tweet (p) in month (\tau).
  • Users ranking in the top 10 % of (S) (or (B)) for a given month are labeled as source spreaders (or brokers) for that month.

Stable vs. temporal labeling
A user is deemed a stable influencer if they appear in the top‑10 % for m consecutive months starting from a reference month (default (m=6)). All others who are influential in the reference month but fail to maintain the status are labeled temporal.

Feature engineering
From the follower and retweet graphs, the authors compute for each user and each of the four months preceding the reference month (Oct–Jan 2022):

  • In‑degree (follower count, retweeter count)
  • PageRank (both networks)
  • Community size (Leiden algorithm)
  • Unique‑user rate (Q_u^\tau) (ratio of distinct retweeters to total retweets)
  • Influence‑score change rate (C_u^{\tau\rightarrow\tau’} = \log\frac{I_u^{\tau’}}{I_u^{\tau}}) (single‑value per month)
  • The raw broker score series (four dimensions)

Overall, 13 feature groups are used, most represented as 4‑dimensional vectors, yielding a rich temporal‑structural profile for each candidate.

Prediction model
A LightGBM gradient‑boosted decision tree classifier is trained on the 70 % training split, with 5‑fold cross‑validation for hyper‑parameter tuning. The test set (30 %) consists of users identified as influencers in July 2022; features from April–July are used to predict whether they will stay influential for the next six months (July–December). Performance is reported via accuracy and Area Under the ROC Curve (AUC).

Key empirical findings

  1. Retention rates – Approximately 52 % of source spreaders and 45 % of brokers identified in January 2022 remained in the top‑10 % for six consecutive months. Users with longer prior histories show higher retention (e.g., 74 % of source spreaders active since October stay stable).
  2. Predictive performance – For source spreaders, the model achieves AUC ≈ 0.89 and accuracy ≈ 0.81; for brokers, AUC ≈ 0.81 and accuracy ≈ 0.73.
  3. Feature importance – The current influence score (January total retweets for spreaders, January broker score for brokers) is the dominant predictor. For brokers, traditional centrality measures (degree, PageRank) contribute far less, confirming prior observations that broker influence cannot be captured by a single structural metric. Past consistency (e.g., number of months previously identified as influencer) also adds predictive power, especially for source spreaders.
  4. Behavioral insights – Stable source spreaders tend to have higher unique‑user rates, indicating a broader audience reach, and belong to larger, more cohesive follower communities. Stable brokers, however, display heterogeneous community memberships and rely more on the timing of their retweets than on raw network prominence.

Implications and future work

  • Marketing practice: Brands seeking long‑term influencer contracts should prioritize users with both high current influence and a demonstrated multi‑month history of influence, rather than relying solely on follower counts or short‑term virality spikes.
  • Methodological contribution: The study demonstrates that a modest set of temporal network features, combined with a gradient‑boosted classifier, suffices to predict stability with high accuracy, offering a scalable pipeline for industry applications.
  • Limitations: The dataset is limited to English tweets and a six‑month window; cultural or language‑specific dynamics may differ. Moreover, the binary stable/temporal label abstracts away gradations of influence decay, which could be modeled with survival analysis in future work.
  • Research directions: Extending the observation horizon to a year or more, incorporating content‑based signals (sentiment, topics), and exploring causal mechanisms (e.g., external events triggering bursts) would deepen understanding of influencer longevity.

In summary, the paper provides a rigorous, data‑driven framework for distinguishing stable from temporal influencers on Twitter, highlights the primacy of current influence scores while acknowledging the added value of past consistency, and delivers actionable insights for both academic researchers and practitioners in social media marketing.


Comments & Academic Discussion

Loading comments...

Leave a Comment