Towards Detecting Compromised Accounts on Social Networks
Compromising social network accounts has become a profitable course of action for cybercriminals. By hijacking control of a popular media or business account, attackers can distribute their malicious messages or disseminate fake information to a large user base. The impacts of these incidents range from a tarnished reputation to multi-billion dollar monetary losses on financial markets. In our previous work, we demonstrated how we can detect large-scale compromises (i.e., so-called campaigns) of regular online social network users. In this work, we show how we can use similar techniques to identify compromises of individual high-profile accounts. High-profile accounts frequently have one characteristic that makes this detection reliable – they show consistent behavior over time. We show that our system, were it deployed, would have been able to detect and prevent three real-world attacks against popular companies and news agencies. Furthermore, our system, in contrast to popular media, would not have fallen for a staged compromise instigated by a US restaurant chain for publicity reasons.
💡 Research Summary
The paper introduces COMPA, a novel system for detecting compromised social‑network accounts, with a particular focus on high‑profile accounts such as news organizations, corporations, and other verified entities. The authors observe that these accounts exhibit remarkably stable behavioral patterns over time—regular posting hours, consistent use of specific client applications, preferred language, typical message length, and habitual inclusion (or exclusion) of hashtags, mentions, and URLs. By learning these patterns from historical posts, COMPA can flag any new message that deviates significantly from the learned profile as a potential compromise.
COMPA’s pipeline consists of two main stages. First, it gathers a user’s message stream from the platform (Twitter timelines, Facebook walls, etc.) and requires a minimum of ten past messages to construct a reliable profile. From each message it extracts seven features: (1) hour‑of‑day, (2) source application, (3) language, (4) message length, (5) presence of hashtags, (6) number of user mentions, and (7) presence of URLs. For each feature a statistical model is trained—typically a Gaussian or multinomial distribution—capturing the normal range of values for that user. In the second stage, when a new message arrives, the same features are extracted and compared against the corresponding models, producing an anomaly score in the range
Comments & Academic Discussion
Loading comments...
Leave a Comment