Exploratory Analysis of Pairwise Interactions in Online Social Networks
In the last few decades sociologists were trying to explain human behaviour by analysing social networks, which requires access to data about interpersonal relationships. This represented a big obstacle in this research field until the emergence of online social networks (OSNs), which vastly facilitated the process of collecting such data. Nowadays, by crawling public profiles on OSNs, it is possible to build a social graph where “friends” on OSN become represented as connected nodes. OSN connection does not necessarily indicate a close real-life relationship, but using OSN interaction records may reveal real-life relationship intensities, a topic which inspired a number of recent researches. Still, published research currently lacks an extensive exploratory analysis of OSN interaction records, i.e. a comprehensive overview of users’ interaction via different ways of OSN interaction. In this paper we provide such an overview by leveraging results of conducted extensive social experiment which managed to collect records for over 3,200 Facebook users interacting with over 1,400,000 of their friends. Our exploratory analysis focuses on extracting population distributions and correlation parameters for 13 interaction parameters, providing valuable insight in online social network interaction for future researches aimed at this field of study.
💡 Research Summary
The paper presents a comprehensive exploratory analysis of user‑to‑user interaction records on Facebook, derived from a large‑scale social experiment called “NajFrend”. Conducted in April–May 2015, the experiment recruited 3,277 participants (predominantly 18‑30‑year‑old students from Croatia and neighboring countries) who granted explicit permission to access their Facebook data via the now‑deprecated Facebook API 1.0. The researchers collected interaction data for more than 1.4 million ego‑user–friend pairs, focusing on thirteen interaction parameters: number of mutual friends, counts of likes, comments, and posts on the ego‑user’s feed, joint tagging in posts, mutual photos published by the ego‑user, by the friend, or by third parties, likes and comments on photos, private chat messages (limited to the most recent 50 exchanges), and likes on the friend’s photos and links.
The primary analytical goals were (1) to characterize the empirical distribution of each parameter and (2) to quantify pairwise relationships among parameters using Pearson’s correlation coefficient. Because the dataset is heavily zero‑inflated—most users interact very little with the majority of their Facebook friends—the authors first isolated non‑zero observations for distribution fitting. They evaluated a suite of candidate theoretical distributions (beta, gamma, inverse‑gamma, normal, log‑normal, skew‑normal, geometric, uniform) using maximum‑likelihood estimation and chi‑square goodness‑of‑fit tests (50 bins). The gamma distribution emerged as the best fit for twelve of the thirteen parameters, while the inbox_chat variable was best modeled by a log‑normal distribution. Parameter‑specific zero‑value ratios ranged from 3.33 % (mutual friends) to 97.58 % (joint posts), underscoring the sparsity of online interaction.
Correlation analysis revealed several notable patterns. The strongest positive correlation (≈0.78) was observed between feed_comment and feed_addressed, indicating that users who frequently comment on a friend’s posts also tend to publish many standalone posts on their own timeline. High correlations also existed between photo_like and feed_like (≈0.71) and between photo_comment and feed_comment, reflecting a general consistency in how users react to textual versus visual content. In contrast, the three mutual‑photo variables (published by user, by friend, by others) displayed low inter‑correlations, suggesting that photo‑sharing habits are highly individualised. Crucially, the number of mutual friends showed negligible correlation (<0.1) with any interaction metric, challenging the intuitive assumption that a larger shared friend pool translates into more intensive online communication.
The discussion situates these findings within broader social‑network theory. The prevalence of zero interactions aligns with Dunbar’s number, which posits a cognitive limit of roughly 150 stable relationships, far below the average Facebook friend count (429) observed in the sample. The authors argue that the identified distributional patterns and weak ties between structural (friendship) and behavioral (interaction) data highlight the need for sophisticated probabilistic models—particularly gamma and log‑normal—to capture the heavy‑tailed, skewed nature of OSN activity.
Limitations include the temporal specificity of the data (pre‑2015 API), geographic concentration, and the inherent sparsity that may affect statistical power for low‑frequency interactions. Future work is planned to expand the dataset across platforms (e.g., Instagram, TikTok), incorporate more diverse demographics, and leverage the exploratory insights to develop machine‑learning models capable of inferring real‑life relationship strength and visualising social graphs. The authors conclude that their extensive empirical foundation will support subsequent research aimed at distinguishing genuine offline relationships from mere online connections.
Comments & Academic Discussion
Loading comments...
Leave a Comment