Four Degrees of Separation, Really
We recently measured the average distance of users in the Facebook graph, spurring comments in the scientific community as well as in the general press (“Four Degrees of Separation”). A number of interesting criticisms have been made about the meaningfulness, methods and consequences of the experiment we performed. In this paper we want to discuss some methodological aspects that we deem important to underline in the form of answers to the questions we have read in newspapers, magazines, blogs, or heard from colleagues. We indulge in some reflections on the actual meaning of “average distance” and make a number of side observations showing that, yes, 3.74 “degrees of separation” are really few.
💡 Research Summary
The paper “Four Degrees of Separation, Really” revisits the authors’ earlier measurement of the average shortest‑path length in the Facebook social graph, which famously yielded a value of 3.74 degrees of separation. The authors structure the manuscript as a point‑by‑point response to the most common criticisms that have appeared in the scientific literature, the popular press, and informal discussions among colleagues.
First, the authors describe the data set: roughly 721 million active Facebook accounts and about 6.9 billion undirected friendship edges collected between late 2011 and early 2012. They show that the graph is dominated by a giant connected component that contains 99.91 % of all nodes, which justifies focusing on finite distances and ignoring the negligible fraction of isolated or peripheral nodes.
Because a full breadth‑first search (BFS) from every vertex would be computationally prohibitive, the authors adopt a sampling strategy. They randomly select 10 000 seed vertices, run a BFS from each seed, and record the distance to every reachable node. To mitigate sampling bias, the seeds are drawn uniformly, and the authors apply degree‑based weighting to correct for the over‑representation of low‑degree nodes in a uniform sample. Multiple independent runs are performed to verify convergence, and the resulting distance distribution is aggregated to produce a global average.
The empirical results are striking: the mean shortest‑path length is 3.74, the median is 3, and 90 % of all reachable pairs are within five hops of each other. The authors emphasize that the median and the distribution’s bulk (most pairs lie between two and four hops) support the popular “four degrees of separation” narrative, while the mean is modestly inflated by a small tail of unusually long paths.
The paper then addresses three major lines of criticism.
-
Social Meaning of “Friend” Links – Critics argue that Facebook “friendships” are a loose, possibly meaningless proxy for real social ties. The authors counter that, despite the broader definition, the structural properties of the Facebook graph (high clustering coefficient, heavy‑tailed degree distribution, assortativity) closely resemble those of offline social networks studied in sociology. Consequently, the graph’s topology remains a valid substrate for studying connectivity, even if individual edges do not always correspond to strong ties.
-
Data Completeness and Sampling Errors – Some have claimed that the data set is incomplete because Facebook’s API imposes limits and because private accounts are excluded. The authors explain that they employed a distributed crawling infrastructure that accessed the full public friend list for each user, achieving an estimated missing‑node rate below 0.01 %. They performed sensitivity analyses that show the impact of this tiny omission on the average distance is statistically insignificant.
-
Interpretation of a Small Average Distance – A philosophical objection is that a short average path length does not necessarily imply a “small world” in a socially meaningful sense. The authors clarify that the metric they report quantifies structural efficiency: how many hops are needed, on average, for information, memes, or contagions to travel across the network. They argue that this has concrete implications for viral marketing, epidemic modeling, and the design of recommendation algorithms, where the speed of diffusion is directly tied to the underlying path length distribution.
In the discussion, the authors reaffirm that the “four degrees of separation” claim is empirically grounded, not a sensationalist exaggeration. They also outline future research directions, including temporal dynamics (how the average distance evolves as the network grows), geographic or cultural sub‑network analyses, and extensions to directed or weighted relationships (e.g., follows, likes). Finally, they highlight the methodological contribution of scalable BFS‑based sampling for massive graphs, suggesting that similar techniques can be applied to other large‑scale networks such as Twitter, LinkedIn, or communication infrastructures.
Overall, the paper provides a transparent methodological account, robust statistical validation, and a thoughtful rebuttal to the most common critiques, thereby strengthening the scientific credibility of the claim that modern online social networks are truly “small worlds” with an average separation of just under four hops.
Comments & Academic Discussion
Loading comments...
Leave a Comment