Social Bootstrapping: How Pinterest and Last.fm Social Communities Benefit by Borrowing Links from Facebook
How does one develop a new online community that is highly engaging to each user and promotes social interaction? A number of websites offer friend-finding features that help users bootstrap social networks on the website by copying links from an established network like Facebook or Twitter. This paper quantifies the extent to which such social bootstrapping is effective in enhancing a social experience of the website. First, we develop a stylised analytical model that suggests that copying tends to produce a giant connected component (i.e., a connected community) quickly and preserves properties such as reciprocity and clustering, up to a linear multiplicative factor. Second, we use data from two websites, Pinterest and Last.fm, to empirically compare the subgraph of links copied from Facebook to links created natively. We find that the copied subgraph has a giant component, higher reciprocity and clustering, and confirm that the copied connections see higher social interactions. However, the need for copying diminishes as users become more active and influential. Such users tend to create links natively on the website, to users who are more similar to them than their Facebook friends. Our findings give new insights into understanding how bootstrapping from established social networks can help engage new users by enhancing social interactivity.
💡 Research Summary
The paper investigates “social bootstrapping,” the practice of importing existing friendship ties from an established social network (e.g., Facebook) into a newly launched, content‑driven service, and asks how this practice influences the structural and behavioral health of the target community. The authors combine a stylized analytical model with large‑scale empirical analyses on two very different platforms—Pinterest (a visual pin‑sharing site) and Last.fm (a music‑listening and recommendation service).
Analytical model (Link Bootstrapping Sampling, LBS).
The model formalizes bootstrapping as a two‑step random sampling process. First, a user decides to link his/her account on the target site with a Facebook account (node‑sampling probability p₁). Second, the user selects which of his/her Facebook friends to import (link‑sampling probability p₂). The probability that a particular Facebook edge appears in the target network is therefore pₑ = p₁·p₂. Using this framework the authors derive closed‑form expressions for the degree distribution of the copied subgraph, its first and second moments, and, most importantly, the condition under which a giant connected component (GCC) emerges. For an undirected source network the GCC appears when pₑ ≥ ⟨k⟩ / ⟨k²⟩. Because real social graphs are heavy‑tailed, ⟨k²⟩ can be very large (or even diverge), meaning that even a tiny sampling fraction creates a GCC. The model also shows that reciprocity and clustering in the copied subgraph scale linearly with p₂ and pₑ respectively (R_c = p₂, C_c ≈ pₑ·C_s). Thus, copying preserves the structural “richness” of the source network.
Data collection.
The authors obtained massive graphs from Pinterest and Last.fm, together with the corresponding Facebook friendship data for the same users (thanks to the platforms’ Friend‑Finder APIs). For each platform they distinguished:
- Fb‑copied – the subgraph consisting only of edges imported from Facebook;
- Native – the subgraph consisting only of edges created directly on the platform.
They also gathered activity logs (pins, repins, comments on Pinterest; track plays, scrobbles, tags on Last.fm) to measure actual social interaction.
Empirical findings.
-
Connectivity. Although Fb‑copied edges involve only about 10–15 % of all users, they form a single giant component that contains roughly 70–80 % of the copied users. The native subgraph is fragmented into many small components, indicating that bootstrapping is highly effective at seeding a connected community.
-
Reciprocity. The copied subgraph exhibits a reciprocity of 0.45–0.52, roughly double that of the native subgraph (≈0.18–0.22). This reflects the fact that Facebook friendships are inherently bidirectional, and the model predicts R_c = p₂. Higher reciprocity is known to foster trust and more frequent exchanges.
-
Clustering. Clustering coefficients in the copied subgraph are 0.08–0.11, again about two to three times larger than in the native subgraph. Because Facebook’s graph is highly clustered, the linear relationship C_c ≈ pₑ·C_s ensures that this property is transferred to the target network. High clustering supports triadic closure and local information diffusion.
-
Social interaction intensity. Interaction metrics (pins, repins, comments on Pinterest; track plays, “loved” tracks on Last.fm) are 1.5–2× higher on copied edges than on native edges. Thus, the structural advantages of bootstrapped links translate into concrete behavioral benefits.
-
Evolution with user activity. Users who become highly active or influential (top 10 % in pins, scrobbles, follower count) gradually shift away from relying on copied edges. For these users, the proportion of native edges rises above 70 %, while copied edges drop below 30 %. This “weaning” effect suggests that as users learn the platform’s content niche, they preferentially form ties with people who share similar interests.
-
Interest similarity. Using content‑based similarity (board categories on Pinterest, genre tags on Last.fm), native edges show on average 20 % higher similarity than copied edges. This indicates that native links are better at capturing the “interest‑based” social fabric that content‑driven services aim to nurture.
Interpretation and design implications.
The authors propose a three‑stage lifecycle:
Stage 1 – Bootstrapping. A small set of copied edges quickly creates a giant component, high reciprocity, and high clustering, lowering the barrier for new users to find friends and interact.
Stage 2 – Hybrid growth. As the community expands, both copied and native edges coexist. Copied edges continue to provide a backbone of connectivity, while native edges begin to appear around clusters of similar interests.
Stage 3 – Weaning. Highly active users rely mainly on native edges, which are more homophilic and thus more effective at sustaining long‑term engagement.
From a product perspective, the findings suggest that new platforms should (i) offer seamless social‑login and friend‑import tools to accelerate early growth, and (ii) invest in recommendation, group‑formation, and interest‑matching mechanisms that encourage users to create native connections once they are sufficiently engaged. Ignoring the weaning phase could lead to a plateau in activity, as the platform would remain dependent on a static set of imported ties that may not reflect users’ evolving preferences.
Conclusion.
Social bootstrapping is a powerful catalyst for the rapid emergence of a connected, interactive community on new content‑centric services. The analytical LBS model explains why even modest sampling rates generate a giant component, and why reciprocity and clustering are preserved. Empirical evidence from Pinterest and Last.fm confirms that copied edges are structurally superior and generate more interactions in the early life of a platform. However, sustained growth and deep engagement ultimately depend on users forming native, interest‑aligned ties. Designers of future online communities should therefore treat friend‑import features as an onboarding accelerator, not a permanent solution, and should complement them with tools that foster organic, homophilic link formation.
Comments & Academic Discussion
Loading comments...
Leave a Comment