It has often been taken as a working assumption that directed links in information networks are frequently formed by "short-cutting" a two-step path between the source and the destination -- a kind of implicit "link copying" analogous to the process of triadic closure in social networks. Despite the role of this assumption in theoretical models such as preferential attachment, it has received very little direct empirical investigation. Here we develop a formalization and methodology for studying this type of directed closure process, and we provide evidence for its important role in the formation of links on Twitter. We then analyze a sequence of models designed to capture the structural phenomena related to directed closure that we observe in the Twitter data.
Deep Dive into The Directed Closure Process in Hybrid Social-Information Networks, with an Analysis of Link Formation on Twitter.
It has often been taken as a working assumption that directed links in information networks are frequently formed by “short-cutting” a two-step path between the source and the destination – a kind of implicit “link copying” analogous to the process of triadic closure in social networks. Despite the role of this assumption in theoretical models such as preferential attachment, it has received very little direct empirical investigation. Here we develop a formalization and methodology for studying this type of directed closure process, and we provide evidence for its important role in the formation of links on Twitter. We then analyze a sequence of models designed to capture the structural phenomena related to directed closure that we observe in the Twitter data.
The Directed Closure Process in Hybrid Social-Information Networks,
with an Analysis of Link Formation on Twitter
Daniel M. Romero
Cornell University
dmr239@cornell.edu
Jon Kleinberg
Cornell University
kleinber@cs.cornell.edu
Abstract
It has often been taken as a working assumption that di-
rected links in information networks are frequently formed
by “short-cutting” a two-step path between the source and the
destination — a kind of implicit “link copying” analogous to
the process of triadic closure in social networks. Despite the
role of this assumption in theoretical models such as prefer-
ential attachment, it has received very little direct empirical
investigation. Here we develop a formalization and method-
ology for studying this type of directed closure process, and
we provide evidence for its important role in the formation of
links on Twitter. We then analyze a sequence of models de-
signed to capture the structural phenomena related to directed
closure that we observe in the Twitter data.
Introduction
Information networks, which connect Web pages or other
units of information, and social networks, which connect
people, are related notions, but they exhibit fundamental dif-
ferences. Two of the principal differences are based on di-
rectionality and heterogeneity. First, information networks
are generally directed structures, with links created by one
author to point to another; social networks, on the other
hand, tend to be represented in most basic settings as undi-
rected structures, expressing relationships that are approxi-
mately mutual. Second, information networks tend to con-
tain a few nodes with extremely large numbers of incoming
edges — documents or pages that are “famous” and hence
widely referenced —- while social networks exhibit dispar-
ities in connectivity only to a smaller extent, since even
the most gregarious people have some practical limit on the
number of genuine social ties they can form.
The link structure of the Web, and of well-defined subsets
of the Web such as the blogosphere and Wikipedia, are clear
examples of information networks; social-networking sites
such as Facebook have provided us with very large represen-
tations of social networks that are derived from social struc-
Copyright c⃝2018, Association for the Advancement of Artificial
Intelligence (www.aaai.org). All rights reserved.
This work has been supported in part by NSF grants BCS-
0537606, IIS-0705774, IIS-0910664, CCF-0910940, a Google Re-
search Grant, a Yahoo! Research Alliance Grant, and the John D.
and Catherine T. MacArthur Foundation.
B
A
C
(a) Undirected
feed-forward
triangle
B
A
C
(b) Directed
“feed-forward”
triangle
Figure 1: (a) Triadic closure in an undirected graph produces
a triangle when an edge connects two nodes who already
have a common neighbor. (b) Analogously, in a directed
information network, directed closure occurs when a node
A links to a node C to which it already has a two-step path
(through a node B). This creates a directed triangle (a “feed-
forward” structure on three nodes).
ture in the off-line world. An interesting recent development
has been the growth of social media sites that increasingly
interpolate between the properties of information networks
and social networks. The micro-blogging site Twitter is a
compelling example of such an interpolation.
A user on
Twitter is able to create links to other users whose content he
or she is interested in; this is referred to as following these
users, and the set of all such follower relations defines a net-
work. The structure of this network reflects properties both
of a social network, since it exposes underlying friendship
relations among people, and also of an information network,
since it is directed and also contains huge concentrations of
links to specific “celebrities” and automated generators of
news content that reflect fundamentally informational rela-
tions.
Link Formation in Information Networks.
In a social
network, triadic closure is one of the fundamental processes
of link formation: there is an increased chance that a friend-
ship will form between two people if they already have
a friend in common (Rapoport 1953; Granovetter 1973).
(For example, we could imagine the A-C friendship in Fig-
ure 1(a) as forming after the existence of the A-B and B-C
edges, and accelerated by the existence of these two edges.)
Recent empirical analysis has quantified this effect on large
arXiv:1003.2469v1 [stat.ML] 12 Mar 2010
B1
A
C
B3
B2
B4
Figure 2: In this example, the edge from A to C exhibits
closure if there is already a two-step path from A to C (i.e.,
through B1, B2, B3) when the A-C edge arrives.
social network datasets (Kossinets and Watts 2006). Is there
an analogous process in information networks?
A natural hypothesis for such a process is the following:
if a node A in an information network links to B, and B
links to C, then one should arguably expect an increased
likelihood that A will link to C — since the author of A
…(Full text truncated)…
This content is AI-processed based on ArXiv data.