In the last few years we have witnessed the emergence, primarily in on-line communities, of new types of social networks that require for their representation more complex graph structures than have been employed in the past. One example is the folksonomy, a tripartite structure of users, resources, and tags -- labels collaboratively applied by the users to the resources in order to impart meaningful structure on an otherwise undifferentiated database. Here we propose a mathematical model of such tripartite structures which represents them as random hypergraphs. We show that it is possible to calculate many properties of this model exactly in the limit of large network size and we compare the results against observations of a real folksonomy, that of the on-line photography web site Flickr. We show that in some cases the model matches the properties of the observed network well, while in others there are significant differences, which we find to be attributable to the practice of multiple tagging, i.e., the application by a single user of many tags to one resource, or one tag to many resources.
Deep Dive into Random hypergraphs and their applications.
In the last few years we have witnessed the emergence, primarily in on-line communities, of new types of social networks that require for their representation more complex graph structures than have been employed in the past. One example is the folksonomy, a tripartite structure of users, resources, and tags – labels collaboratively applied by the users to the resources in order to impart meaningful structure on an otherwise undifferentiated database. Here we propose a mathematical model of such tripartite structures which represents them as random hypergraphs. We show that it is possible to calculate many properties of this model exactly in the limit of large network size and we compare the results against observations of a real folksonomy, that of the on-line photography web site Flickr. We show that in some cases the model matches the properties of the observed network well, while in others there are significant differences, which we find to be attributable to the practice of multi
arXiv:0903.0419v1 [physics.soc-ph] 3 Mar 2009
Random hypergraphs and their applications
Gourab Ghoshal,1 Vinko Zlati´c,2, 3 Guido Caldarelli,3, 4 and M. E. J. Newman5, 6
1Department of Physics and Michigan Center for Theoretical Physics,
University of Michigan, Ann Arbor, Michigan 48109, USA
2Theoretical Physics Division, Rudjer Boˇskovi´c Institute, P.O.Box 180, HR-10002 Zagreb, Croatia
3CNR-INFM Centro SMC Dipartimento di Fisica,
Universit`a di Roma “Sapienza” P.le Moro 5, 00185 Roma, Italy
4Linkalab, Complex Systems Computational Lab.
09100 Cagliari Italy
5Department of Physics and Center for the Study of Complex Systems,
University of Michigan, Ann Arbor, Michigan 48109, USA
6Santa Fe Institute, Santa Fe, New Mexico 87501, USA
In the last few years we have witnessed the emergence, primarily in on-line communities, of new
types of social networks that require for their representation more complex graph structures than
have been employed in the past. One example is the folksonomy, a tripartite structure of users,
resources, and tags—labels collaboratively applied by the users to the resources in order to impart
meaningful structure on an otherwise undifferentiated database. Here we propose a mathematical
model of such tripartite structures which represents them as random hypergraphs. We show that it
is possible to calculate many properties of this model exactly in the limit of large network size and
we compare the results against observations of a real folksonomy, that of the on-line photography
web site Flickr. We show that in some cases the model matches the properties of the observed
network well, while in others there are significant differences, which we find to be attributable to
the practice of multiple tagging, i.e., the application by a single user of many tags to one resource,
or one tag to many resources.
I.
INTRODUCTION
Networks are a versatile mathematical tool for rep-
resenting the structure of complex systems and have
been the subject of large volume of work in the last few
years [1, 2, 3, 4, 5]. In its simplest form a network con-
sists of a set of nodes or vertices, connected by lines or
edges, but many extensions and generalizations have also
been studied, including networks with directed edges,
networks with labeled or weighted edges or vertices, and
bipartite networks, which have two types of vertices and
edges running only between unlike types.
Recently, however, new and more complex types of net-
work data have become available, especially associated
with on-line social and professional communities, that
cannot adequately be described by existing network for-
mats. One example is the folksonomy. “Folksonomy” is
the name given to the common on-line (and sometimes
off-line) process by which a group of individuals collabo-
ratively annotate a data set to create semantic structure.
Typically mark-up is performed by labeling pieces of data
with tags. A good example is provided by the on-line
photography resource Flickr, a web site to which users
upload photographs that can then be viewed by other
users. Flickr allows any user to give a short description
of any photo they see, usually just a single word or a few
words. These are the tags. In principle, tags can allow
users to do many things, such as searching for photos
with particular subjects or clustering photos into topical
groups. There are also many other websites and on-line
resources with similar tagging capabilities, but dealing
with different resources. On the website CiteUlike, for
example, users upload academic papers as opposed to
photographs and label them with descriptive tags.
Researchers have taken a variety of approaches to the
representation of folksonomy data using network meth-
ods, including modeling them as simple unipartite graphs
and bipartite graphs as well as limited forms of tripartite
graphs [6, 7, 8, 9]. Each of these approaches, however,
fails to capture some elements of the structure of the data
and hence limits the conclusions that can be drawn from
subsequent network analysis.
The fundamental building block in a folksonomy is a
triple consisting of a resource, such as a photograph, a
tag, usually a short text phrase, and a user, who applies
the tag to the resource. Any full network representation
of folksonomy data needs to capture this three-way rela-
tionship between resource, tag, and user, and this leads
us to the consideration of hypergraphs.
A hypergraph is a generalization of an ordinary graph
in which an edge (or hyperedge) can connect more than
two vertices together. To represent our folksonomy we
make use of a tripartite hypergraph, a generalization of the
more familiar bipartite graph, in which there are three
types of vertices representing resources, tags, and users,
and three-way hyperedges joining them in such a way
that each hyperedge links together exactly one resource,
one tag, and one user. Each hyperedge corresponds to
the act of a user applying a tag to a resource and hence
the tripartite
…(Full text truncated)…
This content is AI-processed based on ArXiv data.