📝 Original Info
- Title: Treelicious: a System for Semantically Navigating Tagged Web Pages
- ArXiv ID: 1102.1111
- Date: 2015-03-18
- Authors: ** - Matt Mullins (Western Washington University) - Perry Fizzano (Western Washington University) **
📝 Abstract
Collaborative tagging has emerged as a popular and effective method for organizing and describing pages on the Web. We present Treelicious, a system that allows hierarchical navigation of tagged web pages. Our system enriches the navigational capabilities of standard tagging systems, which typically exploit only popularity and co-occurrence data. We describe a prototype that leverages the Wikipedia category structure to allow a user to semantically navigate pages from the Delicious social bookmarking service. In our system a user can perform an ordinary keyword search and browse relevant pages but is also given the ability to broaden the search to more general topics and narrow it to more specific topics. We show that Treelicious indeed provides an intuitive framework that allows for improved and effective discovery of knowledge.
💡 Deep Analysis
📄 Full Content
Treelicious: a System for Semantically Navigating
Tagged Web Pages
Matt Mullins
Department of Computer Science
Western Washington University
Bellingham, WA USA
mtt.mllns@gmail.com
Perry Fizzano
Department of Computer Science
Western Washington University
Bellingham, WA USA
perry.fizzano@wwu.edu
Abstract—Collaborative tagging has emerged as a popular
and effective method for organizing and describing pages on the
Web. We present Treelicious, a system that allows hierarchical
navigation of tagged web pages. Our system enriches the
navigational capabilities of standard tagging systems, which
typically exploit only popularity and co-occurrence data. We
describe a prototype that leverages the Wikipedia category
structure to allow a user to semantically navigate pages from
the Delicious social bookmarking service. In our system a user
can perform an ordinary keyword search and browse relevant
pages but is also given the ability to broaden the search to
more general topics and narrow it to more specific topics. We
show that Treelicious indeed provides an intuitive framework
that allows for improved and effective discovery of knowledge.
Keywords-collaborative tagging; folksonomy; semantic web;
social bookmarking; Wikipedia; Delicious
I. INTRODUCTION
Collaborative tagging has emerged as a popular and
effective method for organizing and describing pages on the
Web. There exist many different sites in different domains
that use the application of free-form keywords as a method
for organizing and searching their content. To name just a
few: CiteULike for managing and discovering scholarly ref-
erences, LibraryThing for cataloging and sharing literature,
Etsy for buying and selling handmade items, and Delicious1
for organizing and sharing bookmarks. Tagging becomes
especially useful to describe non-text media like photos on
Flickr and videos on YouTube. These sites have embraced
tagging as an effective and low-cost way of describing and
organizing their content. On Delicious, one of the most
popular social bookmarking sites, users annotate pages with
tags, usually for the selfish reason of personal organization.
Yet when this is done by many individuals, collectively
rich and accurate descriptions of what these resources mean
to humans materializes. Even though users are using tags
primarily to help themselves retrieve the page later, 62%
of the tags in Delicious end up identifying descriptive
facts about the web resource—tags useful beyond personal
1http://delicious.com/
organization [1]. This user-generated classification structure
has come to be known as a “folksonomy2”.
Yet these folksonomies are lacking in several ways.
First, they’re flat. There is no explicit hierarchy, synonymy,
or relation information present—only simple co-occurrence
data. Second, they’re ambiguous. This is the classic problem
of using words with multiple meanings and no explicit
disambiguation information. Given this lack of semantics
there are only a handful of ways we can present sets of
tags to the user. A common method is to use a tag “cloud”
with more popular tags in the cloud indicated by a larger
font size. Another method is to start with a search tag and
present related tags based on which tags the search tag co-
occurs with in tagged content. This co-occurrence data can
also be used to group related tags using clustering techniques
as is done in Flickr. Though all of these methods are helpful
in some way, ultimately, they fail to show the semantic
relationships among tags [2]. As a result, it is hard for a
user to put their search into perspective. Figure 1 shows
an example of the related tags produced from a search for
“acm” on Delicious.
Figure 1.
A Delicious search for “acm”’ yields these “related” tags. Their
relation is based solely on co-occurrence information.
2http://vanderwal.net/folksonomy.html
arXiv:1102.1111v1 [cs.IR] 5 Feb 2011
This lack of structure resulting from the use of free-
form tags is not encountered with more classic systems
of classification like hierarchical taxonomies and library
classifications. The categories in these systems are well-
defined and placed in a strict hierarchy. Each subcategory
can have only one parent category of which it is a member.
Such a structure results in clear semantic “broader than” and
“narrower than” relationships among concepts. But the strict-
ness inherent in these classic systems presents disadvantages.
They require expert catalogers, authoritative sources of judg-
ment, and users educated about the categories [3]. It also
takes work to keep them from becoming outdated as new
categories are formed and old ones are restructured (e.g. “the
Soviet Union” being reclassified as a “Former country”).
Commenting on the restriction that each class have only
one parent, Voss [4] observes that “Hierarchy seems to have
a strict semantic that does not fit to the vagueness of the
world. In practice there are always several ways to classify
an object .. . If one uses polyhierarchy like in a thesa
Reference
This content is AI-processed based on open access ArXiv data.