The paper illustrates the research result of the application of semantic technology to ease the use and reuse of digital contents exposed as Linked Data on the web. It focuses on the specific issue of explorative research for the resource selection: a context dependent semantic similarity assessment is proposed in order to compare datasets annotated through terminologies exposed as Linked Data (e.g. habitats, species). Semantic similarity is shown as a building block technology to sift linked data resources. From semantic similarity application, we derived a set of recommendations underlying open issues in scaling the similarity assessment up to the Web of Data.
The paper proposes the research result on semantic technology to ease the use and reuse of digital content exposed as Linked Data resources on the web. In particular, it focuses on the specific issue of supporting explorative research for the resource selection.
Effective sharing and reuse of resources and in particular digital contents (e.g., plain text, documents, images, audio/video, source code) are still desiderata by many scientific and industrial domains, e.g., environmental monitoring and analysis, medicine and bioinformatics, CAD/CAE virtual product modelling and professional multimedia, where the selection of tailored and high-quality content is a necessary condition to provide successful and competitive services. For example, in the domain of environmental data, many data resources are usually obtained through complex acquisition-processing pipelines, which typically involve distinct specialized fields of competency. Oceanographers, biologists, geologists may provide heterogeneous data resources, which are encoded differently in text, tables, images, 2D and 3D digital terrain models. Knowledge management research for the browsing of these contents has to involve issues related to: (i) different user and domain dependant pipelines, (ii) sharing and collaboration between users. For these purposes the use and management of metadata describing digital contents becomes essential.
Semantic Web and in particular the emerging Linked Data [1] provide a promising framework to encode, publish and share complex metadata of resources in these scientific and industrial domains. In particular, the increasing interest for Linked Data is affecting the way information is published, managed, and reused. However, the large part of discussion is still focusing on how to publish and share data rather than how to take advantage of the published Linked Data.
We address the issue about how to exploit Linked Data resources once they have been published providing a step over in their browsing and selection. In particular a context dependent semantic similarity assessment is proposed and applied to compare geographic resources with specific reference to target dataset about habitat in the geographic domain.
Recommendations learnt from this experience in scaling the similarity assessment up to the Web of Data are provided as the final contribution of the paper.
The paper is organized as follows: Section 2 describes the objectives of the paper; Section 3 shows the methodology used; Section 4 provides information about the technology employed and Section 5 shows the application outcomes. Conclusion and recommendations end the paper.
We focus on the issue related to the selection task affecting the browsing activity where users with different skills have to carefully select a set of resources whose metadata is exposed as Linked Data. Semantic similarity will be discussed as example of a set of methods voted at consuming metadata published as linked data. Semantic similarity aims to compare resources identifying those that are conceptually close but not identical, it is proposed as a method supporting the deep comparison among candidates during the resource selection. The methods originally conceived for ontology driven repositories [2] have been extended to the resources published according to Linked Data.
We propose a method to analyse digital contents exposed as linked data evaluating their semantic similarity.
The term “semantic similarity” has been used in literature with different meanings. It sometimes refers to ontology alignment, where it enables the matching of distinct ontologies by comparing the names of the classes, attributes, relations, and instances [3]. Semantic similarity can also refer to concept similarity where it assesses the similarity among terms by considering their distinguishing features [4,5,6]; their encoding in lexicographic databases [7,8,9]; and their encoding in conceptual spaces [10].
In this paper, however instance semantic similarity is exploited to support in the comparison of linked data providing different ranking to browse and select them during the search for geographical information.
Different methods to assess instance similarity have been proposed in literature. Some rely on description logics [11]; some have been applied in the context of web services [12]; and some others have been applied to cluster ontology driven metadata [13,14].
Surprisingly, none of these methods support recognition in the case of those instances, albeit different, have effectively the same informative content: they lack of an explicit formalization of the role of context in the entity comparison, and they fail identifying and measuring if the informative content of one overlaps or is contained in the other. Thus, the similarity results are not easily interpretable in terms of gain and loss the users get adopting a resource in place of another. In this paper, we exploit extension to linked data of th
This content is AI-processed based on open access ArXiv data.