Querying Incomplete Data over Extended ER Schemata

Reading time: 6 minute
...

📝 Original Info

  • Title: Querying Incomplete Data over Extended ER Schemata
  • ArXiv ID: 1003.3139
  • Date: 2015-03-13
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Since Chen's Entity-Relationship (ER) model, conceptual modeling has been playing a fundamental role in relational data design. In this paper we consider an extended ER (EER) model enriched with cardinality constraints, disjointness assertions, and is-a relations among both entities and relationships. In this setting, we consider the case of incomplete data, which is likely to occur, for instance, when data from different sources are integrated. In such a context, we address the problem of providing correct answers to conjunctive queries by reasoning on the schema. Based on previous results about decidability of the problem, we provide a query answering algorithm that performs rewriting of the initial query into a recursive Datalog query encoding the information about the schema. We finally show extensions to more general settings. This paper will appear in the special issue of Theory and Practice of Logic Programming (TPLP) titled Logic Programming in Databases: From Datalog to Semantic-Web Rules.

💡 Deep Analysis

Deep Dive into Querying Incomplete Data over Extended ER Schemata.

Since Chen’s Entity-Relationship (ER) model, conceptual modeling has been playing a fundamental role in relational data design. In this paper we consider an extended ER (EER) model enriched with cardinality constraints, disjointness assertions, and is-a relations among both entities and relationships. In this setting, we consider the case of incomplete data, which is likely to occur, for instance, when data from different sources are integrated. In such a context, we address the problem of providing correct answers to conjunctive queries by reasoning on the schema. Based on previous results about decidability of the problem, we provide a query answering algorithm that performs rewriting of the initial query into a recursive Datalog query encoding the information about the schema. We finally show extensions to more general settings. This paper will appear in the special issue of Theory and Practice of Logic Programming (TPLP) titled Logic Programming in Databases: From Datalog to Semant

📄 Full Content

arXiv:1003.3139v2 [cs.DB] 13 Apr 2010 To appear in Theory and Practice of Logic Programming 1 Querying Incomplete Data over Extended ER Schemata ANDREA CAL`I Computing Laboratory, University of Oxford Eagle House, Walton Well Road – Oxford OX2 6ED, United Kingdom (e-mail: andrea.cali@comlab.ox.ac.uk) DAVIDE MARTINENGHI Dipartimento di Elettronica e Informazione, Politecnico di Milano Piazza Leonardo 32 – 20133 Milano, Italy (e-mail: davide.martinenghi@polimi.it) Abstract Since Chen’s Entity-Relationship (ER) model, conceptual modeling has been playing a fundamental role in relational data design. In this paper we consider an extended ER (EER) model enriched with cardinality constraints, disjointness assertions, and is-a re- lations among both entities and relationships. In this setting, we consider the case of incomplete data, which is likely to occur, for instance, when data from different sources are integrated. In such a context, we address the problem of providing correct answers to conjunctive queries by reasoning on the schema. Based on previous results about decidabil- ity of the problem, we provide a query answering algorithm that performs rewriting of the initial query into a recursive Datalog query encoding the information about the schema. We finally show extensions to more general settings. This paper will appear in the special issue of Theory and Practice of Logic Programming (TPLP) titled Logic Programming in Databases: From Datalog to Semantic-Web Rules. KEYWORDS: Extended ER model, Dependencies, Chase, Incomplete Data 1 Introduction Conceptual data models, and in particular the Entity-Relationship (ER) model (Chen 1976), have long been playing a fundamental role in database design. With the emerging trends in data exchange, information integration, semantic web, and web information systems, the need for dealing with inconsistent and incomplete data has arisen. In this context, it is important to provide correct answers to queries posed over inconsistent and incomplete data (Arenas et al. 1999). It is worth notic- ing here that inconsistency and incompleteness of data is considered with respect to a set of constraints (a.k.a. data dependencies). Such constraints, rather than expressing properties that hold on the data, are used to represent properties of the domain of interest. We address the problem of answering queries over incomplete data, where queries 2 A. Cal`ı and D. Martinenghi are conjunctive queries expressed over particular relational schemata, called con- ceptual schemata, that are derived from conceptual models. As for the concep- tual models, we follow (Chen 1976), and we adopt an extension of the well-known Entity-Relationship model, that we call Extended Entity-Relationship (EER) Model, along with (Thalheim 2000) and the many variants of the classical ER Model. Such an extension is widely adopted in practice and is able to repre- sent classes of objects with their attributes, relationships among classes, cardi- nality constraints in the participation of entities in relationships, and is-a rela- tions among both classes and relationships. We provide a formal semantics to our conceptual model in terms of the relational database model, similarly to what is done in (Markowitz and Makowsky 1990). This allows us to formulate conjunctive queries over EER schemata. We do this by providing a translation from EER into relational, whose purpose is to obtain a precise characterization of the relational dependencies that are derived from an EER schema in a design process. In the presence of data that are incomplete w.r.t. to a set of constraints, we need to reason about the dependencies in order to provide certain answers; we do this in a model-theoretic fashion, following the approach of (Arenas et al. 1999; Cal`ı et al. 2001). Intuitively, we start from a given, incomplete database for the relational schema associated with the EER schema; such data, together with the constraints, are interpreted as a logical theory, with a (possibly infinite) set of mod- els, also called solutions in the literature. We adopt the so-called sound semantics (see, e.g., (Cal`ı et al. 2003a)): a database is a model if it is a superset of the initial data, and satisfies the constraints. Given a query, the certain answers are those that are true in all models. In this paper we address the problem of answering conjunctive queries over schemata derived from EER schemata in the presence of incomplete data with respect to the schema under the sound semantics. We present an algorithm, based on encoding the information about the conceptual schema and the instance into a rewriting of the conjunctive query in Datalog, which computes the certain an- swers to queries posed in such a context. The algorithm reasons on the integrity constraints and the query. The problem at hand can be sketchily stated as follows. • We have a conceptual EER schema. From it, a relational schema S is ob- tained through a translation mechanism t

…(Full text truncated)…

📸 Image Gallery

cover.png page_2.webp page_3.webp

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut