Answering Non-Monotonic Queries in Relational Data Exchange
Relational data exchange is the problem of translating relational data from a source schema into a target schema, according to a specification of the relationship between the source data and the target data. One of the basic issues is how to answer queries that are posed against target data. While consensus has been reached on the definitive semantics for monotonic queries, this issue turned out to be considerably more difficult for non-monotonic queries. Several semantics for non-monotonic queries have been proposed in the past few years. This article proposes a new semantics for non-monotonic queries, called the GCWA*-semantics. It is inspired by semantics from the area of deductive databases. We show that the GCWA*-semantics coincides with the standard open world semantics on monotonic queries, and we further explore the (data) complexity of evaluating non-monotonic queries under the GCWA*-semantics. In particular, we introduce a class of schema mappings for which universal queries can be evaluated under the GCWA*-semantics in polynomial time (data complexity) on the core of the universal solutions.
💡 Research Summary
The paper addresses a long‑standing problem in relational data exchange: how to answer non‑monotonic queries on the target schema. In the standard setting a schema mapping M = (σ, τ, Σ) consists of a source schema σ, a target schema τ and a finite set Σ of logical constraints, typically source‑to‑target tuple‑generating dependencies (st‑tgds) and equality‑generating dependencies (egds). Given a source instance S, a solution is a target instance T such that S ∪ T satisfies Σ.
For monotonic queries (e.g., unions of conjunctive queries, UCQs) the “certain answers” semantics—based on the open‑world assumption (OWA)—is well‑understood: a tuple belongs to the answer iff it appears in the result of the query on every solution. This semantics is both robust and efficiently computable using universal solutions or their cores. However, for non‑monotonic queries (those involving negation, universal quantification, or set difference) the certain‑answers approach often yields unintuitive results. A classic illustration is a simple st‑tgd θ: ∀x ∀y (R(x,y) → R′(x,y)) together with a source instance containing a single tuple (a,b). The query q(x,y) = R′(x,y) ∧ ∀z (R′(x,z) → z = y) intuitively should return (a,b), but under OWA the certain answer set is empty because there exist solutions that add extra tuples to R′.
Closed‑world based semantics (CWA) have been proposed to remedy this, but existing CWA‑style approaches (e.g., those by Libkin, Sirangelo, Afrati & Kolaitis) suffer from two major drawbacks: (1) they are not invariant under logically equivalent rewritings of the schema mapping, and (2) they do not respect the standard first‑order interpretation of quantifiers. For instance, a mapping that can be expressed either with an existential quantifier or with a disjunction of ground atoms may lead to different answers for a query asking whether “exactly one y exists”.
The authors therefore introduce a new semantics called GCWA* (Generalized Closed World Assumption star). The idea is borrowed from deductive databases, where GCWA‑based semantics have been studied extensively. A GCWA*-solution for a source instance S under a mapping M is defined as the union of all inclusion‑minimal solutions (i.e., solutions that are minimal under set inclusion). In many common settings (st‑tgds + egds) this union coincides with the set of all minimal solutions, and can be described simply as the collection of all “closed‑world” extensions that satisfy Σ.
Under GCWA*, the answer to a query q is the set of tuples that satisfy q in every GCWA*-solution. The paper proves several fundamental properties:
- Equivalence with OWA on monotonic queries (Proposition 6.1). Hence all known tractability results for UCQs carry over.
- Invariance under logically equivalent schema mappings. If two mappings Σ and Σ′ are logically equivalent, they yield the same GCWA*-answers for any query.
- Faithful treatment of quantifiers. Queries that mix ∃ and ∀ are evaluated according to the usual first‑order semantics; for example, a query asking whether there is exactly one witness for a predicate evaluates to false when multiple witnesses are possible in some GCWA*-solution.
The authors then turn to data‑complexity analysis (the mapping and query are fixed, the source instance is the input). They show that for unrestricted st‑tgds and egds, evaluating even simple non‑monotonic Boolean FO queries can be co‑NP‑hard (Proposition 6.2) or undecidable (Proposition 6.3). To obtain tractability, they define a restricted class of mappings called packed st‑tgds. A packed st‑tgd has the form ∀x ∀y (ϕ(x,y) → ∃z ψ(x,z)) where every pair of atoms in ψ shares at least one existential variable. This restriction, while strong, still permits non‑trivial use of existential quantifiers.
The main technical contribution (Theorem 6.6) is that universal queries (first‑order formulas of the form ∀x ϕ where ϕ is quantifier‑free) can be evaluated in polynomial time under GCWA* when the mapping consists of packed st‑tgds. Moreover, the evaluation can be performed solely on the core of the universal solution (the smallest universal solution), without needing the original source instance. Since the core is already known to be computable in polynomial time for many classes of mappings, this yields an efficient, unified method for answering both UCQs (via OWA) and universal queries (via GCWA*) using a single materialized target instance.
The paper’s structure is as follows:
- Section 2 establishes notation (schemas, instances, homomorphisms, cores) and recalls basic results.
- Section 3 critiques existing non‑monotonic semantics, demonstrating lack of invariance and mis‑alignment with quantifier semantics.
- Section 4 surveys deductive‑database semantics (Reiter’s CWA, GCWA, EGCWA, PWS) and evaluates their suitability for data exchange.
- Section 5 formally defines GCWA* and GCWA*-solutions, and provides illustrative examples.
- Section 6 conducts the data‑complexity study, presenting hardness results, the packed‑st‑tgd class, and the polynomial‑time algorithm for universal queries.
- Section 7 concludes with discussion of open problems such as extending tractability to broader query classes and implementing GCWA* in practical data‑exchange systems.
In summary, the paper delivers a conceptually clean and mathematically robust semantics for non‑monotonic query answering in relational data exchange, resolves key deficiencies of prior approaches, and identifies a practically relevant fragment (packed st‑tgds + universal queries) where evaluation is tractable. This advances both the theory of data exchange and its applicability to real‑world scenarios where users routinely pose non‑monotonic queries (e.g., “are there any orphan records?” or “does every customer have exactly one order?”).
Comments & Academic Discussion
Loading comments...
Leave a Comment