Conjunctive queries play an important role as an expressive query language for Description Logics (DLs). Although modern DLs usually provide for transitive roles, conjunctive query answering over DL knowledge bases is only poorly understood if transitive roles are admitted in the query. In this paper, we consider unions of conjunctive queries over knowledge bases formulated in the prominent DL SHIQ and allow transitive roles in both the query and the knowledge base. We show decidability of query answering in this setting and establish two tight complexity bounds: regarding combined complexity, we prove that there is a deterministic algorithm for query answering that needs time single exponential in the size of the KB and double exponential in the size of the query, which is optimal. Regarding data complexity, we prove containment in co-NP.
Description Logics (DLs) are a family of logic based knowledge representation formalisms (Baader, Calvanese, McGuinness, Nardi, & Patel-Schneider, 2003). Most DLs are fragments of First-Order Logic restricted to unary and binary predicates, which are called concepts and roles in DLs. The constructors for building complex expressions are usually chosen such that the key inference problems, such as concept satisfiability, are decidable and preferably of low computational complexity. A DL knowledge base (KB) consists of a TBox, which contains intensional knowledge such as concept definitions and general background knowledge, and an ABox, which contains extensional knowledge and is used to describe individuals. Using a database metaphor, the TBox corresponds to the schema, and the ABox corresponds to the data. In contrast to databases, however, DL knowledge bases adopt an open world semantics, i.e., they represent information about the domain in an incomplete way.
Standard DL reasoning services include testing concepts for satisfiability and retrieving certain instances of a given concept. The latter retrieves, for a knowledge base consisting of an ABox A and a TBox T , all (ABox) individuals that are instances of the given (possibly complex) concept expression C, i.e., all those individuals a such that T and A entail that a is an instance of C. The underlying reasoning problems are well-understood, and it is known that the combined complexity of these reasoning problems, i.e., the complexity measured in the size of the TBox, the ABox, and the query, is ExpTime-complete for SHIQ (Tobies, 2001). The data complexity of a reasoning problem is measured in the size of the ABox only. Whenever the TBox and the query are small compared to the ABox, as is often the case in practice, the data complexity gives a more useful performance estimate. For SHIQ, instance retrieval is known to be data complete for co-NP (Hustadt, Motik, & Sattler, 2005).
Despite the high worst case complexity of the standard reasoning problems for very expressive DLs such as SHIQ, there are highly optimized implementations available, e.g., FaCT++ (Tsarkov & Horrocks, 2006), KAON2 1 , Pellet (Sirin, Parsia, Cuenca Grau, Kalyanpur, & Katz, 2006), and RacerPro 2 . These systems are used in a wide range of applications, e.g., configuration (McGuinness & Wright, 1998), bio informatics (Wolstencroft, Brass, Horrocks, Lord, Sattler, Turi, & Stevens, 2005), and information integration (Calvanese, De Giacomo, Lenzerini, Nardi, & Rosati, 1998b). Most prominently, DLs are known for their use as a logical underpinning of ontology languages, e.g., OIL, DAML+OIL, and OWL (Horrocks, Patel-Schneider, & van Harmelen, 2003), which is a W3C recommendation (Bechhofer, van Harmelen, Hendler, Horrocks, McGuinness, Patel-Schneider, & Stein, 2004).
In data-intensive applications, querying KBs plays a central role. Instance retrieval is, in some aspects, a rather weak form of querying: although possibly complex concept expressions are used as queries, we can only query for tree-like relational structures, i.e., a DL concept cannot express arbitrary cyclic structures. This property is known as the tree model property and is considered an important reason for the decidability of most Modal and Description Logics (Grädel, 2001;Vardi, 1997). Conjunctive queries (CQs) are well known in the database community and constitute an expressive query language with capabilities that go well beyond standard instance retrieval. For an example, consider a knowledge base that contains an ABox assertion (∃hasSon.(∃hasDaughter.⊤))(Mary), which informally states that the individual (or constant in FOL terms) Mary has a son who has a daughter; hence, that Mary is a grandmother. Additionally, we assume that both roles hasSon and hasDaughter have a transitive super-role hasDescendant. This implies that Mary is related via the role hasDescendant to her (anonymous) grandchild. For this knowledge base, Mary is clearly an answer to the conjunctive query hasSon(x, y) ∧ hasDaughter(y, z) ∧ hasDescendant(x, z), when we assume that x is a distinguished variable (also called answer or free variable) and y, z are non-distinguished (existentially quantified) variables.
If all variables in the query are non-distinguished, the query answer is just true or false and the query is called a Boolean query. Given a knowledge base K and a Boolean CQ q, the query entailment problem is deciding whether q is true or false w.r.t. K. If a CQ contains distinguished variables, the answers to the query are those tuples of individual names for which the knowledge base entails the query that is obtained by replacing the free variables with the individual names in the answer tuple. The problem of finding all answer tuples is known as query answering. Since query entailment is a decision problem and thus better suited for complexity analysis than query answering, we concentrate on query entailment. This is no restriction
This content is AI-processed based on open access ArXiv data.