Algebraic operators for querying pattern bases
📝 Abstract
The objectives of this research work which is intimately related to pattern discovery and management are threefold: (i) handle the problem of pattern manipulation by defining operations on patterns, (ii) study the problem of enriching and updating a pattern set (e.g., concepts, rules) when changes occur in the user’s needs and the input data (e.g., object/attribute insertion or elimination, taxonomy utilization), and (iii) approximate a “presumed” concept using a related pattern space so that patterns can augment data with knowledge. To conduct our work, we use formal concept analysis (FCA) as a framework for pattern discovery and management and we take a joint database-FCA perspective by defining operators similar in spirit to relational algebra operators, investigating approximation in concept lattices and exploiting existing work related to operations on contexts and lattices to formalize such operators.
💡 Analysis
The objectives of this research work which is intimately related to pattern discovery and management are threefold: (i) handle the problem of pattern manipulation by defining operations on patterns, (ii) study the problem of enriching and updating a pattern set (e.g., concepts, rules) when changes occur in the user’s needs and the input data (e.g., object/attribute insertion or elimination, taxonomy utilization), and (iii) approximate a “presumed” concept using a related pattern space so that patterns can augment data with knowledge. To conduct our work, we use formal concept analysis (FCA) as a framework for pattern discovery and management and we take a joint database-FCA perspective by defining operators similar in spirit to relational algebra operators, investigating approximation in concept lattices and exploiting existing work related to operations on contexts and lattices to formalize such operators.
📄 Content
Algebraic Operators for Querying Pattern Bases Rokia Missaoui1, L´eonard Kwuida1, Mohamed Quafafou2, Jean Vaillancourt1 1 Universit´e du Qu´ebec en Outaouais Gatineau (Qu´ebec) Canada, J8X 3X7 2 LSIS - CNRS UMR 6168 Universit´e Aix-Marseille, France {rokia.missaoui,leonard.kwuida,jean.vaillancourt}@uqo.ca⋆, mohamed.quafafou@univmed.fr Abstract. The objectives of this research work which is intimately related to pat- tern discovery and management are threefold: (i) handle the problem of pattern manipulation by defining operations on patterns, (ii) study the problem of enrich- ing and updating a pattern set (e.g., concepts, rules) when changes occur in the user’s needs and the input data (e.g., object/attribute insertion or elimination, tax- onomy utilization), and (iii) approximate a “presumed” concept using a related pattern space so that patterns can augment data with knowledge. To conduct our work, we use formal concept analysis (FCA) as a framework for pattern discov- ery and management and we take a joint database-FCA perspective by defining operators similar in spirit to relational algebra operators, investigating approxi- mation in concept lattices and exploiting existing work related to operations on contexts and lattices to formalize such operators. 1 Introduction The recent research topic of pattern discovery and management refers to a set of activities related to the extraction, description, manipulation and storage of patterns in a similar (but more elaborated) way as data are managed by database applications. In pattern management and inductive databases [4,5,16,24], pat- terns are knowledge artifacts (e.g., association rules, clusters) extracted from data using data mining procedures (generally run in advance), and retrieved upon user’s request. A pattern is then a concise and semantically rich repre- sentation of raw data. An example of a pattern could be a cluster that represents a set of Star Alliance members with their common features (e.g., fleet size, set of destinations). In many database and data warehouse applications, users tend to be drow- ning in data and even in patterns while they are actually interested in a very lim- ited set of knowledge pieces. Moreover, the scope of patterns to explore differs ⋆partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC). arXiv:0902.4042v1 [cs.DB] 24 Feb 2009 from one user to another and changes over time. Finally, one is frequently in- terested in an exploratory and iterative process of data mining (DM) to discover patterns under different scenarios and different hypotheses. In order to reduce the memory overload of the user and his working space induced by the large set of mined patterns, we propose to define a set of algebraic operators similar in spirit to operators of relational algebra. Such operators will allow “data min- ing on demand” (i.e., data mining according to user’s needs and perspectives) and rely on key operations on concept lattices such as selection, projection and join. Additional operations will be defined either to enrich the pattern basis or to identify the patterns that best approximate a “presumed” concept. The following example is an elementary way to display information. It presents the Star Alliance members in year 2000 with their destinations [8]. Star Alliance Latin America Europe Canada Asia Pacific Middle East Africa Mexico Caribbean US Air Canada × × × × × × × × Air New Zealand × × × All Nippon Airways × × × Ansett Australia × The Austrian Airlines Group × × × × × × British Midland × Lufthansa × × × × × × × × Mexicana × × × × × Scandinavian Airlines × × × × × Singapore Airlines × × × × × × Thai Airways International × × × × × United Airlines × × × × × × × VARIG × × × × × × Fig. 1. Star Alliance members and their flying destinations in year 2000. 2 Contexts, Concept Lattices and their Ideals Formal Concept Analysis is a branch of applied mathematics, which is based on a formalization of concept and concept hierarchy [9]. It has been successfully used for conceptual clustering and rule generation. Let K = (G, M, I) be a formal context, where G, M and I stand for a set of objects, a set of attributes, and a binary relation between G and M respectively. Two functions, f1 and f2, summarize the links between subsets of objects and subsets of attributes induced by I. Function f1 maps a set of objects into a set of common attributes, whereas f2 is the dual for attribute sets: f1 : P(G) →P(M), A 7→A′ := {a ∈M | ∀o ∈A, oIa}, f2 : P(M) →P(G), B 7→B′ := {o ∈G | ∀a ∈B, oIa}. Furthermore, the compound operators f2 ◦f1 and f1 ◦f2 (denoted by ′′) are closure operators on G and M respectively. In particular, Z ⊆Z′′ and (Z′′)′′ = Z′′ for Z ∈P(M)∪P(G). The set Z is closed if Z′′ = Z. A formal concept c is a pair of sets (A, B) with A ⊆G, B ⊆M, A = B′ and B = A′. A is called the extent of c (denoted by ext(c)) and B its intent (denoted by int(c)). In the closed itemset mining framework [18,29], A and
This content is AI-processed based on ArXiv data.