In the paper a new approach to data representation and manipulation is described, which is called the concept-oriented data model (CODM). It is supposed that items represent data units, which are stored in concepts. A concept is a combination of superconcepts, which determine the concept's dimensionality or properties. An item is a combination of superitems taken by one from all the superconcepts. An item stores a combination of references to its superitems. The references implement inclusion relation or attribute-value relation among items. A concept-oriented database is defined by its concept structure called syntax or schema and its item structure called semantics. The model defines formal transformations of syntax and semantics including the canonical semantics where all concepts are merged and the data semantics is represented by one set of items. The concept-oriented data model treats relations as subconcepts where items are instances of the relations. Multi-valued attributes are defined via subconcepts as a view on the database semantics rather than as a built-in mechanism. The model includes concept-oriented query language, which is based on collection manipulations. It also has such mechanisms as aggregation and inference based on semantics propagation through the database schema.
Deep Dive into Principles of the Concept-Oriented Data Model.
In the paper a new approach to data representation and manipulation is described, which is called the concept-oriented data model (CODM). It is supposed that items represent data units, which are stored in concepts. A concept is a combination of superconcepts, which determine the concept’s dimensionality or properties. An item is a combination of superitems taken by one from all the superconcepts. An item stores a combination of references to its superitems. The references implement inclusion relation or attribute-value relation among items. A concept-oriented database is defined by its concept structure called syntax or schema and its item structure called semantics. The model defines formal transformations of syntax and semantics including the canonical semantics where all concepts are merged and the data semantics is represented by one set of items. The concept-oriented data model treats relations as subconcepts where items are instances of the relations. Multi-valued attributes are
Currently there exists a wide spectrum of models used to represent data, which are based on different basic principles. These principles lie in different planes so that there exist different dimensions used to distinguish data models. For example, the models could be distinguished by the formalism used to describe the data, by query language used to retrieve data, by data manipulation operators provided by the model, by the model storage principles etc. Most existing models are not isolated and intersect with other models on one set of principles while sharing another set of principles. Thus there exist different classification schemas that can be used to distinguish them.
Although there exist different alternative sets of criteria for classifying data models, we can always select the most important principles having the highest level of abstraction. Frequently these principles are so general that cover other disciples rather only data models and databases. In many cases however, these principles are not declared explicitly so that it is quite difficult to reconstruct them and unambiguously classify one or another data model. This abstract set of principles determines the whole paradigm a data model is based on, i.e., they establish a high level view of the world. The data model described in this paper is based on the concept-oriented paradigm, which is based on its own fundamental principles. The paradigm is much wider than the data model and underlies other areas such as programming, modelling or analysis and design. In this sense it is like object-oriented paradigm that is based on one set of general principles used in different disciplines such as programming or modelling. Some concept-oriented principles are directly used in the described data model while other are have weak consequences for this area. Some principles are rather concrete and constructive while other have in great extent philosophical character. Many of these principles are not independent and may have different level and priority. Yet below in the introduction we try to formulate all of them explicitly as independent highest priority statements or axioms because without them it is quite difficult to understand the essence of this approach and its difference from other paradigms in general and data representation methods in particular. Indeed at the molecular or atomic level of databases are simply elementary interactions so they are all equal. At the level machine language they still equivalent. And even at the level procedural programming most of the data models cannot be distinguished. It is only a set of high level organisational principles that can help us to distinguish one model from another.
The first principle specifies the main areas of interest of the concept-oriented paradigm:
CO1 The concept-oriented paradigm is aimed at studying representation and access issues in any system.
Thus representation and access are the key words for this direction (interaction could be added to this list). In other words, for any system we are first of all interested in how things are represented, how they are access and how they interact.
The next principle introduces types of things used in our analysis:
CO2 There exists two sorts of things: objects and spaces.
These things have numerous synonyms used in different contexts and formal settings. For example, object are also called entity, item, value, element, record etc. Space is can be called set, class, concept, domain etc. In the concept-oriented data model space is associated with the model syntax (section 2.1) while objects are associated with the model semantics (section 2.2). The separation between these two sorts of things is not unique for the concept-oriented paradigm and essentially underlies all the contemporary mathematics (after Descartes) where the world is described by means variables and values. However, in the concept-oriented paradigm it is explicitly formulated as one of the highest priority principles. In this principle we simply recognize that there exists a deep fundamental difference between these two sorts of things without which we are not able to describe the world in general and represent data in particular. Of course much better would be to get rid of this separation and reduce all operations to only one type of things. However, it is currently impossible because the whole world outlook is based on this assumption. In other words, in order to change that we need a new vision of the world organisation, which is a kind of the theory of everything where any entity can be described by means of itself. We do not exclude such a theory at all but for the needs of the data representation we assume that this principle is true. By postulating the existence of two sorts of things we essentially make our life easier. In particular, we can formulate the next principle:
CO3 Objects are living in space and its structure determines most of the whole system functionality, w
…(Full text truncated)…
This content is AI-processed based on ArXiv data.