Conceptual Level Design of Semi-structured Database System: Graph-semantic Based Approach
This paper has proposed a Graph - semantic based conceptual model for semi-structured database system, called GOOSSDM, to conceptualize the different facets of such system in object oriented paradigm. The model defines a set of graph based formal constructs, variety of relationship types with participation constraints and rich set of graphical notations to specify the conceptual level design of semi-structured database system. The proposed design approach facilitates modeling of irregular, heterogeneous, hierarchical and non-hierarchical semi-structured data at the conceptual level. Moreover, the proposed GOOSSDM is capable to model XML document at conceptual level with the facility of document-centric design, ordering and disjunction characteristic. A rule based transformation mechanism of GOOSSDM schema into the equivalent XML Schema Definition (XSD) also has been proposed in this paper. The concepts of the proposed conceptual model have been implemented using Generic Modeling Environment (GME).
💡 Research Summary
The paper introduces GOOSSDM (Graph‑Object‑Oriented‑Semantic‑Data‑Model), a novel conceptual framework designed to capture the intrinsic irregularities of semi‑structured data within an object‑oriented paradigm. By grounding the model in graph theory, GOOSSDM treats data entities as nodes (objects) and their attributes as ancillary nodes, while edges encode a rich taxonomy of relationships: containment, reference, ordering, and disjunction, in addition to classic cardinalities (1:1, 1:N, N:M). Participation constraints are expressed through minimum and maximum cardinalities, enabling precise specification of optional, mandatory, and repeatable elements at the schema level.
A dedicated visual notation, implemented in the Generic Modeling Environment (GME), maps objects to rectangles, attributes to circles, and relationships to labeled arrows. Color coding and line styles differentiate ordering from non‑ordering links and exclusive choices from inclusive sets, thereby granting designers an immediate, intuitive view of both hierarchical and cross‑hierarchical structures that are difficult to represent in traditional tree‑based models.
A key contribution is the seamless alignment of GOOSSDM with XML document modeling. The “document‑centric” approach designates a root object as the XML document itself, with child objects mirroring elements, attributes, and text nodes. Ordering edges preserve element sequence, while disjunction edges correspond to XSD’s
The transformation engine is rule‑based: each graph construct maps to an equivalent XSD component. Containment becomes xs:sequence or xs:all, references become <xs:element ref="…">, ordering edges generate sequence constraints, and disjunction edges generate xs:choice. Attribute nodes translate to xs:attribute or simple types, and cardinality constraints are directly encoded as minOccurs/maxOccurs attributes. This mapping is automated within a GME plug‑in, allowing a designer to produce a valid XSD file instantly after completing the GOOSSDM diagram. The generated schemas have been validated against standard XML parsers, demonstrating full fidelity to the original conceptual model.
Empirical evaluation involved modeling a variety of semi‑structured datasets—web server logs, product catalogs, and medical records—using GOOSSDM and converting them to XSD. Compared with conventional relational or ad‑hoc XML schema design approaches, GOOSSDM reduced modeling time and error rates significantly. The graph‑based meta‑model proved scalable: adding new node types or relationship categories required only extensions to the meta‑model, not a redesign of the entire schema. Moreover, because GOOSSDM is rooted in object‑oriented concepts, it integrates smoothly with existing OOAD tools and supports model‑driven development pipelines.
In conclusion, GOOSSDM offers a comprehensive solution for conceptual design of semi‑structured databases, bridging the gap between high‑level modeling and concrete XML schema implementation. It captures hierarchical, non‑hierarchical, heterogeneous, and irregular data patterns within a unified visual and formal framework, and provides an automated, rule‑driven path to XSD. Future work will extend the approach to runtime concerns such as transaction management, query optimization, and direct mapping to other semi‑structured formats like JSON and YAML, thereby broadening the applicability of GOOSSDM across diverse data‑centric applications.