MontiCore: a Framework for Compositional Development of Domain Specific Languages
Domain specific languages (DSLs) are increasingly used today. Coping with complex language definitions, evolving them in a structured way, and ensuring their error freeness are the main challenges of DSL design and implementation. The use of modular language definitions and composition operators are therefore inevitable in the independent development of language components. In this article, we discuss these arising issues by describing a framework for the compositional development of textual DSLs and their supporting tools. We use a redundance-free definition of a readable concrete syntax and a comprehensible abstract syntax as both representations significantly overlap in their structure. For enhancing the usability of the abstract syntax, we added concepts like associations and inheritance to a grammar- based definition in order to build up arbitrary graphs (as known from metamodeling). Two modularity concepts, grammar inheritance and embedding, are discussed. They permit compositional language definition and thus simplify the extension of languages based on already existing ones. We demonstrate that compositional engineering of new languages is a useful concept when project-individual DSLs with appropriate tool support are defined.
💡 Research Summary
The paper introduces MontiCore, a framework designed to address the three main challenges of domain‑specific language (DSL) development: handling complex language specifications, evolving them in a structured manner, and guaranteeing their correctness. The authors propose a single, redundancy‑free grammar notation that simultaneously serves as a readable concrete syntax and a well‑structured abstract syntax. By enriching the grammar with association and inheritance constructs, the framework allows the definition of arbitrary graph‑like abstract models directly within the textual specification, bridging the gap between traditional parser grammars and metamodeling techniques.
Two compositional mechanisms are central to MontiCore’s modularity strategy. Grammar inheritance lets a new language reuse all production rules of an existing one, while permitting selective overriding or extension. This mechanism simplifies language evolution and the creation of language families, because changes in a base grammar automatically propagate to derived grammars. Embedding, on the other hand, enables the insertion of a complete sub‑grammar into another grammar at designated points. The framework resolves token‑level conflicts through a dedicated parser‑combination strategy, ensuring that each embedded language retains its own validation and tooling capabilities.
Beyond the definition phase, MontiCore automatically generates the full tool chain required for a textual DSL: parsers, abstract syntax tree (AST) classes, visitor‑based transformers, and Eclipse plug‑ins for editing, syntax highlighting, and real‑time error checking. Because the generated artifacts are derived from the same modular grammar, consistency between language definition and tooling is guaranteed, and developers avoid the repetitive effort of hand‑crafting parsers and editors for each new DSL.
The authors validate their approach with two case studies. In the first, a state‑machine DSL is extended by embedding a database‑access DSL, resulting in a single script language that can describe both control flow and data manipulation. In the second, a basic UML sequence‑diagram DSL is subclassed to create a domain‑specific variant with additional semantics. In both scenarios, the amount of hand‑written code dropped dramatically, maintenance effort was reduced, and the automatically generated editors correctly performed syntax highlighting and error detection for the composed languages.
Overall, MontiCore demonstrates that a grammar‑centric, compositional development model can substantially improve the productivity and reliability of DSL engineering. By unifying concrete and abstract syntax, supporting object‑oriented extensions within the grammar, and providing inheritance and embedding as first‑class composition operators, the framework enables developers to build complex, project‑specific languages from reusable components while preserving a coherent, automatically generated tool ecosystem.