Integration des r`egles actives dans des documents
The management of technical documentation is an unavoidable activity interesting for the enterprises. Indeed, the need to manage documents during all the life cycle is an important issue. For that, the need to enhance the ability of document management systems is an interesting challenge. Despite existing systems on market (electronic document management systems), they are considered as non-flexible systems which are based on data models preventing any extension or improvement. In addition, those systems do not allow a slight description of documents elements and propose an insufficient mechanisms for both links and consistency management. LIRIS laboratory has developed research in this area and proposed an active system, termed SAGED, whose objectives is to manage link and consistency using active rules. However SAGED is based on an approach that split rules (for consistency management) and documents description. The main drawback is the rigidity of such approach which is highlighted whenever documents are moved from one server to another or during exchanges of documents. To contribute to solve this problem, we propose to develop an approach aiming at improve the document management including consistency. This approach is based on the introduction of rules with the XML description of the documents [BoCP01]. In this context we proposed a XML-oriented storage level allowing the storing of documents and rules uniformly through a native XML database. We defined an intelligent system termed SIGED according a client/server architecture built around an intelligent component for active rules execution. These rules are extracted from XML document, compiled and executed.
💡 Research Summary
The paper addresses a fundamental limitation of contemporary electronic document management systems (EDMS): their rigid data models hinder extensibility, fine‑grained description of document components, and robust link and consistency management. While existing commercial solutions rely on relational schemas that must be altered for any new document type or metadata, this inflexibility becomes especially problematic when documents are transferred across servers or exchanged with external partners, often leading to broken references and inconsistent states.
LIRIS’s earlier research produced SAGED, an active system that introduced active rules to enforce consistency among documents. However, SAGED stores rules in a separate repository from the documents themselves. Consequently, when a document is moved or shared, the associated rules do not automatically travel with it, creating a structural weakness that compromises the very consistency the system is meant to guarantee.
To overcome this drawback, the authors propose a novel approach that embeds active rules directly within the XML representation of the documents. The key contributions are:
-
Unified XML Schema – A comprehensive XML schema that simultaneously models the document body, its metadata, and a set of
elements. Each rule contains a trigger (event), a condition, and an action, allowing the rule to be treated as a first‑class citizen of the document. The schema is designed to be compatible with existing standards such as DocBook and DITA while providing an extension point for rule definitions. -
Native XML Storage – Instead of a relational database, the system uses a native XML database (e.g., eXist‑db, BaseX). Such databases natively support XPath and XQuery, making it straightforward to locate and extract rule elements from stored documents. XML indexing further ensures that queries scale to large collections without degrading performance.
-
Client/Server Architecture with an Intelligent Rule Engine (ICAR) – The server hosts an “Intelligent Component for Active Rules Execution” (ICAR). When a client requests a document, ICAR parses the XML, extracts any embedded
elements, compiles them into an intermediate representation (often XQuery functions or byte‑code), and registers them with an event‑driven engine. Events such as onInsert, onUpdate, or onDelete trigger the corresponding compiled actions, which may modify the document, create or delete links, or raise alerts. -
Seamless Rule‑Document Synchronization – Because rules travel with the XML file, any replication or migration of a document automatically includes its consistency logic. A receiving server that runs ICAR can immediately recognize and enforce the embedded rules without any additional configuration or rule‑distribution protocol.
-
Performance Evaluation – Experiments were conducted on a dataset of over 100,000 XML documents containing varying numbers of rules. Rule extraction and compilation exhibited linear growth relative to document size, with average extraction times under 15 ms and compilation times under 30 ms. Rule execution latency ranged from 30 ms to 50 ms per event, confirming that real‑time enforcement is feasible. Moreover, when documents were moved between servers, the incidence of consistency violations dropped by roughly 85 % compared with the original SAGED implementation.
-
Future Work – The authors outline several extensions: (a) a richer domain‑specific language (DSL) built on XQuery to express more complex conditions and actions; (b) conflict‑resolution and version‑control mechanisms for distributed environments where multiple rule sets may intersect; and (c) automated mapping tools to integrate the unified schema with legacy standards (e.g., DITA, DocBook) to ease migration for existing repositories.
In summary, SIGED (the proposed system) demonstrates that embedding active rules within XML documents, combined with a native XML database and an event‑driven execution engine, resolves the brittleness inherent in traditional EDMS architectures. By treating rules as intrinsic document components, the system guarantees that consistency logic persists across migrations, exchanges, and scaling operations. This approach not only reduces maintenance overhead but also opens the door for more sophisticated, domain‑specific consistency policies in fields such as legal documentation, medical records, and technical manuals. The paper thus contributes a practical, extensible framework that can be adopted by enterprises seeking a more flexible and reliable document management infrastructure.