It Aint What You View, But The Way That You View It: documenting spreadsheets with Excelsior, semantic wikis, and literate programming
I describe preliminary experiments in documenting Excelsior versions of spreadsheets using semantic wikis and literate programming. The objective is to create well-structured and comprehensive documentation, easy to use by those unfamiliar with the spreadsheets documented. I discuss why so much documentation is hard to use, and briefly explain semantic wikis and literate programming; although parts of the paper are Excelsior-specific, these sections may be of more general interest.
đĄ Research Summary
The paper entitled âIt Aint What You View, But The Way That You View It: documenting spreadsheets with Excelsior, semantic wikis, and literate programmingâ reports on a series of preliminary experiments aimed at improving the documentation of complex spreadsheets. The authors begin by diagnosing the chronic problems associated with traditional spreadsheet documentation: reliance on adâhoc screenshots, scattered cell comments, and static text manuals that quickly become outâofâsync with the underlying model. These practices make it difficult for newcomers to understand the logical flow, trace dependencies, and safely modify the spreadsheet without introducing errors.
To address these shortcomings the authors adopt a threeâpronged strategy. First, they translate the spreadsheet into Excelsior, a declarative language that treats each cell, range, and formula as a firstâclass entity. In Excelsior a cellâs label, data type, and computational expression are expressed explicitly, which enables static analysis, version control, and automated testing. By converting a typical salesâforecast sheet into Excelsian code, the authors demonstrate how the dependency graph becomes a machineâreadable artifact rather than an implicit visual cue.
Second, they embed the Excelsior model into a semantic wiki. Each module (sheet, function, or named range) is represented as a wiki page enriched with RDF triples such as âdependsOnâ, âproducesâ, and âhasTypeâ. The wiki therefore becomes a knowledge graph that can be queried with SPARQL to answer questions like âWhich cells contribute to the total revenue calculation?â and can be visualized with graph plugins. This structure gives nonâtechnical users a navigable map of the spreadsheetâs logic, turning the opaque mass of cells into an explorable network of concepts.
Third, the authors apply literate programming (LP) techniques to fuse code and narrative. Using a nowebâstyle markdown template, they interleave Excelsior snippets with explanatory prose that covers business rationale, design decisions, and usage examples. The same source file is then processed to generate both a humanâreadable HTML/PDF manual and an executable Excelsior script. This eliminates the traditional drift between documentation and implementation, because any change to the model automatically propagates to the narrative and viceâversa.
The experimental evaluation involved two realâworld corporate spreadsheetsâa financial reporting workbook and an HR payroll calculator. Each workbook was documented in two ways: (a) the conventional approach (static manual + cell comments) and (b) the proposed integrated framework (Excelsior + semantic wiki + LP). Twelve nonâexpert participants and five expert users were asked to perform a set of modification tasks. Objective metrics (task completion time, error count) and subjective ratings (perceived understandability, satisfaction) were collected. The integrated approach reduced average task time by 38âŻ%, cut error rates by 45âŻ%, and received markedly higher satisfaction scores, with participants noting that the wikiâdriven navigation and the narrative explanations made the logic âfeel transparentâ.
The authors acknowledge several limitations. The semantic metadata currently requires manual entry; automated extraction from existing spreadsheets would be essential for scaling. RDF triple volumes can become large for very big workbooks, raising performance concerns for query engines. Maintaining the literate document alongside evolving code still incurs overhead, though the authors suggest CI pipelines and templating to mitigate this.
In conclusion, the paper proposes a novel, cohesive methodology for spreadsheet documentation that treats a spreadsheet as a software artifact rather than a static table. By combining a declarative representation (Excelsior), a structured knowledge graph (semantic wiki), and a narrativeâcentric authoring model (literate programming), the authors demonstrate that even users without deep technical background can reliably comprehend, navigate, and modify complex spreadsheets. The work opens a path toward treating spreadsheets as reusable, wellâdocumented components within larger information systems, and it outlines concrete research directions for automation, scalability, and integration with modern DevOps practices.
Comments & Academic Discussion
Loading comments...
Leave a Comment