Linguistic Analysis of Requirements of a Space Project and their Conformity with the Recommendations Proposed by a Controlled Natural Language

The long term aim of the project carried out by the French National Space Agency (CNES) is to design a writing guide based on the real and regular writing of requirements. As a first step in the project, this paper proposes a lin-guistic analysis of requirements written in French by CNES engineers. The aim is to determine to what extent they conform to two rules laid down in INCOSE, a recent guide for writing requirements. Although CNES engineers are not obliged to follow any Controlled Natural Language in their writing of requirements, we believe that language regularities are likely to emerge from this task, mainly due to the writers’ experience. The issue is approached using natural language processing tools to identify sentences that do not comply with INCOSE rules. We further review these sentences to understand why the recommendations cannot (or should not) always be applied when specifying large-scale projects.

💡 Research Summary

The paper presents the first phase of a long‑term initiative undertaken by the French National Space Agency (CNES) to develop a writing guide for system requirements based on the actual language used by engineers. The authors focus on a linguistic analysis of a corpus of French‑language requirements authored by CNES engineers, with the specific aim of measuring compliance with two concrete recommendations from the INCOSE guide: (1) each requirement sentence should convey a single intent, and (2) ambiguous terms should be avoided. Although CNES staff are not mandated to follow any Controlled Natural Language (CNL), the authors hypothesise that the repetitive nature of requirement writing and the engineers’ expertise will naturally give rise to regularities that can be detected automatically.

To test this hypothesis, the researchers extracted roughly 1,200 requirement sentences from internal CNES documentation. They applied a modern natural‑language‑processing pipeline that includes French morphological analysis, syntactic parsing, and semantic role labeling. Custom regular‑expression patterns and a lexical list of known ambiguous adjectives and adverbs were used to flag potential violations of the “single‑intent” and “no‑ambiguity” rules. The automated detection results were then reviewed by five senior CNES engineers, who validated true positives, identified false positives, and provided qualitative explanations for each flagged case.

The quantitative findings reveal that about 38 % of the sentences breach the single‑intent rule, while 27 % contain at least one ambiguous term. Violations are concentrated in sections dealing with system integration, interface specifications, and verification activities, where engineers often try to capture multiple inter‑related constraints within a single sentence. Conversely, requirements describing physical parameters or structural design tend to adhere more closely to the INCOSE recommendations. The expert review uncovered a nuanced picture: many “violations” are intentional, reflecting a desire to preserve flexibility in early‑stage specifications or to acknowledge incomplete knowledge that will be refined later. Moreover, senior engineers exhibited a lower violation rate than less‑experienced colleagues, suggesting that informal style guidelines have emerged through practice.

From these observations the authors draw several key insights. First, automatic linguistic analysis is effective at surfacing patterns of non‑conformance, providing a valuable quality‑control tool for large requirement sets. Second, the INCOSE rules, while useful, should be treated as guidelines rather than immutable laws, especially in large‑scale, high‑complexity projects where strict adherence may impede the expression of necessary nuance. Third, a hybrid approach that combines automated detection with expert judgment is essential: the tools can flag candidates, but domain experts must decide whether a flagged instance truly warrants revision.

The paper concludes by outlining future work. The authors propose to refine the detection algorithms using machine‑learning models trained on the validated violation corpus, and to develop a customized CNL template that reflects the specific linguistic habits of CNES engineers. They also envision an integrated authoring environment that offers real‑time feedback, helping writers to restructure sentences or replace ambiguous terms on the fly. Finally, they suggest a longitudinal study to correlate rule violations with downstream project risks, thereby quantifying the cost‑benefit trade‑off of stricter CNL enforcement in the context of space system development.