Functional Automata - Formal Languages for Computer Science Students
An introductory formal languages course exposes advanced undergraduate and early graduate students to automata theory, grammars, constructive proofs, computability, and decidability. Programming students find these topics to be challenging or, in many cases, overwhelming and on the fringe of Computer Science. The existence of this perception is not completely absurd since students are asked to design and prove correct machines and grammars without being able to experiment nor get immediate feedback, which is essential in a learning context. This article puts forth the thesis that the theory of computation ought to be taught using tools for actually building computations. It describes the implementation and the classroom use of a library, FSM, designed to provide students with the opportunity to experiment and test their designs using state machines, grammars, and regular expressions. Students are able to perform random testing before proceeding with a formal proof of correctness. That is, students can test their designs much like they do in a programming course. In addition, the library easily allows students to implement the algorithms they develop as part of the constructive proofs they write. Providing students with this ability ought to be a new trend in the formal languages classroom.
💡 Research Summary
The paper addresses a persistent pedagogical challenge in introductory formal languages and automata theory courses: students are required to design and prove the correctness of machines (DFA, NFA, PDA, Turing machines) and grammars (regular, context‑free, context‑sensitive) without ever seeing those artifacts run. This lack of immediate feedback makes the material feel abstract, overwhelming, and disconnected from the programming experiences that most computer‑science students are accustomed to.
To bridge this gap, the authors present FSM, a library written in the functional language Racket that supplies a complete, language‑agnostic framework for constructing, transforming, observing, and testing finite‑state machines, push‑down automata, Turing machines, and various grammars. The library is organized around three families of operations:
-
Primitive constructors (
make‑dfa,make‑ndfa,make‑pda,make‑tm,make‑rg,make‑cfg,make‑csg) that let students declare the components of a machine or grammar (states, alphabet, start/final states, transition or production rules) in a concise, data‑driven form. -
Transformers that implement the constructive algorithms normally found in textbook proofs. Examples include
regex→fsa(regular expression to finite‑state automaton),ndfa→dfa(subset construction),union‑sm,concat‑sm,kleenestar‑sm,complement‑sm,intersection‑sm, andgrammar→sm(building an automaton from a grammar). These functions enable students to “run” the proof steps they write, turning a theoretical argument into executable code. -
Observers and testers that provide the immediate feedback loop. Observers such as
apply‑sm(run a machine on a word),show‑transitions‑sm(display the path taken), andderiv(return a derivation tree for a grammar) let students inspect the behavior of their constructions. Testers liketest‑equiv‑sm,test‑sm,both‑deriv?, andtest‑equiv‑grammarautomatically generate random inputs (or accept user‑supplied strings) and compare the results of two machines or grammars, reporting mismatches. This separates “validation” (does the machine accept the intended language?) from “verification” (are there hidden bugs?).
The authors argue that this toolset mirrors the iterative develop‑test‑debug cycle familiar from programming courses, thereby reducing the cognitive distance between formal proofs and concrete implementations. They report two classroom case studies. In the first, students used the NFA‑to‑DFA transformer and the random‑testing harness to quickly discover and correct errors in their subset‑construction implementations, achieving higher correctness rates than with pen‑and‑paper proofs alone. In the second, students built CFGs and PDAs, employed grammar→sm and deriv to generate derivation trees, and used the testing utilities to confirm language equivalence, leading to deeper conceptual understanding of closure properties and parsing. Both studies noted increased student engagement, faster turnaround on assignments, and a reduction in grading effort because many correctness checks were automated.
Compared with existing software supplements (often language‑specific, limited in scope, or merely referenced in textbooks), FSM’s design is deliberately language‑agnostic and extensible. Its functional, list‑based representation keeps the code short and readable, and new transformers or observers can be added without altering the core API. This makes the library suitable not only for undergraduate courses but also for research prototypes, compiler construction labs, or model‑checking exercises.
Future work outlined includes developing a web‑based interactive front‑end to lower the entry barrier for students unfamiliar with Racket, integrating the library with automatic grading systems for massive open online courses (MOOCs), and extending the framework to cover more advanced topics such as intermediate representations, static analysis, or probabilistic automata.
In summary, the FSM library introduces a construct‑implement‑experiment paradigm to formal languages education. By allowing students to immediately test the machines and grammars they design, it transforms abstract constructive proofs into tangible, debuggable artifacts, thereby enhancing learning outcomes, motivating students, and streamlining instructor workload.
Comments & Academic Discussion
Loading comments...
Leave a Comment