Efficient Minimization of DFAs with Partial Transition Functions

Reading time: 6 minute
...

📝 Original Info

  • Title: Efficient Minimization of DFAs with Partial Transition Functions
  • ArXiv ID: 0802.2826
  • Date: 2008-02-21
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Let PT-DFA mean a deterministic finite automaton whose transition relation is a partial function. We present an algorithm for minimizing a PT-DFA in $O(m \lg n)$ time and $O(m+n+\alpha)$ memory, where $n$ is the number of states, $m$ is the number of defined transitions, and $\alpha$ is the size of the alphabet. Time consumption does not depend on $\alpha$, because the $\alpha$ term arises from an array that is accessed at random and never initialized. It is not needed, if transitions are in a suitable order in the input. The algorithm uses two instances of an array-based data structure for maintaining a refinable partition. Its operations are all amortized constant time. One instance represents the classical blocks and the other a partition of transitions. Our measurements demonstrate the speed advantage of our algorithm on PT-DFAs over an $O(\alpha n \lg n)$ time, $O(\alpha n)$ memory algorithm.

💡 Deep Analysis

Deep Dive into Efficient Minimization of DFAs with Partial Transition Functions.

Let PT-DFA mean a deterministic finite automaton whose transition relation is a partial function. We present an algorithm for minimizing a PT-DFA in $O(m \lg n)$ time and $O(m+n+\alpha)$ memory, where $n$ is the number of states, $m$ is the number of defined transitions, and $\alpha$ is the size of the alphabet. Time consumption does not depend on $\alpha$, because the $\alpha$ term arises from an array that is accessed at random and never initialized. It is not needed, if transitions are in a suitable order in the input. The algorithm uses two instances of an array-based data structure for maintaining a refinable partition. Its operations are all amortized constant time. One instance represents the classical blocks and the other a partition of transitions. Our measurements demonstrate the speed advantage of our algorithm on PT-DFAs over an $O(\alpha n \lg n)$ time, $O(\alpha n)$ memory algorithm.

📄 Full Content

arXiv:0802.2826v1 [cs.IT] 20 Feb 2008 Symposium on Theoretical Aspects of Computer Science 2008 (Bordeaux), pp. 645-656 www.stacs-conf.org EFFICIENT MINIMIZATION OF DFAS WITH PARTIAL TRANSITION FUNCTIONS ANTTI VALMARI 1 AND PETRI LEHTINEN 1 1 Tampere University of Technology, Institute of Software Systems, PO Box 553, FI-33101 Tampere, Finland E-mail address: {Antti.Valmari,Petri.Lehtinen}@tut.fi Abstract. Let PT-DFA mean a deterministic finite automaton whose transition relation is a partial function. We present an algorithm for minimizing a PT-DFA in O(m lg n) time and O(m + n + α) memory, where n is the number of states, m is the number of defined transitions, and α is the size of the alphabet. Time consumption does not depend on α, because the α term arises from an array that is accessed at random and never initialized. It is not needed, if transitions are in a suitable order in the input. The algorithm uses two instances of an array-based data structure for maintaining a refinable partition. Its operations are all amortized constant time. One instance represents the classical blocks and the other a partition of transitions. Our measurements demonstrate the speed advantage of our algorithm on PT-DFAs over an O(αn lg n) time, O(αn) memory algorithm. 1. Introduction Minimization of a deterministic finite automaton (DFA) is a classic problem in computer science. Let n be the number of states, m the number of transitions and α the size of the alphabet of the DFA. Hopcroft made a breakthrough in 1970 by presenting an algorithm that runs in O(n lg n) time, treating α as a constant [5]. Gries made the dependence of the running time of the algorithm on α explicit, obtaining O(αn lg n) [3]. (Complexity is reported using the RAM machine model under the uniform cost criterion [1, p. 12].) Our starting point was the paper by Knuutila in 2001, where he presented yet another O(αn lg n) algorithm, and remarked that some versions which have been believed to run within this time bound actually fail to do so [6]. Hopcroft’s algorithm is based on using only the “smaller” half of some set (known as block) that has been split. Knuutila demonstrated with an example that although the most well-known notion of “smaller” automatically leads to O(αn lg n), two other notions that have been used may yield Ω(n3) when α = 1 2n. He also showed that this can be avoided by maintaining, for each symbol, the set of those states in the block that have input transitions labelled by that symbol. According to [3], Hopcroft’s original algorithm did so. Some later authors have dropped this complication as unnecessary, although it is necessary when the alternative notions of “smaller” are used. Key words and phrases: deterministic finite automaton, sparse adjacency matrix, partition refinement. Petri Lehtinen was funded by Academy of Finland, project ALEA (210795). c ⃝ A. Valmari and P. Lehtinen CC ⃝ Creative Commons Attribution-NoDerivs License 646 A. VALMARI AND P. LEHTINEN Knuutila mentioned as future work whether his approach can be used to develop an O(m lg n) algorithm for DFAs whose transition functions are not necessarily total. For brevity, we call them PT-DFAs. With an ordinary DFA, O(m lg n) is the same as O(αn lg n) as m = αn, but with a PT-DFA it may be much better. We present such an algorithm in this paper. We refined Knuutila’s method of maintaining sets of states with relevant input transitions into a full-fledged data structure for maintaining refinable partitions. In- stead of maintaining those sets of states, our algorithm maintains the corresponding sets of transitions. Another instance of the structure maintains the blocks. Knuutila seems to claim that such a PT-DFA algorithm arises from the results in [7], where an O(m lg n) algorithm was described for refining a partition against a relation. However, there α = 1, so the solved problem is not an immediate generalisation of ours. Extending the algorithm to α > 1 is not trivial, as can be appreciated from the extension in [2]. It discusses O(m lg n) without openly promising it. Indeed, its analysis treats α as a constant. It seems to us that its running time does have an αn term. In Section 2 we present an abstract minimization algorithm that, unlike [3, 6], has been adapted to PT-DFAs and avoids scanning the blocks and the alphabet in nested loops. The latter is crucial for converting αn into m in the complexity. The question of what blocks are needed in further splitting, has led to lengthy and sometimes unconvincing discussions in earlier literature. Our correctness proof deals with this issue using the “loop invariant” paradigm advocated in [4]. Our loop invariant “knows” what blocks are needed. Section 3 presents an implementation of the refinable partition data structure. Its per- formance relies on a carefully chosen combination of simple low-level programming details. The implementation of the main part of the abstract algorithm is the topic of Section 4. The analysis of its

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut