📝 Original Info
- Title: Efficient Minimization of DFAs with Partial Transition Functions
- ArXiv ID: 0802.2826
- Date: 2008-02-21
- Authors: Researchers from original ArXiv paper
📝 Abstract
Let PT-DFA mean a deterministic finite automaton whose transition relation is a partial function. We present an algorithm for minimizing a PT-DFA in $O(m \lg n)$ time and $O(m+n+\alpha)$ memory, where $n$ is the number of states, $m$ is the number of defined transitions, and $\alpha$ is the size of the alphabet. Time consumption does not depend on $\alpha$, because the $\alpha$ term arises from an array that is accessed at random and never initialized. It is not needed, if transitions are in a suitable order in the input. The algorithm uses two instances of an array-based data structure for maintaining a refinable partition. Its operations are all amortized constant time. One instance represents the classical blocks and the other a partition of transitions. Our measurements demonstrate the speed advantage of our algorithm on PT-DFAs over an $O(\alpha n \lg n)$ time, $O(\alpha n)$ memory algorithm.
💡 Deep Analysis
Deep Dive into Efficient Minimization of DFAs with Partial Transition Functions.
Let PT-DFA mean a deterministic finite automaton whose transition relation is a partial function. We present an algorithm for minimizing a PT-DFA in $O(m \lg n)$ time and $O(m+n+\alpha)$ memory, where $n$ is the number of states, $m$ is the number of defined transitions, and $\alpha$ is the size of the alphabet. Time consumption does not depend on $\alpha$, because the $\alpha$ term arises from an array that is accessed at random and never initialized. It is not needed, if transitions are in a suitable order in the input. The algorithm uses two instances of an array-based data structure for maintaining a refinable partition. Its operations are all amortized constant time. One instance represents the classical blocks and the other a partition of transitions. Our measurements demonstrate the speed advantage of our algorithm on PT-DFAs over an $O(\alpha n \lg n)$ time, $O(\alpha n)$ memory algorithm.
📄 Full Content
arXiv:0802.2826v1 [cs.IT] 20 Feb 2008
Symposium on Theoretical Aspects of Computer Science 2008 (Bordeaux), pp. 645-656
www.stacs-conf.org
EFFICIENT MINIMIZATION OF DFAS WITH PARTIAL TRANSITION
FUNCTIONS
ANTTI VALMARI 1 AND PETRI LEHTINEN 1
1 Tampere University of Technology, Institute of Software Systems, PO Box 553, FI-33101 Tampere,
Finland
E-mail address: {Antti.Valmari,Petri.Lehtinen}@tut.fi
Abstract. Let PT-DFA mean a deterministic finite automaton whose transition relation
is a partial function. We present an algorithm for minimizing a PT-DFA in O(m lg n) time
and O(m + n + α) memory, where n is the number of states, m is the number of defined
transitions, and α is the size of the alphabet. Time consumption does not depend on α,
because the α term arises from an array that is accessed at random and never initialized.
It is not needed, if transitions are in a suitable order in the input. The algorithm uses
two instances of an array-based data structure for maintaining a refinable partition. Its
operations are all amortized constant time. One instance represents the classical blocks and
the other a partition of transitions. Our measurements demonstrate the speed advantage
of our algorithm on PT-DFAs over an O(αn lg n) time, O(αn) memory algorithm.
1. Introduction
Minimization of a deterministic finite automaton (DFA) is a classic problem in computer
science. Let n be the number of states, m the number of transitions and α the size of the
alphabet of the DFA. Hopcroft made a breakthrough in 1970 by presenting an algorithm
that runs in O(n lg n) time, treating α as a constant [5]. Gries made the dependence of
the running time of the algorithm on α explicit, obtaining O(αn lg n) [3]. (Complexity is
reported using the RAM machine model under the uniform cost criterion [1, p. 12].)
Our starting point was the paper by Knuutila in 2001, where he presented yet another
O(αn lg n) algorithm, and remarked that some versions which have been believed to run
within this time bound actually fail to do so [6]. Hopcroft’s algorithm is based on using only
the “smaller” half of some set (known as block) that has been split. Knuutila demonstrated
with an example that although the most well-known notion of “smaller” automatically leads
to O(αn lg n), two other notions that have been used may yield Ω(n3) when α = 1
2n. He
also showed that this can be avoided by maintaining, for each symbol, the set of those
states in the block that have input transitions labelled by that symbol. According to [3],
Hopcroft’s original algorithm did so. Some later authors have dropped this complication as
unnecessary, although it is necessary when the alternative notions of “smaller” are used.
Key words and phrases: deterministic finite automaton, sparse adjacency matrix, partition refinement.
Petri Lehtinen was funded by Academy of Finland, project ALEA (210795).
c
⃝
A. Valmari and P. Lehtinen
CC
⃝
Creative Commons Attribution-NoDerivs License
646
A. VALMARI AND P. LEHTINEN
Knuutila mentioned as future work whether his approach can be used to develop an
O(m lg n) algorithm for DFAs whose transition functions are not necessarily total.
For
brevity, we call them PT-DFAs. With an ordinary DFA, O(m lg n) is the same as O(αn lg n)
as m = αn, but with a PT-DFA it may be much better. We present such an algorithm
in this paper. We refined Knuutila’s method of maintaining sets of states with relevant
input transitions into a full-fledged data structure for maintaining refinable partitions. In-
stead of maintaining those sets of states, our algorithm maintains the corresponding sets of
transitions. Another instance of the structure maintains the blocks.
Knuutila seems to claim that such a PT-DFA algorithm arises from the results in [7],
where an O(m lg n) algorithm was described for refining a partition against a relation.
However, there α = 1, so the solved problem is not an immediate generalisation of ours.
Extending the algorithm to α > 1 is not trivial, as can be appreciated from the extension
in [2]. It discusses O(m lg n) without openly promising it. Indeed, its analysis treats α as a
constant. It seems to us that its running time does have an αn term.
In Section 2 we present an abstract minimization algorithm that, unlike [3, 6], has been
adapted to PT-DFAs and avoids scanning the blocks and the alphabet in nested loops. The
latter is crucial for converting αn into m in the complexity. The question of what blocks
are needed in further splitting, has led to lengthy and sometimes unconvincing discussions
in earlier literature. Our correctness proof deals with this issue using the “loop invariant”
paradigm advocated in [4]. Our loop invariant “knows” what blocks are needed.
Section 3 presents an implementation of the refinable partition data structure. Its per-
formance relies on a carefully chosen combination of simple low-level programming details.
The implementation of the main part of the abstract algorithm is the topic of Section 4.
The analysis of its
…(Full text truncated)…
Reference
This content is AI-processed based on ArXiv data.