Our aim is to build a set of rules, such that reasoning over temporal dependencies within gene regulatory networks is possible. The underlying transitions may be obtained by discretizing observed time series, or they are generated based on existing knowledge, e.g. by Boolean networks or their nondeterministic generalization. We use the mathematical discipline of formal concept analysis (FCA), which has been applied successfully in domains as knowledge representation, data mining or software engineering. By the attribute exploration algorithm, an expert or a supporting computer program is enabled to decide about the validity of a minimal set of implications and thus to construct a sound and complete knowledge base. From this all valid implications are derivable that relate to the selected properties of a set of genes. We present results of our method for the initiation of sporulation in Bacillus subtilis. However the formal structures are exhibited in a most general manner. Therefore the approach may be adapted to signal transduction or metabolic networks, as well as to discrete temporal transitions in many biological and nonbiological areas.
Deep Dive into Constructing a Knowledge Base for Gene Regulatory Dynamics by Formal Concept Analysis Methods.
Our aim is to build a set of rules, such that reasoning over temporal dependencies within gene regulatory networks is possible. The underlying transitions may be obtained by discretizing observed time series, or they are generated based on existing knowledge, e.g. by Boolean networks or their nondeterministic generalization. We use the mathematical discipline of formal concept analysis (FCA), which has been applied successfully in domains as knowledge representation, data mining or software engineering. By the attribute exploration algorithm, an expert or a supporting computer program is enabled to decide about the validity of a minimal set of implications and thus to construct a sound and complete knowledge base. From this all valid implications are derivable that relate to the selected properties of a set of genes. We present results of our method for the initiation of sporulation in Bacillus subtilis. However the formal structures are exhibited in a most general manner. Therefore th
Constructing a Knowledge Base for Gene
Regulatory Dynamics by Formal Concept
Analysis Methods
Johannes Wollbold12, Reinhard Guthke2, and Bernhard Ganter1
1 University of Technology, Institute of Algebra, Dresden, Germany
http://www.math.tu-dresden.de/alg/algebra.html
jwollbold@gmx.de
2 Leibniz Institute for Natural Product Research and Infection Biology -
Hans-Kn¨oll-Institute (HKI) Jena, Germany
Abstract. Our aim is to build a set of rules, such that reasoning over
temporal dependencies within gene regulatory networks is possible. The
underlying transitions may be obtained by discretizing observed time se-
ries, or they are generated based on existing knowledge, e.g. by Boolean
networks or their nondeterministic generalization. We use the mathemat-
ical discipline of formal concept analysis (FCA), which has been applied
successfully in domains as knowledge representation, data mining or soft-
ware engineering. By the attribute exploration algorithm, an expert or a
supporting computer program is enabled to decide about the validity of
a minimal set of implications and thus to construct a sound and com-
plete knowledge base. From this all valid implications are derivable that
relate to the selected properties of a set of genes. We present results of
our method for the initiation of sporulation in Bacillus subtilis. However
the formal structures are exhibited in a most general manner. Therefore
the approach may be adapted to signal transduction or metabolic net-
works, as well as to discrete temporal transitions in many biological and
nonbiological areas.
Keywords: complete lattices, reasoning, temporal logic, gene expression
1
Introduction
As the mathematical methodology of formal concept analysis (FCA) is little
known within systems biology, we give a short overview of its history and pur-
poses. During the early years 1980, FCA emerged within the community of set
and order theorists, algebraists and discrete mathematicians. Its first aim was to
find a new, concrete and meaningful approach to the understanding of complete
lattices (ordered sets such that for every subset the supremum and the infimum
exist). The following discovery showed to be very fruitful: Every complete lat-
tice is representable as a hierarchy of concepts, which were conceived as sets of
objects sharing a maximal set of attributes. This paved the way for using the
developed field of lattice theory for a transparent and complete representation of
very different types of knowledge. FCA was inspired by the pedagogue Hartmut
arXiv:0807.3287v1 [q-bio.MN] 21 Jul 2008
2
von Hentig [7] and his program of restructuring sciences, with a view to interdis-
ciplinary collaboration and democratic control. The philosophical background
goes back to Charles S. Peirce (1839 - 1914), who condensed some of his main
ideas to the pragmatic maxim:
Consider what effects, that might conceivably have practical bear-
ings, we conceive the objects of our conception to have. Then, our
conception of these effects is the whole of our conception of the
object. [14, 5.402]
In that tradition, FCA aims at unfolding the observable, elementary proper-
ties defining the objects subsumed by scientific concepts. If applied to temporal
transitions, effects of homogeneous classes of states can be modeled and pre-
dicted in a clear and concise manner. Thus FCA seems to be appropriate to
describe causality - and the limits of its understanding.
At present, FCA is a richly developed mathematical theory, and there are
practical applications in various fields as data and text mining, knowledge man-
agement, semantic web, software engineering or economics [3]. FCA has been
used for the analysis of gene expression data in [2] and [13], but this is the
first approach of applying it to model (gene) regulatory networks. The math-
ematical framework of FCA is very general and open, such that multifarious
refinements are possible, according to current approaches of modeling dynamics
within systems biology. On the other hand, we developed a formal structure for
general discrete temporal transitions. They occur in a variety of domains: control
of engineering processes, development of the values of variables or objects in a
computer program, change of interactions in social networks, a piece of music,
etc.
In this paper, however, the examples are uniquely biological. The purpose is
to construct a knowledge base for reasoning about temporal dependencies within
gene regulatory or signal transduction networks, by the attribute exploration
algorithm: For a given set of interesting properties, it builds a sound, complete
and nonredundant knowledge base. This minimal set of rules has to be checked
by an expert or a computer program, e.g. by comparison of knowledge based
predictions with data.
Since there exist relatively fixed thresholds of activation for many genes, it
is a common abstraction to consider only two expression levels offand on. The
classical approach of Boolean networks [8] is able to capture ess
…(Full text truncated)…
This content is AI-processed based on ArXiv data.