Prediction and verification of indirect interactions in densely interconnected regulatory networks

Reading time: 5 minute
...

s/papers/0710.0892/cover.png"

📝 Original Info

  • Title: Prediction and verification of indirect interactions in densely interconnected regulatory networks
  • ArXiv ID: 0710.0892
  • Date: 2007-11-27
  • Authors: 논문에 명시된 저자 정보가 제공되지 않았습니다.

📝 Abstract

We develop a matrix-based approach to predict and verify indirect interactions in gene and protein regulatory networks. It is based on the approximate transitivity of indirect regulations (e.g. A regulates B and B regulates C often implies that A regulates C) and optimally takes into account the length of a cascade and signs of intermediate interactions. Our method is at its most powerful when applied to large and densely interconnected networks. It successfully predicts both the yet unknown indirect regulations, as well as the sign (activation or repression) of already known ones. The reliability of sign predictions was calibrated using the gold-standard sets of positive and negative interactions. We fine-tuned the parameters of our algorithm by maximizing the area under the Receiver Operating Characteristic (ROC) curve. We then applied the optimized algorithm to large literature-derived networks of all direct and indirect regulatory interactions in several model organisms (Homo sapiens, Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster).

💡 Deep Analysis

Figure 1

📄 Full Content

The development of high-throughput experimental techniques lead to the accumulation of unprecedented amounts of data describing regulatory interactions in model organisms. Effective computational algorithms are needed to convert this treasure trove of information into the system-wide understanding of the underlying biological processes.

Regulatory interactions between proteins can be either direct or indirect. We would refer to a link from a regulatory protein to a target protein as direct if it is mediated by a direct molecular mechanism, such as e.g. transcriptional regulation of target protein’s level by a transcription factor or phosphorylation of a substrate protein by a kinase.

Conversely, regulations involving any number of intermediate proteins will be referred to as indirect. In fact, indirect regulations are vastly more common than the direct ones and thus are more likely to be detected experimentally. Large sets of regulatory interactions (both direct and indirect) are often represented in terms of a directed network in which edges carry signs representing whether the regulation is an activation (positive sign) or an inhibition (negative sign). By ignoring the strength of interactions and combinatorial effects of several inputs such network provides a very simplified description of the real-life regulatory processes.

In this work, we develop a novel algorithm which allows one to verify already known indirect regulations, infer their signs (if it is not known), and to predict the new ones, which have not yet been experimentally detected. As an input it uses a network consisting of all presently known regulatory interactions (both direct and indirect). Our algorithm also allows one to make an educated guess about which of the interactions in the original network are direct and which are indirect in cases when this information is not readily available (as e.g. in microarray experiments following a perturbation localized on one or several genes). Thus it contributes to a popular topic of reconstructing direct regulatory network from microarray data [1,2]. Our algorithm works best when applied to large and heavily-interconnected networks. That is the reason we chose to apply it to networks in well-studied model organisms obtained using automatic text-mining technologies [3].

Large-scale network analysis of indirect regulatory interactions in yeast was recently studied in [4,5,6]. These works focused on the classification of regulations as either direct or indirect and subsequently pruning of indirect regulations. Pruning of indirect regulations is a useful procedure from the point of network simplification. However, being developed for relatively sparse networks, these algorithms assume all links are equally reliable and neither of these algorithms performs well for heavily interconnected networks considered in this study.

The emergent behavior of the rapidly growing body of knowledge contained in regulatory and other biomolecular networks was recently explored in a series of publications of Rzhetsky and collaborators [7,8,9]. The matrix-based approach advocated below nicely compliments the Bayesian methods [8] of validation of large maps of biomolecular pathways or, more generally, any set of published biological statements [9].

The main idea behind our algorithm is as follows: consider a protein i regulating (either directly or indirectly) a protein k which in its turn is known to regulate (again directly or indirectly) a protein j, then it is likely to also have an indirect regulatory interaction between i and j. This simple observation could be further extended in two ways. Firstly, indirect regulations could propagate along longer protein cascades, thus a series of regulations i → k 1 → k 2 → j contributes to increase the likelihood of an indirect regulation i → j. Secondly, having multiple parallel pathways reinforce the predictability. Therefore, if a protein i regulates proteins k 1 , k 2 and each of them regulates a protein j, it is even more likely to find an indirect regulation from i to j.

A simple-minded way to predict or verify an indirect regulation between a protein i and a protein j is to simply count the number of directed paths connecting i and j.

However, this counting scheme does not take into account two important observations. First of all, paths should be weighted differently according to their lengths. Inferences based on longer cascades is less reliable, and thus such should contribute less to the likelihood. We choose to exponentially discount longer paths by weighting a path involving n intermediate proteins by a factor λ n , where λ < 1 is a parameter of our algorithm.

Secondly, the inferred sign of the indirect regulation from different paths should agree with each other. In general, if a protein i and a protein j are connected by a multi-step path, the sign of the resultant indirect regulation between i and j is given by the product of signs of all intermediate

📸 Image Gallery

cover.png

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut