RNA-RNA interaction prediction based on multiple sequence alignments

Reading time: 5 minute
...

📝 Original Info

  • Title: RNA-RNA interaction prediction based on multiple sequence alignments
  • ArXiv ID: 1003.3987
  • Date: 2010-07-15
  • Authors: ** Andrew X. Li, Manja Marz, Jing Qin, Christian M. Reidys **

📝 Abstract

Many computerized methods for RNA-RNA interaction structure prediction have been developed. Recently, $O(N^6)$ time and $O(N^4)$ space dynamic programming algorithms have become available that compute the partition function of RNA-RNA interaction complexes. However, few of these methods incorporate the knowledge concerning related sequences, thus relevant evolutionary information is often neglected from the structure determination. Therefore, it is of considerable practical interest to introduce a method taking into consideration both thermodynamic stability and sequence covariation. We present the \emph{a priori} folding algorithm \texttt{ripalign}, whose input consists of two (given) multiple sequence alignments (MSA). \texttt{ripalign} outputs (1) the partition function, (2) base-pairing probabilities, (3) hybrid probabilities and (4) a set of Boltzmann-sampled suboptimal structures consisting of canonical joint structures that are compatible to the alignments. Compared to the single sequence-pair folding algorithm \texttt{rip}, \texttt{ripalign} requires negligible additional memory resource. Furthermore, we incorporate possible structure constraints as input parameters into our algorithm. The algorithm described here is implemented in C as part of the \texttt{rip} package. The supplemental material, source code and input/output files can freely be downloaded from \url{http://www.combinatorics.cn/cbpc/ripalign.html}. \section{Contact} Christian Reidys \texttt{duck@santafe.edu}

💡 Deep Analysis

Deep Dive into RNA-RNA interaction prediction based on multiple sequence alignments.

Many computerized methods for RNA-RNA interaction structure prediction have been developed. Recently, $O(N^6)$ time and $O(N^4)$ space dynamic programming algorithms have become available that compute the partition function of RNA-RNA interaction complexes. However, few of these methods incorporate the knowledge concerning related sequences, thus relevant evolutionary information is often neglected from the structure determination. Therefore, it is of considerable practical interest to introduce a method taking into consideration both thermodynamic stability and sequence covariation. We present the \emph{a priori} folding algorithm \texttt{ripalign}, whose input consists of two (given) multiple sequence alignments (MSA). \texttt{ripalign} outputs (1) the partition function, (2) base-pairing probabilities, (3) hybrid probabilities and (4) a set of Boltzmann-sampled suboptimal structures consisting of canonical joint structures that are compatible to the alignments. Compared to the singl

📄 Full Content

arXiv:1003.3987v3 [math-ph] 14 Jul 2010 BIOINFORMATICS Vol. 00 no. 00 Pages 1–8 RNA-RNA interaction prediction based on multiple sequence alignments Andrew X. Li 1, Manja Marz 2, Jing Qin 3, Christian M. Reidys 1,4∗ 1Center for Combinatorics, LPMC-TJKLC, Nankai University Tianjin 300071, P.R. China 2 RNA Bioinformatics Group, Philipps-University Marburg, Marbacher Weg 6, 34037 Marburg, Germany 3Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, D-04103 Leipzig, Germany 4College of Life Science, Nankai University Tianjin 300071, P.R. China. Received on *****; revised on *****; accepted on ***** Associate Editor: ***** ABSTRACT Motivation Many computerized methods for RNA-RNA interaction structure prediction have been developed. Recently, O(N6) time and O(N4) space dynamic programming algorithms have become available that compute the partition function of RNA-RNA interaction complexes. However, few of these methods incorporate the knowledge concerning related sequences, thus relevant evolutionary information is often neglected from the structure determination. Therefore, it is of considerable practical interest to introduce a method taking into consideration both thermodynamic stability and sequence covariation. Results We present the a priori folding algorithm ripalign, whose input consists of two (given) multiple sequence alignments (MSA). ripalign outputs (1) the partition function, (2) base-pairing probabilities, (3) hybrid probabilities and (4) a set of Boltzmann- sampled suboptimal structures consisting of canonical joint structures that are compatible to the alignments. Compared to the single sequence-pair folding algorithm rip, ripalign requires negligible additional memory resource. Furthermore, we incorporate possible structure constraints as input parameters into our algorithm. Availability The algorithm described here is implemented in C as part of the rip package. The supplemental material, source code and input/output files can freely be downloaded from http://www.combinatorics.cn/cbpc/ripalign.html. Contact Christian Reidys duck@santafe.edu Keywords multiple sequence alignment, RNA-RNA interaction, joint structure, dynamic programming, partition function, base pairing probability, hybrid, loop, RNA secondary structure. 1 INTRODUCTION RNA-RNA interactions play a major role at many different levels of the cellular metabolism such as plasmid replication control, viral encapsidation, or transcriptional and translational regulation. With the discovery that a large number of transcripts ∗to whom correspondence should be addressed. Phone: *86-22-2350-6800; Fax: *86-22-2350-9272; duck@santafe.edu in higher eukaryotes are noncoding RNAs, RNA-RNA interactions in cellular metabolism are gaining in prominence. Typical examples of interactions involving two RNA molecules are snRNAs (Forne et al., 1996); snoRNAs with their targets (Bachellerie et al., 2002); micro-RNAs from the RNAi pathway with their mRNA target (Ambros, 2004; Murchison and Hannon, 2004); sRNAs from Escherichia coli (Hershberg et al., 2003; Repoila et al., 2003); and sRNA loop-loop interactions (Brunel et al., 2003). The common feature in many ncRNA classes, especially prokaryotic small RNAs, is the formation of RNA-RNA interaction structures that are much more complex than the simple sense-antisense interactions. As it is the case for the general RNA folding problem with unrestricted pseudoknots (Akutsu, 2000), the RNA-RNA interaction problem (RIP) is NP-complete in its most general form (Alkan et al., 2006; Mneimneh, 2009). However, polynomial- time algorithms can be derived by restricting the space of allowed configurations in ways that are similar to pseudoknot folding algorithms (Rivas and Eddy, 1999). The simplest approach concatenates the two interacting sequences and subsequently employs a slightly modified standard secondary structure folding algorithm. The algorithms RNAcofold (Hofacker et al., 1994; Bernhart et al., 2006), pairfold (Andronescu et al., 2005), and NUPACK (Ren et al., 2005) subscribe to this strategy. A major shortcoming of this approach is that it cannot predict important motifs such as kissing-hairpin loops. The paradigm of concatenation has also been generalized to the pseudoknot folding algorithm of Rivas and Eddy (1999). The resulting model, however, still does not generate all relevant interaction structures (Chitsaz et al., 2009b). An alternative line of thought is to neglect all internal base-pairings in either strand and to compute the minimum free energy (MFE) secondary structure for their hybridization under this constraint. For instance, RNAduplex and RNAhybrid (Rehmsmeier et al., 2004) follows this line of thought. RNAup (M¨uckstein et al., 2006, 2008) and intaRNA (Busch et al., 2008) restrict interactions to a single interval that remains unpaired in the secondary structure for each partner. These models have proved particularly useful for bacterial sRNA/mRNA interactions (Geissmann

…(Full text truncated)…

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut