Reconstruction of metabolic networks from high-throughput metabolite profiling data: in silico analysis of red blood cell metabolism

Reading time: 5 minute
...

📝 Original Info

  • Title: Reconstruction of metabolic networks from high-throughput metabolite profiling data: in silico analysis of red blood cell metabolism
  • ArXiv ID: 0706.2007
  • Date: 2007-11-19
  • Authors: Researchers from original ArXiv paper

📝 Abstract

We investigate the ability of algorithms developed for reverse engineering of transcriptional regulatory networks to reconstruct metabolic networks from high-throughput metabolite profiling data. For this, we generate synthetic metabolic profiles for benchmarking purposes based on a well-established model for red blood cell metabolism. A variety of data sets is generated, accounting for different properties of real metabolic networks, such as experimental noise, metabolite correlations, and temporal dynamics. These data sets are made available online. We apply ARACNE, a mainstream transcriptional networks reverse engineering algorithm, to these data sets and observe performance comparable to that obtained in the transcriptional domain, for which the algorithm was originally designed.

💡 Deep Analysis

Deep Dive into Reconstruction of metabolic networks from high-throughput metabolite profiling data: in silico analysis of red blood cell metabolism.

We investigate the ability of algorithms developed for reverse engineering of transcriptional regulatory networks to reconstruct metabolic networks from high-throughput metabolite profiling data. For this, we generate synthetic metabolic profiles for benchmarking purposes based on a well-established model for red blood cell metabolism. A variety of data sets is generated, accounting for different properties of real metabolic networks, such as experimental noise, metabolite correlations, and temporal dynamics. These data sets are made available online. We apply ARACNE, a mainstream transcriptional networks reverse engineering algorithm, to these data sets and observe performance comparable to that obtained in the transcriptional domain, for which the algorithm was originally designed.

📄 Full Content

In the recent years, high-throughput (HTP) microarray profiling has generated large data sets that characterize the simultaneous activities of, essentially, all genes in a cell. These data sets have been used successfully to reverse engineer (RE) cellular transcriptional regulatory networks (see, for example, [1][2][3] for a collection of references). Similar experimental progress is expected in the emerging field of metabolomics, where sensitive HTP measurements of (relative or absolute) concentrations of many metabolites in a sample of cells are now possible in different preparations under various experimental interventions, and/or steady state growth conditions [4][5][6]. Anticipating the resulting data sets, there is a strong interest in development of computational tools that, unlike more traditional approaches based on sequence information [7] or chemical reactivity and conservation laws [8,9], would use the relevant HTP data to expand our knowledge of metabolic networks, which, as extensive as it is, is still incomplete. Because metabolic networks share features with transcriptional regulatory networks, it is tempting to transfer successful methods developed in the context of transcriptional networks, such as those in [1][2][3]10], to inference of metabolic networks.

An obvious advantage of transferring these methods is that only minimal modifications are required to the very extensive RE code base. On the other hand, it is not obvious that the existing methods will perform well on metabolic networks. Indeed, despite the superficial similarity, metabolic and transcriptional networks are quite different. In the transcriptional case, a transcription factor (TF), a parent, causes a change in the expression of its target gene, a child, without any direct effects on its own activity. This leads to correlations among expressions of TFs and their targets, and these can be readily discovered by various statistical techniques. Conversely, in metabolism, a substrate (a parent) is transformed into a product (a child). Thus, an increase in the child’s abundance comes at the cost of decreasing the abundance of the parent. We therefore expect that the statistical associations in metabolic data will differ from those in gene expression data sets in unknown ways. Furthermore, the experimental noise has a tendency to mask interactions of low-mean or low-variance species. This has been a problem even in transcriptional analysis (e.g., spurious interactions in the ribosomal complex in [1]), where the expression levels and the involved characteristic time scales of reaching steady states are largely uniform across all genes. On the other hand, kinetic rates in a metabolic network can vary over many orders of magnitude for different species. Thus the time required for an organism to achieve a metabolic steady state can vary from milliseconds to hundreds of hours [11]. Furthermore, many metabolites are short-lived and lowabundance, and a “fully expressed” metabolite can mean anywhere from a few molecules to a few million molecules per cell, making consideration of the measurement noise very important.

Because of these differences between transcription and metabolism, the fidelity of standard transcriptional RE algorithms for metabolic networks cannot be assumed. It is therefore useful to test these methods on benchmark data that resemble real metabolic measurements, and for which the ground truth structure of the network is known. We are unaware of the existence of experimental data sets of this kind, and therefore we turn to numerical simulations. However, existing synthetic data sets have focused on realistic modeling of transcriptional regulation [12], and they may not represent metabolism well. Therefore, in this work, we undertake the task of generating synthetic benchmark metabolic data by using a well-established kinetic model of red blood cell (RBC) metabolism [11], which involves 39 metabolites connected by 44 individual reactions. These data have been made publicly available at http://www.menem.com/~ilya/wiki/index.php/RBC_Metabolic_Network . We then use ARACNE, a modern transcriptional network RE algorithm, which was developed and validated for gene expression analysis [10], to infer metabolic interactions from these synthetic metabolic data, and we argue that its performance is comparable to that in the transcriptional case. This outcome suggests that other transcriptional HTP-based RE algorithms might be transferred to the domain of metabolism with minimal changes as well.

In generating synthetic benchmark data, our goal is not to accurately simulate a real system. Rather, our goal is to exercise transcriptional RE algorithms by generating data that are complex enough to incorporate different features of metabolism (dynamic ranges, temporal properties, correlations among chemical species, noises, etc.), but are still simple enough to analyze in detail. Specifically, we generated four data sets to account fo

…(Full text truncated)…

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut