The Stanford RNA Mapping Database for sharing and visualizing RNA structure mapping experiments
We have established an RNA Mapping Database (RMDB) to enable a new generation of structural, thermodynamic, and kinetic studies from quantitative single-nucleotide-resolution RNA structure mapping (freely available at http://rmdb.stanford.edu). Chemical and enzymatic mapping is a rapid, robust, and widespread approach to RNA characterization. Since its recent coupling with high-throughput sequencing techniques, accelerated software pipelines, and large-scale mutagenesis, the volume of mapping data has greatly increased, and there is a critical need for a database to enable sharing, visualization, and meta-analyses of these data. Through its on-line front-end, the RMDB allows users to explore single-nucleotide-resolution chemical accessibility data in heat-map, bar-graph, and colored secondary structure graphics; to leverage these data to generate secondary structure hypotheses; and to download the data in standardized and computer-friendly files, including the RDAT and community-consensus SNRNASM formats. At the time of writing, the database houses 38 entries, describing 2659 RNA sequences and comprising 355,084 data points, and is growing rapidly.
💡 Research Summary
The paper presents the Stanford RNA Mapping Database (RMDB), a centralized web‑based repository designed to store, share, visualize, and analyze high‑throughput RNA structure‑mapping data. RNA chemical and enzymatic probing has become a routine technique for interrogating RNA secondary and tertiary structures, especially after its coupling with capillary electrophoresis and next‑generation sequencing (NGS). The rapid increase in data volume created a need for a standardized, searchable, and visual platform, which the authors argue is not adequately addressed by the decentralized SNRNASM/ISA‑TAB approach that lacks integrated visualization tools and rigorous validation.
RMDB addresses these gaps by providing a dynamic front‑end where each experiment is represented as an M × N matrix (M experiments, N nucleotides). Users can explore data through heat‑maps (dark gray indicating higher reactivity), bar plots, and interactive secondary‑structure drawings generated with VARNA. Clicking a row displays the corresponding structure with experimental bonuses overlaid, and the native structure is shown when available. All visualizations are rendered in real‑time as SVG, enabling smooth interaction in modern browsers.
Data are stored in the RDAT (RNA Data) format, a three‑section text file: a General section (version, global annotations), a Construct section (RNA name, sequence, putative secondary structure, solution conditions, free‑text comments), and a Data section (per‑experiment ANNOTATION_DATA and REACTIVITY lines). The annotation system is hierarchical: global annotations are inherited by constructs and can be overridden locally. Optional fields allow offset indexing, explicit sequence positions, and mutation positions. RDAT can also embed raw electropherogram traces (TRACE) and peak selections (XSEL), preserving information that ISA‑TAB typically discards. The database schema mirrors the RDAT structure, simplifying ingestion, validation, and export. Each entry receives a unique, human‑readable ID (e.g., TRP4P6_SHP_0003) and version number, supporting updates and community contributions.
At the time of writing, RMDB contains 38 entries covering 2,659 distinct RNA molecules and 355,084 individual data points, drawn from diverse sources such as riboswitches, ribozymes, ribosomal domains, tRNAs, and synthetic sequences from the EteRNA project. Most data were generated using 96‑well capillary electrophoresis coupled with the HiTRACE pipeline; two entries derive from Illumina paired‑end NGS processed with the SHAPE‑Seq protocol. Users may upload new data in RDAT format; submissions undergo registration, curation, and validation before becoming publicly visible.
Beyond data storage, RMDB offers analysis tools. A secondary‑structure prediction server integrates experimental reactivity as pseudo‑energy bonuses using the RNAstructure (v5.3) package. Bonuses can be one‑dimensional (single‑value per nucleotide) or two‑dimensional (mutate‑and‑map matrices). The server also supports data normalization and bootstrapping. All server‑side and client‑side code is open‑source under the GNU GPL. The authors bundled the functionality into the RDATkit, a Python/Matlab toolkit that parses RDAT and ISA‑TAB files, applies bonuses, runs structure prediction algorithms, and generates visualizations via VARNA and matplotlib.
Implementation relies on the Django web framework (Python) for server logic, Apache 2.2 and MySQL 14.1 for data storage, and client‑side JavaScript libraries (jQuery, D3, protovis) for interactive graphics. Pre‑generated thumbnails and images accelerate browsing. The system requires a modern browser with SVG support; it has been tested on Firefox 4+, Chrome 6+, Safari 5+, and Internet Explorer 8 with appropriate plugins.
In discussion, the authors emphasize the database’s value for structural biologists, bioinformaticians, and the broader RNA community. They note that while the current content focuses on chemical modification probing, the architecture can readily accommodate enzymatic cleavage or hydroxyl radical footprinting data. The rapid influx of data from projects like EteRNA (8–16 new sequences per week) and the scaling of multiplexed capillary electrophoresis and NGS promise exponential growth. By providing a curated publishing endpoint, visualization suite, and analysis toolkit, RMDB complements rather than competes with decentralized repositories, completing the data lifecycle from acquisition to hypothesis testing. Future directions include expanding assay types, benchmarking modern versus conventional modifiers, and exploring automated extraction of tertiary‑structure information from high‑throughput mapping data. Overall, RMDB represents a robust, extensible platform that centralizes RNA mapping experiments, fostering reproducibility, data reuse, and novel insights into RNA structure and function.
Comments & Academic Discussion
Loading comments...
Leave a Comment