Integrated Data Acquisition, Storage, Retrieval and Processing Using the COMPASS DataBase (CDB)
We present a complex data handling system for the COMPASS tokamak, operated by IPP ASCR Prague, Czech Republic [1]. The system, called CDB (Compass DataBase), integrates different data sources as an assortment of data acquisition hardware and software from different vendors is used. Based on widely available open source technologies wherever possible, CDB is vendor and platform independent and it can be easily scaled and distributed. The data is directly stored and retrieved using a standard NAS (Network Attached Storage), hence independent of the particular technology; the description of the data (the metadata) is recorded in a relational database. Database structure is general and enables the inclusion of multi-dimensional data signals in multiple revisions (no data is overwritten). This design is inherently distributed as the work is off-loaded to the clients. Both NAS and database can be implemented and optimized for fast local access as well as secure remote access. CDB is implemented in Python language; bindings for Java, C/C++, IDL and Matlab are provided. Independent data acquisitions systems as well as nodes managed by FireSignal [2] are all integrated using CDB. An automated data post-processing server is a part of CDB. Based on dependency rules, the server executes, in parallel if possible, prescribed post-processing tasks.
💡 Research Summary
The paper presents CDB (Compass DataBase), a comprehensive data acquisition, storage, retrieval, and processing framework developed for the COMPASS tokamak at IPP ASCR Prague. Motivated by the growing volume of diagnostic data (≈2 GB per discharge) and the limitations of the earlier FireSignal‑based architecture—central server bottlenecks, hardware‑centric signal identification, and lack of version control—the authors designed a system that separates metadata from bulk numerical data and relies exclusively on open‑source components.
Metadata (signal definitions, units, axes, record numbers, revisions, etc.) are stored in a relational database (MySQL by default), while the actual multi‑dimensional datasets reside on a standard Network Attached Storage (NAS) as HDF5 files. The data model distinguishes “generic signals” (abstract descriptions of physical quantities) from “data signals” (concrete instances tied to a specific discharge record and revision). Axes are themselves generic signals, enabling consistent handling of multi‑dimensional data. New data are never overwritten; instead a new revision is created, preserving full history and ensuring data integrity.
CDB’s core is implemented in Python for rapid development and extensive library support. Language bindings are provided via Jython, Cython, and Java, allowing seamless access from C/C++, Matlab, IDL, and other environments. Numerical data are read/written directly via native HDF5 APIs, bypassing the server and thus reducing load. Remote access is achieved through standard network mechanisms (SSH, VPN, SSL) without dedicated code; a JSON‑based web service further enables language‑agnostic interaction.
Integration with existing acquisition hardware is achieved by allowing each acquisition node to write its data directly to CDB, eliminating the central FireSignal data relay. The workflow for storing a signal involves obtaining a record number, referencing the generic signal, creating an HDF5 file, writing the data (including optional axes), closing the file, and finally registering the data signal and its axes in the metadata database. Convenience functions such as store_signal and update_signal encapsulate these steps.
An automated post‑processing server, driven by user‑defined dependency rules, executes prescribed analysis tasks in parallel when possible, streamlining the data‑to‑insight pipeline.
Overall, CDB delivers a scalable, vendor‑independent, and extensible solution that addresses the demanding data management needs of modern tokamak experiments, while remaining portable to other pulsed‑device facilities.
Comments & Academic Discussion
Loading comments...
Leave a Comment