Astrophysics in S.Co.P.E
S.Co.P.E. is one of the four projects funded by the Italian Government in order to provide Southern Italy with a distributed computing infrastructure for fundamental science. Beside being aimed at building the infrastructure, S.Co.P.E. is also actively pursuing research in several areas among which astrophysics and observational cosmology. We shortly summarize the most significant results obtained in the first two years of the project and related to the development of middleware and Data Mining tools for the Virtual Observatory.
💡 Research Summary
The paper presents the first two years of results from the S.Co.P.E. project, an Italian government‑funded initiative that builds a metropolitan GRID infrastructure in Southern Italy and couples it with the International Virtual Observatory (VO). The authors focus on two main technical achievements: the development of a middleware component called GRID‑Launcher v1.0 that bridges the UK‑ASTROGRID Virtual Organization with the S.Co.P.E. GRID, and the creation of the VO‑Neural data‑mining framework, which implements supervised machine‑learning algorithms (Multi‑Layer Perceptron and Support Vector Machine) suitable for massive astronomical catalogs (>1 TB).
GRID‑Launcher v1.0 solves the authentication mismatch that prevents most VO users from accessing GRID resources. It automatically converts user inputs from the ASTROGRID desktop into three scripts (task.sh, task.jdl, wn_runner.sh), packages them, and submits them to the S.Co.P.E. UI using an INFN‑GRID certificate. The UI unpacks the payload, stores input data on a Storage Element, dispatches the execution scripts to Worker Nodes, and finally retrieves output files via the Output Sandbox, depositing results into the VO‑Space. This workflow has been tested with several applications, including the VO‑Neural MLP and SVM tools, Sextractor, and SWarp.
VO‑Neural is an evolution of the earlier AstroNeural project. It provides a VO‑compliant, distributed environment for training and applying machine‑learning models on large sky surveys. The MLP implementation relies on the Fast Artificial Neural Networks (FANN) library written in C++, using random weight initialization and gradient‑descent learning with a validation set to avoid over‑fitting. The SVM implementation uses the LIBSVM library with a radial‑basis‑function kernel; the hyper‑parameters C and γ are explored on a logarithmic grid, a process that is computationally intensive but well suited to parallel execution on a GRID. Both algorithms have been integrated into the GRID‑Launcher so that parameter sweeps can be run on 110 worker nodes spread across the three PON sites (Napoli, Catania, Cagliari).
To demonstrate scientific utility, the authors applied VO‑Neural to the classification of Active Galactic Nuclei (AGN) using photometric data from the Sloan Digital Sky Survey (SDSS). A “Base of Knowledge” (BoK) was constructed by merging two existing spectroscopic catalogs (Sorrentino et al. 2006; Kauffmann et al. 2003) that provide labels such as Seyfert 1, Seyfert 2, non‑AGN, and LINER. Input features included Petrosian radii in five bands, concentration index, fiber magnitude, dereddened colors, and photometric redshift. Three classification experiments were performed: (1) AGN vs. mixed objects, (2) Type 1 vs. Type 2 AGN, and (3) Seyfert vs. LINER. Results (Table 1) show that the MLP achieved 76 % accuracy for AGN vs. mixed and over 95 % for Type 1 vs. Type 2, while the SVM reached 74 % and 82 % respectively. The SVM’s grid search, although demanding, benefited from the GRID parallelism, delivering competitive performance within reasonable wall‑clock time. These outcomes illustrate that photometric‑only classification can approach, and in some cases surpass, traditional spectroscopic diagnostics (e.g., BPT diagrams).
The paper concludes with a roadmap for future development. Planned upgrades include GRID‑Launcher 2.0, which will employ robot certificates to allow any command‑line program to be launched on the GRID without user‑specific credentials. VO‑Neural will be exposed as a web application, and selected tools will be re‑engineered for GPU or multi‑core parallelism to further reduce execution time. Integration with VO‑STAT, a web‑based statistical analysis service, is also envisaged to provide robust exploratory data analysis and result validation. Project status and software releases are made available at http://voneural.na.infn.it/.
Overall, the work demonstrates that a regionally distributed GRID, when coupled with VO standards and modern machine‑learning middleware, can effectively support large‑scale astronomical data mining, thereby enhancing the scientific capabilities of Southern Italy and contributing valuable tools to the global Virtual Observatory community.
Comments & Academic Discussion
Loading comments...
Leave a Comment