Designing a Linked Data Migrational Framework for Singapore Government Datasets

Designing a Linked Data Migrational Framework for Singapore Government   Datasets
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The subject area of this report is Linked Data and its application to the Government domain. Linked Data is an alternative method of data representation that aims to interlink data from varied sources through relationships. Governments around the world have started publishing their data in this format to assist citizens in making better use of public services. This report provides an eight step migrational framework for converting Singapore Government data from legacy systems to Linked Data format. The framework formulation is based on a study of the Singapore data ecosystem with help from Infocomm Development Authority (iDA) of Singapore. Each step in the migrational framework has been constructed with objectives, recommendations, best practices and issues with entry and exit points. This work builds on the existing Linked Data literature, implementations in other countries and cookbooks provided by Linked Data researchers. iDA can use this report to gain an understanding of the effort and work involved in the implementation of Linked Data system on top of their legacy systems. The framework can be evaluated by building a Proof of Concept (POC) application.


💡 Research Summary

The paper presents a comprehensive eight‑step migration framework designed to convert Singapore government datasets from legacy formats into Linked Data (LD) representations. Grounded in a detailed study of Singapore’s data ecosystem and conducted in partnership with the Infocomm Development Authority (iDA), the framework addresses both technical and governance challenges. The first step involves a thorough assessment of existing data sources, schemas, quality issues, and policy constraints, establishing clear migration objectives. Step two defines a uniform URI strategy to assign globally unique identifiers and map them to existing keys. In step three, the authors select and extend ontologies by combining international standards such as DCAT, FOAF, and SKOS with Singapore‑specific vocabularies (e.g., government service codes) to ensure semantic consistency. Step four focuses on data cleansing, de‑duplication, and semantic normalization, employing SHACL constraints for quality validation. The transformation pipeline (step five) leverages declarative mapping languages like RML and SPARQL‑Generate, together with ETL tools, to convert CSV, XML, and relational database records into RDF triples. Step six establishes the storage and delivery infrastructure using a triple store and a Linked Data Platform (LDP) API gateway, enabling real‑time SPARQL queries and subscription mechanisms. Governance and operations are covered in step seven, outlining metadata management, access‑control policies, maintenance organization, and key performance indicators such as data reuse rates and query latency. The final step implements a proof‑of‑concept (PoC) across three domains—transport, health, and housing—integrating the converted data into a citizen portal with visualization and search features. User feedback and performance metrics from the PoC are used to refine the framework. Throughout, each phase is detailed with objectives, recommendations, best practices, and entry/exit criteria, facilitating project management and risk mitigation. By building on prior LD literature and international government implementations, the framework offers iDA a practical roadmap for phased, sustainable adoption of Linked Data, ultimately enhancing Singapore’s open‑data ecosystem and enabling richer public‑service integration.


Comments & Academic Discussion

Loading comments...

Leave a Comment