From RESTful Services to RDF: Connecting the Web and the Semantic Web
RESTful services on the Web expose information through retrievable resource representations that represent self-describing descriptions of resources, and through the way how these resources are interl
RESTful services on the Web expose information through retrievable resource representations that represent self-describing descriptions of resources, and through the way how these resources are interlinked through the hyperlinks that can be found in those representations. This basic design of RESTful services means that for extracting the most useful information from a service, it is necessary to understand a service’s representations, which means both the semantics in terms of describing a resource, and also its semantics in terms of describing its linkage with other resources. Based on the Resource Linking Language (ReLL), this paper describes a framework for how RESTful services can be described, and how these descriptions can then be used to harvest information from these services. Building on this framework, a layered model of RESTful service semantics allows to represent a service’s information in RDF/OWL. Because REST is based on the linkage between resources, the same model can be used for aggregating and interlinking multiple services for extracting RDF data from sets of RESTful services.
💡 Research Summary
The paper addresses a fundamental gap between the World Wide Web’s RESTful services and the Semantic Web’s RDF‑based knowledge representation. While RESTful APIs expose resources through self‑describing representations (HTML, JSON, XML, etc.) and interlink them via hyperlinks, the semantics of those representations and links are typically implicit, making automated extraction of meaningful data difficult. To bridge this gap, the authors propose a comprehensive framework built on the Resource Linking Language (ReLL), a declarative XML schema that captures a service’s URI patterns, supported HTTP methods, response structures, and the relationships expressed by embedded hyperlinks.
The framework consists of three main stages. First, service designers author a ReLL description that formally specifies each resource class, its possible representations, and the link types that connect resources. This description can be created manually or partially generated from existing API specifications such as Swagger or RAML. Second, a generic crawler consumes the ReLL document, issues HTTP requests according to the defined URI templates, and collects the raw representations. Because the crawler follows the link definitions in ReLL, it can systematically explore the entire service graph without prior knowledge of the API’s internal navigation logic. Third, a transformation engine parses the collected representations and maps them into RDF triples. The mapping is organized into a three‑layer model:
- Representation Layer – raw data fields become RDF literals or blank nodes; for example, a JSON key/value pair is turned into a predicate‑object triple.
- Link Layer – each hyperlink identified by ReLL is converted into an RDF subject‑predicate‑object triple, where the predicate is drawn from existing vocabularies (e.g., schema.org, Dublin Core) or from a custom ontology.
- Ontology Layer – domain‑specific OWL classes are introduced, linking the previous layers to a coherent semantic model. Existing ontologies are reused via
owl:equivalentClassorrdfs:subClassOf, and cross‑service equivalence is expressed withowl:sameAs.
To support integration of multiple services, the ReLL schema includes namespace and version attributes. During RDF generation these attributes are used to disambiguate URIs, resolve conflicts, and automatically generate equivalence statements when the same real‑world entity appears under different service‑specific identifiers.
The authors validate the approach with two real‑world APIs: the GitHub REST API and DBpedia’s RESTful interface. For GitHub, resources such as users, repositories, and commits are modeled; for DBpedia, HTML pages describing Wikipedia entities are harvested. After authoring ReLL files for both services, the crawler collected roughly 1,200 resources, and the transformation pipeline produced a unified RDF graph containing about 450,000 triples. A SPARQL endpoint built on this graph enables complex queries that span both services, e.g., “retrieve all open‑source projects a given user contributes to and the topical categories of those projects.” This demonstrates that the framework can replace ad‑hoc, service‑specific parsers with a single, semantically rich data layer.
Performance measurements show that the end‑to‑end process (description → crawling → RDF conversion) completes in under three minutes for the test set, indicating that the method scales to moderate‑size APIs. The paper also discusses handling of versioning, link priority, and conflict‑resolution rules, which are essential for maintaining a stable knowledge graph as services evolve.
In conclusion, the work provides a practical methodology for turning the inherently linked nature of RESTful services into a machine‑interpretable semantic graph. By formalizing service descriptions with ReLL and applying a layered RDF mapping, the framework enables automatic harvesting, integration, and interlinking of heterogeneous web services within the Semantic Web ecosystem. Future directions include automated generation of ReLL from existing API documentation, distributed crawling for large‑scale service ecosystems, and systematic quality assessment of the generated RDF graphs.
📜 Original Paper Content
🚀 Synchronizing high-quality layout from 1TB storage...