An Ontology based System for Cloud Infrastructure Services Discovery

An Ontology based System for Cloud Infrastructure Services Discovery
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The Cloud infrastructure services landscape advances steadily leaving users in the agony of choice. As a result, Cloud service identification and discovery remains a hard problem due to different service descriptions, non standardised naming conventions and heterogeneous types and features of Cloud services. In this paper, we present an OWL based ontology, the Cloud Computing Ontology (CoCoOn) that defines functional and non functional concepts, attributes and relations of infrastructure services. We also present a system…


💡 Research Summary

The paper addresses the growing difficulty of discovering appropriate cloud infrastructure services in an environment where providers offer a plethora of offerings with heterogeneous naming conventions, description formats, and feature sets. Traditional discovery mechanisms rely on keyword searches or provider‑specific catalogs, which are inadequate for handling both functional and non‑functional requirements simultaneously. To overcome these limitations, the authors propose a comprehensive ontology‑driven approach named the Cloud Computing Ontology (CoCoOn) and a prototype discovery system built on top of it.

CoCoOn is modeled in OWL and captures the essential concepts of the cloud domain in a hierarchical fashion. At the top level, the ontology distinguishes between “Service” and “Resource”. The Service class is further specialized into IaaS, PaaS, and SaaS, while the Resource class includes concrete infrastructure components such as VirtualMachine, Storage, and Network. Each component class defines quantitative attributes (e.g., cpuCores, memorySize, diskType, bandwidth) and qualitative attributes (e.g., securityLevel, compliance). Non‑functional properties—cost, performance, availability, latency, and regulatory compliance—are represented as separate object and data properties, allowing them to be linked to any service instance. Relationships such as dependsOn, compatibleWith, and locatedIn model inter‑service dependencies and geographic constraints.

To populate the ontology, the authors harvested service catalogs from the three major public cloud providers (Amazon Web Services, Microsoft Azure, and Google Cloud Platform). They developed parsers that extract service specifications from provider APIs, JSON/YAML descriptors, and HTML documentation, then transform the extracted data into RDF triples aligned with CoCoOn’s schema. Manual validation was performed to resolve ambiguous mappings (e.g., AWS “t2.micro” versus Azure “B1s” representing similar CPU‑memory configurations). All triples are stored in an Apache Jena TDB store, exposing a SPARQL endpoint for query execution.

The discovery system consists of a web‑based user interface, a query‑parsing module, an ontology‑mapping engine, a SPARQL execution engine, and a result‑presentation layer. Users can express their requirements in natural‑language‑like sentences or through form fields that specify attribute ranges (e.g., “CPU ≥ 2 cores, memory ≤ 8 GB, cost ≤ $30 per month, located in Asia‑Pacific”). The parsing module performs Korean language tokenization (the prototype is language‑agnostic) and maps tokens to ontology concepts using a dictionary derived from CoCoOn. A SPARQL query is then automatically generated, incorporating both functional constraints (e.g., required VM type) and non‑functional constraints (e.g., budget, latency). The query is executed against the triple store, and matching service instances are returned with their full attribute sets. Users can further sort or filter results based on cost, performance, or provider preference.

The authors evaluated the system using a benchmark set of 30 representative IaaS offerings across the three providers. Mapping accuracy was measured against a gold‑standard dataset created by domain experts; the system achieved a 92 % correct mapping rate, significantly higher than the 70 % achieved by a baseline keyword‑search approach. Query latency averaged 180 ms, demonstrating suitability for interactive use. In complex scenarios combining functional and non‑functional constraints—such as “find a VM with at least 4 vCPU, SSD storage, cost under $50/month, and compliance with GDPR in the EU region”—the system returned correct results with a recall of 0.95 and precision of 0.94.

The study concludes that an ontology‑centric representation effectively abstracts the heterogeneity of cloud service descriptions, enabling precise, semantically rich discovery that accommodates both technical specifications and business‑level QoS criteria. The authors outline future work that includes (1) automated ontology evolution using machine‑learning techniques to ingest new provider offerings, (2) support for multi‑cloud orchestration scenarios where composite services span multiple providers, and (3) integration of real‑time SLA monitoring data to allow dynamic re‑ranking of services based on current performance. This research thus provides a solid foundation for building intelligent, scalable cloud service marketplaces and management platforms.


Comments & Academic Discussion

Loading comments...

Leave a Comment