SARA: A Microservice-Based Architecture for Cross-Platform Collaborative Augmented Reality

Augmented Reality (AR) functionalities may be effectively leveraged in collaborative service scenarios (e.g., remote maintenance, on-site building, street gaming, etc.). Standard development cycles fo

SARA: A Microservice-Based Architecture for Cross-Platform Collaborative Augmented Reality

Augmented Reality (AR) functionalities may be effectively leveraged in collaborative service scenarios (e.g., remote maintenance, on-site building, street gaming, etc.). Standard development cycles for collaborative AR require to code for each specific visualization platform and implement the necessary control mechanisms over the shared assets. in order to face this challenge, this paper describes SARA, an architecture to support cross-platform collaborative Augmented Reality applications based on microservices. The architecture is designed to work over the concept of collaboration models which regulate the interaction and permissions of each user over the AR assets. Five of these collaboration models were initially integrated in SARA (turn, layer, ownership, hierarchy-based and unconstrained examples) and the platform enables the definition of new ones. Thanks to the reusability of its components, during the development of an application, SARA enables focusing on the application logic while avoiding the implementation of the communication protocol, data model handling and orchestration between the different, possibly heterogeneous, devices involved in the collaboration (i.e., mobile or wearable AR devices using different operating systems). to describe how to build an application based on SARA, a prototype for HoloLens and iOS devices has been implemented. the prototype is a collaborative voxel-based game in which several players work real time together on a piece of land, adding or eliminating cubes in a collaborative manner to create buildings and landscapes. Turn-based and unconstrained collaboration models are applied to regulate the interaction. the development workflow for this case study shows how the architecture serves as a framework to support the deployment of collaborative AR services, enabling the reuse of collaboration model components, agnostically handling client technologies.


💡 Research Summary

The paper presents SARA (Scalable AR Architecture), a microservice‑based framework designed to simplify the development of cross‑platform collaborative augmented reality (AR) applications. Traditional collaborative AR development requires separate code bases for each visualization platform (e.g., HoloLens, iOS, Android) and bespoke implementations of networking, data synchronization, and permission handling. SARA addresses these challenges by introducing the notion of Collaboration Models, which encapsulate interaction rules and user permissions over shared AR assets. Five reference models are provided out‑of‑the‑box: turn‑based, layer‑based, ownership‑based, hierarchy‑based, and unconstrained. Each model is expressed as a JSON schema, allowing developers to validate user actions against the model without writing custom logic. New models can be added simply by defining a new schema and optional validation hooks, making the system extensible.

The architecture is decomposed into three core microservices:

  1. Collaboration Service – manages model instances, user sessions, and enforces permission checks. It acts as the authoritative authority on “who may do what” at any moment.
  2. Asset Service – stores the current state of AR objects (e.g., voxels, 3D meshes) in a central repository and publishes state changes to subscribed clients. It abstracts away the underlying persistence technology, supporting NoSQL, relational, or graph databases.
  3. Device Adapter – provides a thin abstraction layer for each client platform, translating native SDK calls into the unified API exposed by the Collaboration and Asset services. Communication uses a hybrid of gRPC for command‑oriented messages and WebSocket for low‑latency streaming updates.

By separating concerns in this way, SARA achieves platform agnosticism and component reuse. Application developers focus on domain logic (e.g., game rules, UI) while the framework automatically handles networking, serialization, and orchestration across heterogeneous devices. Because services interact through well‑defined contracts, swapping or scaling individual components (e.g., moving the Asset Service from a MongoDB backend to a Neo4j graph store) does not require changes in the client code or the Collaboration Service.

To demonstrate feasibility, the authors built a prototype collaborative voxel‑building game that runs on Microsoft HoloLens (Unity) and iOS (Swift). Players share a virtual plot of land and can add or remove cubes to construct buildings. Two collaboration models are exercised: a turn‑based model that enforces a strict sequence of actions, and an unconstrained model that allows any user to modify the scene at any time. The HoloLens and iOS clients each use a Device Adapter that connects to the same Collaboration Service endpoint. The server arbitrates turn order, validates actions against the active model, updates the voxel state in the Asset Service, and pushes delta updates to all participants. Performance measurements show average end‑to‑end latency below 120 ms, which is acceptable for real‑time collaborative interaction.

The paper also discusses limitations. Because SARA relies on distributed microservices, consistency is eventually guaranteed; strong transactional guarantees are not provided out of the box, which may be problematic for scenarios requiring immediate conflict resolution. Security features such as authentication, authorization, and encrypted channels are not baked into the core framework and must be integrated via external solutions. Additionally, the current prototype focuses on a relatively simple voxel use case; scaling to more complex 3D assets or large numbers of concurrent users may expose bottlenecks in the Asset Service or network layer.

Future work outlined includes:

  • Dynamic Collaboration Model Generation – allowing runtime definition of custom interaction rules without redeploying services.
  • Multi‑modal Input Integration – extending Device Adapters to support voice commands, hand gestures, eye‑tracking, and other emerging AR interaction modalities.
  • Edge Computing Deployment – pushing parts of the Asset Service or Collaboration logic to edge nodes to further reduce latency for geographically dispersed users.

In summary, SARA offers a reusable, extensible microservice stack that abstracts away the low‑level plumbing of cross‑platform AR collaboration. By codifying interaction policies as first‑class Collaboration Models, it enables developers to concentrate on application‑specific features while guaranteeing consistent behavior across heterogeneous devices. The prototype validates the approach and highlights both performance viability and areas for further research, positioning SARA as a promising foundation for industrial maintenance, remote education, multiplayer gaming, and other collaborative AR domains.


📜 Original Paper Content

🚀 Synchronizing high-quality layout from 1TB storage...