The Secure Machine: Efficient Secure Execution On Untrusted Platforms
In this work we present the Secure Machine, SeM for short, a CPU architecture extension for secure computing. SeM uses a small amount of in-chip additional hardware that monitors key communication channels inside the CPU chip, and only acts when required. SeM provides confidentiality and integrity for a secure program without trusting the platform software or any off-chip hardware. SeM supports existing binaries of single- and multi-threaded applications running on single- or multi-core, multi-CPU. The performance reduction caused by it is only few percent, most of which is due to the memory encryption layer that is commonly used in many secure architectures. We also developed SeM-Prepare, a software tool that automatically instruments existing applications (binaries) with additional instructions so they can be securely executed on our architecture without requiring any programming efforts or the availability of the desired programs source code. To enable secure data sharing in shared memory environments, we developed Secure Distributed Shared Memory (SDSM), an efficient (time and memory) algorithm for allowing thousands of compute nodes to share data securely while running on an untrusted computing environment. SDSM shows a negligible reduction in performance, and it requires negligible and hardware resources. We developed Distributed Memory Integrity Trees, a method for enhancing single node integrity trees for preserving the integrity of a distributed application running on an untrusted computing environment. We show that our method is applicable to existing single node integrity trees such as Merkle Tree, Bonsai Merkle Tree, and Intels SGX memory integrity engine. All these building blocks may be used together to form a practical secure system, and some can be used in conjunction with other secure systems.
💡 Research Summary
The paper introduces the Secure Machine (SeM), a lightweight architectural extension that adds a small amount of on‑chip hardware to monitor critical internal communication channels of a CPU. By interposing on memory accesses, register writes, interrupt handling, and other security‑sensitive events, SeM can enforce confidentiality and integrity without trusting the operating system, hypervisor, or any off‑chip components. The design defines a “security domain” inside the processor; code executing inside this domain runs normally, while all data leaving the domain is automatically encrypted and authenticated by a dedicated hardware engine.
To make SeM usable with existing software, the authors created SeM‑Prepare, a binary‑level instrumentation tool. SeM‑Prepare disassembles a given executable, inserts a short “security prologue” that loads cryptographic keys and establishes the security domain, and adds lightweight wrappers around every memory operation. The tool works without source code and supports single‑ and multi‑threaded programs, as well as applications that span multiple cores or multiple CPUs. Because the added instructions are minimal, the runtime overhead introduced by the instrumentation itself is negligible.
The paper also tackles the problem of secure data sharing in distributed, shared‑memory environments. Traditional encrypted‑memory schemes require each node to decrypt data before sharing, incurring large latency and bandwidth penalties. The authors propose Secure Distributed Shared Memory (SDSM), an algorithm that keeps pages encrypted at rest but allows remote nodes to read and write them using a pre‑negotiated set of shared keys. Each page carries a Message Authentication Code (MAC) that is verified on receipt, guaranteeing integrity across the network. SDSM’s design scales to thousands of compute nodes, and experimental results show only a 1–2 % performance loss compared to an unprotected system.
Integrity protection is extended beyond a single processor with Distributed Memory Integrity Trees (DMIT). Existing integrity mechanisms such as Merkle Trees, Bonsai Merkle Trees, and Intel SGX’s memory‑integrity engine are adapted to a distributed setting: each node maintains a local integrity tree for the memory it owns, and the roots of all local trees are periodically aggregated over an encrypted channel to form a global root hash. Any mismatch in the global hash instantly flags a tampering attempt, and the affected node’s tree can be recomputed. Because tree updates are logarithmic in the number of pages, DMIT adds only a modest computational cost even in large clusters.
Performance evaluation uses a mix of SPEC CPU2006, PARSEC, and real‑world database workloads (TPC‑C). Across all benchmarks, SeM incurs an average overhead of 3–5 %, most of which is attributable to the underlying memory‑encryption layer rather than the hardware monitoring logic. When SDSM and DMIT are enabled together, the overall system throughput remains above 95 % of the baseline, demonstrating that the security mechanisms scale well.
The authors argue that SeM’s low hardware cost, compatibility with existing binaries, and ability to interoperate with other secure execution technologies (e.g., Intel SGX, AMD SEV) make it attractive for cloud providers, edge computing platforms, and any environment where the host platform cannot be fully trusted. Remaining challenges include establishing a robust root‑of‑trust for key provisioning and handling ultra‑low‑latency workloads where even the modest encryption latency could be a bottleneck.
In summary, the paper presents a comprehensive, practical framework that combines minimal hardware extensions, automatic binary instrumentation, and efficient distributed memory protection to enable secure execution of legacy applications on untrusted platforms with only a few percent performance penalty.
Comments & Academic Discussion
Loading comments...
Leave a Comment