YUHENG-OS: A Cloud-Native Space Cluster Operating System

As industry and academia continue to advance spaceborne computing and communication capabilities, the formation of cloud-native space clusters (CNSCs) has become an increasingly evident trend. This evolution progressively exposes the resource managem…

Authors: Jin Zhang, Jiachen Sun, Kai Liu

YUHENG-OS: A Cloud-Native Space Cluster Operating System
As industry and academia continue to advance spa ceborne computing and communication cap abilities, the formation of cloud-native space clusters (CNSCs) has become an increasingly evident trend. This evolution pr ogressively exposes the resource management challenges associa ted with coordinating fragmented and heterogeneous onboar d resources while supporting large-scale and diverse space applications. However, directly transplanting mature terrestrial cloud-native cluster operating system paradigms into space is ineffective due to the fragmentation of spaceborne computing resources and satellite mobility, which collectively impose substantial challenges on resource awareness and orchestration. This article presents YUHENG-OS, a cloud- native space cluster operating system tailored for CNSCs. YUHENG-OS provides u nified abstraction, a wareness, and orchestration of heterogeneous spaceborne infrastructure, enabling clu ster-wide task deployment and sche duling across distributed satellites. We introduce a four-layer system architecture and three key enabl ing technologies: modeling of heterogeneous resource demands for sp ace t asks, fra gmented heterogeneous resource awareness un der netwo rk constraints, and matching of differentiated tasks with multidimensional heterogeneous resources under tempo ral dependency constraints. Evaluation results show that, compared with representative terrestrial clo ud-native cluster operating systems exemplified by Kubernetes, YUHENG-OS achieves a substantially higher ta sk completion ratio , with improvements of up to 98%. T his advantage is p rimarily attributed to its ability to reduce resource awa reness delay by 71%. Introductio n In ancient Chinese cosmol ogy, YUHENG evokes an instrument of steadin ess amid motion—an anchor for o rdering the sky when the stars do not stand still. In The Canon o f Shun, this id ea is expressed in the phrase “ ob serving th e heavens with XUANJI a nd YUHENG to align the governance o f a ll celestial affairs. ” It reflects a simple yet powerful principle: when the environment is seemingly chaotic and inh erently dynamic, effective g overnance depend s o n accurate awareness and coordinated management. Ongoing advances in spacebor ne computing and communication technologies are driving satellite systems to evolve fro m single-satellite, mission-specific p latforms toward constellation-scale, networked infrastructures, enabling emerging applications such as spaceborne data inference and training, as well as Direct Han dset to Satellite (DHT S) [1], [2]. With the continuous expansion o f constellation size and t he increasing diversity of o nboard paylo ad capabilities, multiple satellites in orbit are progressively forming a shared heterogeneous resource p ool enco mpassing computation (CPU/GPU/AI accelerators), storage, networking (inter- satellite and satellite–ground laser/microwave links), etc. [3]. These resources can be sensed, abstracted, and o rchestrated in a unified manner. In parallel, onboard applications are transitioning from tightly coupled, mission-specific p rograms to portab le, containerized application u nits, enab ling workloads such as in-orbit data training and spaceborne image p rocessing to b e deploy ed and migrated onboard in a lightweight fashion [1]. This evolution h as given rise to the early form o f cloud- native space clusters (CNSCs). Recent momentum in both industry and academia furth er indicates th at the construction of CNSCs h as become practically feasible. From an industrial perspective, i n 2025, Planet partn ered with NVIDIA to integrate GPUs in to its Owl remote-sensing mega-constellation, enabling real-time, in-orbit processing of h igh-resolution Earth observation data [4 ]. Starcloud successfully laun ched its test satellite Starcloud-1, equipped with an NVIDIA H100 GPU, into Lo w Earth Orbit (LEO), marking a milestone as th e first attempt to conduct language model training in space using data center–class GPUs [5]. In p arallel, Google announced its Project Suncatcher initiative, proposing an in-orbit computing con stellation composed o f 81 satellites equipped with TPUs [6], while Starlink has explicitly stated plans to integrate onboard computing payloads into its V3 satellites [7]. In China, the Chenguang-1 satellite was launched in 2025, with an onboard computing capability comparable to that of a gro und-based server. Subsequent plans aim to build a space data center with an ov erall computing capacity exceeding 400,000 Peta Floating Point Operations Per Second (PFLOPS) [8]. In academia, Y U H E N G - O S : A C l o u d - N a t i v e S p a c e C l u s t e r O p e r a t i n g S y s t e m J i n Z h a n g , T s i n g h u a U n i v e r s i t y J i a c h e n S u n , H o n g K o n g U n i v e r s i t y o f S c i e n c e a n d T e c h n o l o g y K a i L i u , T s i n g h u a U n i v e r s i t y L i n l i n g K u a n g , T s i n g h u a U n i v e r s i t y J i a n h u a L u , T s i n g h u a U n i v e r s i t y Tsinghua University has proposed the deployment of the Tsinghua Sp ace Network (TSN) in Mediu m Earth Orbit (MEO), leveraging its wide-area coverage and stable communication capacity to p rovide sustained and reliable management and artificial intelligence services for massive volu mes o f spaceborne data and heterogeneous resources. The first satellite of TSN, TN-1A, was successfully launched in 2024 and is currently undergoing in-orbit validation of key enabling technologies [9]. Against this backd rop, there is a gro wing need for a stable resource management p latform capable of bridging large-scale, diverse space applications with fragmented, heterogeneous on- orbit resources. A useful point of reference can be found in terrestrial cloud platfo rms, where cloud-native clu ster operating systems already play a comparable role by matching massive, multi-tenant application demand s wit h large pools of computing and storage infrastructure. The cloud-native software stack represented by Kubernetes—proposed and open-sourced by Google i n 2014—has gradually become an industry standard, enabling mainstream cloud service pro viders to support large-scale automated deployment, elastic scaling, and operational management of containerized applications in data center env ironments [10]. Building on Kubernetes, Amazon o ffers Elastic Kubernetes Service (EKS), which is deeply integrated with its cloud infrastructure fo r resou rce orchestration and operational management [11 ]. Similarly, major cloud vendors, including Tencent Cloud [12] and Baidu AI Clou d [13], have adopted Kubernetes as th e core foundation for building their o wn cluster operating systems targeting large-scale data center deployments. Huawei Cl oud further extends Kubernetes through the KubeEdge framework, enabling cluster op erating system capabilities to exp and beyond centralized data centers to distributed nodes and heterogeneous device environments [14]. These successful precedents naturally motivate efforts to ext end th e cloud-native cluster operating system p aradigm to CNSCs. However, o wing to fundamental differences in environmental condi tions and system constrain ts, d irectly porting terrestrial cloud-native cluster operating system paradigms to space is often ineffective. One fundamental challenge arises from th e fragmented nature of spaceborne computing resou rces. Unl ike terrestrial clusters, where computing capacity is typically aggregated into a small number of large, homogeneous server clusters interconnected b y high-bandwidth links, satellite comp uting resources are inherent ly distributed across a large population o f spatially separated nodes. For example, in Ch ina, over 8.3 million standard server racks are deployed nation wide, collectively p roviding computing capacity o n the o rder of 246 EFLOPS, yet such massive capacity is concentrated in only 10 national data center clu sters [15]. In contrast, computing resources in CNSCs are spread across multiple orbital shells and thousands of satellites, each o ffering limited onbo ard capability—currently on the order of hundreds of TOPS — under strict power and th ermal constraints. As th e network scale increases, resou rce fra gmentation becomes more pronounced, substantially increasing the difficulty of resource management, particu larly for the Network Operation and Control Centers (NOCCs) to maintain timely and accurate visibility into the resource states of CNSCs. Another critical challenge stems from satellite mobili ty, which results in constrained visibility windows and intermittent connectivity. For low Earth orbit systems, satellite–ground communications typically last only several minutes to on th e order o f ten minutes p er pass, making it impractical for satellite–ground state syn chronization to rely on h igh- frequency, stable heartbeat mechanisms in the same manner as terrestrial cloud-n ative clusters. In ter-satellite links exhibit similarly tran sient availabili ty, as visib ility windows may range from minu tes to hours depen ding on orb ital configurations, while link q uality continuously varies with relative geometry, thereby preventing t he establishment of stable, fixed high- throughput connections comparable to data-center optical interconnects. These mobility-driven network dynamics fundamentally challenge terrestrial cloud-native cluster operating systems, which are largely centered on computation and storage with relativ ely limited modeling of communication constraints. Consequently, i n CNSCs, communicatio n resources must be explicit ly modeled and orchestrated as constraints that are as critical as computation and storage, and, in certain scenarios, may even become the dominan t limiting factor. In summary, the fragmentation and mobility of spaceborne resources constitute a central challeng e for CNSCs, which involves no t only obtaining timely and accurate awareness of fragmented resource states, but also efficiently orchestrating heterogeneous resources and d iverse tasks under highly dynamic conditions. Remarkably, these two aspects align closely with the prin ciples embo died by XUANJI and YUHENG. According ly, we propo se an efficient resou rce management platfo rm for CNSCs, named YUHENG-OS, which serves as a k ey enabling layer b etween space applications and space cluster infrastructu res under network constraints by providing unified abstraction , awareness, and orchestration o f h eterogeneous resources, as well as suppo rting cluster-wide deployment and scheduling of space tasks. The main contributions of this arti cle are threefold: (1) We propo se a four-layer architecture for YUHENG-OS tailored to CNSCs, which effectively b ridges the space application layer and th e underlying space cluster infrastructure by providing u nified support fo r cluster scalability , awareness of fragmented resources, heterogeneous resource orchestration, and task demand analysis. (2) We develop three key technologies, including modeling of heterogeneous resource demands for space tasks, fragmented heterogeneous resource awareness under network constraints, and matching of d ifferentiated tasks with multidimensional heterogeneous resources under temporal dependency constraints. (3) We conduct comprehensive evaluations to assess the effectiveness of YUHENG -OS. The results demonstrate that, compared with terrestrial cloud-native cluster operating systems exemplified by Kubernetes, YUHENG-OS achieves significant improvements in task completion ratio. The remainder of this article is org anized as follows. Section II in troduces the proposed YUHENG-OS architecture. Section III presents the three k ey technologies. Section IV reports and analyzes the simulation results. Section V concludes the article. Figure 1 Architecture overview of YUHENG- OS and its deploy ment in a CNSC. YUHENG-OS Ar chitecture Overview A CNSC system comp rises satellites distributed across multiple o rbital regimes, including LEO, MEO, and Geostationary Earth Orbit (GEO), and spanning diverse functional roles such as remote sensing–computing and communication–computing. Its g round segment inclu des an NOCC for each constellati on, a unified Task Scheduling Cent er, and multiple User Access Portals. Within CNSC, the YUHENG-OS serves as the core resource management layer, bridging the underlying infrastructure layer (L0) and t he app lication layer (L5), as shown in Figure 1. It is designed to provide unified abstraction, awareness, and orchestration of fragment ed, heterogeneous spaceborne resources, thereby enabling on-demand service provisioning for large-scale space application s. From a system perspective, L0 resides beneath YUHENG- OS and constitutes th e foundational infrastructure for resource provisioning and task execut ion environments. It includes heterogeneous onboard resources, the satellite OS, and the container runtime, joi ntly formed b y per-satellite heterogeneous spaceborne resources, the satellite o perating system, and the container runtime. Sp ecifically, heterogeneous onboard resources comprise spacebo rne computing units (e.g., CPUs and GPUs), storage resources, inter-satellite and satellite–ground laser/microwave communication links, as well as mission -specific payloads such as remote sensing sensors. The satellit e OS manages local resources and reports resource information to YUHENG-OS, while the container runtime provides a unified execution environment fo r cloud-nativ e applications deployed onboard. Togeth er, these components form the resource and runtime basis for YUHENG-OS. Above YUHENG-OS, L5 represents the a pplication layer that suppl ies task inputs and service requests, enco mpassing a broad spectrum of cloud-nativ e space applications, including but not limit ed to routine environmental monitoring, disaster monitoring, space edg e computing, and onboard data training. In addition, an L5 Onboard Agent is deploy ed on each satellite to receive task execution d irectives and manage the task execution lifecycle, while an L5 Ground Agen t is depl oyed at each NOCC and the Task Scheduling Center to accept user task requests and distribute task execution instructions to corresponding satellites. Functionally, YUHENG-OS is o rganized into four logical layers: CNSC Extension Management (L1 ), CNSC Resource View Co nstruction (L2), CNSC Resou rce Orchestratio n (L3), and Task Analy sis and Modeling (L4), which are deployed in the form of onboard agents and ground agents. The following subsections describe the de sign rationale and core functionalities of these four layers in detail, followed by an overview of the deploy ment archit ecture and operational workflow of YUHENG-OS within the overall system. L1. CNSC Extens ion Mana gement The CNSC Ex tension Management layer provides controlled cluster expansion and capabi lity registration for CNSC. It consists o f a CNSC Membership Manager and a Satellite Capability Registry, which together enable heterogeneous satellites to elastically join and leave the cluster. The CNSC Membership Manager maintains node lifecycle states and cluster membership consistency under intermittent connectivity, while the Satellite Capability Registry maintains abstract descriptions o f node capabilit ies across computing, storage, communication, etc. These capability descriptions reflect configured capacity bounds rather th an instantaneous resource availability, providing a stabl e reference for subsequent resource abstraction and orchestration. This la yer exposes a unified, soft ware-defined in terface for scalable cluster extension in YUHENG-OS. It is impl emented via both onboard and ground agents, where the onboard agent executes local jo in/leave handshakes, maintains per-satellite lifecycle state, and reports capability d escriptors, while the g round agent performs clu ster-wide membership coordination, admission control, and global capability registration fo r management and control-plane consistency. L2. CNSC Resourc e View Construction The CNSC Resource View Construction layer elevates fragmented, heterogeneous resources in to a cluster-level resource view suitable for schedu ling and orchestration. Heterogeneous Resource Abstraction module first provides a unified abstraction and virtualization of all heterogeneous resources across cluster n odes, producing p er-satellite resou rce profiles th at standardize c omputing, stora ge, communicatio n, and sensing capabilities. Building upon these per-satellite profiles, Awareness-Oriented Transmission Resource Allocation module assigns ap propriate transmission channels for each satellite’s resou rce i nformation and delivers the resource reports to the NOCC. Finally, Spatiotemporal Resource View Construction module consolidates the received multi-satellite resource reports into an integrated spatiotemporal resource view, passed to the L3 layer as real- time resource-state inpu t to support resource-aware orchestration. It is implemented via b oth onboard and g round agents, where the onboard agent performs lo cal heterogeneous resource abstraction, packages and periodically publishes per- satellite resource pro files, and selects uplink/downlink telemetry channels, while the g round agent receives multi- satellite reports at the NOCC, normalizes and aggregates them, and constructs the cluster-level spatiotempo ral resou rce view for upstream orchestration. L3. CNSC Resourc e Orche stration The CNSC Resource Orchestratio n la yer matches task requirements with suitable resources and generates executable scheduling plans under network-constrained conditions and task temporal d ependency constraints. Task Execution Dependency Analysis module first analy zes the temporal dependencies across different execution stages of a task and derives the resou rce demands of each stage across heterogeneous resources. Based on this analy sis, the Regular Periodic Scheduler performs period ic scheduling for routine tasks, while the Emergen cy Asy nchronous Scheduler enables event-triggered asynchronous schedu ling for u rgent tasks to accelerate response time. T o address scheduling failures an d execution interrupti ons caused by emergency task preemption or resou rce faults, th e Terminated Task Re-Scheduler re-plans terminated tasks to restore execution continuity. Meanwhile, the Resource Conflict Arbiter resolves resource contention arising during task scheduling, ensuring consistent and feasib le orchestration decisions across comp eting tasks. It is implemented v ia both onboard and ground agents, where the ground agent conducts cluster-level planning and decision- making (in cluding periodic and event-triggered scheduling, conflict resolution, and re-scheduling) using the global resource view, while the onboard agent enforces allocated plans locally by translating schedules into executable actions, monitoring execution status, and feeding back preemption, failures, and realized execution traces for closed-loop orchestration. L4. Task Ana lysis and Mod eling In contrast to terrestrial cloud-native systems, space-based clusters operate und er tightly c onstrained reso urce budgets, making over-provisioning neither feasible nor efficient . Consequently, resource management must move b eyond upper- bound allo cation and instead leverage an explicit understanding of ho w resource provisioning l evels affect task completion quality (e.g., comp letion latency and model accuracy), whi ch motivates the Task Analysis and Modeling layer. T his layer provides task-lev el intelligence by modeling resource consumption profiles und er different task comp letion qualiti es, thereby enabling fine-grained resou rce orchestration from the demand side. The Task Reso urce Dem and Analysis module receives task requests from the L5 layer and, b ased on the Task Demand Knowledge Base, characterizes each task b y identifying required resource types and estimating the corresponding resource con sumption needed to satisfy specified task completion quality requirements across computing, storage, and communica tion dimensions. The Task Demand Kno wledge Base stores empirical mappings b etween target task quality levels and the associated heterogeneous resource consumption, serving as a calibrated reference for orchestration. It is implemented via b oth onboard and g round agents, where the ground agent maintain s and updates the Task Demand Kno wledge Base using system-wide execution feedback and provides quality -constrained resou rce demand estimates to the L3 layer, while the o nboard agent records task - level runtime feedback—such as realized resource consumption and achieved com pletion quality—and reports them to continuously refi ne d emand estimates under given quality targets. Key Technolo gies of YUH ENG-OS In th is section, we discuss a set of key enabling technologies o f YUHENG-OS th at address the fundamental challenges of operating CNSCs. Modeling of Heterog eneous Resourc e Demands for Space Tasks In space environments, computing, storage, and communication resources are p ersistently constrained. Directly adopting the coarse-grained reservation strategy commonly used in terrestrial cluster op erating systems—where resources are provisioned according to worst-case demand bounds— often leads to prolonged reso urce idleness, thereby reducing overall system throughput and task completion efficiency. This motivates the need for fine-grained mod eling of multi- dimensional resource demands o ver the full task lifecycle, explicitly characterizing h ow different task types consume resources du ring sensing, processing, and data transmission or downlink stages. Such modeling p rovides an actionabl e foundation for efficient scheduling an d orchestration u nder tight resource constraints. To this end, we consider con structing a model- and data- driven task resource demand k nowledge b ase for space workloads, as sh own in Figure 2. For representativ e space processing tasks, the knowledge base employs systematic task profiling and parameter exploration to cover, as comprehensively as possible, f easible resource provisioning configurations. Through repeated evaluations, it assesses key performance indicato rs such as timeliness and accuracy, capturing both expected performance and variabi lity, and derives stage- aware resource d emand curves. Meanwhile, observations from in-orbit execution and task logs are continuously incorporated to refine and calibrate these models, allowing th e knowledge base to progressively converge to ward real-world resource consumption characteristics. With this capability, newly submitted tasks can q uery th eir lifecy cle- wide, multi-dimensional resource requirements prior to execution, enab ling demand-driven provisioning and fine- grained orchestration that substantially red uces resource idle periods and improves overall resource utili zation. Figure 2 Knowledge-base–dr iven task demand modeling and its impact on resource allocation efficien cy in space clusters . Fragmented Heterogeneou s Resource Awareness under Network Constraints To support task p lanning and resource orchestration, resource states of satellites in th e cluster must be aggregated at the NOCC with controllable latency and sufficient accuracy. However, cloud-nativ e space clusters are characterized by wide-area node distribution, low-rate and intermitten tly available links, and highly dynamic and fragmented resources. Under these conditions, the uniform p eriodic heartbeat mechanisms commonly adopt ed by terrestrial cloud-native cluster operating systems—which rely on persistent and h igh- bandwidth control–data plane connectivity—become ineffective and poorly scalable. As a result, resou rce awareness and aggregation mechanisms tailored to network-con strained, time-varying environments emerge as a key enabling technology. To add ress these con straints, we adopt a multi-domain awareness architecture ancho red by MEO/GEO satellites, leveraging their wid e coverage and relatively stable topology to form reliable transmission channels, as shown in Figure 3. Domain partitioning and inter-domain coordination explicitly account for link conditions and con trol-plane lo ad. On th is basis, differentiated resource state repo rting strategies are employed, in which reporting granularity and frequency are adaptively determined according to resource type and volatility. Specifically, resource profiles and change models are constructed for each nod e–resource pair: rapidly varying resources are monitored with s horter sensing intervals, while slowly changing or long -term stable resources are reported at longer intervals o r via even t-driven updates. This approach significantly reduces communication overhead while preserving high-fid elity awareness of critical resource states, providing reliable inputs for sub sequent scheduling decisions. Figure 3 MEO/GEO-Based Multi-Domain Awareness Architecture and Knowledge-Ba se-Driven Adaptive Resource Awareness Strategy . Matching of Differentiat ed Tasks with Multidimensiona l Heterogeneous R esources under Temporal Depe ndency Cons traints Matching space tasks to resources is inherently a mult i- dimensional knapsack p roblem, where heterog eneous resources, spanning computing, storage, communication, and sensing, need to be allocated to multi-stage task pipelines under temporal dependency constraints. Terrestrial clo ud-native cluster operating systems typ ically op erate ov er stable, high- bandwidth interconnects and relatively simple execution stag es, and therefore adopt task-triggered, compute–storage-oriented orchestration. These conditions do not hold for space clusters, where link avail ability is windowed and time-varying, topology evolves continuously, and task stages exhibit distinct resource preferences. This motivates a differentiated orchestration strategy that explicitly accounts for temporal constraints and multi-resource coupling. Figure 4 Differen tiated Task Orchestration Scheme and Multidimensional Knapsack Problem under Tempor al Dependen cy Co nstraints with a Multi-Source Re mote Sensing Fusion Task as an Example . We adopt a hierarchical orchestration framework that combines synchronous pre-planning for regular tasks with asynchronous fast reaction for urgent tasks, as shown in Figure 4. Predictable, recurring workloads are handled through periodic planning and rolling updates, whereas time-critical tasks trig ger rapid reactive scheduling to minimize response latency. To b alance response speed and orchestration complexity, regular tasks further employ tiered planning cycles: high-priority tasks are repl anned more frequ ently, while lower- priority tasks use longer cycles to reduce control and computation costs. Internally, each task is modeled as a directed acyclic graph (DAG), where nod es represent processing stages and edg es capture dependency constraints. The processing stages are subject to strict temporal ordering constraints, such that later stages cannot commence until prerequisite stages are completed. Fo r example, a multi-source remote-sensing fu sion task can be d ecomposed into a sequ ence of ordered execution stages, including multi-source sensing, image prepro cessing, image transmission, multi-source fusion, and result distribution, each exhibiting different demands and affinities for heterog eneous resources. Orchestration decisions must th erefore respect temporal dependency constraints, which substantially in creases decision complexity. In p ractice, scalable onlin e op eration can b e supported using low- complexity app roximation approaches, such as distributed heuristics or learning-driven policies. Moreover, due to the heterogeneity of onbo ard computing resources, different tasks exhibit distinct preferences fo r specific comp uting devices and achieve varying execution efficiencies. To fu lly exploit th e capabilities of heterogeneous devices, heterogeneity needs to be explicitly accounted for du ring orchestration. Results and Dis cussions To quantitativ ely evaluate th e performance o f YUHENG- OS in spaceborne cloud-native environments, we compare it with Kubernetes, a representative terrestrial cluster op erating system. A dedicated simulation framework is constructed to model a CNSC. As summarized in Table 1 , n etwork size from 600 to 6,000 satellites is considered across LE O, MEO, and GEO orb its. LEO and ME O adopt th e St arlink and TS N orbital configurations, respectively, while G EO satellites are uniformly distributed along the equatorial plane. The total computing capacity of the CNSC is set to 3000 GB/s and distributed across all satellites, resulting in in creasing computing dispersion as the network size grows. Rather than characterizing computational capability using conventional metrics such as Operation s Per Second (OPS) or FLOPS, we express it in terms o f the equivalent data p rocessing throughput per second. Inter-satellite link s are configured with microwave (100–500 kbps) and laser (5–20 Gbps) capacities, and satellite- to-ground links are set to 1 Gbps. Task arrivals range from 500 to 4,000, with four priority levels, where level 4 denotes emergency tasks. Each task represents a typical multi-source remote-sensing image fu sion workload, generating 5 GB of raw data that is reduced to 20 MB after onboard p rocessing. Performance is evaluated using the weighted task completion ratio, which captures the system’s ability to guarantee prioritized tasks while efficiently util izing heterog eneous computing and communication resources. Parameter Value Total Number of Satellites From 600 to 6,000 (including LEO, MEO , and GEO) Total Computing Ca pacity 3,000 GB/s (distributed across all satellites) Inter-Satellite Link Capa city {100, 200, 500} kb ps (microwave) {5, 10, 20} Gbps (las er) Satellite - to - Groun d Link Capacity 1 Gbps Number of Tasks From 500 to 4,000 Task Priority Levels Regular Task: 1, 2, 3, Emergency Task: 4 Origin Data Volume 5 GB Processed Data Volume 20 MB Table 1 Parameter Se ttings. Figure 5 sho ws that, as the network size increases, YUHENG-OS and Kubernetes exh ibit markedly different trends in weighted task completion ratio. With increasing network size, the weighted task completion ratio of YUHENG- OS consistently improves, achieving a maximum gain of 6 7%, whereas that of Kubernetes degrades significantly, with a reduction of approximately 50%. With a network size of 6000 and 4000 arrival tasks, the performance gap between YUHENG-OS and Kubernetes widens t o 98%. Figure 5 Weighted Task Completion Ratio with Varying Networ k Size and Task Number. To further identify the root causes of this performance disparity, we conduct a compar ative analy sis of t he average resource awareness delay and the awareness-induced scheduling failure ratio for YUHENG-OS and Kub ernetes. The average resource awarenes s delay characterizes the latency incurred by the NOCCs in acquiring a global resource view across the constellations. The awareness-induced scheduling failure ratio is defined as the proportion of scheduling failures attributable to awareness delays relative to the total number of failed tasks, thereby directly qu antifying the impact of awareness latency on scheduling outcomes. As illustrated in Figure 6, the average resou rce awareness delay of Kubernetes increases sharply with network size. When the network size reaches 6000, the resource view obtained by the NOCCs lags the actual onbo ard resource states by an average of 48s, resulting in an awareness-induced relative scheduling fail ure ratio as high as 8 2%. In contrast, YUHENG-OS demonstrates strong robustness to network scaling. Under th e same 6 000- satellite con figuration, its average resource awareness d elay is only 5 s, substantially lower th an that of Kub ernetes, and the awareness-induced relative schedu ling failure ratio is redu ced by approximately 71%. These result s in dicate th at the improvement in task completion ratio achieved by YUHENG-OS over Kub ernetes is primarily attributed to its significantly reduced awareness delay, underscoring that timely and ac curate resource awareness is critical to the effective management of CNSCs. This advantage stems from the multi-domain adap tive periodic awareness mechanism enab led by MEO/MEO constellation s. Furthermore, by incorporating fine-grained task mod eling and explicitly accounting for inter-stage temporal dependencies and the dynamic satellite network topology during resou rce orchestration, YUHENG-OS enab les precise resource allocation at each execution stage, th ereby sustaining h igh task completion ratios under larg e-scale and high-load conditions. Figure 6 Average Resource Awareness Latency and Awareness-Induced Sc heduling Failure Ratio w ith Varying Network Size (Task Nu mber = 4000) . Conclusion This article investigated the emerging paradigm of cloud- native space clusters and examined th e limitations of d irectly applying terrestrial cluster operating systems to space environments, with fragmented spaceborn e computing resources and satellite mobility. To address these challenges, we pro posed YUHENG-OS, a clo ud-native space cluster operating system that establishes a u nified resou rce management pathway spanning from space applications down to the und erlying space cluster infrastructures th rough a four- layer architecture comp rising CNSC Extension Management, Resource View Construction, Resou rce Orchestration, and Task Analysis and Modeling. Building on this architecture, we further introduced three key e nabling technologies: mo deling of heterogeneou s resource demands for space tasks, fragment ed heterogeneous resource awareness under network constraints, and matching of differentiated tasks with multidimensional heterogeneous resources under tempo ral dependency constraints. Simulation results demonstrate that YUHENG-OS significantly outperforms representative terrestrial solutions exemplified by Kubernetes in terms of task completion ratio, primarily due to its substantially lower resou rce awareness latency. These findings highlight the need for CNSC-oriented operating systems to explicitly incorporate network constraints, temporal dependencies, a nd resource heterogeneity into their core design. In this article, YUHENG-OS p rovides an OS-level foundation that delivers stable and unified resource management for inherently fragmented and dynamic space cluster infrastructures, th ereby enab ling scalable and efficient cloud-native space clusters. Acknowledgem ents This work was supported in part by th e National Natural Science Fou ndation of Chin a (Grant No. 62341130), in part by the Tsinghua University Initiative Scientific Research Program, and in part by the Shanghai Municipal Scien ce and Technology Major Project. Author Info rmation Jin Zhang (jin-zhan 22@mails.tsinghua.edu.cn) is currently working toward the Ph.D. degree with the Department of Electronic Engineering , Tsinghua University, Beijing, China. He received the B.S. degree from Beijing University of Posts and Telecommunications, Beijing, China, in 2022. His curren t research interest focuses on space-based computing networks. Jiachen Sun (sjc20@tsinghua.o rg.cn) is currently a postdoctoral researcher with The Hon g Kong University of Science and Technology, Hong Kong, China. He received the Ph.D. degree from the Department o f Electronic Engineering, Tsinghua University, Beijing, China, in 2026, and the B.S. degree from Xidian University, Xi’an, China. His current research interest is space-based compu ting networks. Kai Liu (liuk aiv@mail.tsinghua.edu.cn) is currently an Associate Res earch Fello w with the Beijing National Research Center for Information Science and Technology, Tsinghua University, Beiji ng, China. He received the B.S. and M.S. degrees from Xidian University, Xi’an, China, in 2009 and 2012, respectively , and the Ph.D. degree from Tsinghua University, Beijing, China, in 2016. His current research interests inclu de space information networks and on-board switching. Linling Kuang (k ll@mail.tsinghua.edu.cn) is currently a Research Fell ow with the Beijing Nation al Research Center for Information Scien ce and Technology, Tsinghua University, Beijing, China. Sh e received the B.S. and M.S. degrees from the National University o f Defense Technology, Changsha, China, in 1 995 and 1998, respectively, and the Ph.D. d egree from Tsin ghua University, Beijing, Ch ina, in 2004. Her research interest is satellite commun ications. Jianhua Lu (lhh -dee@mail.tsinghua.edu.cn) is currently a Professor with the Department of Electron ic Engin eering, Tsinghua University, Beijing, China. He received th e B.S . and M.S. degrees fro m Tsinghua University in 1986 and 1989, respectively, and th e Ph.D. degree fro m The Hong Kong University of Scien ce and Technology in 1998. His research interests in clude wireless comm unication s and satellite communications. He is a Fello w of IEEE. References [1] L. Kuang, J. Sun, J. Zhang, H. Cui, and K. Liu, “Towards Space- Based Computing Infrastructure Network: Devel opment Trends, Network Architecture, Challenges Analysis, and Key Technologies.” 2025. [Online]. Available: https://arxiv.org/abs/2 503.06521 [2] C. Guimarães, A. Netti, M. Sauer, F. Zeiger, H.-P. Huth, and E. Boriskova, “A Survey on S atellite Co mputing: Connecting the Dots Between Networks and Applications,” IEEE Commun. Surv. Tutor., vol. 28, pp. 567–592, 2026, doi: 10.1109/COMST.2 025.3579525. [3] Y. Zuo, M. Yue, H. Yang, L. W u, and X. Yuan, “Integrating Communication, Sens ing and Co mputing in Satellite Internet of Things: Challenges and Opportuni ties,” IEEE Wirel. Commun., vol. 31, no. 3, pp. 332–338, 2024, doi: 10.1109/MWC.019.2 200574. [4] “Owl - Next-generation Monitoring | Planet.” Accessed: Nov. 04, 2025. [Online]. Availabl e: https://www.planet.com /constellations/owl/ [5] A. Lee, “How Starclou d Is Bringing Data Centers to Outer Space,” NVIDIA Blog. Accessed: Jan. 11, 2 026. [Online]. Available: https://blogs.nvidia.com/ blog/starcloud/ [6] “Meet Proje ct Suncatch er, a research moonshot to scale machine learning compute in s pace.,” Google. Accessed: Jan. 11, 2026. [Online]. Available : https://blog.google/in novation-and- ai/technology/researc h/google-project-suncatc her/ [7] E. Berger, “Elon Musk on data centers in orbit: ‘S paceX will be doing this,’” Ars Technica. Accessed: J an. 11, 2026. [Online]. Available: https://arstechnica.com/s pace/2025/10/elo n-musk-on- data-centers-in-orbit-sp acex-will-be-doi ng-this/ [8] “Beijing Institute to Build China’s First Space Computing Center 800 km Above Earth.” Accessed: Jan. 11, 2026. [Online]. Available: https://www.yicaiglobal.co m/news/beijing-inst itute-to- build-chinas-first-space- computing-center- 800-km-above-earth [9] “Tsinghua’s space network experimental platform achieves breakthrough-Tsingh ua Un iversity.” Accessed: Jan. 11, 2026. [Online]. Available: https://www.tsinghua.edu. cn/en/info/1399/140 43.htm [10] “Kubernetes,” Kubernetes. Accessed: Oct. 31, 2025. [Online]. Available: https://kubern etes.io/ [11] “Amazon Elastic Kubernetes Service | AWS,” Amazon Web Services, Inc. Accessed: Jan. 11, 2026. [Online]. Available : https://aws.amazon.c om/cn/eks/ [12] “Tencent Kubernetes Engine | Tencent Cloud.” Accessed: Jan. 11, 2026. [Online]. A vailable: https://www.tencentclou d.com [13] “Container Engine Service C CE_Container Service_K8S_Container Management Platform-Baidu AI Cloud.” Accessed: Jan. 11, 2026. [Online]. Available : https://intl.cloud.baidu.c om/en/product/cce.h tml [14] “KubeEdge Initiated by Huawei Cloud Becomes a CNCF Graduated Project,” Huawei Cloud. Accessed: Jan. 11, 2026. [Online]. Available: https://www .huaweicloud.com/intl/en- us/news/20241018 154136583.html [15] Open Data Center Committee (ODCC), “China Comprehensive Computing Power Index (2024).” 2024. [O nline]. Availabl e: https://www.odcc.org .cn/news/p-1879463 193510518785. html

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment