We present DASH (Deception-Augmented Shared mental model for Human-machine teaming), a novel framework that enhances mission resilience by embedding proactive deception into Shared Mental Models (SMM). Designed for mission-critical applications such as surveillance and rescue, DASH introduces "bait tasks" to detect insider threats, e.g., compromised Unmanned Ground Vehicles (UGVs), AI agents, or human analysts, before they degrade team performance. Upon detection, tailored recovery mechanisms are activated, including UGV system reinstallation, AI model retraining, or human analyst replacement. In contrast to existing SMM approaches that neglect insider risks, DASH improves both coordination and security. Empirical evaluations across four schemes (DASH, SMM-only, no-SMM, and baseline) show that DASH sustains approximately 80% mission success under high attack rates, eight times higher than the baseline. This work contributes a practical human-AI teaming framework grounded in shared mental models, a deception-based strategy for insider threat detection, and empirical evidence of enhanced robustness under adversarial conditions. DASH establishes a foundation for secure, adaptive human-machine teaming in contested environments.
Why human-machine teaming (HMT) systems? As HMT systems are increasingly deployed in high-stakes operations, ensuring secure and efficient collaboration among human analysts, AI agents, and UGVs is critical [1]. A well-established Shared Mental Model (SMM) enhances coordination by synchronizing tasks, fosters verification and validation, promotes adaptation to dynamic conditions, and mitigates failures. Without an effective SMM, miscommunication, inefficient task allocation, and increased risks of mission failure arise [2]. Moreover, adversaries can exploit inconsistencies in trust and coordination, underscoring the need for a security-aware HMT framework that strengthens both collaboration and resilience.
Why are SMMs critical? According to Cannon-Bowers’ team mental model framework [3,4], an SMM is known to establish a shared understanding that enhances coherent task execution and adaptability. However, their application in HMT systems remains underdeveloped [5]. Existing models [6,7] fail to capture the dynamic interactions between human and AI teammates, and security remains largely unaddressed. Traditional static defenses [8] are insufficient against adversaries manipulating trust dynamics or selectively targeting system components. To ensure mission integrity, a security-aware SMM is essential for proactively detecting and mitigating these evolving threats.
Cyber deception in HMT systems. Cyber deception has emerged as a promising proactive defense strategy [9], offering unique advantages in securing HMT systems. Unlike conventional security mechanisms focusing solely on perimeter defense or reactive threat detection [10,11], cyber deception actively manipulates an adversary’s perception, inducing suboptimal decisions [9]. This is particularly valuable in HMT environments, where the diverse attack surface, spanning human, AI, and physical components, creates complex trust relationships. By strategically deploying deceptive elements, such as bait tasks [12], defenders can identify compromised team members before mission degradation occurs [13]. The collaborative nature of HMT systems also enables crossvalidation of actions across team members, further enhancing deception effectiveness [14,15].
Moreover, deception techniques can seamlessly integrate with normal mission operations without disrupting workflow, preserving operational continuity [16]. Despite the critical need to secure HMT systems against sophisticated threats, particularly insider attacks that exploit the human-machine trust boundary, there is no existing comprehensive framework integrating cyber deception with HMT [3,5]. This gap represents a significant vulnerability in mission-critical applications, where both performance and security must be jointly optimized [17].
Gaps filled by our approach, DASH. We propose Deception-Augmented Shared Mental Model for Human-Machine Teaming (DASH), integrating proactive cyber deception into SMMs to improve HMT performance and ensure security. Implemented in a simulated surveillance mission with UGVs, an AI agent, and a human analyst, DASH employs bait arXiv:2512.18616v1 [cs.HC] 21 Dec 2025 tasks for real-time insider threat detection and dynamic trust adjustment. It optimizes coordination through adaptive task allocation, balancing efficiency and security in adversarial environments. By formalizing an operational SMM framework, DASH integrates seamlessly into HMT systems, detecting compromised members without disrupting operations. It also evaluates trade-offs among mission success, operational costs, and detection effectiveness under varying APT conditions, ensuring resilient and secure human-machine collaboration.
This study makes the following key contributions:
A. Human-Machine Teaming (HMT) Systems 1) Definition: HMT refers to the collaboration between humans and machines to achieve shared objectives through interaction and communication [18]. Human-AI Teaming Systems (HATS) integrate humans and AI models to combine their respective strengths [3], while Human-Machine Collaboration (HMC) represents a collaborative form of Human-Machine Interface (HMI) common in health management, surveillance, and manufacturing [19]. Our work implements an HMT system where humans and machines collaborate through coordinated information sharing and trust-based interactions.
-
Trust: Trust in HMT systems encompass: (1) Trust in automation, an agent’s reliability in uncertain conditions [20], and (2) Mutual trust, which enhances communication, consistency, and adaptability [21]. Machines must estimate human trust levels to ensure safety and interaction quality [22]. Studies show how fidelity and trust influence human responses to autonomous systems [23], with human oversight remaining critical in AI-driven operations [24]. Our work implements trust as a dynamic component through the ADTM protocol, which adjusts information sharing based on trust levels to enhance mission integrity against insider threats.
-
Shared Mental Models (SMMs): SMMs enhance coordination in HMT systems, improving team effectiveness [3]. Computational models enable artificial agents to predict human behavior [5,25], while frameworks like CAST anticipate teammates’ information needs [26,27]. Robotic systems demonstrate that shared cognition enhances coordination in simulated environments [28]. Well-established SMMs improve communication and performance under stress [29][30][31], enhancing adaptability in dynamic environments [32,33] across domains like surveillance, rescue, and logistics [17,34,35].
Despite these advantages, existing SMM implementations have three critical limitations: they focus on coordination while neglecting security considerations; validation remains confined to simplified simulations (e.g., Minecraft-based USAR tests [5]); and they employ static trust assumptions [3]. Our DASH framework addresses these through: (1) cyber deception mechanisms with bait tasks for detecting compromised agents, (2) dynamic trust management using multi-source verification, and (3) adaptive task allocation algorithms that maintain mission effectiveness during recovery cycles. DASH’s validation demonstrates higher resilience to advanced persistent threat (APT) attacks compared to conventional approaches [30,31].
- Mission-Driven HMT Systems: Mission-driven HMT systems improve coordination in high-stakes settings [36], with AI-as-Partner frameworks supporting complex task collaboration [37]. Key advances include human systems engineering [38], model-based systems engineering for unmanned missions [39], resilient UGV platoon frameworks [14], and hierarchical systems for disaster response [40]. Augmented reality enhances infantry, unmanned interaction [41], while psychological manipulation risks failing missions [42,43].
However, existing systems lack real-time mechanisms for detecting compromised team members [14,37], sufficient integration between trust assessment and decision-making [15,41], and adequate balance between security and mission performance [36,44]. Studies on security threats [45][46][47] highlight insider threats but lack integration with HMT frameworks. Our work addresses these limitations through proactive defense mechanisms that detect threats in real-time while maintaining operational continuity.
Defensive deception misleads attackers by shaping their perceptions to trigger suboptimal decisions [48,49]. Baitbased methods include trap files for unauthorized access detection [50], game-theoretic honeypots [51], and systems combining camouflage and decoys [12,52]. Most existing techniques focus on conventional IT infrastructures with static trust models and are tested in isolated settings, not complex multi-agent systems.
DASH embeds bait tasks into task allocation and trust management to detect compromised team members while maintaining team performance. It is the first to integrate baitbased deception into an operational SMM for HMT systems, bridging traditional deception methods with the dynamic needs of modern human-machine teams.
Table I compares our work with existing HMT and deception frameworks based on their use of HMT, deception, SMM, and trust management. In Section VII, we further compare the performance of DASH against these counterparts, focusing on the impact of deception and SMM integration.
III. PROBLEM STATEMENT This work considers a mission-oriented HMT system comprising multiple UGVs with peer-to-peer communication, an AI agent, a human analyst, and a command center for surveillance operations. Such systems are vulnerable to APTs that may implant backdoors in the AI agent, manipulate the human analyst via social engineering, or inject false data into UGV communications to disrupt mission goals. To counter these threats, the Command Center operates the DASH framework, managing trust, task allocation, and information integration. A Trusted Execution Environment (TEE) [53] ensures the integrity of critical computations and decisions.
Traditional security measures often fail to proactively detect or mitigate sophisticated threats [54]. DASH addresses this gap by integrating individual mental models (IMMs) into a unified SMM and deploying deceptive “bait tasks” to identify compromised components. These tasks include planted objects for UGVs, validation prompts for the AI agent, and known-image checks for human analysts. A trust management mechanism continuously evaluates each node, triggering recovery protocols, UGV reinstallation, AI retraining, or analyst replacement, when trust scores fall below a threshold (see Section V-B).
Each surveillance operation comprises missions, each representing a detection cycle (DC) initiated by the Command Center. A binary variable denotes success ( = 1 if classification is correct, 0 otherwise). The total cost total includes mission execution, bait deployment, and recovery.
The DASH framework aims to maximize:
where MSR = =1 , =1 โค max , is the time for mission and max is the total time limit, and 1 and 2 ( 1 + 2 = 1, 1 , 2 > 0) balance mission success and cost.
As an illustrative example, consider a three-cycle mission where AI analysis achieves MSR = 0.67 at a total cost of 0.3, while human review achieves MSR = 1.00 at a cost of 3.0. With weights 1 = 0.7 and 2 = 0.3, Eq. (1) yields a score of 0.7 ร 0.67 -0.3 ร 0.3 โ 0.38 for AI analysis and 0.7 ร 1 -0.3 ร 3 = -0.2 for human review. Since the AI score is higher, AI analysis is preferred.
DASH integrates IMMs, an SMM, and cyber deception to proactively detect compromised members. It assigns regular and bait tasks to trigger recovery actions such as UGV reinstallation, AI retraining, or analyst replacement, ensuring mission resilience under adversarial threats (see Section V).
The HMT system consists of multiple UGVs with autonomous navigation and image capture, communicating in a fully connected topology with the command center, the AI agent, and each other. UGVs share location, detection, and status data, enabling multi-angle verification to improve classification accuracy and mission reliability.
The AI agent processes detections and consults the human analyst under high uncertainty. The command center manages the mission, assigns regular and bait tasks, and monitors trust to maintain security. Wired links are secured via endto-end encryption and physical controls [55], while wireless channels remain vulnerable to injection and eavesdropping. This architecture enables efficient HMT with adaptive security, as illustrated in Fig. 1.
The command center initiates surveillance by specifying target object types and operates within a TEE to ensure secure task management. It implements the DASH framework by integrating IMMs into a unified SMM, allocating tasks, and deploying bait tasks to assess system integrity. Mission success is confirmed through crossvalidation of AI and human analyst results.
- UGVs: The system deploys autonomous UGVs with cameras and peer-to-peer communication, sharing location and detection data for multi-angle validation. Upon detecting a potential target, neighboring UGVs converge to confirm from different perspectives, relaying images to the AI agent and command center.
This validation process helps identify compromised UGVs injecting false data. If consensus is not achieved, the command center uses bait tasks to detect anomalies and triggers system reinstallation as needed. 3) AI Agent: The AI agent processes UGV images using a trust-based prioritized queue and applies Evidential Deep Learning (EDL) [56] to compute uncertain opinions via Subjective Logic (SL) [57]. It generates a belief vector b and epistemic uncertainty , ensuring b + = 1. The perdetection cost is detect = 0.1.
When > , the AI requests human input. The analyst provides a base rate vector a as prior class probabilities, incurring a higher cost detect = 1. The final prediction is guided by P = b + a โข .
- Human Analyst: The human analyst supports the system when AI uncertainty exceeds a threshold, contributing prior belief as a base rate vector a = [ 1 , 2 , . . . , ], with =1 = 1 [57]. This belief is integrated into P = b + a โข to enhance classification reliability.
The considered HMT system faces sophisticated adversarial threats targeting its key components. We categorize these threats as follows:
(1) Compromising UGVs: UGVs are particularly vulnerable due to their reliance on wireless communication interfaces. They are susceptible to reconnaissance, false data injection, and command-and-control attacks that disrupt mission performance [14,58]. Adversaries may analyze network traffic, inject false sensor data to mislead classification, or establish covert control channels to introduce transmission noise and impair coordination. (2) Compromising AI agents: AI agents face backdoor attacks during training, where adversaries tamper with loss computation mechanisms to generate backdoored inputs [59]. These attacks are designed to preserve standard performance while embedding malicious behavior [60], and often remain undetected by activating near convergence using defense evasion strategies [61]. (3) Compromising human analysts: Human analysts are vulnerable to social engineering and psychological manipulation [42,43]. Adversaries may gain trust through seemingly credible interactions and exploit both psychological factors and material incentives [46,47], potentially leading to biased or incorrect decision-making during high-stakes operations. Table II summarizes the attack behaviors and corresponding countermeasures proposed in this work. The DASH framework addresses these threats through a comprehensive, multi-layered defense as below.
To ensure mission resilience, DASH implements targeted defenses against compromised components in the HMT system. These countermeasures are designed to detect, isolate, and recover from adversarial disruptions while preserving operational continuity. They are described as follows.
โข Defense Against Compromised UGVs: DASH employs a hardware-based module to initiate secure remote reinstallation. Upon compromise detection, an encrypted command triggers system sanitization, and integrity is verified using cryptographic signatures. โข Defense Against Compromised AI Agents: DASH mitigates backdoor risks by verifying scripts, dependencies, and loss functions, followed by retraining AI models with trusted datasets and verified software. II summarizes the proposed countermeasures against the identified threats.
This section presents DASH, a framework that optimizes task allocation, information sharing, and request handling in human-AI surveillance teams, while ensuring integrity through proactive threat detection. Building on traditional SMM theories [25,30,31], DASH integrates deception-based security to defend against adversarial threats.
DASH comprises three key components: individual mental models (IMMs), a shared mental model (SMM), and Adaptive Deceptive Task Management (ADTM) for secure task distribution. IMMs capture each participant’s roles, tasks, and situational awareness, and are periodically merged into the SMM to support coordinated decision-making. Reconnaissance [58] Attackers analyze wireless traffic to identify vulnerabilities, monitor protocol patterns, and plan exploits.
DASH detects anomalies, flags unusual scans, and dynamically adjusts security policies. False Data Injection [14] Attackers manipulate sensor data to mislead the command center about UGV locations, detections, or status.
DASH verifies data integrity via trust metrics and bait tasks, flagging anomalies for investigation. C2-based Noise Injection [14] Attackers use a C2 channel to inject noise, degrading image quality or altering classifications.
DASH applies multi-angle validation and human oversight to detect inconsistencies.
Code-based Backdoor Injection [59] Attackers modify loss computation to create models that misclassify specific inputs while appearing normal.
DASH audits training pipelines and libraries to detect and remove malicious code. Defense Evasion [61] Attackers embed evasion techniques that activate near convergence, improving resistance to detection.
DASH issues bait tasks with known outputs to expose hidden model compromises. Targeted Misclassification [60] The compromised model misclassifies specific inputs while maintaining high normal accuracy.
DASH retrains the AI model from scratch using verified datasets and trusted software.
Target Profiling [42] Attackers gather intelligence on analysts via social media and networks to exploit access.
DASH cross-validates AI and human analyst outputs to detect anomalies. Social Engineering [43,45] Attackers craft pretexting scenarios to build credibility and escalate security-compromising requests.
DASH deploys bait tasks to detect unusual response patterns indicative of external influence. Psychological Manipulation [46,47] Attackers exploit incentives and psychological vulnerabilities to manipulate validation results.
DASH triggers immediate personnel replacement, security training, and psychological support.
ADTM integrates deception into tasking by dynamically assigning regular and bait tasks to sustain mission continuity and detect compromised agents.
โข Regular Task Creation and Distribution: ADTM generates mission tasks (e.g., data collection, target identification, evidence analysis) using SMM data and assigns them dynamically based on roles, skills, and availability.
The system implements mechanisms to maintain member trustworthiness and operational integrity as follows.
- Chain-Based Trust Updates: The command center updates trust scores by verifying task outcomes. When a UGV transmits an image, the AI agent’s classification is compared against the human analyst’s assessment. Agreement increases trust for all involved, while discrepancies reduce trust across the chain, signaling potential compromise. Pseudocode for the trust-score update process is provided as Algorithm S1 in the supplement.
To evaluate UGV trust accurately, the system uses multi-UGV validation. Each UGV starts with moderate trust (e.g., 0.5). When multiple UGVs detect the same object, the command center compares their outputs. Trust increases if they agree and are validated by the AI agent or human analyst. A deviating UGV is penalized; if all disagree, all lose trust.
- Trust Calculation Across Agents: Trust for member at time is calculated as:
where ( ) and total ( ) are the number of consensusmatching and total tasks completed by , respectively. The decay factor = 0.001 reduces trust over time, and = 10 ensures adaptive initialization.
If ( ) < , a bait task is issued. Success resets trust to 1; failure triggers defense protocols (Sec. IV-D). If trust remains below , the system initiates component-specific recovery: UGV reinstallation with standby units (DT UGV = 2), AI retraining (DT AI = 3), or analyst replacement (DT human = 5).
To enhance resilience, information-sharing frequency adapts to trust levels. For any interaction between members and , the probability of full data transmission is ( ), the trust of the recipient. High-trust members receive full data; low-trust ones get restricted access. This mechanism minimizes exposure of mission-critical information while incentivizing reliable behavior. Operating within the SMM, data sharing decreases as trust in declines. Algorithm S2 in the supplement formalizes this protocol.
-
SMM Structure: The SMM framework captures a team’s collective understanding of system behavior, tasks, and roles, enhancing coordination and reducing miscommunication. It merges individual mental models (IMMs) into a unified reference that aligns members on objectives, execution, and responsibilities. Each team member maintains an IMM reflecting their unique knowledge, goals, state, and awareness, including mission plans, environmental knowledge, and expectations of others. While incomplete individually, IMMs form the foundation of shared understanding.
-
Deception-Augmented Team Mental Model in the SMM: The SMM serves as the central hub of team cognition, continuously updated through information exchange. It supports role clarity, task coordination, mutual trust assessment, and real-time adaptation to situational changes. Deception-based verification mechanisms embedded within the SMM enable early detection of inconsistencies, which is critical for mission success (see Section V-A).
Each bait task outcome updates the Team Mental Model (TEMM) component of the SMM by adjusting the trust score ( ). These updated trust values inform team member actions; for example, the AI agent prioritizes its processing queue based on descending trust scores, favoring high-trust sources. Compact versions of the SMM and IMM frameworks are shown in Figs. 3 and4, respectively, with full diagrams provided in Figs. S1-S2 of the supplement. A summary of key DASH components appears in Table SI of the supplement document.
- Example SMM: To illustrate, Fig. 5 shows an SMM for a surveillance mission involving a command center, UGVs, an AI agent, and a human analyst. Each team member maintains an IMM composed of a Task Mental Model (TAMM, Task
) and a Team Mental Model (TEMM, Team ). The TAMM includes static information (e.g., mission type, area maps) and dynamic elements (e.g., detected objects, sensor data, classification results), updated throughout the mission. The TEMM captures static factors (e.g., roles, capabilities) and dynamic updates (e.g., member locations, energy levels, operational status, trust).
The collective SMM integrates these IMMs. The shared TAMM (Task ) aggregates static mission info and realtime updates on detections, classifications, and verifications. The shared TEMM (Team ) consolidates team capability data with status updates on positions, queues, availability, and trust metrics derived from performance history. This SMM framework ensures synchronized team understanding, improving coordination and mission effectiveness. We summarize the mathematical notation used in our modeling and analysis in Table SII of the supplementary document.
To evaluate DASH’s effectiveness, we simulate adversarial conditions where attack attempts occur equally often but exploit distinct vulnerabilities. UGVs are most vulnerable (30% success rate) due to exposed wireless channels. AI agents face a 10% risk from backdoor injections, mitigated by code audits. Human analysts, protected through security training, are least vulnerable (5% success). These rates reflect real-world disparities: UGVs have broad attack surfaces, AI benefits from verifiability, and humans gain resilience through training.
- Detection Cycles: A detection cycle is DASH’s core operational unit, initiated by the Command Center. Each cycle includes task dispatch, UGV sensing, AI/human analysis, and ends with object classification confirmation.
Task Model Team Model
Task Model Team Model Fig. 5. SMM example for the surveillance scenario.
The Command Center assigns tasks to UGVs, each covering a designated area. (2) When a UGV captures a potential target, it uploads the image for AI analysis. If uncertainty is high, a human analyst is consulted (see Section IV). ( 3) Upon confirmation, other UGVs may be redirected for multiangle validation or prioritized instance collection. Lowerpriority images may be dropped to conserve processing. (4) The AI agent classifies the image, invoking human validation if uncertainty exceeds a threshold. (5) This process repeats until the required detections are reached. A mission succeeds if targets are identified within the time limit; otherwise, it fails.
The environment includes five object types (i.e., car, truck, human, animal, and tree) for detection. We fine-tuned a ViT-B/16 vision transformer (pre-trained on ImageNet) using 6,000 CIFAR-10 surveillance images with balanced category representation. CIFAR-10 was selected for its publicly available, balanced object classes that align with our surveillance targets and support reproducible evaluation. The ViT-B/16 model was fine-tuned directly on the original CIFAR-10 images without additional preprocessing, focusing on demonstrating the capabilities of our EDL-based AI agent.
For Evidential Deep Learning (EDL) [56], we modified the ViT output to produce evidence values for uncertainty quantification using Subjective Logic, enabling the system to determine when escalation to human input was warranted. Training was performed using the Adam optimizer with a learning rate of 0.0001, a batch size of 32, and 10,000 epochs.
- Simulation Setting: Table SIII in the supplement summarizes key simulation parameters chosen to balance realism and feasibility. The trust threshold ( = 0.3) triggered bait tasks in response to suspected compromises, while the uncertainty threshold ( = 0.25) determined when the AI agent requested human validation. To ensure statistical significance, 100 simulations were conducted per scheme. Attack frequen-cies ranged from 0.0 to 1.0 in increments of 0.1, enabling evaluation across scenarios from benign to fully adversarial conditions.
โข Mission Success Rate (MSR) is the proportion of successfully completed missions:
โข Ratio of Compromised Members is the probability that a team member (UGV, AI, or human) remains compromised during a mission:
where ( ) comp = 1 if was compromised in mission , 0 otherwise.
โข Operational Cost (OC) is the average cost per mission:
where each term denotes mission-specific costs for UGVs, AI, human analysts, and recovery (e.g., reinstallation or retraining). โข SMM Quality Index (SQI) measures the accuracy of information exchanged via the SMM:
where | correct | is the number of correctly shared pieces, and | shared | is the total exchanged. โข SMM Coverage Index (SCI) quantifies the efficiency of team information flow:
where | max | is the total possible information that could be shared.
The relationships between correct , shared , and potential are:
To evaluate the DASH framework, we designed four experimental schemes with varying integration levels:
โข DASH-DF (Deception-Augmented SMM with Defense):
Full implementation with SMM-based coordination, trustbased task allocation, and deception. UGVs share data for multi-angle verification, the AI agent prioritizes tasks by trust, and bait tasks confirm suspected compromises before defenses are triggered (see Section IV-D).
โข SMM-DF (SMM with Defense): Same as DASH-DF but without deception; defenses trigger immediately when trust falls below threshold , without bait task verification. โข DF-only (Defense-only): No SMM coordination or UGV information sharing. The AI agent does not seek human input for uncertain cases. The command center updates trust, and defenses activate when trust falls below . โข BASE (Baseline): Conventional setup with standard task dispatch. UGVs, an AI agent, and a human analyst operate independently, with no trust updates or defense mechanisms, even when outputs conflict. This setting systematically evaluates SMM and deceptionbased defenses on system performance and resilience.
We evaluate the four schemes over 200 object detection (OD) missions. In each figure, the x-axis denotes OD missions , and the y-axis shows the corresponding metric. The results demonstrate the advantages of integrating SMMs and deception-based defenses.
-
Mission Success Rate (MSR): Fig. 6a shows MSR trends over 200 missions. All schemes initially declined under adversarial pressure, stabilizing as defenses were activated. DASH-DF consistently achieved the highest MSR due to SMM coordination and deception-based detection, which reduced compromises and improved information flow. BASE performed the worst, lacking both mechanisms.
-
SMM Quality Index (SQI): Fig. 6b compares the SQI of all considered schemes. DASH-DF outperformed SMM-DF, as deception preserved data integrity and ensured accurate exchange. DF-only and BASE had zero SQI due to the absence of SMM, trust-sharing, UGV positioning, or AI-human collaboration, highlighting the importance of SMM in reliable communication.
-
SMM Coverage Index (SCI): Fig. 6c presents the SCI of all considered schemes. DASH-DF slightly outperformed SMM-DF. Unlike SQI, SCI reflects the completeness of information sharing rather than correctness. DF-only and BASE again scored zero, lacking inter-agent communication.
-
Operational Cost: Fig. 7 compares operational costs. DASH-DF incurred higher costs due to the strategic deployment of deception tasks based on dynamic trust. Human analyst verification was most expensive, as analyst-specific bait tasks required greater resources. Despite the cost, DASH-DF’s proactive defenses significantly reduced compromises (Fig. 8), justifying the investment in security.
-
Ratio of Compromised Members: Fig. 8 shows compromise rates by role. DASH-DF had the lowest across UGVs, AI agents, and analysts, aligning with the MSR trend in Fig. 6a. Rates rose early as attacks escalated, then stabilized as defenses responded. UGVs were most at risk due to exposed wireless links, AI agents faced a moderate threat, and analysts were least vulnerable due to resistance to social engineering.
We conducted a sensitivity analysis by varying the attack rate from 0 to 1 in 0.2 increments, while fixing vulnerability parameters at UGV : AI : Human = 0.3 : 0.1 : 0.05. Results at mission 200 confirm DASH’s robustness under escalating attack pressure and highlight the interaction between system components, defense mechanisms, and threat intensity.
-
Mission Success Rate (MSR): Fig. 9a shows MSR trends across attack rates. DASH-DF consistently outperforms other schemes, maintaining โผ70% success at the highest attack rate, while BASE drops below 10%. The growing performance gap between DASH-DF and SMM-DF underscores the critical role of deception-based defenses. DF-only and BASE degrade sharply, especially at low attack rates (0.0-0.2), exposing weaknesses in non-adaptive defenses.
-
SMM Quality Index (SQI): Fig. 9b shows DASH-DF retains โผ70% SQI at attack rate 1.0, while SMM-DF drops to โผ50%. DASH-DF’s stable SQI reflects its resilience. DFonly and BASE remain at zero across all rates due to lacking effective information-sharing mechanisms.
-
SMM Coverage Index (SCI): Fig. 9c shows that DASH-DF sustains SCI near 0.9 across all attack rates. SMM-DF declines slightly to โผ0.85, highlighting ADTM’s role in maintaining communication. DF-only and BASE remain at zero, lacking a collaborative sharing infrastructure.
-
Operational Costs: Fig. 10 presents cost trends. DASH-DF incurs higher UGV costs (1.4-1.7) due to bait tasks and reinstallations. Although other schemes show lower AI and human costs, they suffer higher compromise rates (Fig. 11). This tradeoff underscores DASH-DF’s investment in resilience.
-
Ratio of Compromised Members: Fig. 11 shows that SMM-DF, DF-only, and BASE experience sharp compromise increases at low attack rates (0-0.2). In contrast, DASH-DF maintains low compromise levels, even at an attack rate of 1.0, AI remains below 0.15, and human analysts under 0.1, demonstrating the effectiveness of deception-based protections.
Beyond attack frequency, we analyzed how varying team member vulnerabilities affect system performance. UGV vulnerability increased from 0.2 to 0.6 (in 0.1 increments), with AI vulnerability set to half the UGV value and human analyst vulnerability to half the AI value, yielding triplets {UGV:AI:Human} = {0.2:0.05:0.02, 0.3:0.1:0.05, 0.4:0.2:0.1, 0.5:0.3:0.15, 0.6:0.4:0.2}. Results under a fixed attack rate of 0.4 are shown in Figs. 121314.
-
Mission Success Rate (MSR): Fig. 12a shows that as vulnerabilities increase, MSR declines across all schemes. DASH-DF remains most resilient, sustaining over 80% success even at peak vulnerability, highlighting the benefit of combining SMM and deception. SMM-DF drops to โผ60%, while DFonly and BASE fall sharply to โผ20% and โผ10%, respectively, failing to address growing compromise risks.
-
SMM Quality Index (SQI): Fig. 12b shows DASH-DF maintains โผ0.75 SQI at peak vulnerability, preserving integrity. SMM-DF drops to โผ0.6, while DF-only and BASE stay at zero due to lacking information-sharing.
-
SMM Coverage Index (SCI): As shown in Fig. 12c, DASH-DF maintains โผ0.9 SCI across all settings. SMM-DF declines from โผ0.95 to โผ0.70, reflecting reduced information flow with higher compromise. The widening gap underscores deception’s role in preserving communication completeness.
-
Operational Costs: Fig. 13 shows DASH-DF incurs the highest operational costs due to frequent bait tasks and recovery, increasing gradually with vulnerability. Although DF-only and BASE incur lower costs, their effectiveness is poor. At peak vulnerability, DASH-DF’s โผ50% higher cost yields a โผ400% MSR improvement over BASE, justifying the security-performance tradeoff.
-
Ratio of Compromised Members: Fig. 14 shows compromise trends. DASH-DF limits UGV compromise to โผ0.65, well below other schemes (near 1.0). AI compromise in DASH-DF stays near 0.1, versus SMM-DF (โผ0.55), DFonly (โผ0.80), and BASE (โผ0.95). Human analysts are highly protected in DASH-DF (under 0.02), unlike BASE (โผ0.9), DFonly (โผ0.65), and SMM-DF (โผ0.3). These results confirm deception’s role in countering social engineering and protecting assets.
Our research advances Human-Machine Teaming Systems with the Deception-Augmented Shared Mental Model for Human-Machine Teaming (DASH) framework, which integrates security into coordination, an aspect overlooked in traditional approaches. DASH introduces a practical implementation of Shared Mental Models for multi-UGV coordination, moving beyond theoretical formulations. It also integrates component-specific deception via strategically deployed “bait tasks” and demonstrates how this approach enhances both mission performance and security. By bridging collaborative efficiency with system integrity, DASH establishes a new paradigm for resilient human-machine teaming in adversarial settings, with applications extending beyond surveillance to any secure multi-agent coordination task.
We found the following key findings from this study. The DASH framework sustained โ60% mission success at the highest attack frequency (1.0), compared to the baseline system’s collapse below 10%, demonstrating a sixfold improvement in resilience. Its SMM Quality Index consistently outperformed other schemes, maintaining โ60% information accuracy even under maximum attack rates, confirming deception’s critical role in preserving information integrity. The SMM Coverage Index remained stable at โ0.8 across all attack rates, ensuring effective communication despite adversarial pressure. DASH-DF significantly reduced compromise rates, keeping human analyst compromises below 0.1 across all attack frequencies, whereas other schemes approached 0.8, showcasing its superior protection. Although DASH-DF incurred 20-30% higher operational costs than other schemes, this investment translated directly into improved mission success and dramatically reduced compromise rates, demonstrating a favorable security-performance tradeoff.
Future work will focus on several key directions. First, we will develop adaptive adversarial models that evolve alongside our defense mechanisms to enable more rigorous testing. Second, we will deploy the framework on physical testbeds with real UGVs, AI systems, and human operators to validate the simulation results. Third, we will enhance deception mechanisms by optimizing a library of component-specific bait tasks using reinforcement learning. For example, planned experiments include urban patrol trials to study robot-human collaboration in city environments and forest fire reconnaissance missions to assess SMM integrity under conditions of smoke and occlusion. Additionally, we will explore the ethical implications of deception-based security in human-machine teams, particularly focusing on its impact on trust dynamics and transparency requirements.
Algorithm S1 details the procedure for updating trust parameters based on task outcomes. When AI and human analyst results match, the system increments the credible task counter, indicating alignment between team members. This algorithm directly supports the trust calculation formula in Eq. (S1) from the main paper, where trust is computed based on the ratio of consistent results to total tasks, with time-based decay.
Algorithm S2 implements the probabilistic information sharing mechanism that proportionally restricts information flow based on current trust levels. This approach ensures that information is shared with team members in proportion to their trustworthiness, reducing the risk of critical information being compromised. The algorithm uses the trust value calculated by Eq. (S1) to determine whether information should be shared in any given instance. A collective knowledge structure derived from Individual Mental Models (IMMs), providing a unified understanding of tasks, roles, and situational awareness to enhance coordination and decision-making. Individual Mental Models (IMMs)
Role-specific knowledge encapsulating mission goals, team roles, and resource requirements, forming the foundation for the SMM. Task Mental Model (TAMM)
Part of the IMM, encompassing information about mission goals (e.g., target types for object detection), expected outcomes (e.g., images from UGVs, predictions from AI agent), and task requirements. Facilitates effective execution and collaboration by clarifying each member’s responsibilities. Team Mental Model (TEMM)
Part of the IMM, including perceptions of team members’ roles (e.g., sensing, identification), capabilities, and equipment. Promotes effective collaboration by ensuring tasks are allocated based on each member’s strengths. Adaptive Deceptive Task Management (ADTM) A mechanism for task creation and allocation that integrates deception techniques to proactively detect and mitigate compromised agents. ADTM dynamically assigns both regular mission tasks and bait tasks to team members to maintain mission progress while ensuring system integrity.
Tasks designed to achieve primary mission objectives, such as data collection and analysis. Managed by ADTM to ensure mission progress. Bait Tasks Tasks introduced to identify potentially compromised agents by testing their integrity through predetermined tasks with known outcomes. Includes targeted verification tasks tailored to UGVs, AI agents, and human analysts.
Requests generated by team members for assistance with high-uncertainty or challenging tasks, facilitating collaboration between AI agents and human analysts.
Assessment of prediction uncertainty ( ) by the AI agent, triggering validation or assistance requests when thresholds ( > ) are exceeded. Detection Threshold Predefined thresholds (e.g., for trust) used to initiate additional validation steps or detect anomalies, such as triggering bait tasks when trust falls below acceptable levels.
Verification of team members’ trustworthiness through tasks designed to detect anomalies or compromises, utilizing bait tasks to assess performance integrity.
Mechanisms to monitor, adjust, and evaluate trust levels of team members based on their behavior and task performance, including dynamic trust calculations using the number of correctly handled tasks.
Signs of potential compromise, such as discrepancies in task outputs or elevated uncertainty levels, used within integrity testing and trust management processes.
Recovery mechanisms, such as system reinstallation for UGVs, retraining for AI agents, and personnel replacement for human analysts, ensuring system integrity after detecting compromised agents. Team Member’s Skill Awareness of specific capabilities and limitations of team members, informing task allocation to maximize effectiveness. Team Member’s Needs Anticipation of informational and support requirements, ensuring timely resource allocation and assistance. Team Member’s State Real-time operational status (e.g., active, idle, compromised), allowing dynamic task reallocation and intervention when necessary.
โ: Considered; โ: Not considered.
This content is AI-processed based on open access ArXiv data.