Insufficient data volume and quality are particularly pressing challenges in the adoption of modern subsymbolic AI. To alleviate these challenges, AI simulation uses virtual training environments in which AI agents can be safely and efficiently developed with simulated, synthetic data. Digital twins

February 23, 2026

Framework System

No Image

AI 기반 안전중요 시스템을 위한 요구공학과 시각 인식 검증 통합 프레임워크

The integration of AI components, particularly Deep Neural Networks (DNNs), into safety-critical systems such as aerospace and autonomous vehicles presents fundamental challenges for assurance. The opacity of AI systems, combined with the semantic gap between high-level requirements and low-level ne

February 23, 2026

No Image

Auditing Algorithmic Bias in Transformer-Based Trading

Abstract and deep analysis are available in the full post.

February 23, 2026

No Image

AutoBackdoor: Automating Backdoor Attacks via LLM Agents

Backdoor attacks pose a serious threat to the secure deployment of large language models (LLMs), enabling adversaries to implant hidden behaviors triggered by specific inputs. However, existing methods often rely on manually crafted triggers and static data pipelines, which are rigid, labor-intensiv

February 23, 2026

BreakFun: Jailbreaking LLMs via Schema Exploitation

Abstract and deep analysis are available in the full post.

February 23, 2026

No Image

Bridging Research and Standardization: Innovations and Methodology for 6G Standard Contributions

Abstract and deep analysis are available in the full post.

February 23, 2026

Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT

Parameter-Efficient Fine-Tuning (PEFT) methods are crucial for adapting large pre-trained models. Among these, LoRA is considered a foundational approach. Building on this, the influential DoRA method enhances performance by decomposing weight updates into magnitude and direction. However, its under

February 23, 2026

Framework

CAMNet: Leveraging Cooperative Awareness Messages for Vehicle Trajectory Prediction

Abstract and deep analysis are available in the full post.

February 23, 2026

No Image

EnergyTwin: A Multi-Agent System for Simulating and Coordinating Energy Microgrids

Microgrids are deployed to reduce purchased grid energy, limit exposure to volatile tariffs, and ensure service continuity during disturbances. This requires coordinating heterogeneous distributed energy resources across multiple time scales and under variable conditions. Among existing tools, typic

February 23, 2026

System

ESG 메트릭 지식 그래프 자동 구축을 위한 온톨로지 기반 프레임워크

Environmental, Social, and Governance (ESG) metric knowledge is inherently structured, connecting industries, reporting frameworks, metric categories, metrics, and calculation models through compositional dependencies, yet in practice this structure remains embedded implicitly in regulatory document

February 23, 2026

No Image

EvoXplain: When Machine Learning Models Agree on Predictions but Disagree on Why -- Measuring Mechanistic Multiplicity Across Training Runs

Machine learning models are primarily judged by predictive performance, especially in applied settings. Once a model reaches high accuracy, its explanation is often assumed to be correct and trustworthy. This assumption raises an overlooked question: when two models achieve high accuracy, do they re

February 23, 2026

Model Learning

No Image

From FAIR to CURE: Guidelines for Computational Models of Biological Systems

Abstract and deep analysis are available in the full post.

February 23, 2026

Model System

Language Models Can Understand Spectra: A Multimodal Model for Molecular Structure Elucidation

Abstract and deep analysis are available in the full post.

February 23, 2026

Model

No Image

LLM 학습 데이터 저작권 검증을 위한 오픈소스 플랫폼

The widespread use of Large Language Models (LLMs) raises critical concerns regarding the unauthorized inclusion of copyrighted content in training data. Existing detection frameworks, such as DE-COP, are computationally intensive, and largely inaccessible to independent creators. As legal scrutiny

February 23, 2026

No Image

LLMs for Automated Unit Test Generation and Assessment in Java: The AgoneTest Framework

Unit testing is an essential but resource-intensive step in software development, ensuring individual code units function correctly. This paper introduces AgoneTest, an automated evaluation framework for Large Language Model-generated (LLM) unit tests in Java. AgoneTest does not aim to propose a nov

February 23, 2026

Framework

No Image

MaskClip: Detachable Clip-on Piezoelectric Sensing of Mask Surface Vibrations for Real-time Noise-Robust Speech Input

Masks are essential in medical settings and during infectious outbreaks but significantly impair speech communication, especially in environments with background noise. Existing solutions often require substantial computational resources or compromise hygiene and comfort. We propose a novel sensing

February 23, 2026

MTQ-Eval: Multilingual Text Quality Evaluation for Language Models

Abstract and deep analysis are available in the full post.

February 23, 2026

Model

No Image

Multi-Agent Code Verification via Information Theory

LLMs generate buggy code: 29.6% of SWE-bench solved patches fail, 62% of BaxBench solutions have vulnerabilities, and existing tools only catch 65% of bugs with 35% false positives. We built CodeX-Verify, a multi-agent system that uses four specialized agents to detect different types of bugs. We pr

February 23, 2026

No Image

NOTAM-Evolve: A Knowledge-Guided Self-Evolving Optimization Framework with LLMs for NOTAM Interpretation

Accurate interpretation of Notices to Airmen (NOTAMs) is critical for aviation safety, yet their condensed and cryptic language poses significant challenges to both manual and automated processing. Existing automated systems are typically limited to shallow parsing, failing to extract the actionable

February 23, 2026

Framework

No Image

OS-R1: Agentic Operating System Kernel Tuning with Reinforcement Learning

Abstract and deep analysis are available in the full post.

February 23, 2026

Learning System

PRiSM 과학적 추론을 위한 동적 멀티모달 벤치마크

Evaluating vision-language models (VLMs) in scientific domains like mathematics and physics poses unique challenges that go far beyond predicting final answers. These domains demand conceptual understanding, symbolic reasoning, and adherence to formal laws, requirements that most existing benchmarks

February 23, 2026

No Image

Quantifying Bounded Rationality: Formal Verification of Simon's Satisficing Through Flexible Stochastic Dominance

This paper introduces Flexible First-Order Stochastic Dominance (FFSD), a mathematically rigorous framework that formalizes Herbert Simon's concept of bounded rationality using the Lean 4 theorem prover. We develop machine-verified proofs demonstrating that FFSD bridges classical expected utility th

February 23, 2026

No Image

RAG-Driven Data Quality Governance for Enterprise ERP Systems

Enterprise ERP systems managing hundreds of thousands of employee records face critical data quality challenges when human resources departments perform decentralized manual entry across multiple languages. We present an end-to-end pipeline combining automated data cleaning with LLM-driven SQL query

February 23, 2026

System Data

Robust and High-Fidelity 3D Gaussian Splatting: Fusing Pose Priors and Geometry Constraints for Texture-Deficient Outdoor Scenes

Abstract and deep analysis are available in the full post.

February 23, 2026

No Image

Snakes in the Plane: Controllable Gliders in a Nanomagnetic Metamaterial

Abstract and deep analysis are available in the full post.

February 23, 2026

Speeding Up MACE: Low-Precision Tricks for Equivarient Force Fields

Machine-learning force fields can deliver accurate molecular dynamics (MD) at high computational cost. For SO(3)-equivariant models such as MACE, there is little systematic evidence on whether reduced-precision arithmetic and GPU-optimized kernels can cut this cost without harming physical fidelity.

February 23, 2026

No Image

The complexity of reachability problems in strongly connected finite automata

Several reachability problems in finite automata, such as completeness of NFAs and synchronisation of total DFAs, correspond to fundamental properties of sets of nonnegative matrices. In particular, the two mentioned properties correspond to matrix mortality and ergodicity, which ask whether there e

February 23, 2026

Understanding and Mitigating Errors of LLM-Generated RTL Code

Abstract and deep analysis are available in the full post.

February 23, 2026

No Image

When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models

Large Reasoning Models (LRMs) achieve strong performance on complex multi-step reasoning, yet they still exhibit severe safety failures such as harmful content generation. Existing methods often apply coarse-grained constraints over the entire reasoning trajectories, which can undermine reasoning ca

February 23, 2026

Model

WorldScore: A Unified Evaluation Benchmark for World Generation

Abstract and deep analysis are available in the full post.

February 23, 2026

No Image

법률 분야 LLM 성능 향상을 위한 문서 구조 재배치와 역할 기반 프롬프트 연구

Large Language Models (LLMs), trained on extensive datasets from the web, exhibit remarkable general reasoning skills. Despite this, they often struggle in specialized areas like law, mainly because they lack domain-specific pretraining. The legal field presents unique challenges, as legal documents

February 23, 2026

No Image

비밀번호 강도 평가와 생성을 위한 SODA ADVANCE: 대형 언어 모델의 역할

Although passwords remain the primary defense against unauthorized access, users often tend to use passwords that are easy to remember. This behavior significantly increases security risks, also due to the fact that traditional password strength evaluation methods are often inadequate. In this discu

February 23, 2026

No Image

생성 평가 일관성을 활용한 대형 언어 모델 정렬 벤치마크

Alignment with human preferences is an important evaluation aspect of LLMs, requiring them to be helpful, honest, safe, and to precisely follow human instructions. Evaluating large language models' (LLMs) alignment typically involves directly assessing their open-ended responses, requiring human ann

February 23, 2026

확산 모델을 활용한 노래 음성 분리 라티스 디퓨전 기반 효율적 생성 접근

Extracting individual elements from music mixtures is a valuable tool for music production and practice. While neural networks optimized to mask or transform mixture spectrograms into the individual source(s) have been the leading approach, the source overlap and correlation in music signals poses a

February 23, 2026

Irresponsible AI: big tech's influence on AI research and associated impacts

The accelerated development, deployment and adoption of artificial intelligence systems has been fuelled by the increasing involvement of big tech. This has been accompanied by increasing ethical concerns and intensified societal and environmental impacts. In this article, we review and discuss how

February 23, 2026

Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model

Placenta Accreta Spectrum (PAS) is a serious obstetric condition that can be challenging to diagnose with Magnetic Resonance Imaging (MRI) due to variability in radiologists' interpretations. To overcome this challenge, a hybrid 3D deep learning model for automated PAS detection from volumetric MRI

February 23, 2026

Model Detection

Rectifying LLM Thought from Lens of Optimization

Recent advancements in large language models (LLMs) have been driven by their emergent reasoning capabilities, particularly through long chain-of-thought (CoT) prompting, which enables thorough exploration and deliberation. Despite these advances, long-CoT LLMs often exhibit suboptimal reasoning beh

February 23, 2026

Story2MIDI: Emotionally Aligned Music Generation from Text

In this paper, we introduce Story2MIDI, a sequence-to-sequence Transformer-based model for generating emotion-aligned music from a given piece of text. To develop this model, we construct the Story2MIDI dataset by merging existing datasets for sentiment analysis from text and emotion classification

February 23, 2026

Vision Foundry: A System for Training Foundational Vision AI Models

Self-supervised learning (SSL) leverages vast unannotated medical datasets, yet steep technical barriers limit adoption by clinical researchers. We introduce Vision Foundry, a code-free, HIPAA-compliant platform that democratizes pre-training, adaptation, and deployment of foundational vision models

February 23, 2026

Model System

'The Dentist is an involved parent, the bartender is not': Revealing Implicit Biases in QA with Implicit BBQ

Existing benchmarks evaluating biases in large language models (LLMs) primarily rely on explicit cues, declaring protected attributes like religion, race, gender by name. However, real-world interactions often contain implicit biases, inferred subtly through names, cultural cues, or traits. This cri

February 23, 2026

A Comprehensive Framework for Automated Quality Control in the Automotive Industry

This paper presents a cutting-edge robotic inspection solution designed to automate quality control in automotive manufacturing. The system integrates a pair of collaborative robots, each equipped with a high-resolution camera-based vision system to accurately detect and localize surface and thread

February 23, 2026

Framework

A Linear Expectation Constraint for Selective Prediction and Routing with False-Discovery Control

Foundation models often generate unreliable answers, while heuristic uncertainty estimators fail to fully distinguish correct from incorrect outputs, causing users to accept erroneous answers without statistical guarantees. We address this through the lens of false discovery rate (FDR) control, ensu

February 23, 2026

A Survey of Bugs in AI-Generated Code

Developers are widely using AI code-generation models, aiming to increase productivity and efficiency. However, there are also quality concerns regarding the AI-generated code. The generated code is produced by models trained on publicly available code, which are known to contain bugs and quality is

February 23, 2026

Advancing Multimodal Teacher Sentiment Analysis:The Large-Scale T-MED Dataset & The Effective AAM-TSA Model

Teachers' emotional states are critical in educational scenarios, profoundly impacting teaching efficacy, student engagement, and learning achievements. However, existing studies often fail to accurately capture teachers' emotions due to the performative nature and overlook the critical impact of in

February 23, 2026

Analysis Model Data

AI/ML in 3GPP 5G Advanced -- Services and Architecture

The 3rd Generation Partnership Project (3GPP), the standards body for mobile networks, is in the final phase of Release 19 standardization and is beginning Release 20. Artificial Intelligence/ Machine Learning (AI/ML) has brought about a paradigm shift in technology and it is being adopted across in

February 23, 2026

AncientBench: Towards Comprehensive Evaluation on Excavated and Transmitted Chinese Corpora

Comprehension of ancient texts plays an important role in archaeology and understanding of Chinese history and civilization. The rapid development of large language models needs benchmarks that can evaluate their comprehension of ancient characters. Existing Chinese benchmarks are mostly targeted at

February 23, 2026

Arc Spline Approximation of Envelopes of Evolving Planar Domains

Computing the envelope of deforming planar domains is a significant and challenging problem with a wide range of potential applications. We approximate the envelope using circular arc splines, curves that balance geometric flexibility and computational simplicity. Our approach combines two concepts

February 23, 2026

< Category Statistics (Total: 5017) >

Astrophysics

658

Computer Science

1843

Condensed Matter

193

Economics

Electrical Engineering and Systems Science

General Relativity

General Research

781

HEP-EX

HEP-LAT

HEP-PH

HEP-TH

MATH-PH

NUCL-EX

NUCL-TH

Nonlinear Sciences

180

Quantitative Biology

307

Quantitative Finance

159

Quantum Physics

Statistics

234

2502.18639

2510.13709

A Multi-dimensional Semantic Surprise Framework Based on Low-Entropy Semantic Manifolds for Fine-Grained Out-of-Distribution Detection

AI Simulation by Digital Twins: Systematic Survey, Reference Framework, and Mapping to a Standardized Architecture

AI 기반 안전중요 시스템을 위한 요구공학과 시각 인식 검증 통합 프레임워크

Auditing Algorithmic Bias in Transformer-Based Trading

AutoBackdoor: Automating Backdoor Attacks via LLM Agents

BreakFun: Jailbreaking LLMs via Schema Exploitation

Bridging Research and Standardization: Innovations and Methodology for 6G Standard Contributions

Calibrating and Rotating: A Unified Framework for Weight Conditioning in PEFT

CAMNet: Leveraging Cooperative Awareness Messages for Vehicle Trajectory Prediction

EnergyTwin: A Multi-Agent System for Simulating and Coordinating Energy Microgrids

ESG 메트릭 지식 그래프 자동 구축을 위한 온톨로지 기반 프레임워크

EvoXplain: When Machine Learning Models Agree on Predictions but Disagree on Why -- Measuring Mechanistic Multiplicity Across Training Runs

From FAIR to CURE: Guidelines for Computational Models of Biological Systems

Language Models Can Understand Spectra: A Multimodal Model for Molecular Structure Elucidation

LLM 학습 데이터 저작권 검증을 위한 오픈소스 플랫폼

LLMs for Automated Unit Test Generation and Assessment in Java: The AgoneTest Framework

MaskClip: Detachable Clip-on Piezoelectric Sensing of Mask Surface Vibrations for Real-time Noise-Robust Speech Input

MTQ-Eval: Multilingual Text Quality Evaluation for Language Models

Multi-Agent Code Verification via Information Theory

NOTAM-Evolve: A Knowledge-Guided Self-Evolving Optimization Framework with LLMs for NOTAM Interpretation

OS-R1: Agentic Operating System Kernel Tuning with Reinforcement Learning

PRiSM 과학적 추론을 위한 동적 멀티모달 벤치마크

Quantifying Bounded Rationality: Formal Verification of Simon's Satisficing Through Flexible Stochastic Dominance

RAG-Driven Data Quality Governance for Enterprise ERP Systems

Robust and High-Fidelity 3D Gaussian Splatting: Fusing Pose Priors and Geometry Constraints for Texture-Deficient Outdoor Scenes

Snakes in the Plane: Controllable Gliders in a Nanomagnetic Metamaterial

Speeding Up MACE: Low-Precision Tricks for Equivarient Force Fields

The complexity of reachability problems in strongly connected finite automata

Understanding and Mitigating Errors of LLM-Generated RTL Code

When Models Outthink Their Safety: Unveiling and Mitigating Self-Jailbreak in Large Reasoning Models

WorldScore: A Unified Evaluation Benchmark for World Generation

법률 분야 LLM 성능 향상을 위한 문서 구조 재배치와 역할 기반 프롬프트 연구

비밀번호 강도 평가와 생성을 위한 SODA ADVANCE: 대형 언어 모델의 역할

생성 평가 일관성을 활용한 대형 언어 모델 정렬 벤치마크

확산 모델을 활용한 노래 음성 분리 라티스 디퓨전 기반 효율적 생성 접근

Irresponsible AI: big tech's influence on AI research and associated impacts

Placenta Accreta Spectrum Detection Using an MRI-based Hybrid CNN-Transformer Model

Rectifying LLM Thought from Lens of Optimization

Story2MIDI: Emotionally Aligned Music Generation from Text

Vision Foundry: A System for Training Foundational Vision AI Models

'The Dentist is an involved parent, the bartender is not': Revealing Implicit Biases in QA with Implicit BBQ

A Comprehensive Framework for Automated Quality Control in the Automotive Industry

A Linear Expectation Constraint for Selective Prediction and Routing with False-Discovery Control

A Survey of Bugs in AI-Generated Code

Advancing Multimodal Teacher Sentiment Analysis:The Large-Scale T-MED Dataset & The Effective AAM-TSA Model

AI/ML in 3GPP 5G Advanced -- Services and Architecture

AncientBench: Towards Comprehensive Evaluation on Excavated and Transmitted Chinese Corpora

Arc Spline Approximation of Envelopes of Evolving Planar Domains

< Category Statistics (Total: 5017) >

Start searching

No results found