General-Science
DeliveryBench: Can Agents Earn Profit in Real World?
Demystifying LLM-as-a-Judge: Analytically Tractable Model for Inference-Time Scaling
HARMON-E: Hierarchical Agentic Reasoning for Multimodal Oncology Notes to Extract Structured Data
Identifying Features Associated with Bias Against 93 Stigmatized Groups in Language Models and Guardrail Model Safety Mitigation
Is Visual Realism Enough? Evaluating Gait Biometric Fidelity in Generative AI Human Animation
Machine Unlearning in the Era of Quantum Machine Learning: An Empirical Study
Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning
Modeling Non-Ergodic Path Effects Using Conditional Generative Model for Fourier Amplitude Spectra
On the Existence and Behaviour of Secondary Attention Sinks
PhysMaster: Building an Autonomous AI Physicist for Theoretical and Computational Physics Research
Reflection-Driven Control for Trustworthy Code Agents
Scalable Stewardship of an LLM-Assisted Clinical Benchmark with Physician Oversight
Signal-SGN++: Topology-Enhanced Time-Frequency Spiking Graph Network for Skeleton-Based Action Recognition
Super-Resolution Enhancement of Medical Images Based on Diffusion Model: An Optimization Scheme for Low-Resolution Gastric Images
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion
$M^3-Verse$: A 'Spot the Difference' Challenge for Large Multimodal Models
A Multi-agent Text2SQL Framework using Small Language Models and Execution Feedback
Adaptive Accountability in Networked MAS: Tracing and Mitigating Emergent Norms at Scale
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction
ChronoDreamer: Action-Conditioned World Model as an Online Simulator for Robotic Planning
CosineGate: Semantic Dynamic Routing via Cosine Incompatibility in Residual Networks
Geometric-Photometric Event-based 3D Gaussian Ray Tracing