Cs-Cv
ToolTok: Tool Tokenization for Efficient and Generalizable GUI Agents
Enhancing Post-Training Quantization via Future Activation Awareness
Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars
Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image Fusion
Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables
HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction
Transferring Visual Explainability of Self-Explaining Models to Prediction-Only Models without Additional Training
CoT-RVS: Zero-Shot Chain-of-Thought Reasoning Segmentation for Videos
Entropy-Lens: Uncovering Decision Strategies in LLMs
Future frame prediction in chest and liver cine MRI using the PCA respiratory motion model: comparing transformers and dynamically trained recurrent neural networks
MineInsight: A Multi-sensor Dataset for Humanitarian Demining Robotics in Off-Road Environments
Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing
Improved Single Camera BEV Perception Using Multi-Camera Training
Uncertainty-Aware Image Classification In Biomedical Imaging Using Spectral-normalized Neural Gaussian Processes
SelvaMask: Segmenting Trees in Tropical Forests and Beyond
Implicit neural representation of textures
MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action Models
Catalyst: Out-of-Distribution Detection via Elastic Scaling
Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation
SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation
Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training
Toxicity Assessment in Preclinical Histopathology via Class-Aware Mahalanobis Distance for Known and Novel Anomalies