Cs-Cv

WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models

Machine Learning 28 JAN, 2026

WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models

By Runjie Zhou

ToolTok: Tool Tokenization for Efficient and Generalizable GUI Agents

Artificial Intelligence 30 JAN, 2026

ToolTok: Tool Tokenization for Efficient and Generalizable GUI Agents

By Xiaoce Wang

Enhancing Post-Training Quantization via Future Activation Awareness

Machine Learning 28 JAN, 2026

Enhancing Post-Training Quantization via Future Activation Awareness

By Zheqi Lv

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

Artificial Intelligence 2 JAN, 2026

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

By Youliang Zhang

Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image Fusion

Computer Vision 27 JAN, 2026

Shuffle Mamba: State Space Models with Random Shuffle for Multi-Modal Image Fusion

By Ke Cao

Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables

Machine Learning 2 JAN, 2026

Helios 2.0: A Robust, Ultra-Low Power Gesture Recognition System Optimised for Event-Sensor based Wearables

By Prarthana Bhattacharyya

HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction

Robotics 2 JAN, 2026

HI-SLAM2: Geometry-Aware Gaussian SLAM for Fast Monocular Scene Reconstruction

By Wei Zhang

Transferring Visual Explainability of Self-Explaining Models to Prediction-Only Models without Additional Training

Artificial Intelligence 2 JAN, 2026

Transferring Visual Explainability of Self-Explaining Models to Prediction-Only Models without Additional Training

By Yuya Yoshikawa

CoT-RVS: Zero-Shot Chain-of-Thought Reasoning Segmentation for Videos

Computer Vision 2 JAN, 2026

CoT-RVS: Zero-Shot Chain-of-Thought Reasoning Segmentation for Videos

By Shiu-hong Kao

Entropy-Lens: Uncovering Decision Strategies in LLMs

Artificial Intelligence 2 JAN, 2026

Entropy-Lens: Uncovering Decision Strategies in LLMs

By Riccardo Ali

Future frame prediction in chest and liver cine MRI using the PCA respiratory motion model: comparing transformers and dynamically trained recurrent neural networks

Neural and Evolutionary Computing 2 JAN, 2026

Future frame prediction in chest and liver cine MRI using the PCA respiratory motion model: comparing transformers and dynamically trained recurrent neural networks

By Michel Pohl

MineInsight: A Multi-sensor Dataset for Humanitarian Demining Robotics in Off-Road Environments

Robotics 2 JAN, 2026

MineInsight: A Multi-sensor Dataset for Humanitarian Demining Robotics in Off-Road Environments

By Mario Malizia

Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

Machine Learning 14 JAN, 2026

Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing

By Siyi Chen

Improved Single Camera BEV Perception Using Multi-Camera Training

Computer Vision 4 JAN, 2024

Improved Single Camera BEV Perception Using Multi-Camera Training

By Daniel Busch

Uncertainty-Aware Image Classification In Biomedical Imaging Using Spectral-normalized Neural Gaussian Processes

Computer Vision 2 JAN, 2026

Uncertainty-Aware Image Classification In Biomedical Imaging Using Spectral-normalized Neural Gaussian Processes

By Uma Meleti

SelvaMask: Segmenting Trees in Tropical Forests and Beyond

Computer Vision 2 JAN, 2026

SelvaMask: Segmenting Trees in Tropical Forests and Beyond

By Simon-Olivier Duguay

Implicit neural representation of textures

Artificial Intelligence 2 JAN, 2026

Implicit neural representation of textures

By Albert Kwok

MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action Models

Computer Vision 2 JAN, 2026

MAIN-VLA: Modeling Abstraction of Intention and eNvironment for Vision-Language-Action Models

By Zheyuan Zhou

Catalyst: Out-of-Distribution Detection via Elastic Scaling

Computer Vision 2 JAN, 2026

Catalyst: Out-of-Distribution Detection via Elastic Scaling

By Abid Hassan

Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation

Computer Vision 2 JAN, 2026

Superman: Unifying Skeleton and Vision for Human Motion Perception and Generation

By Xinshun Wang

SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation

Artificial Intelligence 2 JAN, 2026

SoMA: A Real-to-Sim Neural Simulator for Robotic Soft-body Manipulation

By Mu Huang

Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training

Machine Learning 2 JAN, 2026

Enhancing Diffusion-Based Quantitatively Controllable Image Generation via Matrix-Form EDM and Adaptive Vicinal Training

By Xin Ding

Toxicity Assessment in Preclinical Histopathology via Class-Aware Mahalanobis Distance for Known and Novel Anomalies

Artificial Intelligence 2 JAN, 2026

Toxicity Assessment in Preclinical Histopathology via Class-Aware Mahalanobis Distance for Known and Novel Anomalies

By Olga Graf

An Empirical Study of World Model Quantization

Machine Learning 2 JAN, 2026

An Empirical Study of World Model Quantization

By Zhongqian Fu