Learning Affordances at Inference-Time for Vision-Language-Action Models

February 22, 2026

Reading time: 2 minute

...

📝 Original Info

Title: Learning Affordances at Inference-Time for Vision-Language-Action Models
ArXiv ID: 2510.19752
Date: 2025-10-22
Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (저자명, 소속, 기여도 등은 원문을 확인해 주세요.) **

📝 Abstract

Solving complex real-world control tasks often takes multiple tries: if we fail at first, we reflect on what went wrong, and change our strategy accordingly to avoid making the same mistake. In robotics, Vision-Language-Action models (VLAs) offer a promising path towards solving complex control tasks, but lack the ability to contextually and dynamically readjust behavior when they fail to accomplish a task. In this work, we introduce Learning from Inference-Time Execution (LITEN), which connects a VLA low-level policy to a high-level VLM that conditions on past experiences by including them in-context, allowing it to learn the affordances and capabilities of the low-level VLA. Our approach iterates between a reasoning phase that generates and executes plans for the low-level VLA, and an assessment phase that reflects on the resulting execution and draws useful conclusions to be included in future reasoning contexts. Unlike similar approaches to self-refinement in non-robotics domains, LITEN must reflect on unstructured real-world robot trajectories (e.g., raw videos), which requires structured guiderails during assessment. Our experimental results demonstrate LITEN is able to effectively learn from past experience to generate plans that use high-affordance instructions to accomplish long-horizon tasks.

Learning Affordances at Inference-Time for Vision-Language-Action Models

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

BudgetMem: Learning Selective Memory Policies for Cost-Efficient Long-Context Processing in Language Models

Deep learning models are vulnerable, but adversarial examples are even more vulnerable

OR-R1: Automating Modeling and Solving of Operations Research Optimization Problem via Test-Time Reinforcement Learning

Start searching

No results found