Pelican-VL 1.0: A Foundation Brain Model for Embodied Intelligence

Reading time: 2 minute
...

📝 Original Info

  • Title: Pelican-VL 1.0: A Foundation Brain Model for Embodied Intelligence
  • ArXiv ID: 2511.00108
  • Date: 2025-10-30
  • Authors: ** 제공되지 않음 (논문에 저자 정보가 명시되지 않았습니다.) **

📝 Abstract

This report presents Pelican-VL 1.0, a new family of open-source embodied brain models with parameter scales ranging from 7 billion to 72 billion. Our explicit mission is clearly stated as: To embed powerful intelligence into various embodiments. Pelican-VL 1.0 is currently the largest-scale open-source embodied multimodal brain model. Its core advantage lies in the in-depth integration of data power and intelligent adaptive learning mechanisms. Specifically, metaloop distilled a high-quality dataset from a raw dataset containing 4+ billion tokens. Pelican-VL 1.0 is trained on a large-scale cluster of 1000+ A800 GPUs, consuming over 50k+ A800 GPU-hours per checkpoint. This translates to a 20.3% performance uplift from its base model and outperforms 100B-level open-source counterparts by 10.6%, placing it on par with leading proprietary systems on well-known embodied benchmarks. We establish a novel framework, DPPO (Deliberate Practice Policy Optimization), inspired by human metacognition to train Pelican-VL 1.0. We operationalize this as a metaloop that teaches the AI to practice deliberately, which is a RL-Refine-Diagnose-SFT loop.

💡 Deep Analysis

Figure 1

📄 Full Content

📸 Image Gallery

100B.jpg 200B.jpg 54.png Evolution.png Pelican_logo.png bmk_72B+1.jpg bmk_72B-1.jpg brain_sensormotor_policy_10_co2.png cosmos_multi_epoch_by_range.png egolife_multi_epoch_by_range.png method.jpg phyx_multi_epoch_by_range.png robopointbox_multi_epoch_by_range.png vlm_grasp_results_4_s.png vsibench_mcq_multi_epoch_by_range.png vsibench_qa_multi_epoch_by_range.png x_humanoid_logo_v3.jpg

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut