Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

February 22, 2026

Reading time: 1 minute

...

📝 Original Info

Title: Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
ArXiv ID: 2511.07384
Date: 2025-11-10
Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (저자명, 소속, 연락처 등은 원문을 참고하시기 바랍니다.) **

📝 Abstract

Recent advances in depth-recurrent language models show that recurrence can decouple train-time compute and parameter count from test-time compute. In this work, we study how to convert existing pretrained non-recurrent language models into depth-recurrent models. We find that using a curriculum of recurrences to increase the effective depth of the model over the course of training preserves performance while reducing total computational cost. In our experiments, on mathematics, we observe that converting pretrained models to recurrent ones results in better performance at a given compute budget than simply post-training the original non-recurrent language model.

Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

A Theoretical Analysis of Detecting Large Model-Generated Time Series

A robust methodology for long-term sustainability evaluation of Machine Learning models

Achieving Effective Virtual Reality Interactions via Acoustic Gesture Recognition based on Large Language Models

Start searching

No results found