ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion

February 22, 2026

Reading time: 1 minute

...

📝 Original Info

Title: ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion
ArXiv ID: 2510.25818
Date: 2025-10-29
Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (정보 없음) **

📝 Abstract

Text-to-image diffusion models often exhibit degraded performance when generating images beyond their training resolution. Recent training-free methods can mitigate this limitation, but they often require substantial computation or are incompatible with recent Diffusion Transformer models. In this paper, we propose ScaleDiff, a model-agnostic and highly efficient framework for extending the resolution of pretrained diffusion models without any additional training. A core component of our framework is Neighborhood Patch Attention (NPA), an efficient mechanism that reduces computational redundancy in the self-attention layer with non-overlapping patches. We integrate NPA into an SDEdit pipeline and introduce Latent Frequency Mixing (LFM) to better generate fine details. Furthermore, we apply Structure Guidance to enhance global structure during the denoising process. Experimental results demonstrate that ScaleDiff achieves state-of-the-art performance among training-free methods in terms of both image quality and inference speed on both U-Net and Diffusion Transformer architectures.

ScaleDiff: Higher-Resolution Image Synthesis via Efficient and Model-Agnostic Diffusion

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

Reference

Related Posts

Accumulating Context Changes the Beliefs of Language Models

Benchmarking Multi-Step Legal Reasoning and Analyzing Chain-of-Thought Effects in Large Language Models

Bid2X: Revealing Dynamics of Bidding Environment in Online Advertising from A Foundation Model Lens

Start searching

No results found