Choreographer: A Full-System Framework for Fine-Grained Tasks in Cache Hierarchies
📝 Original Info
- Title: Choreographer: A Full-System Framework for Fine-Grained Tasks in Cache Hierarchies
- ArXiv ID: 2510.26944
- Date: 2025-10-30
- Authors: ** 제공되지 않음 (논문에 저자 정보가 명시되지 않았습니다.) **
📝 Abstract
In this paper, we introduce Choreographer, a simulation framework that enables a holistic system-level evaluation of fine-grained accelerators designed for latency-sensitive tasks. Unlike existing frameworks, Choreographer captures all hardware and software overheads in core-accelerator and cache-accelerator interactions, integrating a detailed gem5-based hardware stack featuring an AMBA coherent hub interface (CHI) mesh network and a complete Linux-based software stack. To facilitate rapid prototyping, it offers a C++ application programming interface and modular configuration options. Our detailed cache model provides accurate insights into performance variations caused by cache configurations, which are not captured by other frameworks. The framework is demonstrated through two case studies: a data-aware prefetcher for graph analytics workloads, and a quicksort accelerator. Our evaluation shows that the prefetcher achieves speedups between 1.08x and 1.88x by reducing memory access latency, while the quicksort accelerator delivers more than 2x speedup with minimal address translation overhead. These findings underscore the ability of Choreographer to model complex hardware-software interactions and optimize performance in small task offloading scenarios.💡 Deep Analysis
📄 Full Content
Reference
This content is AI-processed based on open access ArXiv data.