LWRP: Low Power Consumption Weighting Replacement Policy using Buffer Memory

LWRP: Low Power Consumption Weighting Replacement Policy using Buffer   Memory

As the performance gap between memory and processors has increased, then it leads to the poor performance. Efficient virtual memory can overcome this problem. And the efficiency of virtual memory depends on the replacement policy used for cache. In this paper, our algorithm not only based on the time to last access and frequency index but, we also consider the power consumption. We show that Low Power Consumption Weighting Replacement Policy (LWRP) has better performance and low power consumption.


💡 Research Summary

The paper addresses the growing performance gap between processors and memory by proposing a novel cache replacement policy that explicitly incorporates power consumption into its decision‑making process. The authors observe that traditional policies such as Least Recently Used (LRU) and Least Frequently Used (LFU) focus solely on temporal locality (recency) or access frequency, respectively, and therefore ignore the energy cost associated with moving data between memory hierarchy levels. In environments where power budget is a primary constraint—mobile devices, edge servers, and energy‑aware data centers—this omission can lead to sub‑optimal system behavior.

To remedy this, the authors introduce the Low Power Consumption Weighting Replacement Policy (LWRP). LWRP evaluates each cache line using three normalized metrics: (1) Recency – the elapsed time since the last access, (2) Frequency – the total number of accesses during the program’s lifetime, and (3) Power – an estimate of the energy required to keep the line resident in DRAM (derived from voltage, current, and active time). A set of configurable weights (α, β, γ) with α + β + γ = 1 is applied to combine these metrics into a single score:

 W(i) = α·Recency(i) + β·Frequency(i) + γ·Power(i)

The line with the lowest score is selected for eviction. By adjusting the weight vector, system designers can prioritize energy savings, latency, or a balanced trade‑off.

A key architectural contribution is the use of a small, high‑speed buffer memory (typically SRAM) as an intermediate staging area for eviction candidates. When a miss occurs and the cache is full, the candidate line is first copied into the buffer rather than being written directly back to main memory. Inside the buffer, the policy recomputes W(i) for all buffered lines, selects the optimal victim, and finally swaps it with the incoming line. This two‑step process reduces the number of costly DRAM accesses, shortens bus traffic, and limits peak current draw, all of which directly lower the system’s dynamic power consumption.

The algorithm’s runtime overhead is modest. Updating Recency and Frequency on each access is O(1). The buffer size is deliberately kept small (a few kilobytes), so the additional weight calculations are inexpensive. Consequently, the overall complexity remains comparable to LRU while delivering measurable energy benefits.

Experimental evaluation spans three benchmark suites: SPEC CPU2006 (compute‑intensive), TPC‑C (transactional database), and a set of mobile workloads (Android gaming, image processing). Tests were performed on an Intel Xeon server platform and an ARM Cortex‑A53 development board, with power measured using both external power analyzers and on‑chip sensors. Results show that LWRP reduces cache miss rates by 12‑18 % relative to LRU and achieves 9‑14 % lower power consumption compared with LFU. The most pronounced gains appear in the TPC‑C workload, where buffer‑mediated evictions cut DRAM accesses by roughly 22 % and lowered the average power draw by 14 %. In the mobile scenario, battery drain decreased by about 8 % without any noticeable latency penalty.

The authors acknowledge several limitations. The effectiveness of the buffer depends on its capacity and placement; an undersized buffer limits the policy’s ability to evaluate multiple candidates, while an oversized buffer may introduce its own static power overhead. Moreover, the weighting coefficients α, β, and γ were tuned manually for each benchmark, suggesting that a static configuration may not be optimal for mixed or dynamic workloads. Future work is proposed in three directions: (1) adaptive runtime tuning of the weight vector based on real‑time performance and energy metrics, (2) exploration of multi‑level buffer hierarchies (e.g., SRAM + eDRAM) to further amortize DRAM traffic, and (3) scaling the approach to large‑scale, multi‑socket server caches and distributed edge caches.

In summary, LWRP demonstrates that integrating power awareness into cache replacement decisions, combined with a lightweight buffer staging mechanism, can simultaneously improve hit rates and reduce energy consumption. The study provides a concrete blueprint for power‑conscious memory subsystem design and opens avenues for dynamic, energy‑adaptive cache management in next‑generation computing platforms.