Joint Cache Partition and Job Assignment on Multi-Core Processors

Multicore shared cache processors pose a challenge for designers of embedded systems who try to achieve minimal and predictable execution time of workloads consisting of several jobs. To address this challenge the cache is statically partitioned among the cores and the jobs are assigned to the cores so as to minimize the makespan. Several heuristic algorithms have been proposed that jointly decide how to partition the cache among the cores and assign the jobs. We initiate a theoretical study of this problem which we call the joint cache partition and job assignment problem. By a careful analysis of the possible cache partitions we obtain a constant approximation algorithm for this problem. For some practical special cases we obtain a 2-approximation algorithm, and show how to improve the approximation factor even further by allowing the algorithm to use additional cache. We also study possible improvements that can be obtained by allowing dynamic cache partitions and dynamic job assignments. We define a natural special case of the well known scheduling problem on unrelated machines in which machines are ordered by “strength”. Our joint cache partition and job assignment problem generalizes this scheduling problem which we think is of independent interest. We give a polynomial time algorithm for this scheduling problem for instances obtained by fixing the cache partition in a practical case of the joint cache partition and job assignment problem where job loads are step functions.

💡 Research Summary

The paper tackles a fundamental problem in the design of embedded multicore systems: how to partition a shared last‑level cache among cores and assign a set of jobs to those cores so that the overall completion time (makespan) is minimized. While prior work has largely treated cache partitioning and job scheduling as separate concerns, the authors formalize the combined decision problem, which they call the Joint Cache Partition and Job Assignment (JCP‑JA) problem, and provide the first rigorous approximation analysis.

Problem formulation.
The system consists of P cores and a total cache capacity C units. Each job j has a runtime function f_j(s) that depends on the amount of cache s allocated to the core that runs it. The functions are non‑increasing step functions: as more cache is given, the job’s execution time either drops to a lower plateau or stays the same. The goal is to choose a cache partition (s_1,…,s_P) with ∑ s_i = C and a job assignment A_i for each core i that minimizes
\