The universal scalability law of computational capacity is a rational function C_p = P(p)/Q(p) with P(p) a linear polynomial and Q(p) a second-degree polynomial in the number of physical processors p, that has been long used for statistical modeling and prediction of computer system performance. We prove that C_p is equivalent to the synchronous throughput bound for a machine-repairman with state-dependent service rate. Simpler rational functions, such as Amdahl's law and Gustafson speedup, are corollaries of this queue-theoretic bound. C_p is further shown to be both necessary and sufficient for modeling all practical characteristics of computational scalability.
Deep Dive into A General Theory of Computational Scalability Based on Rational Functions.
The universal scalability law of computational capacity is a rational function C_p = P(p)/Q(p) with P(p) a linear polynomial and Q(p) a second-degree polynomial in the number of physical processors p, that has been long used for statistical modeling and prediction of computer system performance. We prove that C_p is equivalent to the synchronous throughput bound for a machine-repairman with state-dependent service rate. Simpler rational functions, such as Amdahl’s law and Gustafson speedup, are corollaries of this queue-theoretic bound. C_p is further shown to be both necessary and sufficient for modeling all practical characteristics of computational scalability.
A General Theory of Computational Scalability Based on
Rational Functions
Neil J. Gunther∗
November 26, 2024
Abstract
The universal scalability law of computational capacity is a rational function Cp = P(p)/Q(p)
with P(p) a linear polynomial and Q(p) a second-degree polynomial in the number of physical
processors p, that has been long used for statistical modeling and prediction of computer system
performance. We prove that Cp is equivalent to the synchronous throughput bound for a machine-
repairman with state-dependent service rate. Simpler rational functions, such as Amdahl’s law
and Gustafson speedup, are corollaries of this queue-theoretic bound. Cp is further shown to be
both necessary and sufficient for modeling all practical characteristics of computational scalability.
1
Introduction
For several decades, a class of real functions called rational functions [1], has been used to represent
throughput scalability as a function of physical processor configuration. In particular, Amdahl’s
law [2], its modification due to Gustafson [3] and the Universal Scalability Law (USL) [4] have
found ubiquitous application. In this context, the relative computing capacity, Cp, is a rational
function of the number of physical processors p. It is defined as the quotient of a polynomial P(p)
in the numerator and Q(p) in the denominator, i.e., Cp = P(p)/Q(p). Each of the above-mentioned
scalability models is distinguished by the number of coefficients or fitting parameters associated
with the polynomials in P(p) and Q(p). For example, Amdahl’s law and Gustafson’s modification
are single parameter models, whereas the USL model contains two parameters.
Despite their historical utility, these models have stood in isolation without any deeper physical
interpretation.
It has even been suggested that Amdahl’s law is not fundamental [5].
More
importantly, the lack of a unified physical interpretation has led to the use of certain flawed
scalability models [6].
In this note, we demonstrate that the aforementioned class of rational
functions corresponds to certain performance bounds belonging to a queue-theoretic model.
The idea that Amdahl’s law, which has most frequently been associated with the scalability
of massively parallel systems, can be considered from a queue-theoretic standpoint, is not entirely
new [See e.g., 7, 8].
However, quite apart from motivations entirely different from our own,
those previous works employed open queueing models with an unbounded number of requests (See
Appendix C), whereas we shall use a closed queueing model with a finite number of requests
p corresponding to the number of physical processors.
The USL function is associated with a
state-dependent generalization of the machine repairman [9].
The organization of this paper is as follows. We briefly review the scalability models of interest
in Sect. 2. The appropriate queueing metrics associated with the standard machine repairman and
its state-dependent extension are discussed in Sect. 3. The performance characteristics associated
with synchronous queueing are also presented there. The main theorem (Theorem 2) is established
in Sect. 4. Amdahl’s law and Gustafson’s linear speedup are shown to be corollaries of this theorem.
Finally, in Sect. 5 we prove an earlier conjecture that a rational function with Q(p) a second-degree
polynomial is both necessary and sufficient to model all practical cases of computational scalability.
∗Performance Dynamics Company, 4061 East Castro Valley Blvd., Suite 110, Castro Valley, CA 94552, USA. Email: nj
gunther @ perfdynamics .
com
1
arXiv:0808.1431v2 [cs.PF] 25 Aug 2008
2
Parametric Models
Although technically, we are discussing rational functions, we shall hereafter refer to them as
parametric models, and the coefficients as parameters, since the primary application of these
models is nonlinear statistical regression of performance data [See e.g., 4, 10, 11, 12, and references
therein].
0
20
40
60
80
100
p
5
10
15
20
Cp
Figure 1: Parametric models: USL (red), Amdahl (green), Gustafson (blue), with parameter values
exaggerated to distinguish their typical characteristic relative to ideal linear scaling (dashed). The
horizontal line is the Amdahl asymptote at σ−1
.
Definition 1 (Speedup). If an amount of work N is completed in time T1 on a uniprocessor, the
same amount of work can be completed in time Tp < T1 on a p-way multiprocessor. The speedup
Sp = T1/Tp is one measure of scalability.
2.1
Amdahl’s law
For a single task that takes time T1 to execute on a uniprocessor (p = 1), Amdahl’s law [2]
states that if the task can be equipartitioned onto p processors, but contains an irreducible fraction
of sequential work σ ∈[0, 1], then only the remaining portion of the execution time (1 −σ)T1
can be executed as p parallel subtasks on p physical processors. The bound on the achievable
equipartitioned speedup [13] is given by the ratio
Sp(σ) =
T1
σT1 +
„1 −σ
p
«
T1
(1)
which simplifies to
Sp(σ) =
p
1 + σ(p −1) ;
(2)
a rational functio
…(Full text truncated)…
This content is AI-processed based on ArXiv data.