This paper discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many statistical algorithms. To exploit these devices fully, optimization algorithms should reduce to multiple parallel tasks, each accessing a limited amount of data. These criteria favor EM and MM algorithms that separate parameters and data. To a lesser extent block relaxation and coordinate descent and ascent also qualify. We demonstrate the utility of GPUs in nonnegative matrix factorization, PET image reconstruction, and multidimensional scaling. Speedups of 100 fold can easily be attained. Over the next decade, GPUs will fundamentally alter the landscape of computational statistics. It is time for more statisticians to get on-board.
Deep Dive into Graphics Processing Units and High-Dimensional Optimization.
This paper discusses the potential of graphics processing units (GPUs) in high-dimensional optimization problems. A single GPU card with hundreds of arithmetic cores can be inserted in a personal computer and dramatically accelerates many statistical algorithms. To exploit these devices fully, optimization algorithms should reduce to multiple parallel tasks, each accessing a limited amount of data. These criteria favor EM and MM algorithms that separate parameters and data. To a lesser extent block relaxation and coordinate descent and ascent also qualify. We demonstrate the utility of GPUs in nonnegative matrix factorization, PET image reconstruction, and multidimensional scaling. Speedups of 100 fold can easily be attained. Over the next decade, GPUs will fundamentally alter the landscape of computational statistics. It is time for more statisticians to get on-board.
arXiv:1003.3272v1 [stat.CO] 16 Mar 2010
Graphics Processing Units and
High-Dimensional Optimization
Hua Zhou, Kenneth Lange and Marc A. Suchard
Department of Human Genetics, University of California, Los Angeles, e-mail:
huazhou@ucla.edu.
Departments of Biomathematics, Human Genetics, and Statistics, University of California,,
Los Angeles, e-mail: klange@ucla.edu.
Departments of Biomathematics, Biostatistics, and Human Genetics, University of
California, Los Angeles, e-mail: msuchard@ucla.edu.
Abstract: This paper discusses the potential of graphics processing units
(GPUs) in high-dimensional optimization problems. A single GPU card
with hundreds of arithmetic cores can be inserted in a personal computer
and dramatically accelerates many statistical algorithms. To exploit these
devices fully, optimization algorithms should reduce to multiple parallel
tasks, each accessing a limited amount of data. These criteria favor EM
and MM algorithms that separate parameters and data. To a lesser extent
block relaxation and coordinate descent and ascent also qualify. We demon-
strate the utility of GPUs in nonnegative matrix factorization, PET image
reconstruction, and multidimensional scaling. Speedups of 100 fold can eas-
ily be attained. Over the next decade, GPUs will fundamentally alter the
landscape of computational statistics. It is time for more statisticians to
get on-board.
Keywords and phrases: Block relaxation, EM and MM algorithms, mul-
tidimensional scaling, nonnegative matrix factorization, parallel computing,
PET scanning.
1. Introduction
Statisticians, like all scientists, are acutely aware that the clock speeds on their
desktops and laptops have stalled. Does this mean that statistical computing
has hit a wall? The answer fortunately is no, but the hardware advances that
we routinely expect have taken an interesting detour. Most computers now sold
have two to eight processing cores. Think of these as separate CPUs on the
same chip. Naive programmers rely on sequential algorithms and often fail to
take advantage of more than a single core. Sophisticated programmers, the kind
who work for commercial firms such as Matlab, eagerly exploit parallel program-
ming. However, multicore CPUs do not represent the only road to the success
of statistical computing.
Graphics processing units (GPUs) have caught the scientific community by
surprise. These devices are designed for graphics rendering in computer anima-
tion and games. Propelled by these nonscientific markets, the old technology of
numerical (array) coprocessors has advanced rapidly. Highly parallel GPUs are
1
Zhou, Lange, and Suchard/GPUs and High-Dimensional Optimization
2
now making computational inroads against traditional CPUs in image process-
ing, protein folding, stock options pricing, robotics, oil exploration, data mining,
and many other areas [27]. We are starting to see orders of magnitude improve-
ment on some hard computational problems. Three companies, Intel, NVIDIA,
and AMD/ATI, dominate the market. Intel is struggling to keep up with its
more nimble competitors.
Modern GPUs support more vector and matrix operations, stream data
faster, and possess more local memory per core than their predecessors. They
are also readily available as commodity items that can be inserted as video
cards on modern PCs. GPUs have been criticized for their hostile program-
ming environment and lack of double precision arithmetic and error correction,
but these faults are being rectified. The CUDA programming environment [26]
for NVIDIA chips is now easing some of the programming chores. We could
say more about near-term improvements, but most pronouncements would be
obsolete within months.
Oddly, statisticians have been slow to embrace the new technology. Silberstein
et al [30] first demonstrated the potential for GPUs in fitting simple Bayesian
networks. Recently Suchard and Rambaut [32] have seen greater than 100-fold
speed-ups in MCMC simulations in molecular phylogeny. Lee et al [17] and Tib-
bits et al [33] are following suit with Bayesian model fitting via particle filtering
and slice sampling. Finally, work is under-way to port common data mining
techniques such as hierarchical clustering and multi-factor dimensionality re-
duction onto GPUs [31]. These efforts constitute the first wave of an eventual
flood of statistical and data mining applications. The porting of GPU tools into
the R environment will undoubtedly accelerate the trend [3].
Not all problems in computational statistics can benefit from GPUs. Sequen-
tial algorithms are resistant unless they can be broken into parallel pieces. Even
parallel algorithms can be problematic if the entire range of data must be ac-
cessed by each GPU. Because they have limited memory, GPUs are designed to
operate on short streams of data. The greatest speedups occur when all of the
GPUs on a card perform the same arithmetic operation simultaneously. Effec-
tive applications of GPUs in optimization involves both separati
…(Full text truncated)…
This content is AI-processed based on ArXiv data.