An HPC Benchmark Survey and Taxonomy for Characterization

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The field of High-Performance Computing (HPC) is defined by providing computing devices with highest performance for a variety of demanding scientific users. The tight co-design relationship between HPC providers and users propels the field forward, paired with technological improvements, achieving continuously higher performance and resource utilization. A key device for system architects, architecture researchers, and scientific users are benchmarks, allowing for well-defined assessment of hardware, software, and algorithms. Many benchmarks exist in the community, from individual niche benchmarks testing specific features, to large-scale benchmark suites for whole procurements. We survey the available HPC benchmarks, summarizing them in table form with key details and concise categorization, also through an interactive website. For categorization, we present a benchmark taxonomy for well-defined characterization of benchmarks.

💡 Research Summary

The paper presents a comprehensive survey of high‑performance computing (HPC) benchmarks, addressing the growing difficulty of locating, comparing, and reusing the myriad of existing tests. The authors collected more than 180 individual benchmarks and 13 benchmark suites, spanning classic synthetic tests such as HPL and STREAM, mini‑applications derived from real scientific codes, and large‑scale, GPU‑accelerated workloads used in recent supercomputer procurements. For each entry they recorded a uniform set of metadata: name, download URL, license, reference publication, and a free‑form notes field. All information is stored in a machine‑readable YAML schema and made publicly available via a GitHub repository.

A central contribution is the design of a “Benchmark Taxonomy” that provides a structured, multi‑dimensional classification of benchmarks. The taxonomy consists of twelve top‑level categories—application‑domain, benchmark‑scale, communication, compute‑performance‑characteristics, memory‑access‑characteristics, method‑type, programming‑language, programming‑model, and others—each populated with a curated list of possible tag values (e.g., MPI, NCCL for communication; dense‑linear‑algebra, FFT for method‑type; CUDA, OpenMP for programming‑model). By combining multiple tags, a benchmark can be described with fine granularity, enabling precise filtering and comparison. The taxonomy is deliberately extensible: new tags can be added to the YAML files without breaking existing tools.

The authors also built an interactive web portal (fzj‑jsc.github.io/benchmark‑survey) that visualizes the full table, supports dynamic filtering by any combination of tags, and automatically incorporates community contributions submitted through pull requests. This contrasts with traditional static PDF tables and greatly improves discoverability.

In the evaluation, the authors categorize the 13 suites into three groups: procurement‑oriented suites (e.g., OLCF‑6, TS‑5, NERSC‑10, JUPITER, CORAL‑2) that are used to evaluate large, multi‑node systems and typically contain a mix of GPU‑accelerated application benchmarks and synthetic tests; research‑oriented suites that target specific domains or algorithmic studies; and smaller, educational or benchmark‑development suites. They note that most modern suites emphasize GPU acceleration and multi‑node scaling, reflecting current hardware trends.

The paper discusses limitations of existing benchmarks, such as poor portability across hardware generations, over‑specialization, and the difficulty of creating robust, repeatable, and stable tests. By providing a unified taxonomy and a centralized, openly licensed repository, the authors aim to mitigate these issues, promote standardization, and encourage community‑driven curation.

In conclusion, the work delivers a valuable infrastructure for the HPC community: a searchable, extensible catalog of benchmarks, a well‑defined taxonomy for precise characterization, and an open‑source data model that can serve as a foundation for future benchmarking efforts, tool development, and comparative performance studies.

An HPC Benchmark Survey and Taxonomy for Characterization

💡 Research Summary

Comments & Academic Discussion

Leave a Comment