- Title: A Study of Network Congestion in Two Supercomputing High-Speed Interconnects
- ArXiv ID: 1907.05312
- Date: 2019-07-12
- Authors: Saurabh Jha, Archit Patke, Jim Brandt, Ann Gentile, Mike Showerman, Eric Roman, Zbigniew T. Kalbarczyk, William T. Kramer, Ravishankar K. Iyer
📝 Abstract
Network congestion in high-speed interconnects is a major source of application run time performance variation. Recent years have witnessed a surge of interest from both academia and industry in the development of novel approaches for congestion control at the network level and in application placement, mapping, and scheduling at the system-level. However, these studies are based on proxy applications and benchmarks that are not representative of field-congestion characteristics of high-speed interconnects. To address this gap, we present (a) an end-to-end framework for monitoring and analysis to support long-term field-congestion characterization studies, and (b) an empirical study of network congestion in petascale systems across two different interconnect technologies: (i) Cray Gemini, which uses a 3-D torus topology, and (ii) Cray Aries, which uses the DragonFly topology.
💡 Summary & Analysis
This paper focuses on analyzing network congestion in high-speed interconnects used in supercomputing environments. Specifically, it compares and contrasts the performance of two different interconnect technologies: Cray Gemini (which uses a 3-D torus topology) and Cray Aries (which employs the DragonFly topology). The primary challenge addressed is that existing studies often rely on proxy applications and benchmarks which do not accurately represent real-world congestion scenarios in high-speed interconnects. To tackle this issue, the authors present an end-to-end framework for monitoring and analyzing long-term field-congestion characteristics using a tool called Monet.
The study examines how routing algorithms and link bandwidth heterogeneity affect network congestion. Key findings include evidence that Aries’ DragonFly topology is superior to Gemini’s 3-D torus in managing network congestion due to its lower global diameter of one hop, which helps contain back pressure from congested links. Furthermore, adaptive routing allows packets to take longer but less congested paths, thereby alleviating congestion on the minimal path.
The paper also highlights that link bandwidth heterogeneity can exacerbate congestion issues. By understanding these dynamics and their effects on network performance, researchers and developers can design more efficient network architectures and algorithms that mitigate congestion and improve overall system reliability and efficiency.
📄 Full Paper Content (ArXiv Source)
# Tool Demonstration
/>
Gemini />
AriesCongested link durations vs. PTS threshold for Blue Waters
(Gemini) and Edison (Aries)
In this section, we describe the following two results obtained from
Monet tool on field-congestion data.
impact of routing algorithms on congestion (see
Subsection 4.1)
impact of heterogeneity in link-bandwidth on congestion (see
Subsection 4.2)
Impact of Routing Algorithms
Figure 3 shows the quantile values for
different congested link durations, i.e., durations for which the PTS
value on the link is above a fixed threshold ($`PTS_{th}`$). The figure
leads to the following insights:
Use of the dragonfly topology and adaptive routing has led to
improvement in congestion control between two generations of Cray
interconnects. The Dragonfly topology used in Aries has a low global
diameter of one hop, which helps to contain the back pressure of
congested links. Furthermore, adaptive routing allows packets to take
a longer but less congested path, which helps to alleviate congestion
on the minimal path.
Figure 3 provides empirical evidence for
that observation. For every $`PTS_{th}`$ threshold, the congested link
duration in Aries is an order of magnitude less than in Gemini. For
example, if the threshold for congestion is fixed at 15% PTS, while
the median duration is close to zero in both systems, the 99.9th
percentile duration is approximately 1 minute for Edison and 400
minutes for Blue Waters. However, while Aries manages long bouts of
congestion better than Gemini does, application runtime variability
due to network performance remains a concern.
*Detection of long-duration congestion using traffic measurements can
facilitate intervention such as rank remapping or rescheduling of
bully jobs *. The 99.9th percentile congested link duration observed
in both systems for $`PTS_{th} \le 20\%`$ is greater than a minute.
Such long duration congestion allows us to tolerate greater latency
for detection and diagnosis in real time. Moreover, a diagnosis can be
converted to actionable feedback to be used by tools such as TopoMesh
, which can remap MPI ranks or the scheduler to reschedule bully jobs.
/>
X+ and X- />
Y+ and Y- />
Z+ and Z-Congested link durations for different link types in
Gemini />
Green />
Black />
BlueCongested link durations for different link types in
Aries
Impact of Heterogeneity in Link-bandwidth
Heterogeneity in link bandwidth across different link types (electrical
and optical links) increases the susceptibility to congestion.
Figure 7 (a),
Figure 7 (b) and
Figure 7 (c) respectively show
congested link durations at different quantile values for X, Y and
Z directional links of Cray Gemini interconnect in Blue Waters, and
Figure 11 (a),
Figure 11 (b) and
Figure 11 (c) respecitvely show the
congested link durations at different quantile values for Green,
Black and Blue links of Cray Aries interconnect in Edison. In
Gemini, for higher $`PTS_{th}`$ thresholds ($`\ge20\%`$), links along
the X direction have longer lasting congestion than those on the Y
and Z direction links. Similarly, in Aries, optical links (Blue)
have shorter and less severe bursts of congestion than the electrical
links (Green and Black). Thus, mismatch and heterogeneity in
link-bandwidth leads to varying levels of congestion along network
path.
The copyright of this content belongs to the respective researchers. We deeply appreciate their hard work and contribution to the advancement of human civilization.