Quality Up in Polynomial Homotopy Continuation by Multithreaded Path Tracking
Speedup measures how much faster we can solve the same problem using many cores. If we can afford to keep the execution time fixed, then quality up measures how much better the solution will be computed using many cores. In this paper we describe our multithreaded implementation to track one solution path defined by a polynomial homotopy. Limiting quality to accuracy and confusing accuracy with precision, we strive to offset the cost of multiprecision arithmetic running multithreaded code on many cores.
💡 Research Summary
The paper addresses the problem of tracking a single solution path in polynomial homotopy continuation on modern multicore workstations. While the traditional focus has been on speedup—how much faster a problem can be solved using p cores—the authors introduce the complementary notion of “quality up.” Quality is defined as the number of correct decimal places in the computed solution, and quality‑up is the ratio Qp/Q1 when the execution time is held constant. In an ideal scenario, both speedup and quality‑up would scale linearly with the number of cores.
To realize quality‑up, the authors implement a multithreaded path tracker that uses extended‑precision arithmetic provided by the QD‑2.3.9 library. They work with double‑double (≈31 decimal digits) and quad‑double (≈62 decimal digits) floating‑point formats, which offer higher precision at a cost comparable to complex arithmetic. Because double‑double arithmetic does not require frequent memory allocation, it is well suited for shared‑memory multithreading.
The path tracker follows the classic predictor–corrector scheme. The predictor (either a secant or a newly proposed quadratic predictor) generates an initial guess for the next point on the path; Newton’s method acts as the corrector; a step‑size control algorithm adjusts the step length. Algorithm 3.1 describes the overall multithreaded workflow, where the first thread performs prediction and step‑size control, while all threads cooperate in the Newton correction.
Algorithm 4.1 details the multithreaded Newton method. The evaluation of monomials and the multiplication by coefficients are embarrassingly parallel and require no synchronization. The Jacobian matrix is assembled, and a Gaussian elimination with partial pivoting is performed on the augmented matrix. Pivot selection is done by the first thread; after a pivot is chosen, all threads update their assigned rows. A barrier synchronizes the threads before proceeding to back substitution, which is also parallelized. After solving for the correction Δz, the first thread updates the solution vector.
Performance experiments use a synthetic 40‑variable system with 200 monomials, each of average degree 40. Table 1 reports wall‑clock times for polynomial evaluation, Gaussian elimination, and back substitution as the number of threads increases from 1 to 8. With eight cores, polynomial evaluation drops from 35.7 s to 4.78 s (≈7.5× speedup), and total time falls from 40.8 s to 6.18 s (≈6.6× speedup). The authors note that polynomial evaluation dominates the cost, so multithreading this stage yields the greatest benefit.
A major contribution is the quadratic predictor. Unlike the secant predictor, which uses only the two most recent points, the quadratic predictor fits a parabola through three points (t₁, x₁), (t₂, x₂), (t₃, x₃) for each coordinate independently and evaluates it at the next t. This operation scales linearly with dimension and adds negligible overhead, yet it dramatically reduces the number of Newton corrections required. In a 20‑dimensional test, the secant predictor needed 11 362 successful corrections and 26 min 53 s of runtime, whereas the quadratic predictor required only 572 corrections and 8.86 s. For a 40‑dimensional problem, the quadratic predictor reduced runtime from several days (secant) to a few minutes. Tables 2 and 3 illustrate these gains.
The paper also incorporates algorithmic differentiation to compute all partial derivatives efficiently, further lowering the cost of Jacobian assembly.
In summary, the authors demonstrate that by combining extended‑precision arithmetic, fine‑grained multithreading of Newton’s method, and an inexpensive quadratic predictor, one can achieve both substantial speedup and significant quality‑up on multicore machines. The work provides a practical framework for high‑precision numerical algebraic geometry and suggests that “quality up” is a useful metric for evaluating parallel numerical algorithms when fixed execution time is a constraint.
Comments & Academic Discussion
Loading comments...
Leave a Comment