A novel Krylov subspace method for approximating Fréchet derivatives of large-scale matrix functions

A novel Krylov subspace method for approximating Fréchet derivatives of large-scale matrix functions
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present a novel Krylov subspace method for approximating $L_f(A, E) \vc{b}$, the matrix-vector product of the Fréchet derivative $L_f(A, E)$ of a large-scale matrix function $f(A)$ in direction $E$, a task that arises naturally in the sensitivity analysis of quantities involving matrix functions, such as centrality measures for networks. It also arises in the context of gradient-based methods for optimization problems that feature matrix functions, e.g., when fitting an evolution equation to an observed solution trajectory. In principle, the well-known identity [ f\left( \begin{bmatrix} A & E \ 0 & A \end{bmatrix} \right) \begin{bmatrix} 0 \ \vc{b} \end{bmatrix} = \begin{bmatrix} L_f(A, E) \vc{b} \ f(A) \vc{b} \end{bmatrix}, ] allows one to directly apply any standard Krylov subspace method, such as the Arnoldi algorithm, to address this task. However, this comes with the major disadvantage that the involved block triangular matrix has unfavorable spectral properties, which impede the convergence analysis and, to a certain extent, also the observed convergence. To avoid these difficulties, we propose a novel modification of the Arnoldi algorithm that aims at better preserving the block triangular structure. In turn, this allows one to bound the convergence of the modified method by the best polynomial approximation of the derivative $f^\prime$ on the numerical range of $A$. Several numerical experiments illustrate our findings.


💡 Research Summary

The paper addresses the computational challenge of evaluating the Fréchet derivative of a large‑scale matrix function, specifically the product (L_f(A,E),b) where (A\in\mathbb{C}^{n\times n}), (E\in\mathbb{C}^{n\times n}) and (b\in\mathbb{C}^n). Such quantities appear in sensitivity analysis of network centrality measures, gradient‑based optimization involving matrix functions, and other applications where only the directional derivative applied to a vector is required.

A classical approach exploits the block‑triangular identity
\


Comments & Academic Discussion

Loading comments...

Leave a Comment