- Home
- Documents
*Sparse linear solvers: iterative methods and preconditioning CA solvers based on s-step methods:...*

prev

next

out of 54

View

1Download

0

Embed Size (px)

Sparse linear solvers: iterative methods and preconditioning

L. Grigori

ALPINES INRIA and LJLL, UPMC

On sabbatical at UC Berkeley

February 2016

Plan

Sparse linear solvers Sparse matrices and graphs Classes of linear solvers

Krylov subspace methods Conjugate gradient method

Iterative solvers that reduce communication CA solvers based on s-step methods Enlarged Krylov methods

2 of 46

Plan

Sparse linear solvers Sparse matrices and graphs Classes of linear solvers

Krylov subspace methods

Iterative solvers that reduce communication

3 of 46

Sparse matrices and graphs

� Most matrices arising from real applications are sparse. � A 1M-by-1M submatrix of the web connectivity graph, constructed from

an archive at the Stanford WebBase.

Figure : Nonzero structure of the matrix

4 of 46

Sparse matrices and graphs

� Most matrices arising from real applications are sparse.

� GHS class: Car surface mesh, n = 100196, nnz(A) = 544688

Figure : Nonzero structure of the matrix Figure : Its undirected graph

Examples from Tim Davis’s Sparse Matrix Collection,

http://www.cise.ufl.edu/research/sparse/matrices/

5 of 46

http://www.cise.ufl.edu/research/sparse/matrices/

Sparse matrices and graphs

� Semiconductor simulation matrix from Steve Hamm, Motorola, Inc. circuit with no parasitics, n = 105676, nnz(A) = 513072

Figure : Nonzero structure of the matrix Figure : Its undirected graph

Examples from Tim Davis’s Sparse Matrix Collection,

http://www.cise.ufl.edu/research/sparse/matrices/

6 of 46

http://www.cise.ufl.edu/research/sparse/matrices/

Symmetric sparse matrices and graphs

� The structure of a square symmetric matrix A with nonzero diagonal can be represented by an undirected graph G (A) = (V ,E ) with

� n vertices, one for each row/column of A � an edge (i , j) for each nonzero aij , i > j

1 2 3 4 5 6 7 8 9

A =

1 2 3 4 5 6 7 8 9

x x x x x x x

x x x x x x x

x x x x x x x x x

x x x x x x x

x x x

1 2 3

4 5 6

7 8 9

G (A)

Notation: upper case (A) - matrices; lower case (aij) - elements

7 of 46

Sparse linear solvers

Direct methods of factorization � For solving Ax = b, least squares problems

� Cholesky, LU, QR, LDLT factorizations

� Limited by fill-in/memory consumption and scalability

Iterative solvers

� For solving Ax = b, least squares, Ax = λx , SVD

� When only multiplying A by a vector is possible

� Limited by accuracy/convergence

Hybrid methods As domain decomposition methods

8 of 46

Plan

Sparse linear solvers

Krylov subspace methods Conjugate gradient method

Iterative solvers that reduce communication

9 of 46

Krylov subspace methods

Solve Ax = b by finding a sequence x1, x2, ..., xk that minimizes some measure of error over the corresponding spaces

x0 +Ki (A, r0), i = 1, ..., k

.

They are defined by two conditions:

1. Subspace condition: xk ∈ x0 +Kk(A, r0) 2. Petrov-Galerkin condition: rk ⊥ Lk

⇐⇒ (rk)ty = 0, ∀ y ∈ Lk

where � x0 is the initial iterate, r0 is the initial residual,

� Kk (A, r0) = span{r0,Ar0,A2r0, ...,Ak−1r0} is the Krylov subspace of dimension k, � Lk is a well-defined subspace of dimension k.

10 of 46

One of Top Ten Algorithms of the 20th Century

From SIAM News, Volume 33, Number 4: Magnus Hestenes, Eduard Stiefel, and Cornelius Lanczos, all from the Institute for Numerical Analysis at the National Bureau of Standards, initiate the development of Krylov subspace iteration methods.

� Russian mathematician Alexei Krylov writes first paper, 1931.

� Lanczos - introduced an algorithm to generate an orthogonal basis for such a subspace when the matrix is symmetric.

� Hestenes and Stiefel - introduced CG for SPD matrices.

Other Top Ten Algorithms: Monte Carlo method, decompositional approach to

matrix computations (Householder), Quicksort, Fast multipole, FFT.

11 of 46

Choosing a Krylov method

Source slide: J. Demmel 12 of 46

Conjugate gradient (Hestenes, Stieffel, 52)

� A Krylov projection method for SPD matrices where Lk = Kk(A, r0). � Finds x∗ = A−1b by minimizing the quadratic function

φ(x) = 1

2 (x)tAx − btx

5φ(x) = Ax − b = 0

� After j iterations of CG,

||x∗ − xj ||A ≤ 2||x − x0||A

(√ κ(A)− 1√ κ(A) + 1

)j ,

where x0 is starting vector, ||x ||A = √ xTAx and κ(A) = |λmax(A)|/|λmin(A)|.

13 of 46

Conjugate gradient

� Computes A-orthogonal search directions by conjugation of the residuals{ p1 = r0 = −5 φ(x0) pk = rk−1 + βkpk−1

(1)

� At k-th iteration,

xk = xk−1 + αkpk = argminx∈x0+Kk (A,r0)φ(x)

where αk is the step along pk .

� CG algorithm obtained by imposing the orthogonality and the conjugacy conditions

rTk ri = 0, for all i 6= k , pTk Api = 0, for all i 6= k .

14 of 46

CG algorithm

Algorithm 1 The CG Algorithm

1: r0 = b − Ax0, ρ0 = ||r0||22, p1 = r0, k = 1 2: while (

√ ρk > �||b||2 and k < kmax ) do

3: if (k 6= 1) then 4: βk = (rk−1, rk−1)/(rk−2, rk−2) 5: pk = rk−1 + βkpk−1 6: end if 7: αk = (rk−1, rk−1)/(Apk , pk) 8: xk = xk−1 + αkpk 9: rk = rk−1 − αkApk

10: ρk = ||rk ||22 11: k = k + 1 12: end while

15 of 46

Challenge in getting efficient and scalable solvers

� A Krylov solver finds xk+1 from x0 +Kk+1(A, r0) where

Kk+1(A, r0) = span{r0,Ar0,A2r0, ...,Ak r0},

such that the Petrov-Galerkin condition b − Axk+1 ⊥ Lk+1 is satisfied. � Does a sequence of k SpMVs to get vectors [x1, ..., xk ]

� Finds best solution xk+1 as linear combination of [x1, ..., xk ]

Typically, each iteration requires

� Sparse matrix vector product → point-to-point communication

� Dot products for orthogonalization → global communication

16 of 46

Challenge in getting efficient and scalable solvers

� A Krylov solver finds xk+1 from x0 +Kk+1(A, r0) where

Kk+1(A, r0) = span{r0,Ar0,A2r0, ...,Ak r0},

such that the Petrov-Galerkin condition b − Axk+1 ⊥ Lk+1 is satisfied. � Does a sequence of k SpMVs to get vectors [x1, ..., xk ]

� Finds best solution xk+1 as linear combination of [x1, ..., xk ]

Typically, each iteration requires

� Sparse matrix vector product → point-to-point communication

� Dot products for orthogonalization → global communication

16 of 46

Ways to improve performance

� Improve the performance of sparse matrix-vector product.

� Improve the performance of collective communication.

� Change numerics - reformulate or introduce Krylov subspace algorithms to: � reduce communication, � increase arithmetic intensity - compute sparse matrix-set of vectors product.

� Use preconditioners to decrease the number of iterations till convergence.

17 of 46

Plan

Sparse linear solvers

Krylov subspace methods

Iterative solvers that reduce communication CA solvers based on s-step methods Enlarged Krylov methods

18 of 46

Iterative solvers that reduce communication

Communication avoiding based on s-step methods

� Unroll k iterations, orthogonalize every k steps.

� A factor of O(k) less messages and bandwidth in sequential.

� A factor of O(k) less messages in parallel (same bandwidth).

Enlarged Krylov methods

� Decrease the number of iterations to decrease the number of global communication.

� Increase arithmetic intensity.

Other approaches available in the litterature, but not presented here.

19 of 46

CA solvers based on s-step methods: main idea

To avoid communication, unroll k-steps, ghost necessary data,

� generate a set of vectors W for the Krylov subspace Kk(A, r0), � (A)-orthogonalize the vectors using a communication avoiding

orthogonalization algorithm (e.g. TSQR(W)).

References � Van Rosendale ’83, Walker ’85, Chronopoulous and Gear ’89, Erhel ’93, Toledo ’95, Bai, Hu,

Reichel ’91 (Newton basis), Joubert and Carey ’92 (Chebyshev basis), etc.

� Recent references: G. Atenekeng, B. Philippe, E. Kamgnia (to enable multiplicative Schwarz preconditioner), J. Demmel, M. Hoemmen, M. Mohiyuddin, K. Yellick (to minimize communication, next slides), Carson, Demmel, Knight (CA and other Krylov solvers, preconditioners)

20 of 46

CA-GMRES

GMRES: find x in span{b,Ab, ...,Akb} minimizing ||Ax − b||2 Cost of k steps of standard GMRES vs new GMRES

Source of following 11 slides: J. Demmel

21 of 46

CA-GMRES

GMRES: find x in span{b,Ab,