Compact schemes are often preferred in performing scientific computing for their superior spectral resolution. Error-free parallelization of a compact scheme is a challenging task due to the requirement of additional closures at the inter-processor ...
Krylov methods are a key way of solving large sparse linear systems of equations but suffer from poor strong scalability on distributed memory machines. This is due to high synchronization costs from large numbers of collective communication calls ...
This work studies three multigrid variants for matrix-free finite-element computations on locally refined meshes: geometric local smoothing, geometric global coarsening (both h-multigrid), and polynomial global coarsening (a variant of p-multigrid). We ...
In this article, we present a CUDA library with a C API for solving block cyclic tridiagonal and banded systems on one GPU. The library can process block tridiagonal systems with block sizes from 1 × 1 (scalar) to 4 × 4 and banded systems with up to four ...