Direct Tall-and-Skinny QR algorithm

As presented in:

A. Benson, D. Gleich, and J. Demmel. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE International Conference on Big Data, 2013. https://arxiv.org/abs/1301.1071

This algorithm is used to compute both the QR decomposition and the Singular Value Decomposition. It requires that the input array have a single column of blocks, each of which fit in memory.

Parameters
data: Array
compute_svd: bool

Whether to compute the SVD rather than the QR decomposition

_max_vchunk_size: Integer

Used internally in recursion to set the maximum row dimension of chunks in subsequent recursive calls.

`dask.array.linalg.qr`

`dask.array.linalg.svd`

`dask.array.linalg.sfqr`

Variant for short-and-fat arrays

Notes

With `k` blocks of size `(m, n)`, this algorithm has memory use that scales as `k * n * n`.

The implementation here is the recursive variant due to the ultimate need for one “single core” QR decomposition. In the non-recursive version of the algorithm, given `k` blocks, after `k` `m * n` QR decompositions, there will be a “single core” QR decomposition that will have to work with a `(k * n, n)` matrix.

Here, recursion is applied as necessary to ensure that `k * n` is not larger than `m` (if `m / n >= 2`). In particular, this is done to ensure that single core computations do not have to work on blocks larger than `(m, n)`.

Where blocks are irregular, the above logic is applied with the “height” of the “tallest” block used in place of `m`.

Consider use of the `rechunk` method to control this behavior. Taller blocks will reduce overall memory use (assuming that many of them still fit in memory at once).