dask.array.linalg.tsqr

dask.array.linalg.tsqr

dask.array.linalg.tsqr(data, compute_svd=False, _max_vchunk_size=None)[source]

Direct Tall-and-Skinny QR algorithm

As presented in:

A. Benson, D. Gleich, and J. Demmel. Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures. IEEE International Conference on Big Data, 2013. https://arxiv.org/abs/1301.1071

This algorithm is used to compute both the QR decomposition and the Singular Value Decomposition. It requires that the input array have a single column of blocks, each of which fit in memory.

Parameters
data: Array
compute_svd: bool

Whether to compute the SVD rather than the QR decomposition

_max_vchunk_size: Integer

Used internally in recursion to set the maximum row dimension of chunks in subsequent recursive calls.

See also

dask.array.linalg.qr

Powered by this algorithm

dask.array.linalg.svd

Powered by this algorithm

dask.array.linalg.sfqr

Variant for short-and-fat arrays

Notes

With k blocks of size (m, n), this algorithm has memory use that scales as k * n * n.

The implementation here is the recursive variant due to the ultimate need for one “single core” QR decomposition. In the non-recursive version of the algorithm, given k blocks, after k m * n QR decompositions, there will be a “single core” QR decomposition that will have to work with a (k * n, n) matrix.

Here, recursion is applied as necessary to ensure that k * n is not larger than m (if m / n >= 2). In particular, this is done to ensure that single core computations do not have to work on blocks larger than (m, n).

Where blocks are irregular, the above logic is applied with the “height” of the “tallest” block used in place of m.

Consider use of the rechunk method to control this behavior. Taller blocks will reduce overall memory use (assuming that many of them still fit in memory at once).