dask.array.core.blockwise
dask.array.core.blockwise¶
- dask.array.core.blockwise(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)[source]¶
Tensor operation: Generalized inner and outer products
A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The
blockwise
function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.- Parameters
- funccallable
Function to apply to individual tuples of blocks
- out_inditerable
Block pattern of the output, something like ‘ijk’ or (1, 2, 3)
- *argssequence of Array, index pairs
Sequence like (x, ‘ij’, y, ‘jk’, z, ‘i’)
- **kwargsdict
Extra keyword arguments to pass to function
- dtypenp.dtype
Datatype of resulting array.
- concatenatebool, keyword only
If true concatenate arrays along dummy indices, else provide lists
- adjust_chunksdict
Dictionary mapping index to function to be applied to chunk sizes
- new_axesdict, keyword only
New indexes and their dimension lengths
- align_arrays: bool
Whether or not to align chunks along equally sized dimensions when multiple arrays are provided. This allows for larger chunks in some arrays to be broken into smaller ones that match chunk sizes in other arrays such that they are compatible for block function mapping. If this is false, then an error will be thrown if arrays do not already have the same number of blocks in each dimension.
Examples
2D embarrassingly parallel operation from two arrays, x, and y.
>>> import operator, numpy as np, dask.array as da >>> x = da.from_array([[1, 2], ... [3, 4]], chunks=(1, 2)) >>> y = da.from_array([[10, 20], ... [0, 0]]) >>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8') >>> z.compute() array([[11, 22], [ 3, 4]])
Outer product multiplying a by b, two 1-d vectors
>>> a = da.from_array([0, 1, 2], chunks=1) >>> b = da.from_array([10, 50, 100], chunks=1) >>> z = blockwise(np.outer, 'ij', a, 'i', b, 'j', dtype='f8') >>> z.compute() array([[ 0, 0, 0], [ 10, 50, 100], [ 20, 100, 200]])
z = x.T
>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype) >>> z.compute() array([[1, 3], [2, 4]])
The transpose case above is illustrative because it does transposition both on each in-memory block by calling
np.transpose
and on the order of the blocks themselves, by switching the order of the indexij -> ji
.We can compose these same patterns with more variables and more complex in-memory functions
z = X + Y.T
>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8') >>> z.compute() array([[11, 2], [23, 4]])
Any index, like
i
missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead passconcatenate=True
.Inner product multiplying a by b, two 1-d vectors
>>> def sequence_dot(a_blocks, b_blocks): ... result = 0 ... for a, b in zip(a_blocks, b_blocks): ... result += a.dot(b) ... return result
>>> z = blockwise(sequence_dot, '', a, 'i', b, 'i', dtype='f8') >>> z.compute() 250
Add new single-chunk dimensions with the
new_axes=
keyword, including the length of the new dimension. New dimensions will always be in a single chunk.>>> def f(a): ... return a[:, None] * np.ones((1, 5))
>>> z = blockwise(f, 'az', a, 'a', new_axes={'z': 5}, dtype=a.dtype)
New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see
da.map_blocks
).>>> z = blockwise(f, 'az', a, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype) >>> z.chunks ((1, 1, 1), (5, 5))
If the applied function changes the size of each chunk you can specify this with a
adjust_chunks={...}
dictionary holding a function for each index that modifies the dimension size in that index.>>> def double(x): ... return np.concatenate([x, x])
>>> y = blockwise(double, 'ij', x, 'ij', ... adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype) >>> y.chunks ((2, 2), (2,))
Include literals by indexing with None
>>> z = blockwise(operator.add, 'ij', x, 'ij', 1234, None, dtype=x.dtype) >>> z.compute() array([[1235, 1236], [1237, 1238]])