dask.array.blockwise

dask.array.blockwise#

dask.array.blockwise(func, out_ind, *args, name=None, token=None, dtype=None, adjust_chunks=None, new_axes=None, align_arrays=True, concatenate=None, meta=None, **kwargs)[source]#

Tensor operation: Generalized inner and outer products

A broad class of blocked algorithms and patterns can be specified with a concise multi-index notation. The blockwise function applies an in-memory function across multiple blocks of multiple inputs in a variety of ways. Many dask.array operations are special cases of blockwise including elementwise, broadcasting, reductions, tensordot, and transpose.

Parameters:

funccallable: Function to apply to individual tuples of blocks
out_inditerable: Block pattern of the output, something like ‘ijk’ or (1, 2, 3)
*argssequence of Array, index pairs: You may also pass literal arguments, accompanied by None index e.g. (x, ‘ij’, y, ‘jk’, z, ‘i’, some_literal, None)
**kwargsdict: Extra keyword arguments to pass to function
dtypenp.dtype: Datatype of resulting array.
concatenatebool, keyword only: If true concatenate arrays along dummy indices, else provide lists
adjust_chunksdict: Dictionary mapping index to function to be applied to chunk sizes
new_axesdict, keyword only: New indexes and their dimension lengths
align_arrays: bool: Whether or not to align chunks along equally sized dimensions when multiple arrays are provided. This allows for larger chunks in some arrays to be broken into smaller ones that match chunk sizes in other arrays such that they are compatible for block function mapping. If this is false, then an error will be thrown if arrays do not already have the same number of blocks in each dimension.

Examples

2D embarrassingly parallel operation from two arrays, x, and y.

>>> import operator, numpy as np, dask.array as da
>>> x = da.from_array([[1, 2],
...                    [3, 4]], chunks=(1, 2))
>>> y = da.from_array([[10, 20],
...                    [0, 0]])
>>> z = blockwise(operator.add, 'ij', x, 'ij', y, 'ij', dtype='f8')
>>> z.compute()
array([[11, 22],
       [ 3,  4]])

Outer product multiplying a by b, two 1-d vectors

>>> a = da.from_array([0, 1, 2], chunks=1)
>>> b = da.from_array([10, 50, 100], chunks=1)
>>> z = blockwise(np.outer, 'ij', a, 'i', b, 'j', dtype='f8')
>>> z.compute()
array([[  0,   0,   0],
       [ 10,  50, 100],
       [ 20, 100, 200]])

z = x.T

>>> z = blockwise(np.transpose, 'ji', x, 'ij', dtype=x.dtype)
>>> z.compute()
array([[1, 3],
       [2, 4]])

The transpose case above is illustrative because it does transposition both on each in-memory block by calling np.transpose and on the order of the blocks themselves, by switching the order of the index ij -> ji.

We can compose these same patterns with more variables and more complex in-memory functions

z = X + Y.T

>>> z = blockwise(lambda x, y: x + y.T, 'ij', x, 'ij', y, 'ji', dtype='f8')
>>> z.compute()
array([[11,  2],
       [23,  4]])

Any index, like i missing from the output index is interpreted as a contraction (note that this differs from Einstein convention; repeated indices do not imply contraction.) In the case of a contraction the passed function should expect an iterable of blocks on any array that holds that index. To receive arrays concatenated along contracted dimensions instead pass concatenate=True.

Inner product multiplying a by b, two 1-d vectors

>>> def sequence_dot(a_blocks, b_blocks):
...     result = 0
...     for a, b in zip(a_blocks, b_blocks):
...         result += a.dot(b)
...     return result

>>> z = blockwise(sequence_dot, '', a, 'i', b, 'i', dtype='f8')
>>> z.compute()
np.int64(250)

Add new single-chunk dimensions with the new_axes= keyword, including the length of the new dimension. New dimensions will always be in a single chunk.

>>> def f(a):
...     return a[:, None] * np.ones((1, 5))

>>> z = blockwise(f, 'az', a, 'a', new_axes={'z': 5}, dtype=a.dtype)

New dimensions can also be multi-chunk by specifying a tuple of chunk sizes. This has limited utility as is (because the chunks are all the same), but the resulting graph can be modified to achieve more useful results (see da.map_blocks).

>>> z = blockwise(f, 'az', a, 'a', new_axes={'z': (5, 5)}, dtype=x.dtype)
>>> z.chunks
((1, 1, 1), (5, 5))

If the applied function changes the size of each chunk you can specify this with a adjust_chunks={...} dictionary holding a function for each index that modifies the dimension size in that index.

>>> def double(x):
...     return np.concatenate([x, x])

>>> y = blockwise(double, 'ij', x, 'ij',
...               adjust_chunks={'i': lambda n: 2 * n}, dtype=x.dtype)
>>> y.chunks
((2, 2), (2,))

Include literals by indexing with None

>>> z = blockwise(operator.add, 'ij', x, 'ij', 1234, None, dtype=x.dtype)
>>> z.compute()
array([[1235, 1236],
       [1237, 1238]])

dask.array.blockwise

Contents

dask.array.blockwise#