dask.dataframe.compute

dask.dataframe.compute(*args, traverse=True, optimize_graph=True, scheduler=None, get=None, **kwargs)[source]

Compute several dask collections at once.

Parameters
argsobject

Any number of objects. If it is a dask object, it’s computed and the result is returned. By default, python builtin collections are also traversed to look for dask objects (for more information see the traverse keyword). Non-dask arguments are passed through unchanged.

traversebool, optional

By default dask traverses builtin python collections looking for dask objects passed to compute. For large collections this can be expensive. If none of the arguments contain any dask objects, set traverse=False to avoid doing this traversal.

schedulerstring, optional

Which scheduler to use like “threads”, “synchronous” or “processes”. If not provided, the default is to check the global settings first, and then fall back to the collection defaults.

optimize_graphbool, optional

If True [default], the optimizations for each collection are applied before computation. Otherwise the graph is run as is. This can be useful for debugging.

getNone

Should be left to None The get= keyword has been removed.

kwargs

Extra keywords to forward to the scheduler function.

Examples

>>> import dask as d
>>> import dask.array as da
>>> a = da.arange(10, chunks=2).sum()
>>> b = da.arange(10, chunks=2).mean()
>>> d.compute(a, b)
(45, 4.5)

By default, dask objects inside python collections will also be computed:

>>> d.compute({'a': a, 'b': b, 'c': 1})
({'a': 45, 'b': 4.5, 'c': 1},)