dask_expr.from_delayed

dask_expr.from_delayed

dask_expr.from_delayed(dfs: Delayed | distributed.Future | Iterable[Delayed | distributed.Future], meta=None, divisions: tuple | None = None, prefix: str | None = None, verify_meta: bool = True)[source]

Create Dask DataFrame from many Dask Delayed objects

Warning

from_delayed should only be used if the objects that create the data are complex and cannot be easily represented as a single function in an embarassingly parallel fashion.

from_map is recommended if the query can be expressed as a single function like:

def read_xml(path):

return pd.read_xml(path)

ddf = dd.from_map(read_xml, paths)

from_delayed might be depreacted in the future.

Parameters
dfs

A dask.delayed.Delayed, a distributed.Future, or an iterable of either of these objects, e.g. returned by client.submit. These comprise the individual partitions of the resulting dataframe. If a single object is provided (not an iterable), then the resulting dataframe will have only one partition.

$META
divisions

Partition boundaries along the index. For tuple, see https://docs.dask.org/en/latest/dataframe-design.html#partitions If None, then won’t use index information

prefix

Prefix to prepend to the keys.

verify_meta

If True check that the partitions have consistent metadata, defaults to True.