dask.dataframe.from_delayed

dask.dataframe.from_delayed#

Create Dask DataFrame from many Dask Delayed objects

Warning

from_delayed should only be used if the objects that create the data are complex and cannot be easily represented as a single function in an embarrassingly parallel fashion.

from_map is recommended if the query can be expressed as a single function like:

def read_xml(path):: return pd.read_xml(path)

ddf = dd.from_map(read_xml, paths)

from_delayed might be deprecated in the future.

Parameters:

dfs: A dask.delayed.Delayed, a distributed.Future, or an iterable of either of these objects, e.g. returned by client.submit. These comprise the individual partitions of the resulting dataframe. If a single object is provided (not an iterable), then the resulting dataframe will have only one partition.
$META
divisions: Partition boundaries along the index. For tuple, see https://docs.dask.org/en/latest/dataframe-design.html#partitions If None, then won’t use index information
prefix: Prefix to prepend to the keys.
verify_meta: If True check that the partitions have consistent metadata, defaults to True.

dask.dataframe.from_delayed

Contents

dask.dataframe.from_delayed#