dask_expr.from_delayed
dask_expr.from_delayed¶
- dask_expr.from_delayed(dfs: Delayed | distributed.Future | Iterable[Delayed | distributed.Future], meta=None, divisions: tuple | None = None, prefix: str | None = None, verify_meta: bool = True)[source]¶
Create Dask DataFrame from many Dask Delayed objects
Warning
from_delayed
should only be used if the objects that create the data are complex and cannot be easily represented as a single function in an embarassingly parallel fashion.from_map
is recommended if the query can be expressed as a single function like:- def read_xml(path):
return pd.read_xml(path)
ddf = dd.from_map(read_xml, paths)
from_delayed
might be depreacted in the future.- Parameters
- dfs
A
dask.delayed.Delayed
, adistributed.Future
, or an iterable of either of these objects, e.g. returned byclient.submit
. These comprise the individual partitions of the resulting dataframe. If a single object is provided (not an iterable), then the resulting dataframe will have only one partition.- $META
- divisions
Partition boundaries along the index. For tuple, see https://docs.dask.org/en/latest/dataframe-design.html#partitions If None, then won’t use index information
- prefix
Prefix to prepend to the keys.
- verify_meta
If True check that the partitions have consistent metadata, defaults to True.