dask.dataframe.from_delayed
dask.dataframe.from_delayed¶
- dask.dataframe.from_delayed(dfs: Delayed | distributed.Future | Iterable[Delayed | distributed.Future], meta=None, divisions: tuple | Literal['sorted'] | None = None, prefix: str = 'from-delayed', verify_meta: bool = True) DataFrame | Series [source]¶
Create Dask DataFrame from many Dask Delayed objects.
Warning
from_delayed
should only be used if the objects that create the data are complex and cannot be easily represented as a single function in an embarassingly parallel fashion.from_map
is recommended if the query can be expressed as a single function like:- def read_xml(path):
return pd.read_xml(path)
ddf = dd.from_map(read_xml, paths)
from_delayed
might be depreacted in the future.- Parameters
- dfs
A
dask.delayed.Delayed
, adistributed.Future
, or an iterable of either of these objects, e.g. returned byclient.submit
. These comprise the individual partitions of the resulting dataframe. If a single object is provided (not an iterable), then the resulting dataframe will have only one partition.- metapd.DataFrame, pd.Series, dict, iterable, tuple, optional
An empty
pd.DataFrame
orpd.Series
that matches the dtypes and column names of the output. This metadata is necessary for many algorithms in dask dataframe to work. For ease of use, some alternative inputs are also available. Instead of aDataFrame
, adict
of{name: dtype}
or iterable of(name, dtype)
can be provided (note that the order of the names should match the order of the columns). Instead of a series, a tuple of(name, dtype)
can be used. If not provided, dask will try to infer the metadata. This may lead to unexpected results, so providingmeta
is recommended. For more information, seedask.dataframe.utils.make_meta
.- divisions
Partition boundaries along the index. For tuple, see https://docs.dask.org/en/latest/dataframe-design.html#partitions For string ‘sorted’ will compute the delayed values to find index values. Assumes that the indexes are mutually sorted. If None, then won’t use index information
- prefix
Prefix to prepend to the keys.
- verify_meta
If True check that the partitions have consistent metadata, defaults to True.