dask_expr._groupby.SeriesGroupBy.apply
dask_expr._groupby.SeriesGroupBy.apply¶
- SeriesGroupBy.apply(func, *args, meta=_NoDefault.no_default, shuffle_method=None, **kwargs)¶
Parallel version of pandas GroupBy.apply
This mimics the pandas version except for the following:
If the grouper does not align with the index then this causes a full shuffle. The order of rows within each group may not be preserved.
Dask’s GroupBy.apply is not appropriate for aggregations. For custom aggregations, use
dask.dataframe.groupby.Aggregation
.
Warning
Pandas’ groupby-apply can be used to to apply arbitrary functions, including aggregations that result in one row per group. Dask’s groupby-apply will apply
func
once on each group, doing a shuffle if needed, such that each group is contained in one partition. Whenfunc
is a reduction, e.g., you’ll end up with one row per group. To apply a custom aggregation with Dask, usedask.dataframe.groupby.Aggregation
.- Parameters
- func: function
Function to apply
- args, kwargsScalar, Delayed or object
Arguments and keywords to pass to the function.
- metapd.DataFrame, pd.Series, dict, iterable, tuple, optional
An empty
pd.DataFrame
orpd.Series
that matches the dtypes and column names of the output. This metadata is necessary for many algorithms in dask dataframe to work. For ease of use, some alternative inputs are also available. Instead of aDataFrame
, adict
of{name: dtype}
or iterable of(name, dtype)
can be provided (note that the order of the names should match the order of the columns). Instead of a series, a tuple of(name, dtype)
can be used. If not provided, dask will try to infer the metadata. This may lead to unexpected results, so providingmeta
is recommended. For more information, seedask.dataframe.utils.make_meta
.
- Returns
- appliedSeries or DataFrame depending on columns keyword