dask.dataframe.groupby.DataFrameGroupBy.mean
dask.dataframe.groupby.DataFrameGroupBy.mean¶
- DataFrameGroupBy.mean(split_every=None, split_out=1, shuffle_method=None, numeric_only=_NoDefault.no_default)¶
Compute mean of groups, excluding missing values.
This docstring was copied from pandas.core.groupby.groupby.GroupBy.mean.
Some inconsistencies with the Dask version may exist.
- Parameters
- numeric_onlybool, default False
Include only float, int, boolean columns.
Changed in version 2.0.0: numeric_only no longer accepts
None
and defaults toFalse
.- enginestr, default None (Not supported in Dask)
'cython'
: Runs the operation through C-extensions from cython.'numba'
: Runs the operation through JIT compiled code from numba.None
: Defaults to'cython'
or globally settingcompute.use_numba
New in version 1.4.0.
- engine_kwargsdict, default None (Not supported in Dask)
For
'cython'
engine, there are no acceptedengine_kwargs
For
'numba'
engine, the engine can acceptnopython
,nogil
andparallel
dictionary keys. The values must either beTrue
orFalse
. The defaultengine_kwargs
for the'numba'
engine is{{'nopython': True, 'nogil': False, 'parallel': False}}
New in version 1.4.0.
- Returns
- pandas.Series or pandas.DataFrame
See also
Series.groupby
Apply a function groupby to a Series.
DataFrame.groupby
Apply a function groupby to each row or column of a DataFrame.
Examples
>>> df = pd.DataFrame({'A': [1, 1, 2, 1, 2], ... 'B': [np.nan, 2, 3, 4, 5], ... 'C': [1, 2, 1, 1, 2]}, columns=['A', 'B', 'C'])
Groupby one column and return the mean of the remaining columns in each group.
>>> df.groupby('A').mean() B C A 1 3.0 1.333333 2 4.0 1.500000
Groupby two columns and return the mean of the remaining column.
>>> df.groupby(['A', 'B']).mean() C A B 1 2.0 2.0 4.0 1.0 2 3.0 1.0 5.0 2.0
Groupby one column and return the mean of only particular column in the group.
>>> df.groupby('A')['B'].mean() A 1 3.0 2 4.0 Name: B, dtype: float64