dask.dataframe.groupby.DataFrameGroupBy.mean

dask.dataframe.groupby.DataFrameGroupBy.mean¶

DataFrameGroupBy.mean(split_every=None, split_out=1, shuffle_method=None, numeric_only=_NoDefault.no_default)¶

Compute mean of groups, excluding missing values.

This docstring was copied from pandas.core.groupby.groupby.GroupBy.mean.

Some inconsistencies with the Dask version may exist.

Parameters

numeric_onlybool, default False

Include only float, int, boolean columns.

Changed in version 2.0.0: numeric_only no longer accepts None and defaults to False.

enginestr, default None (Not supported in Dask)

'cython' : Runs the operation through C-extensions from cython.
'numba' : Runs the operation through JIT compiled code from numba.
None : Defaults to 'cython' or globally setting compute.use_numba

New in version 1.4.0.

engine_kwargsdict, default None (Not supported in Dask)

For 'cython' engine, there are no accepted engine_kwargs
For 'numba' engine, the engine can accept nopython, nogil and parallel dictionary keys. The values must either be True or False. The default engine_kwargs for the 'numba' engine is {{'nopython': True, 'nogil': False, 'parallel': False}}

New in version 1.4.0.

Returns

pandas.Series or pandas.DataFrame

See also

Series.groupby: Apply a function groupby to a Series.
DataFrame.groupby: Apply a function groupby to each row or column of a DataFrame.

Examples

>>> df = pd.DataFrame({'A': [1, 1, 2, 1, 2],  
...                    'B': [np.nan, 2, 3, 4, 5],
...                    'C': [1, 2, 1, 1, 2]}, columns=['A', 'B', 'C'])

Groupby one column and return the mean of the remaining columns in each group.

>>> df.groupby('A').mean()  
     B         C
A
1  3.0  1.333333
2  4.0  1.500000

Groupby two columns and return the mean of the remaining column.

>>> df.groupby(['A', 'B']).mean()  
         C
A B
1 2.0  2.0
  4.0  1.0
2 3.0  1.0
  5.0  2.0

Groupby one column and return the mean of only particular column in the group.

>>> df.groupby('A')['B'].mean()  
A
1    3.0
2    4.0
Name: B, dtype: float64

dask.dataframe.groupby.DataFrameGroupBy.max

dask.dataframe.groupby.DataFrameGroupBy.min