dask_expr._groupby.GroupBy.sum

dask_expr._groupby.GroupBy.sum

GroupBy.sum(numeric_only=False, min_count=None, **kwargs)[source]

Compute sum of group values.

This docstring was copied from pandas.core.groupby.groupby.GroupBy.sum.

Some inconsistencies with the Dask version may exist.

Parameters
numeric_onlybool, default False

Include only float, int, boolean columns.

Changed in version 2.0.0: numeric_only no longer accepts None.

min_countint, default 0

The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA.

enginestr, default None None (Not supported in Dask)
  • 'cython' : Runs rolling apply through C-extensions from cython.

  • 'numba'Runs rolling apply through JIT compiled code from numba.

    Only available when raw is set to True.

  • None : Defaults to 'cython' or globally setting compute.use_numba

engine_kwargsdict, default None None (Not supported in Dask)
  • For 'cython' engine, there are no accepted engine_kwargs

  • For 'numba' engine, the engine can accept nopython, nogil

    and parallel dictionary keys. The values must either be True or False. The default engine_kwargs for the 'numba' engine is {'nopython': True, 'nogil': False, 'parallel': False} and will be applied to both the func and the apply groupby aggregation.

Returns
Series or DataFrame

Computed sum of values within each group.

Examples

For SeriesGroupBy:

>>> lst = ['a', 'a', 'b', 'b']  
>>> ser = pd.Series([1, 2, 3, 4], index=lst)  
>>> ser  
a    1
a    2
b    3
b    4
dtype: int64
>>> ser.groupby(level=0).sum()  
a    3
b    7
dtype: int64

For DataFrameGroupBy:

>>> data = [[1, 8, 2], [1, 2, 5], [2, 5, 8], [2, 6, 9]]  
>>> df = pd.DataFrame(data, columns=["a", "b", "c"],  
...                   index=["tiger", "leopard", "cheetah", "lion"])
>>> df  
          a  b  c
  tiger   1  8  2
leopard   1  2  5
cheetah   2  5  8
   lion   2  6  9
>>> df.groupby("a").sum()  
     b   c
a
1   10   7
2   11  17