dask.dataframe.DataFrame.sum
dask.dataframe.DataFrame.sum¶
- DataFrame.sum(axis=None, skipna=True, split_every=False, dtype=None, out=None, min_count=None, numeric_only=None)¶
Return the sum of the values over the requested axis.
This docstring was copied from pandas.core.frame.DataFrame.sum.
Some inconsistencies with the Dask version may exist.
This is equivalent to the method
numpy.sum
.- Parameters
- axis{index (0), columns (1)}
Axis for the function to be applied on. For Series this parameter is unused and defaults to 0.
Warning
The behavior of DataFrame.sum with
axis=None
is deprecated, in a future version this will reduce over both axes and return a scalar To retain the old behavior, pass axis=0 (or do not pass axis).New in version 2.0.0.
- skipnabool, default True
Exclude NA/null values when computing the result.
- numeric_onlybool, default False
Include only float, int, boolean columns. Not implemented for Series.
- min_countint, default 0
The required number of valid values to perform the operation. If fewer than
min_count
non-NA values are present the result will be NA.- **kwargs
Additional keyword arguments to be passed to the function.
- Returns
- Series or scalar
See also
Series.sum
Return the sum.
Series.min
Return the minimum.
Series.max
Return the maximum.
Series.idxmin
Return the index of the minimum.
Series.idxmax
Return the index of the maximum.
DataFrame.sum
Return the sum over the requested axis.
DataFrame.min
Return the minimum over the requested axis.
DataFrame.max
Return the maximum over the requested axis.
DataFrame.idxmin
Return the index of the minimum over the requested axis.
DataFrame.idxmax
Return the index of the maximum over the requested axis.
Examples
>>> idx = pd.MultiIndex.from_arrays([ ... ['warm', 'warm', 'cold', 'cold'], ... ['dog', 'falcon', 'fish', 'spider']], ... names=['blooded', 'animal']) >>> s = pd.Series([4, 2, 0, 8], name='legs', index=idx) >>> s blooded animal warm dog 4 falcon 2 cold fish 0 spider 8 Name: legs, dtype: int64
>>> s.sum() 14
By default, the sum of an empty or all-NA Series is
0
.>>> pd.Series([], dtype="float64").sum() # min_count=0 is the default 0.0
This can be controlled with the
min_count
parameter. For example, if you’d like the sum of an empty series to be NaN, passmin_count=1
.>>> pd.Series([], dtype="float64").sum(min_count=1) nan
Thanks to the
skipna
parameter,min_count
handles all-NA and empty series identically.>>> pd.Series([np.nan]).sum() 0.0
>>> pd.Series([np.nan]).sum(min_count=1) nan