dask.dataframe.DataFrame.cumsum

dask.dataframe.DataFrame.cumsum

DataFrame.cumsum(axis=None, skipna=True, dtype=None, out=None)

Return cumulative sum over a DataFrame or Series axis.

This docstring was copied from pandas.core.frame.DataFrame.cumsum.

Some inconsistencies with the Dask version may exist.

Returns a DataFrame or Series of the same size containing the cumulative sum.

Parameters
axis{0 or ‘index’, 1 or ‘columns’}, default 0

The index or the name of the axis. 0 is equivalent to None or ‘index’. For Series this parameter is unused and defaults to 0.

skipnabool, default True

Exclude NA/null values. If an entire row/column is NA, the result will be NA.

*args, **kwargs

Additional keywords have no effect but might be accepted for compatibility with NumPy.

Returns
Series or DataFrame

Return cumulative sum of Series or DataFrame.

See also

core.window.expanding.Expanding.sum

Similar functionality but ignores NaN values.

DataFrame.sum

Return the sum over DataFrame axis.

DataFrame.cummax

Return cumulative maximum over DataFrame axis.

DataFrame.cummin

Return cumulative minimum over DataFrame axis.

DataFrame.cumsum

Return cumulative sum over DataFrame axis.

DataFrame.cumprod

Return cumulative product over DataFrame axis.

Examples

Series

>>> s = pd.Series([2, np.nan, 5, -1, 0])  
>>> s  
0    2.0
1    NaN
2    5.0
3   -1.0
4    0.0
dtype: float64

By default, NA values are ignored.

>>> s.cumsum()  
0    2.0
1    NaN
2    7.0
3    6.0
4    6.0
dtype: float64

To include NA values in the operation, use skipna=False

>>> s.cumsum(skipna=False)  
0    2.0
1    NaN
2    NaN
3    NaN
4    NaN
dtype: float64

DataFrame

>>> df = pd.DataFrame([[2.0, 1.0],  
...                    [3.0, np.nan],
...                    [1.0, 0.0]],
...                   columns=list('AB'))
>>> df  
     A    B
0  2.0  1.0
1  3.0  NaN
2  1.0  0.0

By default, iterates over rows and finds the sum in each column. This is equivalent to axis=None or axis='index'.

>>> df.cumsum()  
     A    B
0  2.0  1.0
1  5.0  NaN
2  6.0  1.0

To iterate over columns and find the sum in each row, use axis=1

>>> df.cumsum(axis=1)  
     A    B
0  2.0  3.0
1  3.0  NaN
2  1.0  1.0