dask.dataframe.DataFrame.var

dask.dataframe.DataFrame.var#

DataFrame.var(axis=0, skipna=True, ddof=1, numeric_only=False, split_every=False, **kwargs)#

Return unbiased variance over requested axis.

This docstring was copied from pandas.DataFrame.var.

Some inconsistencies with the Dask version may exist.

Normalized by N-1 by default. This can be changed using the ddof argument.

Parameters:

axis{index (0), columns (1)}: For Series this parameter is unused and defaults to 0.

Warning

The behavior of DataFrame.var with axis=None is deprecated, in a future version this will reduce over both axes and return a scalar To retain the old behavior, pass axis=0 (or do not pass axis).
skipnabool, default True: Exclude NA/null values. If an entire row/column is NA, the result will be NA.
ddofint, default 1: Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.
numeric_onlybool, default False: Include only float, int, boolean columns. Not implemented for Series.
**kwargs: Additional keywords passed.

Returns:

Series or scalaer: Unbiased variance over requested axis.

See also

numpy.var: Equivalent function in NumPy.
Series.var: Return unbiased variance over Series values.
Series.std: Return standard deviation over Series values.
DataFrame.std: Return standard deviation of the values over the requested axis.

Examples

>>> df = pd.DataFrame(
...     {
...         "person_id": [0, 1, 2, 3],
...         "age": [21, 25, 62, 43],
...         "height": [1.61, 1.87, 1.49, 2.01],
...     }
... ).set_index("person_id")
>>> df
           age  height
person_id
0           21    1.61
1           25    1.87
2           62    1.49
3           43    2.01

>>> df.var()
age       352.916667
height      0.056367
dtype: float64

Alternatively, ddof=0 can be set to normalize by N instead of N-1:

>>> df.var(ddof=0)
age       264.687500
height      0.042275
dtype: float64

dask.dataframe.DataFrame.var

Contents

dask.dataframe.DataFrame.var#