dask_expr._collection.Index.value_counts
dask_expr._collection.Index.value_counts¶
- Index.value_counts(sort=None, ascending=False, dropna=True, normalize=False, split_every=None, split_out=_NoDefault.no_default)¶
Return a Series containing counts of unique values.
This docstring was copied from pandas.core.series.Series.value_counts.
Some inconsistencies with the Dask version may exist.
The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.
- Parameters
- normalizebool, default False
If True then the object returned will contain the relative frequencies of the unique values.
- sortbool, default True
Sort by frequencies when True. Preserve the order of the data when False.
- ascendingbool, default False
Sort in ascending order.
- binsint, optional (Not supported in Dask)
Rather than count values, group them into half-open bins, a convenience for
pd.cut
, only works with numeric data.- dropnabool, default True
Don’t include counts of NaN.
- Returns
- Series
See also
Series.count
Number of non-NA elements in a Series.
DataFrame.count
Number of non-NA elements in a DataFrame.
DataFrame.value_counts
Equivalent method on DataFrames.
Examples
>>> index = pd.Index([3, 1, 2, 3, 4, np.nan]) >>> index.value_counts() 3.0 2 1.0 1 2.0 1 4.0 1 Name: count, dtype: int64
With normalize set to True, returns the relative frequency by dividing all values by the sum of values.
>>> s = pd.Series([3, 1, 2, 3, 4, np.nan]) >>> s.value_counts(normalize=True) 3.0 0.4 1.0 0.2 2.0 0.2 4.0 0.2 Name: proportion, dtype: float64
bins
Bins can be useful for going from a continuous variable to a categorical variable; instead of counting unique apparitions of values, divide the index in the specified number of half-open bins.
>>> s.value_counts(bins=3) (0.996, 2.0] 2 (2.0, 3.0] 2 (3.0, 4.0] 1 Name: count, dtype: int64
dropna
With dropna set to False we can also see NaN index values.
>>> s.value_counts(dropna=False) 3.0 2 1.0 1 2.0 1 4.0 1 NaN 1 Name: count, dtype: int64