dask.dataframe.Series.nunique_approx

dask.dataframe.Series.nunique_approx

Series.nunique_approx(split_every=None)

Approximate number of unique rows.

This method uses the HyperLogLog algorithm for cardinality estimation to compute the approximate number of unique rows. The approximate error is 0.406%.

Parameters
split_everyint, optional

Group partitions into groups of this size while performing a tree-reduction. If set to False, no tree-reduction will be used. Default is 8.

Returns
a float representing the approximate number of elements