dask.dataframe.Index.nlargest

dask.dataframe.Index.nlargest

Index.nlargest(n=5, split_every=None)

Return the largest n elements.

This docstring was copied from pandas.core.series.Series.nlargest.

Some inconsistencies with the Dask version may exist.

Parameters
nint, default 5

Return this many descending sorted values.

keep{‘first’, ‘last’, ‘all’}, default ‘first’ (Not supported in Dask)

When there are duplicate values that cannot all fit in a Series of n elements:

  • first : return the first n occurrences in order of appearance.

  • last : return the last n occurrences in reverse order of appearance.

  • all : keep all occurrences. This can result in a Series of size larger than n.

Returns
Series

The n largest values in the Series, sorted in decreasing order.

See also

Series.nsmallest

Get the n smallest elements.

Series.sort_values

Sort Series by values.

Series.head

Return the first n rows.

Notes

Faster than .sort_values(ascending=False).head(n) for small n relative to the size of the Series object.

Examples

>>> countries_population = {"Italy": 59000000, "France": 65000000,  
...                         "Malta": 434000, "Maldives": 434000,
...                         "Brunei": 434000, "Iceland": 337000,
...                         "Nauru": 11300, "Tuvalu": 11300,
...                         "Anguilla": 11300, "Montserrat": 5200}
>>> s = pd.Series(countries_population)  
>>> s  
Italy       59000000
France      65000000
Malta         434000
Maldives      434000
Brunei        434000
Iceland       337000
Nauru          11300
Tuvalu         11300
Anguilla       11300
Montserrat      5200
dtype: int64

The n largest elements where n=5 by default.

>>> s.nlargest()  
France      65000000
Italy       59000000
Malta         434000
Maldives      434000
Brunei        434000
dtype: int64

The n largest elements where n=3. Default keep value is ‘first’ so Malta will be kept.

>>> s.nlargest(3)  
France    65000000
Italy     59000000
Malta       434000
dtype: int64

The n largest elements where n=3 and keeping the last duplicates. Brunei will be kept since it is the last with value 434000 based on the index order.

>>> s.nlargest(3, keep='last')  
France      65000000
Italy       59000000
Brunei        434000
dtype: int64

The n largest elements where n=3 with all duplicates kept. Note that the returned Series has five elements due to the three duplicates.

>>> s.nlargest(3, keep='all')  
France      65000000
Italy       59000000
Malta         434000
Maldives      434000
Brunei        434000
dtype: int64