dask.dataframe.groupby.DataFrameGroupBy.idxmax

DataFrameGroupBy.idxmax(split_every=None, split_out=1, axis=None, skipna=True)

Return index of first occurrence of maximum over requested axis.

This docstring was copied from pandas.core.frame.DataFrame.idxmax.

Some inconsistencies with the Dask version may exist.

NA/null values are excluded.

Parameters
axis{0 or ‘index’, 1 or ‘columns’}, default 0

The axis to use. 0 or ‘index’ for row-wise, 1 or ‘columns’ for column-wise.

skipnabool, default True

Exclude NA/null values. If an entire row/column is NA, the result will be NA.

Returns
Series

Indexes of maxima along the specified axis.

Raises
ValueError
  • If the row/column is empty

See also

Series.idxmax

Return index of the maximum element.

Notes

This method is the DataFrame version of ndarray.argmax.

Examples

Consider a dataset containing food consumption in Argentina.

>>> df = pd.DataFrame({'consumption': [10.51, 103.11, 55.48],  
...                    'co2_emissions': [37.2, 19.66, 1712]},
...                    index=['Pork', 'Wheat Products', 'Beef'])
>>> df  
                consumption  co2_emissions
Pork                  10.51         37.20
Wheat Products       103.11         19.66
Beef                  55.48       1712.00

By default, it returns the index for the maximum value in each column.

>>> df.idxmax()  
consumption     Wheat Products
co2_emissions             Beef
dtype: object

To return the index for the maximum value in each row, use axis="columns".

>>> df.idxmax(axis="columns")  
Pork              co2_emissions
Wheat Products     consumption
Beef              co2_emissions
dtype: object