dask.dataframe.groupby.SeriesGroupBy.first

dask.dataframe.groupby.SeriesGroupBy.first

SeriesGroupBy.first(split_every=None, split_out=1, shuffle_method=None, numeric_only=_NoDefault.no_default)

Compute the first entry of each column within each group.

This docstring was copied from pandas.core.groupby.groupby.GroupBy.first.

Some inconsistencies with the Dask version may exist.

Defaults to skipping NA elements.

Parameters
numeric_onlybool, default False

Include only float, int, boolean columns.

min_countint, default -1 (Not supported in Dask)

The required number of valid values to perform the operation. If fewer than min_count valid values are present the result will be NA.

skipnabool, default True (Not supported in Dask)

Exclude NA/null values. If an entire row/column is NA, the result will be NA.

New in version 2.2.1.

Returns
Series or DataFrame

First values within each group.

See also

DataFrame.groupby

Apply a function groupby to each row or column of a DataFrame.

pandas.core.groupby.DataFrameGroupBy.last

Compute the last non-null entry of each column.

pandas.core.groupby.DataFrameGroupBy.nth

Take the nth row from each group.

Examples

>>> df = pd.DataFrame(dict(A=[1, 1, 3], B=[None, 5, 6], C=[1, 2, 3],  
...                        D=['3/11/2000', '3/12/2000', '3/13/2000']))
>>> df['D'] = pd.to_datetime(df['D'])  
>>> df.groupby("A").first()  
     B  C          D
A
1  5.0  1 2000-03-11
3  6.0  3 2000-03-13
>>> df.groupby("A").first(min_count=2)  
    B    C          D
A
1 NaN  1.0 2000-03-11
3 NaN  NaN        NaT
>>> df.groupby("A").first(numeric_only=True)  
     B  C
A
1  5.0  1
3  6.0  3