dask.dataframe.api.GroupBy.first

dask.dataframe.api.GroupBy.first#

GroupBy.first(numeric_only=False, sort=None, **kwargs)[source]#

Compute the first entry of each column within each group.

This docstring was copied from pandas.core.groupby.groupby.GroupBy.first.

Some inconsistencies with the Dask version may exist.

Defaults to skipping NA elements.

Parameters:

numeric_onlybool, default False: Include only float, int, boolean columns.
min_countint, default -1 (Not supported in Dask): The required number of valid values to perform the operation. If fewer than min_count valid values are present the result will be NA.
skipnabool, default True (Not supported in Dask): Exclude NA/null values. If an entire group is NA, the result will be NA.

Added in version 2.2.1.

Returns:

Series or DataFrame: First values within each group.

See also

DataFrame.groupby: Apply a function groupby to each row or column of a DataFrame.
core.groupby.DataFrameGroupBy.last: Compute the last non-null entry of each column.
core.groupby.DataFrameGroupBy.nth: Take the nth row from each group.

Examples

>>> df = pd.DataFrame(
...     dict(
...         A=[1, 1, 3],
...         B=[None, 5, 6],
...         C=[1, 2, 3],
...         D=["3/11/2000", "3/12/2000", "3/13/2000"],
...     )
... )
>>> df["D"] = pd.to_datetime(df["D"])
>>> df.groupby("A").first()
     B  C          D
A
1  5.0  1 2000-03-11
3  6.0  3 2000-03-13
>>> df.groupby("A").first(min_count=2)
    B    C          D
A
1 NaN  1.0 2000-03-11
3 NaN  NaN        NaT
>>> df.groupby("A").first(numeric_only=True)
     B  C
A
1  5.0  1
3  6.0  3

dask.dataframe.api.GroupBy.first

Contents

dask.dataframe.api.GroupBy.first#