dask.dataframe.DataFrame.iterrows

DataFrame.iterrows()[source]

Iterate over DataFrame rows as (index, Series) pairs.

This docstring was copied from pandas.core.frame.DataFrame.iterrows.

Some inconsistencies with the Dask version may exist.

Yields
indexlabel or tuple of label

The index of the row. A tuple for a MultiIndex.

dataSeries

The data of the row as a Series.

See also

DataFrame.itertuples

Iterate over DataFrame rows as namedtuples of the values.

DataFrame.items

Iterate over (column name, Series) pairs.

Notes

  1. Because iterrows returns a Series for each row, it does not preserve dtypes across the rows (dtypes are preserved across columns for DataFrames). For example,

    >>> df = pd.DataFrame([[1, 1.5]], columns=['int', 'float'])  
    >>> row = next(df.iterrows())[1]  
    >>> row  
    int      1.0
    float    1.5
    Name: 0, dtype: float64
    >>> print(row['int'].dtype)  
    float64
    >>> print(df['int'].dtype)  
    int64
    

    To preserve dtypes while iterating over the rows, it is better to use itertuples() which returns namedtuples of the values and which is generally faster than iterrows.

  2. You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.