dask.dataframe.DataFrame.explode

DataFrame.explode(column)[source]

Transform each element of a list-like to a row, replicating index values.

This docstring was copied from pandas.core.frame.DataFrame.explode.

Some inconsistencies with the Dask version may exist.

New in version 0.25.0.

Parameters
columnIndexLabel

Column(s) to explode. For multiple columns, specify a non-empty list with each element be str or tuple, and all specified columns their list-like data on same row of the frame must have matching length.

New in version 1.3.0: Multi-column explode

ignore_indexbool, default False (Not supported in Dask)

If True, the resulting index will be labeled 0, 1, …, n - 1.

New in version 1.1.0.

Returns
DataFrame

Exploded lists to rows of the subset columns; index will be duplicated for these rows.

Raises
ValueError :
  • If columns of the frame are not unique.

  • If specified columns to explode is empty list.

  • If specified columns to explode have not matching count of elements rowwise in the frame.

See also

DataFrame.unstack

Pivot a level of the (necessarily hierarchical) index labels.

DataFrame.melt

Unpivot a DataFrame from wide format to long format.

Series.explode

Explode a DataFrame from list-like columns to long format.

Notes

This routine will explode list-likes including lists, tuples, sets, Series, and np.ndarray. The result dtype of the subset rows will be object. Scalars will be returned unchanged, and empty list-likes will result in a np.nan for that row. In addition, the ordering of rows in the output will be non-deterministic when exploding sets.

Examples

>>> df = pd.DataFrame({'A': [[0, 1, 2], 'foo', [], [3, 4]],  
...                    'B': 1,
...                    'C': [['a', 'b', 'c'], np.nan, [], ['d', 'e']]})
>>> df  
           A  B          C
0  [0, 1, 2]  1  [a, b, c]
1        foo  1        NaN
2         []  1         []
3     [3, 4]  1     [d, e]

Single-column explode.

>>> df.explode('A')  
     A  B          C
0    0  1  [a, b, c]
0    1  1  [a, b, c]
0    2  1  [a, b, c]
1  foo  1        NaN
2  NaN  1         []
3    3  1     [d, e]
3    4  1     [d, e]

Multi-column explode.

>>> df.explode(list('AC'))  
     A  B    C
0    0  1    a
0    1  1    b
0    2  1    c
1  foo  1  NaN
2  NaN  1  NaN
3    3  1    d
3    4  1    e