dask.dataframe.DataFrame.random_split
dask.dataframe.DataFrame.random_split¶
- DataFrame.random_split(frac, random_state=None, shuffle=False)¶
Pseudorandomly split dataframe into different pieces row-wise
- Parameters
- fraclist
List of floats that should sum to one.
- random_stateint or np.random.RandomState
If int create a new RandomState with this as the seed. Otherwise draw from the passed RandomState.
- shufflebool, default False
If set to True, the dataframe is shuffled (within partition) before the split.
See also
dask.DataFrame.sample
Examples
50/50 split
>>> a, b = df.random_split([0.5, 0.5])
80/10/10 split, consistent random_state
>>> a, b, c = df.random_split([0.8, 0.1, 0.1], random_state=123)