dask.dataframe.DataFrame.random_split

dask.dataframe.DataFrame.random_split

DataFrame.random_split(frac, random_state=None, shuffle=False)

Pseudorandomly split dataframe into different pieces row-wise

Parameters
fraclist

List of floats that should sum to one.

random_stateint or np.random.RandomState

If int create a new RandomState with this as the seed. Otherwise draw from the passed RandomState.

shufflebool, default False

If set to True, the dataframe is shuffled (within partition) before the split.

See also

dask.DataFrame.sample

Examples

50/50 split

>>> a, b = df.random_split([0.5, 0.5])  

80/10/10 split, consistent random_state

>>> a, b, c = df.random_split([0.8, 0.1, 0.1], random_state=123)