dask_expr._collection.DataFrame.sample

dask_expr._collection.DataFrame.sample

DataFrame.sample(n=None, frac=None, replace=False, random_state=None)

Random sample of items

Parameters
nint, optional

Number of items to return is not supported by dask. Use frac instead.

fracfloat, optional

Approximate fraction of items to return. This sampling fraction is applied to all partitions equally. Note that this is an approximate fraction. You should not expect exactly len(df) * frac items to be returned, as the exact number of elements selected will depend on how your data is partitioned (but should be pretty close in practice).

replaceboolean, optional

Sample with or without replacement. Default = False.

random_stateint or np.random.RandomState

If an int, we create a new RandomState with this as the seed; Otherwise we draw from the passed RandomState.