dask_expr._collection.DataFrame.divisions
dask_expr._collection.DataFrame.divisions¶
- property DataFrame.divisions¶
Tuple of
npartitions + 1
values, in ascending order, marking the lower/upper bounds of each partition’s index. Divisions allow Dask to know which partition will contain a given value, significantly speeding up operations like loc, merge, and groupby by not having to search the full dataset.Example: for
divisions = (0, 10, 50, 100)
, there are three partitions, where the index in each partition contains values [0, 10), [10, 50), and [50, 100], respectively. Dask therefore knowsdf.loc[45]
will be in the second partition.When every item in
divisions
isNone
, the divisions are unknown. Most operations can still be performed, but some will be much slower, and a few may fail.It is not supported to set
divisions
directly. Instead, useset_index
, which sorts and splits the data as needed. See https://docs.dask.org/en/latest/dataframe-design.html#partitions.