dask_expr._collection.DataFrame.analyze

dask_expr._collection.DataFrame.analyze

DataFrame.analyze(filename: str | None = None, format: str | None = None) None

Outputs statistics about every node in the expression.

analyze optimizes the expression and triggers a computation. It records statistics like memory usage per partition to analyze how data flow through the graph.

Warning

analyze adds plugins to the scheduler and the workers that have a non-trivial cost. This method should not be used in production workflows.

Parameters
filename: str, None

File to store the graph representation.

format: str, default is png

File format for the graph representation.

Returns
None, but writes a graph representation of the expression enriched with
statistics to disk.