DataFrame.analyze(filename: str | None = None, format: str | None = None) None

Outputs statistics about every node in the expression.

analyze optimizes the expression and triggers a computation. It records statistics like memory usage per partition to analyze how data flow through the graph.


analyze adds plugins to the scheduler and the workers that have a non-trivial cost. This method should not be used in production workflows.

filename: str, None

File to store the graph representation.

format: str, default is png

File format for the graph representation.

None, but writes a graph representation of the expression enriched with
statistics to disk.