Dashboard Diagnostics
Contents
Dashboard Diagnostics¶
Profiling parallel code can be challenging, but the Dask distributed scheduler
provides live feedback via its interactive dashboard. A link that redirects to the dashboard will prompt
in the terminal where the scheduler is created, and it is also shown when you create a Client
and connect
the scheduler.
from dask.distributed import Client
client = Client() # start distributed scheduler locally.
client
In a Jupyter Notebook or JupyterLab session displaying the client object will show the dashboard address as following:

The address of the dashboard will be displayed if you are in a Jupyter Notebook,
or, if you are in a terminal or IPython, it can be queried from client.dashboard_link
. By default, when starting a scheduler
on your local machine the dashboard will be served at http://localhost:8787/status
. You
can type this address into your browser to access the dashboard, but may be directed
elsewhere if port 8787 is taken. You can also configure the address by passing options to the
scheduler, see dashboard_address
in LocalCluster
The dashboard link redirects you to the entry point of the dashboard with information on:
Bytes Stored and Bytes per Worker: Cluster memory and Memory per worker
Task Processing/CPU Utilization/Occupancy: Tasks being processed by each worker/ CPU Utilization per worker/ Expected runtime for all tasks currently on a worker.
Task Stream: Individual task across threads.
Progress: Progress of a set of tasks.

Bytes Stored and Bytes per Worker¶
These two plots show a summary of the overall memory usage on the cluster (Bytes Stored), as well as the individual usage on each worker (Bytes per Worker). The colors on these plots indicate the following.
■
|
Memory under target (default 60% of memory available) |
■
|
Memory is close to the spilling to disk target (default 70% of memory available) |
■
|
Memory spilled to disk |

The different levels of transparency on these plot is related to the type of memory (Managed, Unmanaged and Unmanaged recent), and you can find a detailed explanation of them in the Worker Memory management documentation
Task Processing/CPU Utilization/Occupancy¶
Task Processing
The Processing tab in the figure shows the number of tasks being processed by each worker with the blue bar. The scheduler will try to ensure that the workers are processing the same number of tasks. If one of the bars is completely white it means that worker has no tasks and its waiting for them. This usually happens when the computations are close to finished (nothing to worry about), but it can also mean that the distribution of the task across workers is not optimized.
There are three different colors that can appear in this plot:
■
|
Processing tasks. |
■
|
Saturated: It has enough work to stay busy. |
■
|
Idle: Does not have enough work to stay busy. |

In this plot on the dashboard we have two extra tabs with the following information:
CPU Utilization
The CPU tab shows the cpu usage per-worker as reported by psutils
metrics.
Occupancy
The Occupancy tab shows the occupancy, in time, per worker. The total occupancy for a worker is the total expected runtime for all tasks currently on a worker. For example, an occupancy of 10s means an occupancy of 10s means that the worker estimates it will take 10s to execute all the tasks it has currently been assigned.
Task Stream¶
The task stream is a view of all the tasks across worker-threads. Each row represents a thread and each rectangle represents an individual tasks. The color for each rectangle corresponds to the task-prefix of the task being performed and it matches the color of the Progress plot (see Progress section). This means that all the individual tasks part of the inc task-prefix for example, will have the same randomly assigned color from the viridis color map.
There are certain colors that are reserved for a specific kinds of tasks:
■
|
Transferring data between workers tasks. |
■
|
Reading from or writing to disk. |
■
|
Serializing/deserializing data. |
■
|
Erred tasks. |
In some scenarios the dashboard will have white spaces between each rectangle, this means that during that time the worker-thread is idle. Having too much white and red is an indication of not optimal use of resources.


Progress¶
The progress bars plot shows the progress of each individual task-prefix. The color of the of each bar matches the color of the individual tasks on the task stream that correspond to the same task-prefix. Each horizontal bar has three different components:
■
|
Tasks that are ready to run. |
■
|
Tasks that have been completed and are in memory. |
■
|
Tasks that have been completed, been in memory and have been released. |

Dask JupyterLab Extension¶
The JupyterLab Dask extension allows you to embed Dask’s dashboard plots directly into JupyterLab panes.
Once the JupyterLab Dask extension is installed you can choose any of the individual plots available and integrated as a pane in your JupyterLab session. For example, in the figure below we selected the Task Stream, Progress, Workers Memory, and Graph plots.
