Python API (advanced)
Contents
Python API (advanced)¶
In some rare cases, experts may want to create Scheduler
, Worker
, and
Nanny
objects explicitly in Python. This is often necessary when making
tools to automatically deploy Dask in custom settings.
It is more common to create a Local cluster with Client() on a single machine or use the Command Line Interface (CLI). New readers are recommended to start there.
If you do want to start Scheduler and Worker objects yourself you should be a
little familiar with async
/await
style Python syntax. These objects
are awaitable and are commonly used within async with
context managers.
Here are a few examples to show a few ways to start and finish things.
Full Example¶
|
Dynamic distributed task scheduler |
|
Worker node in a Dask distributed cluster |
|
Connect to and submit computation to a Dask cluster |
We first start with a comprehensive example of setting up a Scheduler, two Workers, and one Client in the same event loop, running a simple computation, and then cleaning everything up.
import asyncio
from dask.distributed import Scheduler, Worker, Client
async def f():
async with Scheduler() as s:
async with Worker(s.address) as w1, Worker(s.address) as w2:
async with Client(s.address, asynchronous=True) as client:
future = client.submit(lambda x: x + 1, 10)
result = await future
print(result)
asyncio.get_event_loop().run_until_complete(f())
Now we look at simpler examples that build up to this case.
Scheduler¶
|
Dynamic distributed task scheduler |
We create scheduler by creating a Scheduler()
object, and then await
that object to wait for it to start up. We can then wait on the .finished
method to wait until it closes. In the meantime the scheduler will be active
managing the cluster..
import asyncio
from dask.distributed import Scheduler, Worker
async def f():
s = Scheduler() # scheduler created, but not yet running
s = await s # the scheduler is running
await s.finished() # wait until the scheduler closes
asyncio.get_event_loop().run_until_complete(f())
This program will run forever, or until some external process connects to the
scheduler and tells it to stop. If you want to close things yourself you can
close any Scheduler
, Worker
, Nanny
, or Client
class by awaiting
the .close
method:
await s.close()
Worker¶
|
Worker node in a Dask distributed cluster |
The worker follows the same API. The only difference is that the worker needs to know the address of the scheduler.
import asyncio
from dask.distributed import Scheduler, Worker
async def f(scheduler_address):
w = await Worker(scheduler_address)
await w.finished()
asyncio.get_event_loop().run_until_complete(f("tcp://127.0.0.1:8786"))
Start many in one event loop¶
|
Dynamic distributed task scheduler |
|
Worker node in a Dask distributed cluster |
We can run as many of these objects as we like in the same event loop.
import asyncio
from dask.distributed import Scheduler, Worker
async def f():
s = await Scheduler()
w = await Worker(s.address)
await w.finished()
await s.finished()
asyncio.get_event_loop().run_until_complete(f())
Use Context Managers¶
We can also use async with
context managers to make sure that we clean up
properly. Here is the same example as from above:
import asyncio
from dask.distributed import Scheduler, Worker
async def f():
async with Scheduler() as s:
async with Worker(s.address) as w:
await w.finished()
await s.finished()
asyncio.get_event_loop().run_until_complete(f())
Alternatively, in the example below we also include a Client
, run a small
computation, and then allow things to clean up after that computation..
import asyncio
from dask.distributed import Scheduler, Worker, Client
async def f():
async with Scheduler() as s:
async with Worker(s.address) as w1, Worker(s.address) as w2:
async with Client(s.address, asynchronous=True) as client:
future = client.submit(lambda x: x + 1, 10)
result = await future
print(result)
asyncio.get_event_loop().run_until_complete(f())
This is equivalent to creating and awaiting
each server, and then calling
.close
on each as we leave the context.
In this example we don’t wait on s.finished()
, so this will terminate
relatively quickly. You could have called await s.finished()
though if you
wanted this to run forever.
Nanny¶
|
A process to manage worker processes |
Alternatively, we can replace Worker
with Nanny
if we want your workers
to be managed in a separate process. The Nanny
constructor follows the
same API. This allows workers to restart themselves in case of failure. Also,
it provides some additional monitoring, and is useful when coordinating many
workers that should live in different processes in order to avoid the GIL.
# w = await Worker(s.address)
w = await Nanny(s.address)
API¶
These classes have a variety of keyword arguments that you can use to control their behavior. See the API documentation below for more information.
Scheduler¶
- class distributed.Scheduler(loop=None, delete_interval='500ms', synchronize_worker_interval='60s', services=None, service_kwargs=None, allowed_failures=None, extensions=None, validate=None, scheduler_file=None, security=None, worker_ttl=None, idle_timeout=None, interface=None, host=None, port=0, protocol=None, dashboard_address=None, dashboard=None, http_prefix='/', preload=None, preload_argv=(), plugins=(), contact_address=None, **kwargs)[source]¶
Dynamic distributed task scheduler
The scheduler tracks the current state of workers, data, and computations. The scheduler listens for events and responds by controlling workers appropriately. It continuously tries to use the workers to execute an ever growing dask graph.
All events are handled quickly, in linear time with respect to their input (which is often of constant size) and generally within a millisecond. To accomplish this the scheduler tracks a lot of state. Every operation maintains the consistency of this state.
The scheduler communicates with the outside world through Comm objects. It maintains a consistent and valid view of the world even when listening to several clients at once.
A Scheduler is typically started either with the
dask-scheduler
executable:$ dask-scheduler Scheduler started at 127.0.0.1:8786
Or within a LocalCluster a Client starts up without connection information:
>>> c = Client() >>> c.cluster.scheduler Scheduler(...)
Users typically do not interact with the scheduler directly but rather with the client object
Client
.The
contact_address
parameter allows to advertise a specific address to the workers for communication with the scheduler, which is different than the address the scheduler binds to. This is useful when the scheduler listens on a private address, which therefore cannot be used by the workers to contact it.State
The scheduler contains the following state variables. Each variable is listed along with what it stores and a brief description.
- tasks:
{task key: TaskState}
Tasks currently known to the scheduler
- tasks:
- unrunnable:
{TaskState}
Tasks in the “no-worker” state
- unrunnable:
- workers:
{worker key: WorkerState}
Workers currently connected to the scheduler
- workers:
- idle:
{WorkerState}
: Set of workers that are not fully utilized
- idle:
- saturated:
{WorkerState}
: Set of workers that are not over-utilized
- saturated:
- host_info:
{hostname: dict}
: Information about each worker host
- host_info:
- clients:
{client key: ClientState}
Clients currently connected to the scheduler
- clients:
- services:
{str: port}
: Other services running on this scheduler, like Bokeh
- services:
- loop:
IOLoop
: The running Tornado IOLoop
- loop:
- client_comms:
{client key: Comm}
For each client, a Comm object used to receive task requests and report task status updates.
- client_comms:
- stream_comms:
{worker key: Comm}
For each worker, a Comm object from which we both accept stimuli and report results
- stream_comms:
- task_duration:
{key-prefix: time}
Time we expect certain functions to take, e.g.
{'sum': 0.25}
- task_duration:
- adaptive_target(target_duration=None)[source]¶
Desired number of workers based on the current workload
This looks at the current running tasks and memory use, and returns a number of desired workers. This is often used by adaptive scheduling.
- Parameters
- target_durationstr
A desired duration of time for computations to take. This affects how rapidly the scheduler will ask to scale.
See also
- async add_client(comm: distributed.comm.core.Comm, client: str, versions: dict[str, Any]) None [source]¶
Add client to network
We listen to all future messages from this Comm.
- add_keys(worker=None, keys=(), stimulus_id=None)[source]¶
Learn that a worker has certain keys
This should not be used in practice and is mostly here for legacy reasons. However, it is sent by workers from time to time.
- add_plugin(plugin: SchedulerPlugin, *, idempotent: bool = False, name: str | None = None, **kwargs)[source]¶
Add external plugin to scheduler.
See https://distributed.readthedocs.io/en/latest/plugins.html
- Parameters
- pluginSchedulerPlugin
SchedulerPlugin instance to add
- idempotentbool
If true, the plugin is assumed to already exist and no action is taken.
- namestr
A name for the plugin, if None, the name attribute is checked on the Plugin instance and generated if not discovered.
- async add_worker(comm=None, *, address: str, status: str, keys=(), nthreads=None, name=None, resolve_address=True, nbytes=None, types=None, now=None, resources=None, host_info=None, memory_limit=None, metrics=None, pid=0, services=None, local_directory=None, versions: dict[str, Any] | None = None, nanny=None, extra=None, stimulus_id=None)[source]¶
Add a new worker to the cluster
- async benchmark_hardware() dict[str, dict[str, float]] [source]¶
Run a benchmark on the workers for memory, disk, and network bandwidths
- Returns
- result: dict
A dictionary mapping the names “disk”, “memory”, and “network” to dictionaries mapping sizes to bandwidths. These bandwidths are averaged over many workers running computations across the cluster.
- async broadcast(comm=None, *, msg: dict, workers: list[str] | None = None, hosts: list[str] | None = None, nanny: bool = False, serializers=None, on_error: "Literal['raise', 'return', 'return_pickle', 'ignore']" = 'raise') dict [source]¶
Broadcast message to workers, return all results
- client_releases_keys(keys=None, client=None, stimulus_id=None)[source]¶
Remove keys from client desired list
- async close(fast=False, close_workers=False)[source]¶
Send cleanup signal to all coroutines then wait until finished
See also
Scheduler.cleanup
- async close_worker(worker: str, stimulus_id: str, safe: bool = False)[source]¶
Remove a worker from the cluster
This both removes the worker from our local state and also sends a signal to the worker to shut down. This works regardless of whether or not the worker has a nanny process restarting it
- coerce_address(addr, resolve=True)[source]¶
Coerce possible input addresses to canonical form. resolve can be disabled for testing with fake hostnames.
Handles strings, tuples, or aliases.
- async delete_worker_data(worker_address: str, keys: collections.abc.Collection[str], stimulus_id: str) None [source]¶
Delete data from a worker and update the corresponding worker/task states
- Parameters
- worker_address: str
Worker address to delete keys from
- keys: list[str]
List of keys to delete on the specified worker
- async dump_cluster_state_to_url(url: str, exclude: collections.abc.Collection[str], format: Literal['msgpack', 'yaml'], **storage_options: dict[str, Any]) None [source]¶
Write a cluster state dump to an fsspec-compatible URL.
- async feed(comm, function=None, setup=None, teardown=None, interval='1s', **kwargs)[source]¶
Provides a data Comm to external requester
Caution: this runs arbitrary Python code on the scheduler. This should eventually be phased out. It is mostly used by diagnostics.
- async gather_on_worker(worker_address: str, who_has: dict[str, list[str]]) set [source]¶
Peer-to-peer copy of keys from multiple workers to a single worker
- Parameters
- worker_address: str
Recipient worker address to copy keys to
- who_has: dict[Hashable, list[str]]
{key: [sender address, sender address, …], key: …}
- Returns
- returns:
set of keys that failed to be copied
- async get_cluster_state(exclude: collections.abc.Collection[str]) dict [source]¶
Produce the state dict used in a cluster state dump
- get_worker_service_addr(worker: str, service_name: str, protocol: bool = False) tuple[str, int] | str | None [source]¶
Get the (host, port) address of the named service on the worker. Returns None if the service doesn’t exist.
- Parameters
- workeraddress
- service_namestr
Common services include ‘bokeh’ and ‘nanny’
- protocolboolean
Whether or not to include a full address with protocol (True) or just a (host, port) pair
- handle_long_running(key=None, worker=None, compute_duration=None)[source]¶
A task has seceded from the thread pool
We stop the task from being stolen in the future, and change task duration accounting as if the task has stopped.
- handle_missing_data(key: str, worker: str, errant_worker: str, stimulus_id: str) None [source]¶
Signal that errant_worker does not hold key.
This may either indicate that errant_worker is dead or that we may be working with stale data and need to remove key from the workers has_what. If no replica of a task is available anymore, the task is transitioned back to released and rescheduled, if possible.
- Parameters
- keystr
Task key that could not be found
- workerstr
Address of the worker informing the scheduler
- errant_workerstr
Address of the worker supposed to hold a replica
- async handle_worker(comm=None, worker=None, stimulus_id=None)[source]¶
Listen to responses from a single worker
This is the main loop for scheduler-worker interaction
See also
Scheduler.handle_client
Equivalent coroutine for clients
- async proxy(comm=None, msg=None, worker=None, serializers=None)[source]¶
Proxy a communication through the scheduler to some other worker
- async rebalance(comm=None, keys: collections.abc.Iterable[collections.abc.Hashable] = None, workers: collections.abc.Iterable[str] = None, stimulus_id: str = None) dict [source]¶
Rebalance keys so that each worker ends up with roughly the same process memory (managed+unmanaged).
Warning
This operation is generally not well tested against normal operation of the scheduler. It is not recommended to use it while waiting on computations.
Algorithm
Find the mean occupancy of the cluster, defined as data managed by dask + unmanaged process memory that has been there for at least 30 seconds (
distributed.worker.memory.recent-to-old-time
). This lets us ignore temporary spikes caused by task heap usage.Alternatively, you may change how memory is measured both for the individual workers as well as to calculate the mean through
distributed.worker.memory.rebalance.measure
. Namely, this can be useful to disregard inaccurate OS memory measurements.Discard workers whose occupancy is within 5% of the mean cluster occupancy (
distributed.worker.memory.rebalance.sender-recipient-gap
/ 2). This helps avoid data from bouncing around the cluster repeatedly.Workers above the mean are senders; those below are recipients.
Discard senders whose absolute occupancy is below 30% (
distributed.worker.memory.rebalance.sender-min
). In other words, no data is moved regardless of imbalancing as long as all workers are below 30%.Discard recipients whose absolute occupancy is above 60% (
distributed.worker.memory.rebalance.recipient-max
). Note that this threshold by default is the same asdistributed.worker.memory.target
to prevent workers from accepting data and immediately spilling it out to disk.Iteratively pick the sender and recipient that are farthest from the mean and move the least recently inserted key between the two, until either all senders or all recipients fall within 5% of the mean.
A recipient will be skipped if it already has a copy of the data. In other words, this method does not degrade replication. A key will be skipped if there are no recipients available with enough memory to accept the key and that don’t already hold a copy.
The least recently insertd (LRI) policy is a greedy choice with the advantage of being O(1), trivial to implement (it relies on python dict insertion-sorting) and hopefully good enough in most cases. Discarded alternative policies were:
Largest first. O(n*log(n)) save for non-trivial additional data structures and risks causing the largest chunks of data to repeatedly move around the cluster like pinballs.
Least recently used (LRU). This information is currently available on the workers only and not trivial to replicate on the scheduler; transmitting it over the network would be very expensive. Also, note that dask will go out of its way to minimise the amount of time intermediate keys are held in memory, so in such a case LRI is a close approximation of LRU.
- Parameters
- keys: optional
allowlist of dask keys that should be considered for moving. All other keys will be ignored. Note that this offers no guarantee that a key will actually be moved (e.g. because it is unnecessary or because there are no viable recipient workers for it).
- workers: optional
allowlist of workers addresses to be considered as senders or recipients. All other workers will be ignored. The mean cluster occupancy will be calculated only using the allowed workers.
- reevaluate_occupancy(worker_index: int = 0)[source]¶
Periodically reassess task duration time
The expected duration of a task can change over time. Unfortunately we don’t have a good constant-time way to propagate the effects of these changes out to the summaries that they affect, like the total expected runtime of each of the workers, or what tasks are stealable.
In this coroutine we walk through all of the workers and re-align their estimates with the current state of tasks. We do this periodically rather than at every transition, and we only do it if the scheduler process isn’t under load (using psutil.Process.cpu_percent()). This lets us avoid this fringe optimization when we have better things to think about.
- async register_nanny_plugin(comm, plugin, name=None)[source]¶
Registers a setup function, and call it on every worker
- async register_scheduler_plugin(plugin, name=None, idempotent=None)[source]¶
Register a plugin on the scheduler.
- async register_worker_plugin(comm, plugin, name=None)[source]¶
Registers a worker plugin on all running and future workers
- remove_client(client: str, stimulus_id: Optional[str] = None) None [source]¶
Remove client from network
- remove_plugin(name: str | None = None, plugin: SchedulerPlugin | None = None) None [source]¶
Remove external plugin from scheduler
- Parameters
- namestr
Name of the plugin to remove
- async remove_worker(address, stimulus_id, safe=False, close=True)[source]¶
Remove worker from cluster
We do this when a worker reports that it plans to leave or when it appears to be unresponsive. This may send its tasks back to a released state.
- async replicate(comm=None, keys=None, n=None, workers=None, branching_factor=2, delete=True, lock=True, stimulus_id=None)[source]¶
Replicate data throughout cluster
This performs a tree copy of the data throughout the network individually on each piece of data.
- Parameters
- keys: Iterable
list of keys to replicate
- n: int
Number of replications we expect to see within the cluster
- branching_factor: int, optional
The number of workers that can copy data in each generation. The larger the branching factor, the more data we copy in a single step, but the more a given worker risks being swamped by data requests.
See also
- report(msg: dict, ts: Optional[distributed.scheduler.TaskState] = None, client: Optional[str] = None)[source]¶
Publish updates to all listening Queues and Comms
If the message contains a key then we only send the message to those comms that care about the key.
- request_acquire_replicas(addr: str, keys: list, *, stimulus_id: str)[source]¶
Asynchronously ask a worker to acquire a replica of the listed keys from other workers. This is a fire-and-forget operation which offers no feedback for success or failure, and is intended for housekeeping and not for computation.
- request_remove_replicas(addr: str, keys: list, *, stimulus_id: str)[source]¶
Asynchronously ask a worker to discard its replica of the listed keys. This must never be used to destroy the last replica of a key. This is a fire-and-forget operation, intended for housekeeping and not for computation.
The replica disappears immediately from TaskState.who_has on the Scheduler side; if the worker refuses to delete, e.g. because the task is a dependency of another task running on it, it will (also asynchronously) inform the scheduler to re-add itself to who_has. If the worker agrees to discard the task, there is no feedback.
- reschedule(key=None, worker=None)[source]¶
Reschedule a task
Things may have shifted and this task may now be better suited to run elsewhere
- async retire_workers(comm=None, *, workers: list[str] | None = None, names: list | None = None, close_workers: bool = False, remove: bool = True, stimulus_id: str = None, **kwargs) dict [source]¶
Gracefully retire workers from cluster
- Parameters
- workers: list[str] (optional)
List of worker addresses to retire.
- names: list (optional)
List of worker names to retire. Mutually exclusive with
workers
. If neitherworkers
nornames
are provided, we callworkers_to_close
which finds a good set.- close_workers: bool (defaults to False)
Whether or not to actually close the worker explicitly from here. Otherwise we expect some external job scheduler to finish off the worker.
- remove: bool (defaults to True)
Whether or not to remove the worker metadata immediately or else wait for the worker to contact us
- **kwargs: dict
Extra options to pass to workers_to_close to determine which workers we should drop
- Returns
- Dictionary mapping worker ID/address to dictionary of information about
- that worker for each retired worker.
See also
- run_function(comm, function, args=(), kwargs=None, wait=True)[source]¶
Run a function within this process
See also
- async scatter(comm=None, data=None, workers=None, client=None, broadcast=False, timeout=2)[source]¶
Send data out to workers
See also
- send_task_to_worker(worker, ts: distributed.scheduler.TaskState, duration: float = - 1)[source]¶
Send a single computational task to a worker
- stimulus_cancel(comm, keys=None, client=None, force=False)[source]¶
Stop execution on a list of keys
- stimulus_task_erred(key=None, worker=None, exception=None, stimulus_id=None, traceback=None, **kwargs)[source]¶
Mark that a task has erred on a particular worker
- stimulus_task_finished(key=None, worker=None, stimulus_id=None, **kwargs)[source]¶
Mark that a task has finished execution on a particular worker
- transition(key, finish: str, *args, stimulus_id: str, **kwargs)[source]¶
Transition a key from its current state to the finish state
- Returns
- Dictionary of recommendations for future transitions
See also
Scheduler.transitions
transitive version of this function
Examples
>>> self.transition('x', 'waiting') {'x': 'processing'}
- transition_story(*keys)¶
Get all transitions that touch one of the input keys
- transitions(recommendations: dict, stimulus_id: str)[source]¶
Process transitions until none are left
This includes feedback from previous transitions and continues until we reach a steady state
- update_data(*, who_has: dict, nbytes: dict, client=None)[source]¶
Learn that new data has entered the network from an external source
See also
Scheduler.mark_key_in_memory
- update_graph(client=None, tasks=None, keys=None, dependencies=None, restrictions=None, priority=None, loose_restrictions=None, resources=None, submitting_task=None, retries=None, user_priority=0, actors=None, fifo_timeout=0, annotations=None, code=None, stimulus_id=None)[source]¶
Add new computations to the internal dask graph
This happens whenever the Client calls submit, map, get, or compute.
- worker_send(worker: str, msg: dict[str, Any]) None [source]¶
Send message to worker
This also handles connection failures by adding a callback to remove the worker on the next cycle.
- workers_list(workers)[source]¶
List of qualifying workers
Takes a list of worker addresses or hostnames. Returns a list of all worker addresses that match
- workers_to_close(comm=None, memory_ratio: int | float | None = None, n: int | None = None, key: Callable[[WorkerState], Hashable] | None = None, minimum: int | None = None, target: int | None = None, attribute: str = 'address') list[str] [source]¶
Find workers that we can close with low cost
This returns a list of workers that are good candidates to retire. These workers are not running anything and are storing relatively little data relative to their peers. If all workers are idle then we still maintain enough workers to have enough RAM to store our data, with a comfortable buffer.
This is for use with systems like
distributed.deploy.adaptive
.- Parameters
- memory_ratioNumber
Amount of extra space we want to have for our stored data. Defaults to 2, or that we want to have twice as much memory as we currently have data.
- nint
Number of workers to close
- minimumint
Minimum number of workers to keep around
- keyCallable(WorkerState)
An optional callable mapping a WorkerState object to a group affiliation. Groups will be closed together. This is useful when closing workers must be done collectively, such as by hostname.
- targetint
Target number of workers to have after we close
- attributestr
The attribute of the WorkerState object to return, like “address” or “name”. Defaults to “address”.
- Returns
- to_close: list of worker addresses that are OK to close
See also
Examples
>>> scheduler.workers_to_close() ['tcp://192.168.0.1:1234', 'tcp://192.168.0.2:1234']
Group workers by hostname prior to closing
>>> scheduler.workers_to_close(key=lambda ws: ws.host) ['tcp://192.168.0.1:1234', 'tcp://192.168.0.1:4567']
Remove two workers
>>> scheduler.workers_to_close(n=2)
Keep enough workers to have twice as much memory as we we need.
>>> scheduler.workers_to_close(memory_ratio=2)
Worker¶
- class distributed.Worker(scheduler_ip: str | None = None, scheduler_port: int | None = None, *, scheduler_file: str | None = None, nthreads: int | None = None, loop: IOLoop | None = None, local_dir: None = None, local_directory: str | None = None, services: dict | None = None, name: Any | None = None, reconnect: bool = True, executor: Executor | dict[str, Executor] | Literal['offload'] | None = None, resources: dict[str, float] | None = None, silence_logs: int | None = None, death_timeout: Any | None = None, preload: list[str] | None = None, preload_argv: list[str] | list[list[str]] | None = None, security: Security | dict[str, Any] | None = None, contact_address: str | None = None, heartbeat_interval: Any = '1s', extensions: dict[str, type] | None = None, metrics: Mapping[str, Callable[[Worker], Any]] = {}, startup_information: Mapping[str, Callable[[Worker], Any]] = {}, interface: str | None = None, host: str | None = None, port: int | str | Collection[int] | None = None, protocol: str | None = None, dashboard_address: str | None = None, dashboard: bool = False, http_prefix: str = '/', nanny: Nanny | None = None, plugins: tuple[WorkerPlugin, ...] = (), low_level_profiler: bool | None = None, validate: bool | None = None, profile_cycle_interval=None, lifetime: Any | None = None, lifetime_stagger: Any | None = None, lifetime_restart: bool | None = None, memory_limit: str | float = 'auto', data=None, memory_target_fraction: float | Literal[False] | None = None, memory_spill_fraction: float | Literal[False] | None = None, memory_pause_fraction: float | Literal[False] | None = None, **kwargs)[source]¶
Worker node in a Dask distributed cluster
Workers perform two functions:
Serve data from a local dictionary
Perform computation on that data and on data from peers
Workers keep the scheduler informed of their data and use that scheduler to gather data from other workers when necessary to perform a computation.
You can start a worker with the
dask-worker
command line application:$ dask-worker scheduler-ip:port
Use the
--help
flag to see more options:$ dask-worker --help
The rest of this docstring is about the internal state that the worker uses to manage and track internal computations.
State
Informational State
These attributes don’t change significantly during execution.
- nthreads:
int
: Number of nthreads used by this worker process
- nthreads:
- executors:
dict[str, concurrent.futures.Executor]
: Executors used to perform computation. Always contains the default executor.
- executors:
- local_directory:
path
: Path on local machine to store temporary files
- local_directory:
- scheduler:
rpc
: Location of scheduler. See
.ip/.port
attributes.
- scheduler:
- name:
string
: Alias
- name:
- services:
{str: Server}
: Auxiliary web servers running on this worker
- services:
service_ports:
{str: port}
:- total_out_connections:
int
The maximum number of concurrent outgoing requests for data
- total_out_connections:
- total_in_connections:
int
The maximum number of concurrent incoming requests for data
- total_in_connections:
- comm_threshold_bytes:
int
As long as the total number of bytes in flight is below this threshold we will not limit the number of outgoing connections for a single tasks dependency fetch.
- comm_threshold_bytes:
- batched_stream:
BatchedSend
A batched stream along which we communicate to the scheduler
- batched_stream:
- log:
[(message)]
A structured and queryable log. See
Worker.story
- log:
Volatile State
These attributes track the progress of tasks that this worker is trying to complete. In the descriptions below a
key
is the name of a task that we want to compute anddep
is the name of a piece of dependent data that we want to collect from others.- tasks:
{key: TaskState}
The tasks currently executing on this worker (and any dependencies of those tasks)
- tasks:
- data_needed: UniqueTaskHeap
The tasks which still require data in order to execute and are in memory on at least another worker, prioritized as a heap
- data_needed_per_worker:
{worker: UniqueTaskHeap}
Same as data_needed, split by worker
- data_needed_per_worker:
- ready: [keys]
Keys that are ready to run. Stored in a LIFO stack
- constrained: [keys]
Keys for which we have the data to run, but are waiting on abstract resources like GPUs. Stored in a FIFO deque
- executing_count:
int
A count of tasks currently executing on this worker
- executing_count:
- executed_count: int
A number of tasks that this worker has run in its lifetime
- long_running: {keys}
A set of keys of tasks that are running and have started their own long-running clients.
- has_what:
{worker: {deps}}
The data that we care about that we think a worker has
- has_what:
- in_flight_tasks:
int
A count of the number of tasks that are coming to us in current peer-to-peer connections
- in_flight_tasks:
- in_flight_workers:
{worker: {task}}
The workers from which we are currently gathering data and the dependencies we expect from those connections. Workers in this dict won’t be asked for additional dependencies until the current query returns.
- in_flight_workers:
- busy_workers:
{worker}
Workers that recently returned a busy status. Workers in this set won’t be asked for additional dependencies for some time.
- busy_workers:
- comm_bytes:
int
The total number of bytes in flight
- comm_bytes:
- threads:
{key: int}
The ID of the thread on which the task ran
- threads:
- active_threads:
{int: key}
The keys currently running on active threads
- active_threads:
- waiting_for_data_count:
int
A count of how many tasks are currently waiting for data
- waiting_for_data_count:
- generation:
int
Counter that decreases every time the compute-task handler is invoked by the Scheduler. It is appended to TaskState.priority and acts as a tie-breaker between tasks that have the same priority on the Scheduler, determining a last-in-first-out order between them.
- generation:
- Parameters
- scheduler_ip: str, optional
- scheduler_port: int, optional
- scheduler_file: str, optional
- ip: str, optional
- data: MutableMapping, type, None
The object to use for storage, builds a disk-backed LRU dict by default
- nthreads: int, optional
- loop: tornado.ioloop.IOLoop
- local_directory: str, optional
Directory where we place local resources
- name: str, optional
- memory_limit: int, float, string
Number of bytes of memory that this worker should use. Set to zero for no limit. Set to ‘auto’ to calculate as system.MEMORY_LIMIT * min(1, nthreads / total_cores) Use strings or numbers like 5GB or 5e9
- memory_target_fraction: float or False
Fraction of memory to try to stay beneath (default: read from config key distributed.worker.memory.target)
- memory_spill_fraction: float or False
Fraction of memory at which we start spilling to disk (default: read from config key distributed.worker.memory.spill)
- memory_pause_fraction: float or False
Fraction of memory at which we stop running new tasks (default: read from config key distributed.worker.memory.pause)
- max_spill: int, string or False
Limit of number of bytes to be spilled on disk. (default: read from config key distributed.worker.memory.max-spill)
- executor: concurrent.futures.Executor, dict[str, concurrent.futures.Executor], “offload”
- The executor(s) to use. Depending on the type, it has the following meanings:
Executor instance: The default executor.
Dict[str, Executor]: mapping names to Executor instances. If the “default” key isn’t in the dict, a “default” executor will be created using
ThreadPoolExecutor(nthreads)
.Str: The string “offload”, which refer to the same thread pool used for offloading communications. This results in the same thread being used for deserialization and computation.
- resources: dict
Resources that this worker has like
{'GPU': 2}
- nanny: str
Address on which to contact nanny, if it exists
- lifetime: str
Amount of time like “1 hour” after which we gracefully shut down the worker. This defaults to None, meaning no explicit shutdown time.
- lifetime_stagger: str
Amount of time like “5 minutes” to stagger the lifetime value The actual lifetime will be selected uniformly at random between lifetime +/- lifetime_stagger
- lifetime_restart: bool
Whether or not to restart a worker after it has reached its lifetime Default False
- kwargs: optional
Additional parameters to ServerNode constructor
Examples
Use the command line to start a worker:
$ dask-scheduler Start scheduler at 127.0.0.1:8786 $ dask-worker 127.0.0.1:8786 Start worker at: 127.0.0.1:1234 Registered with scheduler at: 127.0.0.1:8786
- async close_gracefully(restart=None)[source]¶
Gracefully shut down a worker
This first informs the scheduler that we’re shutting down, and asks it to move our data elsewhere. Afterwards, we close as normal
- property data: collections.abc.MutableMapping[str, Any]¶
{task key: task payload} of all completed tasks, whether they were computed on this Worker or computed somewhere else and then transferred here over the network.
When using the default configuration, this is a zict buffer that automatically spills to disk whenever the target threshold is exceeded. If spilling is disabled, it is a plain dict instead. It could also be a user-defined arbitrary dict-like passed when initialising the Worker or the Nanny. Worker logic should treat this opaquely and stick to the MutableMapping API.
- async gather_dep(worker: str, to_gather: Iterable[str], total_nbytes: int, *, stimulus_id: str) StateMachineEvent | None [source]¶
Gather dependencies for a task from a worker who has them
- Parameters
- workerstr
Address of worker to gather dependencies from
- to_gatherlist
Keys of dependencies to gather from worker – this is not necessarily equivalent to the full list of dependencies of
dep
as some dependencies may already be present on this worker.- total_nbytesint
Total number of bytes for all the dependencies in to_gather combined
- get_current_task() str [source]¶
Get the key of the task we are currently running
This only makes sense to run within a task
See also
get_worker
Examples
>>> from dask.distributed import get_worker >>> def f(): ... return get_worker().get_current_task()
>>> future = client.submit(f) >>> future.result() 'f-1234'
- handle_cancel_compute(key: str, stimulus_id: str) None [source]¶
Cancel a task on a best effort basis. This is only possible while a task is in state waiting or ready. Nothing will happen otherwise.
- handle_free_keys(keys: list[str], stimulus_id: str) None [source]¶
Handler to be called by the scheduler.
The given keys are no longer referred to and required by the scheduler. The worker is now allowed to release the key, if applicable.
This does not guarantee that the memory is released since the worker may still decide to hold on to the data and task since it is required by an upstream dependency.
- handle_remove_replicas(keys: list[str], stimulus_id: str) str [source]¶
Stream handler notifying the worker that it might be holding unreferenced, superfluous data.
This should not actually happen during ordinary operations and is only intended to correct any erroneous state. An example where this is necessary is if a worker fetches data for a downstream task but that task is released before the data arrives. In this case, the scheduler will notify the worker that it may be holding this unnecessary data, if the worker hasn’t released the data itself, already.
This handler does not guarantee the task nor the data to be actually released but only asks the worker to release the data on a best effort guarantee. This protects from race conditions where the given keys may already have been rescheduled for compute in which case the compute would win and this handler is ignored.
For stronger guarantees, see handler free_keys
- async start_unsafe()[source]¶
Attempt to start the server. This is not idempotent and not protected against concurrent startup attempts.
This is intended to be overwritten or called by subclasses. For a safe startup, please use
Server.start
instead.If
death_timeout
is configured, we will require this coroutine to finish before this timeout is reached. If the timeout is reached we will close the instance and raise anasyncio.TimeoutError
- stimulus_story(*keys_or_tasks: str | TaskState) list[StateMachineEvent] [source]¶
Return all state machine events involving one or more tasks
- story(*keys_or_tasks: str | TaskState) list[tuple] [source]¶
Return all transitions involving one or more tasks
- transition(ts: distributed.worker_state_machine.TaskState, finish: str, *, stimulus_id: str, **kwargs) None [source]¶
Transition a key from its current state to the finish state
- Returns
- Dictionary of recommendations for future transitions
See also
Scheduler.transitions
transitive version of this function
Examples
>>> self.transition('x', 'waiting', stimulus_id=f"test-{(time()}") {'x': 'processing'}
- transition_resumed_fetch(ts: distributed.worker_state_machine.TaskState, *, stimulus_id: str) tuple [source]¶
See Worker._transition_from_resumed
- transition_resumed_missing(ts: distributed.worker_state_machine.TaskState, *, stimulus_id: str) tuple [source]¶
See Worker._transition_from_resumed
- transition_resumed_waiting(ts: distributed.worker_state_machine.TaskState, *, stimulus_id: str) tuple [source]¶
See Worker._transition_from_resumed
- transitions(recommendations: dict, *, stimulus_id: str) None [source]¶
Process transitions until none are left
This includes feedback from previous transitions and continues until we reach a steady state
- trigger_profile() None [source]¶
Get a frame from all actively computing threads
Merge these frames into existing profile counts
- property worker_address¶
For API compatibility with Nanny
Nanny¶
- class distributed.Nanny(scheduler_ip=None, scheduler_port=None, scheduler_file=None, worker_port: int | str | Collection[int] | None = 0, nthreads=None, loop=None, local_dir=None, local_directory=None, services=None, name=None, memory_limit='auto', reconnect=True, validate=False, quiet=False, resources=None, silence_logs=None, death_timeout=None, preload=None, preload_argv=None, preload_nanny=None, preload_nanny_argv=None, security=None, contact_address=None, listen_address=None, worker_class=None, env=None, interface=None, host=None, port: int | str | Collection[int] | None = None, protocol=None, config=None, **worker_kwargs)[source]¶
A process to manage worker processes
The nanny spins up Worker processes, watches then, and kills or restarts them as necessary. It is necessary if you want to use the
Client.restart
method, or to restart the worker automatically if it gets to the terminate fraction of its memory limit.The parameters for the Nanny are mostly the same as those for the Worker with exceptions listed below.
- Parameters
- env: dict, optional
Environment variables set at time of Nanny initialization will be ensured to be set in the Worker process as well. This argument allows to overwrite or otherwise set environment variables for the Worker. It is also possible to set environment variables using the option distributed.nanny.environ. Precedence as follows
Nanny arguments
Existing environment variables
Dask configuration
See also
- close_gracefully()[source]¶
A signal that we shouldn’t try to restart workers if they go away
This is used as part of the cluster shutdown process.
- async instantiate() distributed.core.Status [source]¶
Start a local worker process
Blocks until the process is up and the scheduler is properly informed
- async kill(timeout=2)[source]¶
Kill the local worker process
Blocks until both the process is down and the scheduler is properly informed
- property local_dir¶
For API compatibility with Nanny