dask.bag.map_partitions
dask.bag.map_partitions¶
- dask.bag.map_partitions(func, *args, **kwargs)[source]¶
Apply a function to every partition across one or more bags.
Note that all
Bag
arguments must be partitioned identically.- Parameters
- funccallable
- *args, **kwargsBag, Item, Delayed, or object
Arguments and keyword arguments to pass to
func
.
Examples
>>> import dask.bag as db >>> b = db.from_sequence(range(1, 101), npartitions=10) >>> def div(nums, den=1): ... return [num / den for num in nums]
Using a python object:
>>> hi = b.max().compute() >>> hi 100 >>> b.map_partitions(div, den=hi).take(5) (0.01, 0.02, 0.03, 0.04, 0.05)
Using an
Item
:>>> b.map_partitions(div, den=b.max()).take(5) (0.01, 0.02, 0.03, 0.04, 0.05)
Note that while both versions give the same output, the second forms a single graph, and then computes everything at once, and in some cases may be more efficient.