dask.bag.Bag.map
dask.bag.Bag.map¶
- Bag.map(func, *args, **kwargs)[source]¶
Apply a function elementwise across one or more bags.
Note that all
Bag
arguments must be partitioned identically.- Parameters
- funccallable
- *args, **kwargsBag, Item, or object
Extra arguments and keyword arguments to pass to
func
after the calling bag instance. Non-Bag args/kwargs are broadcasted across all calls tofunc
.
Notes
For calls with multiple Bag arguments, corresponding partitions should have the same length; if they do not, the call will error at compute time.
Examples
>>> import dask.bag as db >>> b = db.from_sequence(range(5), npartitions=2) >>> b2 = db.from_sequence(range(5, 10), npartitions=2)
Apply a function to all elements in a bag:
>>> b.map(lambda x: x + 1).compute() [1, 2, 3, 4, 5]
Apply a function with arguments from multiple bags:
>>> from operator import add >>> b.map(add, b2).compute() [5, 7, 9, 11, 13]
Non-bag arguments are broadcast across all calls to the mapped function:
>>> b.map(add, 1).compute() [1, 2, 3, 4, 5]
Keyword arguments are also supported, and have the same semantics as regular arguments:
>>> def myadd(x, y=0): ... return x + y >>> b.map(myadd, y=b2).compute() [5, 7, 9, 11, 13] >>> b.map(myadd, y=1).compute() [1, 2, 3, 4, 5]
Both arguments and keyword arguments can also be instances of
dask.bag.Item
. Here we’ll add the max value in the bag to each element:>>> b.map(myadd, b.max()).compute() [4, 5, 6, 7, 8]