dask.bag.map

dask.bag.map(func, *args, **kwargs)

Apply a function elementwise across one or more bags.

Note that all Bag arguments must be partitioned identically.

Parameters
funccallable
*args, **kwargsBag, Item, Delayed, or object

Arguments and keyword arguments to pass to func. Non-Bag args/kwargs are broadcasted across all calls to func.

Notes

For calls with multiple Bag arguments, corresponding partitions should have the same length; if they do not, the call will error at compute time.

Examples

>>> import dask.bag as db
>>> b = db.from_sequence(range(5), npartitions=2)
>>> b2 = db.from_sequence(range(5, 10), npartitions=2)

Apply a function to all elements in a bag:

>>> db.map(lambda x: x + 1, b).compute()
[1, 2, 3, 4, 5]

Apply a function with arguments from multiple bags:

>>> from operator import add
>>> db.map(add, b, b2).compute()
[5, 7, 9, 11, 13]

Non-bag arguments are broadcast across all calls to the mapped function:

>>> db.map(add, b, 1).compute()
[1, 2, 3, 4, 5]

Keyword arguments are also supported, and have the same semantics as regular arguments:

>>> def myadd(x, y=0):
...     return x + y
>>> db.map(myadd, b, y=b2).compute()
[5, 7, 9, 11, 13]
>>> db.map(myadd, b, y=1).compute()
[1, 2, 3, 4, 5]

Both arguments and keyword arguments can also be instances of dask.bag.Item or dask.delayed.Delayed. Here we’ll add the max value in the bag to each element:

>>> db.map(myadd, b, b.max()).compute()
[4, 5, 6, 7, 8]