- dask.bag.from_sequence(seq, partition_size=None, npartitions=None)[source]¶
Create a dask Bag from Python sequence.
This sequence should be relatively small in memory. Dask Bag works best when it handles loading your data itself. Commonly we load a sequence of filenames into a Bag and then use
.mapto open them.
- seq: Iterable
A sequence of elements to put into the dask
- partition_size: int (optional)
The length of each partition
- npartitions: int (optional)
The number of desired partitions
- It is best to provide either ``partition_size`` or ``npartitions``
- (though not both.)
Create bag from text files
>>> import dask.bag as db >>> b = db.from_sequence(['Alice', 'Bob', 'Chuck'], partition_size=2)