API
Contents
API¶
Create Bags¶
|
Create a dask Bag from Python sequence. |
|
Create bag from many dask Delayed objects. |
|
Create a dask Bag from a url. |
|
Numbers from zero to n |
|
Read lines from text files |
|
Read set of avro files |
From dataframe¶
|
Create Dask Bag from a Dask DataFrame |
|
Create a Dask Bag from a Series |
Top-level functions¶
|
Concatenate many bags together, unioning all elements. |
|
Apply a function elementwise across one or more bags. |
|
Apply a function to every partition across one or more bags. |
|
Write dask Bag to disk, one filename per partition, one line per element. |
|
Partition-wise bag zip |
Random Sampling¶
|
Return a k sized list of elements chosen with replacement. |
|
Chooses k unique random elements from a bag. |
Turn Bags into other things¶
|
Write dask Bag to disk, one filename per partition, one line per element. |
|
Create Dask Dataframe from a Dask Bag. |
|
Convert into a list of |
|
Write bag to set of avro files |
Bag Methods¶
|
Parallel collection of Python objects |
|
Repeatedly apply binary function to a sequence, accumulating results. |
|
Are all elements truthy? |
|
Are any of the elements truthy? |
|
Compute this dask collection |
|
Count the number of elements. |
|
Distinct elements of collection |
|
Filter elements in collection by a predicate function. |
Concatenate nested lists into one long list. |
|
|
Parallelizable reduction |
|
Combined reduction and groupby. |
|
Count number of occurrences of each distinct element. |
|
Group collection by key function |
|
Joins collection with another collection. |
|
Apply a function elementwise across one or more bags. |
|
Apply a function to every partition across one or more bags. |
|
Maximum element |
|
Arithmetic mean |
|
Minimum element |
|
Persist this dask collection into memory |
|
Select item from all tuples/dicts in collection. |
|
Cartesian product between two bags. |
|
Reduce collection with reduction operators. |
|
Return elements from bag with probability of |
|
Remove elements in collection that match predicate. |
|
Repartition Bag across new divisions. |
|
Apply a function using argument tuples from the given bag. |
|
Standard deviation |
|
Sum all elements |
|
Take the first k elements. |
|
Write bag to set of avro files |
|
Create Dask Dataframe from a Dask Bag. |
|
Convert into a list of |
|
Write dask Bag to disk, one filename per partition, one line per element. |
|
K largest elements in collection |
|
Variance |
|
Render the computation of this object's task graph using graphviz. |
Item Methods¶
|
|
|
|
|
Compute this dask collection |
|
Create bag item from a dask.delayed value. |
|
Persist this dask collection into memory |
|
Convert into a |
|
Render the computation of this object's task graph using graphviz. |