Contents

Changelog

Contents

Changelog

2022.12.0

Released on December 2, 2022

Enhancements

Bug Fixes

Maintenance

2022.11.1

Released on November 18, 2022

Enhancements

Maintenance

2022.11.0

Released on November 15, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.10.2

Released on October 31, 2022

This was a hotfix and has no changes in this repository. The necessary fix was in dask/distributed, but we decided to bump this version number for consistency.

2022.10.1

Released on October 28, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.10.0

Released on October 14, 2022

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2022.9.2

Released on September 30, 2022

Enhancements

Documentation

Maintenance

2022.9.1

Released on September 16, 2022

New Features

Enhancements

Bug Fixes

Deprecations

  • Allow split_out to be None, which then defaults to 1 in groupby().aggregate() (GH#9491) Ian Rose

Documentation

Maintenance

2022.9.0

Released on September 2, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.8.1

Released on August 19, 2022

New Features

Enhancements

Bug Fixes

Documentation

Maintenance

2022.8.0

Released on August 5, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.7.1

Released on July 22, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.7.0

Released on July 8, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.6.1

Released on June 24, 2022

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.6.0

Released on June 10, 2022

Enhancements

Bug Fixes

Documentation

Maintenance

2022.05.2

Released on May 26, 2022

Enhancements

Documentation

Maintenance

2022.05.1

Released on May 24, 2022

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.05.0

Released on May 2, 2022

Highlights

This is a bugfix release for this issue.

Documentation

2022.04.2

Released on April 29, 2022

Highlights

This release includes several deprecations/breaking API changes to dask.dataframe.read_parquet and dask.dataframe.to_parquet:

  • to_parquet no longer writes _metadata files by default. If you want to write a _metadata file, you can pass in write_metadata_file=True.

  • read_parquet now defaults to split_row_groups=False, which results in one Dask dataframe partition per parquet file when reading in a parquet dataset. If you’re working with large parquet files you may need to set split_row_groups=True to reduce your partition size.

  • read_parquet no longer calculates divisions by default. If you require read_parquet to return dataframes with known divisions, please set calculate_divisions=True.

  • read_parquet has deprecated the gather_statistics keyword argument. Please use the calculate_divisions keyword argument instead.

  • read_parquet has deprecated the require_extensions keyword argument. Please use the parquet_file_extension keyword argument instead.

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.04.1

Released on April 15, 2022

New Features

  • Add missing NumPy ufuncs: abs, left_shift, right_shift, positive. (GH#8920) Tom White

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.04.0

Released on April 1, 2022

Note

This is the first release with support for Python 3.10

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.03.0

Released on March 18, 2022

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.02.1

Released on February 25, 2022

New Features

  • Add aggregate functions first and last to dask.dataframe.pivot_table (GH#8649) Knut Nordanger

  • Add std() support for datetime64 dtype for pandas-like objects (GH#8523) Ben Glossner

  • Add materialized task counts to HighLevelGraph and Layer html reprs (GH#8589) kori73

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.02.0

Released on February 11, 2022

Note

This is the last release with support for Python 3.7

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.01.1

Released on January 28, 2022

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2022.01.0

Released on January 14, 2022

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2021.12.0

Released on December 10, 2021

New Features

Enhancements

Bug Fixes

Deprecations

Documentation

Maintenance

2021.11.2

Released on November 19, 2021

2021.11.1

Released on November 8, 2021

Patch release to update distributed dependency to version 2021.11.1.

2021.11.0

Released on November 5, 2021

2021.10.0

Released on October 22, 2021

2021.09.1

Released on September 21, 2021

2021.09.0

Released on September 3, 2021

2021.08.1

Released on August 20, 2021

2021.08.0

Released on August 13, 2021

2021.07.2

Released on July 30, 2021

Note

This is the last release with support for NumPy 1.17 and pandas 0.25. Beginning with the next release, NumPy 1.18 and pandas 1.0 will be the minimum supported versions.

2021.07.1

Released on July 23, 2021

2021.07.0

Released on July 9, 2021

2021.06.2

Released on June 22, 2021

2021.06.1

Released on June 18, 2021

2021.06.0

Released on June 4, 2021

2021.05.1

Released on May 28, 2021

2021.05.0

Released on May 14, 2021

2021.04.1

Released on April 23, 2021

2021.04.0

Released on April 2, 2021

2021.03.1

Released on March 26, 2021

2021.03.0

Released on March 5, 2021

Note

This is the first release with support for Python 3.9 and the last release with support for Python 3.6

2021.02.0

Released on February 5, 2021

2021.01.1

Released on January 22, 2021

2021.01.0

Released on January 15, 2021

2020.12.0

Released on December 10, 2020

Highlights

  • Switched to CalVer for versioning scheme.

  • Introduced new APIs for HighLevelGraph to enable sending high-level representations of task graphs to the distributed scheduler.

  • Introduced new HighLevelGraph layer objects including BasicLayer, Blockwise, BlockwiseIO, ShuffleLayer, and more.

  • Added support for applying custom Layer-level annotations like priority, retries, etc. with the dask.annotations context manager.

  • Updated minimum supported version of pandas to 0.25.0 and NumPy to 1.15.1.

  • Support for the pyarrow.dataset API to read_parquet.

  • Several fixes to Dask Array’s SVD.

All changes

2.30.0 / 2020-10-06

Array

2.29.0 / 2020-10-02

Array

Bag

Core

DataFrame

Documentation

2.28.0 / 2020-09-25

Array

Core

DataFrame

2.27.0 / 2020-09-18

Array

Core

DataFrame

Documentation

2.26.0 / 2020-09-11

Array

Core

DataFrame

Documentation

2.25.0 / 2020-08-28

Core

DataFrame

Documentation

2.24.0 / 2020-08-22

Array

Dataframe

Core

2.23.0 / 2020-08-14

Array

Bag

Core

DataFrame

Documentation

2.22.0 / 2020-07-31

Array

Core

DataFrame

Documentation

2.21.0 / 2020-07-17

Array

Bag

Core

DataFrame

Documentation

2.20.0 / 2020-07-02

Array

DataFrame

Documentation

2.19.0 / 2020-06-19

Array

Core

DataFrame

Documentation

2.18.1 / 2020-06-09

Array

Core

Documentation

2.18.0 / 2020-06-05

Array

Bag

DataFrame

Documentation

2.17.2 / 2020-05-28

Core

DataFrame

2.17.1 / 2020-05-28

Array

Core

DataFrame

2.17.0 / 2020-05-26

Array

Bag

Core

DataFrame

Documentation

2.16.0 / 2020-05-08

Array

Core

DataFrame

Documentation

2.15.0 / 2020-04-24

Array

Core

DataFrame

Documentation

2.14.0 / 2020-04-03

Array

Core

DataFrame

Documentation

2.13.0 / 2020-03-25

Array

Bag

Core

DataFrame

Documentation

2.12.0 / 2020-03-06

Array

Core

DataFrame

Documentation

2.11.0 / 2020-02-19

Array

Bag

Core

DataFrame

Documentation

2.10.1 / 2020-01-30

2.10.0 / 2020-01-28

2.9.2 / 2020-01-16

Array

Core

DataFrame

Documentation

2.9.1 / 2019-12-27

Array

Core

DataFrame

Documentation

2.9.0 / 2019-12-06

Array

Core

DataFrame

Documentation

2.8.1 / 2019-11-22

Array

Core

DataFrame

Documentation

2.8.0 / 2019-11-14

Array

Bag

Core

DataFrame

Documentation

2.7.0 / 2019-11-08

This release drops support for Python 3.5

Array

Core

DataFrame

Documentation

2.6.0 / 2019-10-15

Core

DataFrame

Documentation

2.5.2 / 2019-10-04

Array

DataFrame

Documentation

2.5.0 / 2019-09-27

Core

DataFrame

Documentation

2.4.0 / 2019-09-13

Array

Core

DataFrame

Documentation

2.3.0 / 2019-08-16

Array

Bag

Core

DataFrame

Documentation

2.2.0 / 2019-08-01

Array

Bag

Core

DataFrame

Documentation

2.1.0 / 2019-07-08

Array

Core

DataFrame

Documentation

2.0.0 / 2019-06-25

Array

Core

DataFrame

Documentation

1.2.2 / 2019-05-08

Array

Bag

Core

DataFrame

Documentation

1.2.1 / 2019-04-29

Array

Core

DataFrame

Documentation

1.2.0 / 2019-04-12

Array

Core

DataFrame

Documentation

1.1.5 / 2019-03-29

Array

Core

DataFrame

Documentation

1.1.4 / 2019-03-08

Array

Core

DataFrame

Documentation

1.1.3 / 2019-03-01

Array

DataFrame

Documentation

1.1.2 / 2019-02-25

Array

Bag

DataFrame

Documentation

Core

1.1.1 / 2019-01-31

Array

DataFrame

Delayed

Documentation

Core

  • Work around psutil 5.5.0 not allowing pickling Process objects Janne Vuorela

1.1.0 / 2019-01-18

Array

DataFrame

Documentation

Core

1.0.0 / 2018-11-28

Array

DataFrame

Documentation

Core

0.20.2 / 2018-11-15

Array

Dataframe

Documentation

0.20.1 / 2018-11-09

Array

Core

Dataframe

Documentation

0.20.0 / 2018-10-26

Array

Bag

Core

Dataframe

Documentation

0.19.4 / 2018-10-09

Array

Bag

Dataframe

Core

Documentation

0.19.3 / 2018-10-05

Array

Bag

Dataframe

Core

Documentation

0.19.2 / 2018-09-17

Array

Core

Documentation

0.19.1 / 2018-09-06

Array

Dataframe

Documentation

0.19.0 / 2018-08-29

Array

DataFrame

Core

Docs

0.18.2 / 2018-07-23

Array

Bag

Dataframe

Delayed

Core

0.18.1 / 2018-06-22

Array

DataFrame

Core

0.18.0 / 2018-06-14

Array

Dataframe

Bag

Core

0.17.5 / 2018-05-16

Array

DataFrame

0.17.4 / 2018-05-03

Dataframe

0.17.3 / 2018-05-02

Array

DataFrame

Core

  • Support traversing collections in persist, visualize, and optimize (GH#3410) Jim Crist

  • Add schedule= keyword to compute and persist. This replaces common use of the get= keyword (GH#3448) Matthew Rocklin

0.17.2 / 2018-03-21

Array

DataFrame

Bag

Core

0.17.1 / 2018-02-22

Array

DataFrame

Core

0.17.0 / 2018-02-09

Array

DataFrame

Bag

  • Document bag.map_paritions function may receive either a list or generator. (GH#3150) Nir

Core

0.16.1 / 2018-01-09

Array

DataFrame

Core

0.16.0 / 2017-11-17

This is a major release. It includes breaking changes, new protocols, and a large number of bug fixes.

Array

DataFrame

Core

0.15.4 / 2017-10-06

Array

  • da.random.choice now works with array arguments (GH#2781)

  • Support indexing in arrays with np.int (fixes regression) (GH#2719)

  • Handle zero dimension with rechunking (GH#2747)

  • Support -1 as an alias for “size of the dimension” in chunks (GH#2749)

  • Call mkdir in array.to_npy_stack (GH#2709)

DataFrame

  • Added the .str accessor to Categoricals with string categories (GH#2743)

  • Support int96 (spark) datetimes in parquet writer (GH#2711)

  • Pass on file scheme to fastparquet (GH#2714)

  • Support Pandas 0.21 (GH#2737)

Bag

  • Add tree reduction support for foldby (GH#2710)

Core

  • Drop s3fs from pip install dask[complete] (GH#2750)

0.15.3 / 2017-09-24

Array

  • Add masked arrays (GH#2301)

  • Add *_like array creation functions (GH#2640)

  • Indexing with unsigned integer array (GH#2647)

  • Improved slicing with boolean arrays of different dimensions (GH#2658)

  • Support literals in top and atop (GH#2661)

  • Optional axis argument in cumulative functions (GH#2664)

  • Improve tests on scalars with assert_eq (GH#2681)

  • Fix norm keepdims (GH#2683)

  • Add ptp (GH#2691)

  • Add apply_along_axis (GH#2690) and apply_over_axes (GH#2702)

DataFrame

  • Added Series.str[index] (GH#2634)

  • Allow the groupby by param to handle columns and index levels (GH#2636)

  • DataFrame.to_csv and Bag.to_textfiles now return the filenames to

    which they have written (GH#2655)

  • Fix combination of partition_on and append in to_parquet (GH#2645)

  • Fix for parquet file schemes (GH#2667)

  • Repartition works with mixed categoricals (GH#2676)

Core

  • python setup.py test now runs tests (GH#2641)

  • Added new cheatsheet (GH#2649)

  • Remove resize tool in Bokeh plots (GH#2688)

0.15.2 / 2017-08-25

Array

  • Remove spurious keys from map_overlap graph (GH#2520)

  • where works with non-bool condition and scalar values (GH#2543) (GH#2549)

  • Improve compress (GH#2541) (GH#2545) (GH#2555)

  • Add argwhere, _nonzero, and where(cond) (GH#2539)

  • Generalize vindex in dask.array to handle multi-dimensional indices (GH#2573)

  • Add choose method (GH#2584)

  • Split code into reorganized files (GH#2595)

  • Add linalg.norm (GH#2597)

  • Add diff, ediff1d (GH#2607), (GH#2609)

  • Improve dtype inference and reflection (GH#2571)

Bag

  • Remove deprecated Bag behaviors (GH#2525)

DataFrame

Core

  • Remove bare except: blocks everywhere (GH#2590)

0.15.1 / 2017-07-08

  • Add storage_options to to_textfiles and to_csv (GH#2466)

  • Rechunk and simplify rfftfreq (GH#2473), (GH#2475)

  • Better support ndarray subclasses (GH#2486)

  • Import star in dask.distributed (GH#2503)

  • Threadsafe cache handling with tokenization (GH#2511)

0.15.0 / 2017-06-09

Array

  • Add dask.array.stats submodule (GH#2269)

  • Support ufunc.outer (GH#2345)

  • Optimize fancy indexing by reducing graph overhead (GH#2333) (GH#2394)

  • Faster array tokenization using