Dask DataFrame API (legacy)

Dask DataFrame API (legacy)¶

Dataframe¶

`DataFrame`(dsk, name, meta, divisions)	Parallel Pandas DataFrame
`DataFrame.abs`()	Return a Series/DataFrame with absolute numeric value of each element.
`DataFrame.add`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator add).
`DataFrame.align`(other[, join, axis, fill_value])	Align two objects on their axes with the specified join method.
`DataFrame.all`([axis, skipna, split_every, out])	Return whether all elements are True, potentially over an axis.
`DataFrame.any`([axis, skipna, split_every, out])	Return whether any element is True, potentially over an axis.
`DataFrame.apply`(func[, axis, broadcast, ...])	Parallel version of pandas.DataFrame.apply
`DataFrame.applymap`(func[, meta])	Apply a function to a Dataframe elementwise.
`DataFrame.assign`(**kwargs)	Assign new columns to a DataFrame.
`DataFrame.astype`(dtype)	Cast a pandas object to a specified dtype `dtype`.
`DataFrame.bfill`([axis, limit])	Fill NA/NaN values by using the next valid observation to fill the gap.
`DataFrame.categorize`([columns, index, ...])	Convert columns of the DataFrame to category dtype.
`DataFrame.columns`
`DataFrame.compute`(**kwargs)	Compute this dask collection
`DataFrame.copy`([deep])	Make a copy of the dataframe
`DataFrame.corr`([method, min_periods, ...])	Compute pairwise correlation of columns, excluding NA/null values.
`DataFrame.count`([axis, split_every, ...])	Count non-NA cells for each column or row.
`DataFrame.cov`([min_periods, numeric_only, ...])	Compute pairwise covariance of columns, excluding NA/null values.
`DataFrame.cummax`([axis, skipna, out])	Return cumulative maximum over a DataFrame or Series axis.
`DataFrame.cummin`([axis, skipna, out])	Return cumulative minimum over a DataFrame or Series axis.
`DataFrame.cumprod`([axis, skipna, dtype, out])	Return cumulative product over a DataFrame or Series axis.
`DataFrame.cumsum`([axis, skipna, dtype, out])	Return cumulative sum over a DataFrame or Series axis.
`DataFrame.describe`([split_every, ...])	Generate descriptive statistics.
`DataFrame.diff`([periods, axis])	First discrete difference of element.
`DataFrame.div`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator truediv).
`DataFrame.divide`(other[, axis, level, ...])	Get Floating division of dataframe and other, element-wise (binary operator truediv).
`DataFrame.drop`([labels, axis, columns, errors])	Drop specified labels from rows or columns.
`DataFrame.drop_duplicates`([subset, ...])	Return DataFrame with duplicate rows removed.
`DataFrame.dropna`([how, subset, thresh])	Remove missing values.
`DataFrame.dtypes`	Return data types
`DataFrame.eq`(other[, axis, level])	Get Equal to of dataframe and other, element-wise (binary operator eq).
`DataFrame.eval`(expr[, inplace])	Evaluate a string describing operations on DataFrame columns.
`DataFrame.explode`(column)	Transform each element of a list-like to a row, replicating index values.
`DataFrame.ffill`([axis, limit])	Fill NA/NaN values by propagating the last valid observation to next valid.
`DataFrame.fillna`([value, method, limit, axis])	Fill NA/NaN values using the specified method.
`DataFrame.first`(offset)	Select initial periods of time series data based on a date offset.
`DataFrame.floordiv`(other[, axis, level, ...])	Get Integer division of dataframe and other, element-wise (binary operator floordiv).
`DataFrame.ge`(other[, axis, level])	Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).
`DataFrame.get_partition`(n)	Get a dask DataFrame/Series representing the nth partition.
`DataFrame.groupby`([by, group_keys, sort, ...])	Group DataFrame using a mapper or by a Series of columns.
`DataFrame.gt`(other[, axis, level])	Get Greater than of dataframe and other, element-wise (binary operator gt).
`DataFrame.head`([n, npartitions, compute])	First n rows of the dataset
`DataFrame.idxmax`([axis, skipna, ...])	Return index of first occurrence of maximum over requested axis.
`DataFrame.idxmin`([axis, skipna, ...])	Return index of first occurrence of minimum over requested axis.
`DataFrame.iloc`	Purely integer-location based indexing for selection by position.
`DataFrame.index`	Return dask Index instance
`DataFrame.info`([buf, verbose, memory_usage])	Concise summary of a Dask DataFrame.
`DataFrame.isin`(values)	Whether each element in the DataFrame is contained in values.
`DataFrame.isna`()	Detect missing values.
`DataFrame.isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`DataFrame.items`()	Iterate over (column name, Series) pairs.
`DataFrame.iterrows`()	Iterate over DataFrame rows as (index, Series) pairs.
`DataFrame.itertuples`([index, name])	Iterate over DataFrame rows as namedtuples.
`DataFrame.join`(other[, on, how, lsuffix, ...])	Join columns of another DataFrame.
`DataFrame.known_divisions`	Whether divisions are already known
`DataFrame.last`(offset)	Select final periods of time series data based on a date offset.
`DataFrame.le`(other[, axis, level])	Get Less than or equal to of dataframe and other, element-wise (binary operator le).
`DataFrame.loc`	Purely label-location based indexer for selection by label.
`DataFrame.lt`(other[, axis, level])	Get Less than of dataframe and other, element-wise (binary operator lt).
`DataFrame.map_partitions`(func, args, *kwargs)	Apply Python function on each DataFrame partition.
`DataFrame.mask`(cond[, other])	Replace values where the condition is True.
`DataFrame.max`([axis, skipna, split_every, ...])	Return the maximum of the values over the requested axis.
`DataFrame.mean`([axis, skipna, split_every, ...])	Return the mean of the values over the requested axis.
`DataFrame.median`([axis, method])	Return the median of the values over the requested axis.
`DataFrame.median_approximate`([axis, method])	Return the approximate median of the values over the requested axis.
`DataFrame.melt`([id_vars, value_vars, ...])	Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.
`DataFrame.memory_usage`([index, deep])	Return the memory usage of each column in bytes.
`DataFrame.memory_usage_per_partition`([...])	Return the memory usage of each partition
`DataFrame.merge`(right[, how, on, left_on, ...])	Merge the DataFrame with another DataFrame
`DataFrame.min`([axis, skipna, split_every, ...])	Return the minimum of the values over the requested axis.
`DataFrame.mod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator mod).
`DataFrame.mode`([dropna, split_every, ...])	Get the mode(s) of each element along the selected axis.
`DataFrame.mul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator mul).
`DataFrame.ndim`	Return dimensionality
`DataFrame.ne`(other[, axis, level])	Get Not equal to of dataframe and other, element-wise (binary operator ne).
`DataFrame.nlargest`([n, columns, split_every])	Return the first n rows ordered by columns in descending order.
`DataFrame.npartitions`	Return number of partitions
`DataFrame.nsmallest`([n, columns, split_every])	Return the first n rows ordered by columns in ascending order.
`DataFrame.partitions`	Slice dataframe by partitions
`DataFrame.persist`(**kwargs)	Persist this dask collection into memory
`DataFrame.pivot_table`([index, columns, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`DataFrame.pop`(item)	Return item and drop from frame.
`DataFrame.pow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator pow).
`DataFrame.prod`([axis, skipna, split_every, ...])	Return the product of the values over the requested axis.
`DataFrame.quantile`([q, axis, numeric_only, ...])	Approximate row-wise and precise column-wise quantiles of DataFrame
`DataFrame.query`(expr, **kwargs)	Filter dataframe with complex expression
`DataFrame.radd`(other[, axis, level, fill_value])	Get Addition of dataframe and other, element-wise (binary operator radd).
`DataFrame.random_split`(frac[, random_state, ...])	Pseudorandomly split dataframe into different pieces row-wise
`DataFrame.rdiv`(other[, axis, level, fill_value])	Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
`DataFrame.reduction`(chunk[, aggregate, ...])	Generic row-wise reductions.
`DataFrame.rename`([index, columns])	Rename columns or index labels.
`DataFrame.repartition`([divisions, ...])	Repartition dataframe along new divisions
`DataFrame.replace`([to_replace, value, regex])	Replace values given in to_replace with value.
`DataFrame.resample`(rule[, closed, label])	Resample time-series data.
`DataFrame.reset_index`([drop])	Reset the index to the default index.
`DataFrame.rfloordiv`(other[, axis, level, ...])	Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).
`DataFrame.rmod`(other[, axis, level, fill_value])	Get Modulo of dataframe and other, element-wise (binary operator rmod).
`DataFrame.rmul`(other[, axis, level, fill_value])	Get Multiplication of dataframe and other, element-wise (binary operator rmul).
`DataFrame.round`([decimals])	Round a DataFrame to a variable number of decimal places.
`DataFrame.rpow`(other[, axis, level, fill_value])	Get Exponential power of dataframe and other, element-wise (binary operator rpow).
`DataFrame.rsub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator rsub).
`DataFrame.rtruediv`(other[, axis, level, ...])	Get Floating division of dataframe and other, element-wise (binary operator rtruediv).
`DataFrame.sample`([n, frac, replace, ...])	Random sample of items
`DataFrame.select_dtypes`([include, exclude])	Return a subset of the DataFrame's columns based on the column dtypes.
`DataFrame.sem`([axis, skipna, ddof, ...])	Return unbiased standard error of the mean over requested axis.
`DataFrame.set_index`(other[, drop, sorted, ...])	Set the DataFrame index (row labels) using an existing column.
`DataFrame.shape`	Return a tuple representing the dimensionality of the DataFrame.
`DataFrame.shuffle`(on[, npartitions, ...])	Rearrange DataFrame into new partitions
`DataFrame.size`	Size of the Series or DataFrame as a Delayed object.
`DataFrame.sort_values`(by[, npartitions, ...])	Sort the dataset by a single column.
`DataFrame.squeeze`([axis])	Squeeze 1 dimensional axis objects into scalars.
`DataFrame.std`([axis, skipna, ddof, ...])	Return sample standard deviation over requested axis.
`DataFrame.sub`(other[, axis, level, fill_value])	Get Subtraction of dataframe and other, element-wise (binary operator sub).
`DataFrame.sum`([axis, skipna, split_every, ...])	Return the sum of the values over the requested axis.
`DataFrame.tail`([n, compute])	Last n rows of the dataset
`DataFrame.to_backend`([backend])	Move to a new DataFrame backend
`DataFrame.to_bag`([index, format])	Create Dask Bag from a Dask DataFrame
`DataFrame.to_csv`(filename, **kwargs)	Store Dask DataFrame to CSV files
`DataFrame.to_dask_array`([lengths, meta])	Convert a dask DataFrame to a dask array.
`DataFrame.to_delayed`([optimize_graph])	Convert into a list of `dask.delayed` objects, one per partition.
`DataFrame.to_hdf`(path_or_buf, key[, mode, ...])	Store Dask Dataframe to Hierarchical Data Format (HDF) files
`DataFrame.to_html`([max_rows])	Render a DataFrame as an HTML table.
`DataFrame.to_json`(filename, args, *kwargs)	See dd.to_json docstring for more information
`DataFrame.to_parquet`(path, args, *kwargs)	Store Dask.dataframe to Parquet files
`DataFrame.to_records`([index, lengths])	Create Dask Array from a Dask Dataframe
`DataFrame.to_string`([max_rows])	Render a DataFrame to a console-friendly tabular output.
`DataFrame.to_sql`(name, uri[, schema, ...])	See dd.to_sql docstring for more information
`DataFrame.to_timestamp`([freq, how, axis])	Cast to DatetimeIndex of timestamps, at beginning of period.
`DataFrame.truediv`(other[, axis, level, ...])	Get Floating division of dataframe and other, element-wise (binary operator truediv).
`DataFrame.values`	Return a dask.array of the values of this dataframe
`DataFrame.var`([axis, skipna, ddof, ...])	Return unbiased variance over requested axis.
`DataFrame.visualize`([filename, format, ...])	Render the computation of this object's task graph using graphviz.
`DataFrame.where`(cond[, other])	Replace values where the condition is False.

Series¶

`Series`(dsk, name, meta, divisions)	Parallel Pandas Series
`Series.add`(other[, level, fill_value, axis])	Return Addition of series and other, element-wise (binary operator add).
`Series.align`(other[, join, axis, fill_value])	Align two objects on their axes with the specified join method.
`Series.all`([axis, skipna, split_every, out])	Return whether all elements are True, potentially over an axis.
`Series.any`([axis, skipna, split_every, out])	Return whether any element is True, potentially over an axis.
`Series.apply`(func[, convert_dtype, meta, args])	Parallel version of pandas.Series.apply
`Series.astype`(dtype)	Cast a pandas object to a specified dtype `dtype`.
`Series.autocorr`([lag, split_every])	Compute the lag-N autocorrelation.
`Series.between`(left, right[, inclusive])	Return boolean Series equivalent to left <= series <= right.
`Series.bfill`([axis, limit])	Fill NA/NaN values by using the next valid observation to fill the gap.
`Series.clear_divisions`()	Forget division information
`Series.clip`([lower, upper, axis])	Trim values at input threshold(s).
`Series.compute`(**kwargs)	Compute this dask collection
`Series.copy`([deep])	Make a copy of the dataframe
`Series.corr`(other[, method, min_periods, ...])	Compute correlation with other Series, excluding missing values.
`Series.count`([split_every])	Return number of non-NA/null observations in the Series.
`Series.cov`(other[, min_periods, split_every])	Compute covariance with Series, excluding missing values.
`Series.cummax`([axis, skipna, out])	Return cumulative maximum over a DataFrame or Series axis.
`Series.cummin`([axis, skipna, out])	Return cumulative minimum over a DataFrame or Series axis.
`Series.cumprod`([axis, skipna, dtype, out])	Return cumulative product over a DataFrame or Series axis.
`Series.cumsum`([axis, skipna, dtype, out])	Return cumulative sum over a DataFrame or Series axis.
`Series.describe`([split_every, percentiles, ...])	Generate descriptive statistics.
`Series.diff`([periods, axis])	First discrete difference of element.
`Series.div`(other[, level, fill_value, axis])	Return Floating division of series and other, element-wise (binary operator truediv).
`Series.drop_duplicates`([subset, ...])	Return DataFrame with duplicate rows removed.
`Series.dropna`()	Return a new Series with missing values removed.
`Series.dtype`	Return data type
`Series.eq`(other[, level, fill_value, axis])	Return Equal to of series and other, element-wise (binary operator eq).
`Series.explode`()	Transform each element of a list-like to a row.
`Series.ffill`([axis, limit])	Fill NA/NaN values by propagating the last valid observation to next valid.
`Series.fillna`([value, method, limit, axis])	Fill NA/NaN values using the specified method.
`Series.first`(offset)	Select initial periods of time series data based on a date offset.
`Series.floordiv`(other[, level, fill_value, axis])	Return Integer division of series and other, element-wise (binary operator floordiv).
`Series.ge`(other[, level, fill_value, axis])	Return Greater than or equal to of series and other, element-wise (binary operator ge).
`Series.get_partition`(n)	Get a dask DataFrame/Series representing the nth partition.
`Series.groupby`([by, group_keys, sort, ...])	Group Series using a mapper or by a Series of columns.
`Series.gt`(other[, level, fill_value, axis])	Return Greater than of series and other, element-wise (binary operator gt).
`Series.head`([n, npartitions, compute])	First n rows of the dataset
`Series.idxmax`([axis, skipna, split_every, ...])	Return index of first occurrence of maximum over requested axis.
`Series.idxmin`([axis, skipna, split_every, ...])	Return index of first occurrence of minimum over requested axis.
`Series.isin`(values)	Whether elements in Series are contained in values.
`Series.isna`()	Detect missing values.
`Series.isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`Series.known_divisions`	Whether divisions are already known
`Series.last`(offset)	Select final periods of time series data based on a date offset.
`Series.le`(other[, level, fill_value, axis])	Return Less than or equal to of series and other, element-wise (binary operator le).
`Series.loc`	Purely label-location based indexer for selection by label.
`Series.lt`(other[, level, fill_value, axis])	Return Less than of series and other, element-wise (binary operator lt).
`Series.map`(arg[, na_action, meta])	Map values of Series according to an input mapping or function.
`Series.map_overlap`(func, before, after, ...)	Apply a function to each partition, sharing rows with adjacent partitions.
`Series.map_partitions`(func, args, *kwargs)	Apply Python function on each DataFrame partition.
`Series.mask`(cond[, other])	Replace values where the condition is True.
`Series.max`([axis, skipna, split_every, out, ...])	Return the maximum of the values over the requested axis.
`Series.mean`([axis, skipna, split_every, ...])	Return the mean of the values over the requested axis.
`Series.median`([method])	Return the median of the values over the requested axis.
`Series.median_approximate`([method])	Return the approximate median of the values over the requested axis.
`Series.memory_usage`([index, deep])	Return the memory usage of the Series.
`Series.memory_usage_per_partition`([index, deep])	Return the memory usage of each partition
`Series.min`([axis, skipna, split_every, out, ...])	Return the minimum of the values over the requested axis.
`Series.mod`(other[, level, fill_value, axis])	Return Modulo of series and other, element-wise (binary operator mod).
`Series.mul`(other[, level, fill_value, axis])	Return Multiplication of series and other, element-wise (binary operator mul).
`Series.nbytes`	Number of bytes
`Series.ndim`	Return dimensionality
`Series.ne`(other[, level, fill_value, axis])	Return Not equal to of series and other, element-wise (binary operator ne).
`Series.nlargest`([n, split_every])	Return the largest n elements.
`Series.notnull`()	DataFrame.notnull is an alias for DataFrame.notna.
`Series.nsmallest`([n, split_every])	Return the smallest n elements.
`Series.nunique`([split_every, dropna])	Return number of unique elements in the object.
`Series.nunique_approx`([split_every])	Approximate number of unique rows.
`Series.persist`(**kwargs)	Persist this dask collection into memory
`Series.pipe`(func, args, *kwargs)	Apply chainable functions that expect Series or DataFrames.
`Series.pow`(other[, level, fill_value, axis])	Return Exponential power of series and other, element-wise (binary operator pow).
`Series.prod`([axis, skipna, split_every, ...])	Return the product of the values over the requested axis.
`Series.quantile`([q, method])	Approximate quantiles of Series
`Series.radd`(other[, level, fill_value, axis])	Return Addition of series and other, element-wise (binary operator radd).
`Series.random_split`(frac[, random_state, ...])	Pseudorandomly split dataframe into different pieces row-wise
`Series.rdiv`(other[, level, fill_value, axis])	Return Floating division of series and other, element-wise (binary operator rtruediv).
`Series.reduction`(chunk[, aggregate, ...])	Generic row-wise reductions.
`Series.repartition`([divisions, npartitions, ...])	Repartition dataframe along new divisions
`Series.replace`([to_replace, value, regex])	Replace values given in to_replace with value.
`Series.rename`([index, inplace, sorted_index])	Alter Series index labels or name
`Series.resample`(rule[, closed, label])	Resample time-series data.
`Series.reset_index`([drop])	Reset the index to the default index.
`Series.rolling`(window[, min_periods, ...])	Provides rolling transformations.
`Series.round`([decimals])	Round each value in a Series to the given number of decimals.
`Series.sample`([n, frac, replace, random_state])	Random sample of items
`Series.sem`([axis, skipna, ddof, ...])	Return unbiased standard error of the mean over requested axis.
`Series.shape`	Return a tuple representing the dimensionality of a Series.
`Series.shift`([periods, freq, axis])	Shift index by desired number of periods with an optional time freq.
`Series.size`	Size of the Series or DataFrame as a Delayed object.
`Series.std`([axis, skipna, ddof, ...])	Return sample standard deviation over requested axis.
`Series.sub`(other[, level, fill_value, axis])	Return Subtraction of series and other, element-wise (binary operator sub).
`Series.sum`([axis, skipna, split_every, ...])	Return the sum of the values over the requested axis.
`Series.to_backend`([backend])	Move to a new DataFrame backend
`Series.to_bag`([index, format])	Create a Dask Bag from a Series
`Series.to_csv`(filename, **kwargs)	Store Dask DataFrame to CSV files
`Series.to_dask_array`([lengths, meta])	Convert a dask DataFrame to a dask array.
`Series.to_delayed`([optimize_graph])	Convert into a list of `dask.delayed` objects, one per partition.
`Series.to_frame`([name])	Convert Series to DataFrame.
`Series.to_hdf`(path_or_buf, key[, mode, append])	Store Dask Dataframe to Hierarchical Data Format (HDF) files
`Series.to_string`([max_rows])	Render a string representation of the Series.
`Series.to_timestamp`([freq, how, axis])	Cast to DatetimeIndex of Timestamps, at beginning of period.
`Series.truediv`(other[, level, fill_value, axis])	Return Floating division of series and other, element-wise (binary operator truediv).
`Series.unique`([split_every, split_out])	Return Series of unique values in the object.
`Series.value_counts`([sort, ascending, ...])	Return a Series containing counts of unique values.
`Series.values`	Return a dask.array of the values of this dataframe
`Series.var`([axis, skipna, ddof, ...])	Return unbiased variance over requested axis.
`Series.visualize`([filename, format, ...])	Render the computation of this object's task graph using graphviz.
`Series.where`(cond[, other])	Replace values where the condition is False.

Index¶

`Index`(dsk, name, meta, divisions)
`Index.add`(other[, level, fill_value, axis])	Return Addition of series and other, element-wise (binary operator add).
`Index.align`(other[, join, axis, fill_value])	Align two objects on their axes with the specified join method.
`Index.all`([axis, skipna, split_every, out])	Return whether all elements are True, potentially over an axis.
`Index.any`([axis, skipna, split_every, out])	Return whether any element is True, potentially over an axis.
`Index.apply`(func[, convert_dtype, meta, args])	Parallel version of pandas.Series.apply
`Index.astype`(dtype)	Cast a pandas object to a specified dtype `dtype`.
`Index.autocorr`([lag, split_every])	Compute the lag-N autocorrelation.
`Index.between`(left, right[, inclusive])	Return boolean Series equivalent to left <= series <= right.
`Index.bfill`([axis, limit])	Fill NA/NaN values by using the next valid observation to fill the gap.
`Index.clear_divisions`()	Forget division information
`Index.clip`([lower, upper, axis])	Trim values at input threshold(s).
`Index.compute`(**kwargs)	Compute this dask collection
`Index.copy`([deep])	Make a copy of the dataframe
`Index.corr`(other[, method, min_periods, ...])	Compute correlation with other Series, excluding missing values.
`Index.count`([split_every])	Return number of non-NA/null observations in the Series.
`Index.cov`(other[, min_periods, split_every])	Compute covariance with Series, excluding missing values.
`Index.cummax`([axis, skipna, out])	Return cumulative maximum over a DataFrame or Series axis.
`Index.cummin`([axis, skipna, out])	Return cumulative minimum over a DataFrame or Series axis.
`Index.cumprod`([axis, skipna, dtype, out])	Return cumulative product over a DataFrame or Series axis.
`Index.cumsum`([axis, skipna, dtype, out])	Return cumulative sum over a DataFrame or Series axis.
`Index.describe`([split_every, percentiles, ...])	Generate descriptive statistics.
`Index.diff`([periods, axis])	First discrete difference of element.
`Index.div`(other[, level, fill_value, axis])	Return Floating division of series and other, element-wise (binary operator truediv).
`Index.drop_duplicates`([split_every, ...])	Return Index with duplicate values removed.
`Index.dropna`()	Return a new Series with missing values removed.
`Index.dtype`	Return data type
`Index.eq`(other[, level, fill_value, axis])	Return Equal to of series and other, element-wise (binary operator eq).
`Index.explode`()	Transform each element of a list-like to a row.
`Index.ffill`([axis, limit])	Fill NA/NaN values by propagating the last valid observation to next valid.
`Index.fillna`([value, method, limit, axis])	Fill NA/NaN values using the specified method.
`Index.first`(offset)	Select initial periods of time series data based on a date offset.
`Index.floordiv`(other[, level, fill_value, axis])	Return Integer division of series and other, element-wise (binary operator floordiv).
`Index.ge`(other[, level, fill_value, axis])	Return Greater than or equal to of series and other, element-wise (binary operator ge).
`Index.get_partition`(n)	Get a dask DataFrame/Series representing the nth partition.
`Index.groupby`([by, group_keys, sort, ...])	Group Series using a mapper or by a Series of columns.
`Index.gt`(other[, level, fill_value, axis])	Return Greater than of series and other, element-wise (binary operator gt).
`Index.head`([n, compute])	First n items of the Index.
`Index.idxmax`([axis, skipna, split_every, ...])	Return index of first occurrence of maximum over requested axis.
`Index.idxmin`([axis, skipna, split_every, ...])	Return index of first occurrence of minimum over requested axis.
`Index.is_monotonic_decreasing`	Return a boolean if the values are equal or decreasing.
`Index.is_monotonic_increasing`	Return a boolean if the values are equal or increasing.
`Index.isin`(values)	Whether elements in Series are contained in values.
`Index.isna`()	Detect missing values.
`Index.isnull`()	DataFrame.isnull is an alias for DataFrame.isna.
`Index.known_divisions`	Whether divisions are already known
`Index.last`(offset)	Select final periods of time series data based on a date offset.
`Index.le`(other[, level, fill_value, axis])	Return Less than or equal to of series and other, element-wise (binary operator le).
`Index.loc`	Purely label-location based indexer for selection by label.
`Index.lt`(other[, level, fill_value, axis])	Return Less than of series and other, element-wise (binary operator lt).
`Index.map`(arg[, na_action, meta, is_monotonic])	Map values using an input mapping or function.
`Index.map_overlap`(func, before, after, ...)	Apply a function to each partition, sharing rows with adjacent partitions.
`Index.map_partitions`(func, args, *kwargs)	Apply Python function on each DataFrame partition.
`Index.mask`(cond[, other])	Replace values where the condition is True.
`Index.max`([split_every])	Return the maximum value of the Index.
`Index.mean`([axis, skipna, split_every, ...])	Return the mean of the values over the requested axis.
`Index.median`([method])	Return the median of the values over the requested axis.
`Index.median_approximate`([method])	Return the approximate median of the values over the requested axis.
`Index.memory_usage`([deep])	Memory usage of the values.
`Index.memory_usage_per_partition`([index, deep])	Return the memory usage of each partition
`Index.min`([split_every])	Return the minimum value of the Index.
`Index.mod`(other[, level, fill_value, axis])	Return Modulo of series and other, element-wise (binary operator mod).
`Index.mul`(other[, level, fill_value, axis])	Return Multiplication of series and other, element-wise (binary operator mul).
`Index.nbytes`	Number of bytes
`Index.ndim`	Return dimensionality
`Index.ne`(other[, level, fill_value, axis])	Return Not equal to of series and other, element-wise (binary operator ne).
`Index.nlargest`([n, split_every])	Return the largest n elements.
`Index.notnull`()	DataFrame.notnull is an alias for DataFrame.notna.
`Index.nsmallest`([n, split_every])	Return the smallest n elements.
`Index.nunique`([split_every, dropna])	Return number of unique elements in the object.
`Index.nunique_approx`([split_every])	Approximate number of unique rows.
`Index.persist`(**kwargs)	Persist this dask collection into memory
`Index.pipe`(func, args, *kwargs)	Apply chainable functions that expect Series or DataFrames.
`Index.pow`(other[, level, fill_value, axis])	Return Exponential power of series and other, element-wise (binary operator pow).
`Index.prod`([axis, skipna, split_every, ...])	Return the product of the values over the requested axis.
`Index.quantile`([q, method])	Approximate quantiles of Series
`Index.radd`(other[, level, fill_value, axis])	Return Addition of series and other, element-wise (binary operator radd).
`Index.random_split`(frac[, random_state, shuffle])	Pseudorandomly split dataframe into different pieces row-wise
`Index.rdiv`(other[, level, fill_value, axis])	Return Floating division of series and other, element-wise (binary operator rtruediv).
`Index.reduction`(chunk[, aggregate, combine, ...])	Generic row-wise reductions.
`Index.rename`([index, inplace, sorted_index])	Alter Series index labels or name
`Index.repartition`([divisions, npartitions, ...])	Repartition dataframe along new divisions
`Index.replace`([to_replace, value, regex])	Replace values given in to_replace with value.
`Index.resample`(rule[, closed, label])	Resample time-series data.
`Index.reset_index`([drop])	Reset the index to the default index.
`Index.rolling`(window[, min_periods, center, ...])	Provides rolling transformations.
`Index.round`([decimals])	Round each value in a Series to the given number of decimals.
`Index.sample`([n, frac, replace, random_state])	Random sample of items
`Index.sem`([axis, skipna, ddof, split_every, ...])	Return unbiased standard error of the mean over requested axis.
`Index.shape`	Return a tuple representing the dimensionality of a Series.
`Index.shift`([periods, freq])	Shift index by desired number of time frequency increments.
`Index.size`	Size of the Series or DataFrame as a Delayed object.
`Index.std`([axis, skipna, ddof, split_every, ...])	Return sample standard deviation over requested axis.
`Index.sub`(other[, level, fill_value, axis])	Return Subtraction of series and other, element-wise (binary operator sub).
`Index.sum`([axis, skipna, split_every, ...])	Return the sum of the values over the requested axis.
`Index.to_backend`([backend])	Move to a new DataFrame backend
`Index.to_bag`([index, format])	Create a Dask Bag from a Series
`Index.to_csv`(filename, **kwargs)	Store Dask DataFrame to CSV files
`Index.to_dask_array`([lengths, meta])	Convert a dask DataFrame to a dask array.
`Index.to_delayed`([optimize_graph])	Convert into a list of `dask.delayed` objects, one per partition.
`Index.to_frame`([index, name])	Create a DataFrame with a column containing the Index.
`Index.to_hdf`(path_or_buf, key[, mode, append])	Store Dask Dataframe to Hierarchical Data Format (HDF) files
`Index.to_series`()	Create a Series with both index and values equal to the index keys.
`Index.to_string`([max_rows])	Render a string representation of the Series.
`Index.to_timestamp`([freq, how, axis])	Cast to DatetimeIndex of Timestamps, at beginning of period.
`Index.truediv`(other[, level, fill_value, axis])	Return Floating division of series and other, element-wise (binary operator truediv).
`Index.unique`([split_every, split_out])	Return Series of unique values in the object.
`Index.value_counts`([sort, ascending, ...])	Return a Series containing counts of unique values.
`Index.values`	Return a dask.array of the values of this dataframe
`Index.var`([axis, skipna, ddof, split_every, ...])	Return unbiased variance over requested axis.
`Index.visualize`([filename, format, ...])	Render the computation of this object's task graph using graphviz.
`Index.where`(cond[, other])	Replace values where the condition is False.
`Index.to_frame`([index, name])	Create a DataFrame with a column containing the Index.

Accessors¶

Similar to pandas, Dask provides dtype-specific methods under various accessors. These are separate namespaces within Series that only apply to specific data types.

Datetime Accessor¶

Methods

`Series.dt.ceil`(args, *kwargs)	Perform ceil operation on the data to the specified freq.
`Series.dt.floor`(args, *kwargs)	Perform floor operation on the data to the specified freq.
`Series.dt.isocalendar`()	Calculate year, week, and day according to the ISO 8601 standard.
`Series.dt.normalize`(args, *kwargs)	Convert times to midnight.
`Series.dt.round`(args, *kwargs)	Perform round operation on the data to the specified freq.
`Series.dt.strftime`(args, *kwargs)	Convert to Index using specified date_format.

Attributes

`Series.dt.date`	Returns numpy array of python `datetime.date` objects.
`Series.dt.day`	The day of the datetime.
`Series.dt.dayofweek`	The day of the week with Monday=0, Sunday=6.
`Series.dt.dayofyear`	The ordinal day of the year.
`Series.dt.daysinmonth`	The number of days in the month.
`Series.dt.freq`
`Series.dt.hour`	The hours of the datetime.
`Series.dt.microsecond`	The microseconds of the datetime.
`Series.dt.minute`	The minutes of the datetime.
`Series.dt.month`	The month as January=1, December=12.
`Series.dt.nanosecond`	The nanoseconds of the datetime.
`Series.dt.quarter`	The quarter of the date.
`Series.dt.second`	The seconds of the datetime.
`Series.dt.time`	Returns numpy array of `datetime.time` objects.
`Series.dt.timetz`	Returns numpy array of `datetime.time` objects with timezones.
`Series.dt.tz`	Return the timezone.
`Series.dt.week`	The week ordinal of the year.
`Series.dt.weekday`	The day of the week with Monday=0, Sunday=6.
`Series.dt.weekofyear`	The week ordinal of the year.
`Series.dt.year`	The year of the datetime.

String Accessor¶

Methods

`Series.str.capitalize`()	Convert strings in the Series/Index to be capitalized.
`Series.str.casefold`()	Convert strings in the Series/Index to be casefolded.
`Series.str.cat`([others, sep, na_rep])	Concatenate strings in the Series/Index with given separator.
`Series.str.center`(width[, fillchar])	Pad left and right side of strings in the Series/Index.
`Series.str.contains`(pat[, case, flags, na, ...])	Test if pattern or regex is contained within a string of a Series or Index.
`Series.str.count`(pat[, flags])	Count occurrences of pattern in each string of the Series/Index.
`Series.str.decode`(encoding[, errors])	Decode character string in the Series/Index using indicated encoding.
`Series.str.encode`(encoding[, errors])	Encode character string in the Series/Index using indicated encoding.
`Series.str.endswith`(args, *kwargs)	Test if the end of each string element matches a pattern.
`Series.str.extract`(args, *kwargs)	Extract capture groups in the regex pat as columns in a DataFrame.
`Series.str.extractall`(pat[, flags])	Extract capture groups in the regex pat as columns in DataFrame.
`Series.str.find`(sub[, start, end])	Return lowest indexes in each strings in the Series/Index.
`Series.str.findall`(pat[, flags])	Find all occurrences of pattern or regular expression in the Series/Index.
`Series.str.fullmatch`(pat[, case, flags, na])	Determine if each string entirely matches a regular expression.
`Series.str.get`(i)	Extract element from each component at specified position or with specified key.
`Series.str.index`(sub[, start, end])	Return lowest indexes in each string in Series/Index.
`Series.str.isalnum`()	Check whether all characters in each string are alphanumeric.
`Series.str.isalpha`()	Check whether all characters in each string are alphabetic.
`Series.str.isdecimal`()	Check whether all characters in each string are decimal.
`Series.str.isdigit`()	Check whether all characters in each string are digits.
`Series.str.islower`()	Check whether all characters in each string are lowercase.
`Series.str.isnumeric`()	Check whether all characters in each string are numeric.
`Series.str.isspace`()	Check whether all characters in each string are whitespace.
`Series.str.istitle`()	Check whether all characters in each string are titlecase.
`Series.str.isupper`()	Check whether all characters in each string are uppercase.
`Series.str.join`(sep)	Join lists contained as elements in the Series/Index with passed delimiter.
`Series.str.len`()	Compute the length of each element in the Series/Index.
`Series.str.ljust`(width[, fillchar])	Pad right side of strings in the Series/Index.
`Series.str.lower`()	Convert strings in the Series/Index to lowercase.
`Series.str.lstrip`([to_strip])	Remove leading characters.
`Series.str.match`(pat[, case, flags, na])	Determine if each string starts with a match of a regular expression.
`Series.str.normalize`(form)	Return the Unicode normal form for the strings in the Series/Index.
`Series.str.pad`(width[, side, fillchar])	Pad strings in the Series/Index up to width.
`Series.str.partition`([sep, expand])	Split the string at the first occurrence of sep.
`Series.str.repeat`(repeats)	Duplicate each string in the Series or Index.
`Series.str.replace`(pat, repl[, n, case, ...])	Replace each occurrence of pattern/regex in the Series/Index.
`Series.str.rfind`(sub[, start, end])	Return highest indexes in each strings in the Series/Index.
`Series.str.rindex`(sub[, start, end])	Return highest indexes in each string in Series/Index.
`Series.str.rjust`(width[, fillchar])	Pad left side of strings in the Series/Index.
`Series.str.rpartition`([sep, expand])	Split the string at the last occurrence of sep.
`Series.str.rsplit`([pat, n, expand])	Split strings around given separator/delimiter.
`Series.str.rstrip`([to_strip])	Remove trailing characters.
`Series.str.slice`([start, stop, step])	Slice substrings from each element in the Series or Index.
`Series.str.split`([pat, n, expand])	Split strings around given separator/delimiter.
`Series.str.startswith`(args, *kwargs)	Test if the start of each string element matches a pattern.
`Series.str.strip`([to_strip])	Remove leading and trailing characters.
`Series.str.swapcase`()	Convert strings in the Series/Index to be swapcased.
`Series.str.title`()	Convert strings in the Series/Index to titlecase.
`Series.str.translate`(table)	Map all characters in the string through the given mapping table.
`Series.str.upper`()	Convert strings in the Series/Index to uppercase.
`Series.str.wrap`(width, **kwargs)	Wrap strings in Series/Index at specified line width.
`Series.str.zfill`(width)	Pad strings in the Series/Index by prepending '0' characters.

Categorical Accessor¶

Methods

`Series.cat.add_categories`(args, *kwargs)	Add new categories.
`Series.cat.as_known`(**kwargs)	Ensure the categories in this series are known.
`Series.cat.as_ordered`(args, *kwargs)	Set the Categorical to be ordered.
`Series.cat.as_unknown`()	Ensure the categories in this series are unknown
`Series.cat.as_unordered`(args, *kwargs)	Set the Categorical to be unordered.
`Series.cat.remove_categories`(args, *kwargs)	Remove the specified categories.
`Series.cat.remove_unused_categories`()	Removes categories which are not used
`Series.cat.rename_categories`(args, *kwargs)	Rename categories.
`Series.cat.reorder_categories`(args, *kwargs)	Reorder categories as specified in new_categories.
`Series.cat.set_categories`(args, *kwargs)	Set the categories to the specified new categories.

Attributes

`Series.cat.categories`	The categories of this categorical.
`Series.cat.codes`	The codes of this categorical.
`Series.cat.known`	Whether the categories are fully known
`Series.cat.ordered`	Whether the categories have an ordered relationship

Groupby Operations¶

DataFrame Groupby¶

`DataFrameGroupBy.aggregate`([arg, ...])	Aggregate using one or more specified operations
`DataFrameGroupBy.apply`(func, args, *kwargs)	Parallel version of pandas GroupBy.apply
`DataFrameGroupBy.bfill`([limit])	Backward fill the values.
`DataFrameGroupBy.count`([split_every, ...])	Compute count of group, excluding missing values.
`DataFrameGroupBy.cumcount`([axis])	Number each item in each group from 0 to the length of that group - 1.
`DataFrameGroupBy.cumprod`([axis, numeric_only])	Cumulative product for each group.
`DataFrameGroupBy.cumsum`([axis, numeric_only])	Cumulative sum for each group.
`DataFrameGroupBy.fillna`([value, method, ...])	Fill NA/NaN values using the specified method.
`DataFrameGroupBy.ffill`([limit])	Forward fill the values.
`DataFrameGroupBy.get_group`(key)	Construct DataFrame from group with provided name.
`DataFrameGroupBy.max`([split_every, ...])	Compute max of group values.
`DataFrameGroupBy.mean`([split_every, ...])	Compute mean of groups, excluding missing values.
`DataFrameGroupBy.min`([split_every, ...])	Compute min of group values.
`DataFrameGroupBy.size`([split_every, ...])	Compute group sizes.
`DataFrameGroupBy.std`([ddof, split_every, ...])	Compute standard deviation of groups, excluding missing values.
`DataFrameGroupBy.sum`([split_every, ...])	Compute sum of group values.
`DataFrameGroupBy.var`([ddof, split_every, ...])	Compute variance of groups, excluding missing values.
`DataFrameGroupBy.cov`([ddof, split_every, ...])	Compute pairwise covariance of columns, excluding NA/null values.
`DataFrameGroupBy.corr`([ddof, split_every, ...])	Compute pairwise correlation of columns, excluding NA/null values.
`DataFrameGroupBy.first`([split_every, ...])	Compute the first entry of each column within each group.
`DataFrameGroupBy.last`([split_every, ...])	Compute the last entry of each column within each group.
`DataFrameGroupBy.idxmin`([split_every, ...])	Return index of first occurrence of minimum over requested axis.
`DataFrameGroupBy.idxmax`([split_every, ...])	Return index of first occurrence of maximum over requested axis.
`DataFrameGroupBy.rolling`(window[, ...])	Provides rolling transformations.
`DataFrameGroupBy.transform`(func, args, *kwargs)	Parallel version of pandas GroupBy.transform

Series Groupby¶

`SeriesGroupBy.aggregate`([arg, split_every, ...])	Aggregate using one or more specified operations
`SeriesGroupBy.apply`(func, args, *kwargs)	Parallel version of pandas GroupBy.apply
`SeriesGroupBy.bfill`([limit])	Backward fill the values.
`SeriesGroupBy.count`([split_every, ...])	Compute count of group, excluding missing values.
`SeriesGroupBy.cumcount`([axis])	Number each item in each group from 0 to the length of that group - 1.
`SeriesGroupBy.cumprod`([axis, numeric_only])	Cumulative product for each group.
`SeriesGroupBy.cumsum`([axis, numeric_only])	Cumulative sum for each group.
`SeriesGroupBy.fillna`([value, method, limit, ...])	Fill NA/NaN values using the specified method.
`SeriesGroupBy.ffill`([limit])	Forward fill the values.
`SeriesGroupBy.get_group`(key)	Construct DataFrame from group with provided name.
`SeriesGroupBy.max`([split_every, split_out, ...])	Compute max of group values.
`SeriesGroupBy.mean`([split_every, split_out, ...])	Compute mean of groups, excluding missing values.
`SeriesGroupBy.min`([split_every, split_out, ...])	Compute min of group values.
`SeriesGroupBy.nunique`([split_every, split_out])	Return number of unique elements in the group.
`SeriesGroupBy.size`([split_every, split_out, ...])	Compute group sizes.
`SeriesGroupBy.std`([ddof, split_every, ...])	Compute standard deviation of groups, excluding missing values.
`SeriesGroupBy.sum`([split_every, split_out, ...])	Compute sum of group values.
`SeriesGroupBy.var`([ddof, split_every, ...])	Compute variance of groups, excluding missing values.
`SeriesGroupBy.first`([split_every, ...])	Compute the first entry of each column within each group.
`SeriesGroupBy.last`([split_every, split_out, ...])	Compute the last entry of each column within each group.
`SeriesGroupBy.idxmin`([split_every, ...])	Return index of first occurrence of minimum over requested axis.
`SeriesGroupBy.idxmax`([split_every, ...])	Return index of first occurrence of maximum over requested axis.
`SeriesGroupBy.rolling`(window[, min_periods, ...])	Provides rolling transformations.
`SeriesGroupBy.transform`(func, args, *kwargs)	Parallel version of pandas GroupBy.transform

Custom Aggregation¶

Aggregation(name, chunk, agg[, finalize])

User defined groupby-aggregation.

Rolling Operations¶

`map_overlap`(func, df, before, after, *args)	Apply a function to each partition, sharing rows with adjacent partitions.
`Series.rolling`(window[, min_periods, ...])	Provides rolling transformations.
`DataFrame.rolling`(window[, min_periods, ...])	Provides rolling transformations.

`Rolling.apply`(func[, raw, engine, ...])	Calculate the rolling custom aggregation function.
`Rolling.count`()	Calculate the rolling count of non NaN observations.
`Rolling.kurt`()	Calculate the rolling Fisher's definition of kurtosis without bias.
`Rolling.max`()	Calculate the rolling maximum.
`Rolling.mean`()	Calculate the rolling mean.
`Rolling.median`()	Calculate the rolling median.
`Rolling.min`()	Calculate the rolling minimum.
`Rolling.quantile`(quantile)	Calculate the rolling quantile.
`Rolling.skew`()	Calculate the rolling unbiased skewness.
`Rolling.std`([ddof])	Calculate the rolling standard deviation.
`Rolling.sum`()	Calculate the rolling sum.
`Rolling.var`([ddof])	Calculate the rolling variance.

Create DataFrames¶

`read_csv`(urlpath[, blocksize, ...])	Read CSV files into a Dask.DataFrame
`read_table`(urlpath[, blocksize, ...])	Read delimited files into a Dask.DataFrame
`read_fwf`(urlpath[, blocksize, ...])	Read fixed-width files into a Dask.DataFrame
`read_parquet`(path[, columns, filters, ...])	Read a Parquet file into a Dask DataFrame
`read_hdf`(pattern, key[, start, stop, ...])	Read HDF files into a Dask DataFrame
`read_json`(url_path[, orient, lines, ...])	Create a dataframe from a set of JSON files
`read_orc`(path[, engine, columns, index, ...])	Read dataframe from ORC file(s)
`read_sql_table`(table_name, con, index_col[, ...])	Read SQL database table into a DataFrame.
`read_sql_query`(sql, con, index_col[, ...])	Read SQL query into a DataFrame.
`read_sql`(sql, con, index_col, **kwargs)	Read SQL query or database table into a DataFrame.
`from_array`(x[, chunksize, columns, meta])	Read any sliceable array into a Dask Dataframe
`from_dask_array`(x[, columns, index, meta])	Create a Dask DataFrame from a Dask Array.
`from_delayed`(dfs[, meta, divisions, prefix, ...])	Create Dask DataFrame from many Dask Delayed objects
`from_map`(func, *iterables[, args, meta, ...])	Create a DataFrame collection from a custom function map
`from_pandas`()	Construct a Dask DataFrame from a Pandas DataFrame
`DataFrame.from_dict`(data, *, npartitions[, ...])	Construct a Dask DataFrame from a Python Dictionary

Bag.to_dataframe([meta, columns, optimize_graph])

Create Dask Dataframe from a Dask Bag.

Store DataFrames¶

`to_csv`(df, filename[, single_file, ...])	Store Dask DataFrame to CSV files
`to_parquet`(df, path[, engine, compression, ...])	Store Dask.dataframe to Parquet files
`to_hdf`(df, path, key[, mode, append, ...])	Store Dask Dataframe to Hierarchical Data Format (HDF) files
`to_records`(df)	Create Dask Array from a Dask Dataframe
`to_sql`(df, name, uri[, schema, if_exists, ...])	Store Dask Dataframe to a SQL table
`to_json`(df, url_path[, orient, lines, ...])	Write dataframe into JSON text files

Convert DataFrames¶

`DataFrame.to_bag`([index, format])	Create Dask Bag from a Dask DataFrame
`DataFrame.to_dask_array`([lengths, meta])	Convert a dask DataFrame to a dask array.
`DataFrame.to_delayed`([optimize_graph])	Convert into a list of `dask.delayed` objects, one per partition.

Reshape DataFrames¶

`get_dummies`(data[, prefix, prefix_sep, ...])	Convert categorical variable into dummy/indicator variables.
`pivot_table`(df[, index, columns, values, ...])	Create a spreadsheet-style pivot table as a DataFrame.
`melt`(frame[, id_vars, value_vars, var_name, ...])	Unpivots a DataFrame from wide format to long format, optionally leaving identifier variables set.

Concatenate DataFrames¶

`DataFrame.merge`(right[, how, on, left_on, ...])	Merge the DataFrame with another DataFrame
`concat`(dfs[, axis, join, ...])	Concatenate DataFrames along rows.
`merge`(left, right[, how, on, left_on, ...])	Merge DataFrame or named Series objects with a database-style join.
`merge_asof`(left, right[, on, left_on, ...])	Perform a merge by key distance.

Resampling¶

`Resampler`(obj, rule, **kwargs)	Class for resampling timeseries data.
`Resampler.agg`(agg_funcs, args, *kwargs)	Aggregate using one or more operations over the specified axis.
`Resampler.count`()	Compute count of group, excluding missing values.
`Resampler.first`()	Compute the first entry of each column within each group.
`Resampler.last`()	Compute the last entry of each column within each group.
`Resampler.max`()	Compute max value of group.
`Resampler.mean`()	Compute mean of groups, excluding missing values.
`Resampler.median`()	Compute median of groups, excluding missing values.
`Resampler.min`()	Compute min value of group.
`Resampler.nunique`()	Return number of unique elements in the group.
`Resampler.ohlc`()	Compute open, high, low and close values of a group, excluding missing values.
`Resampler.prod`()	Compute prod of group values.
`Resampler.quantile`()	Return value at the given quantile.
`Resampler.sem`()	Compute standard error of the mean of groups, excluding missing values.
`Resampler.size`()	Compute group sizes.
`Resampler.std`()	Compute standard deviation of groups, excluding missing values.
`Resampler.sum`()	Compute sum of group values.
`Resampler.var`()	Compute variance of groups, excluding missing values.

Dask Metadata¶

make_meta(x[, index, parent_meta])

This method creates meta-data based on the type of x, and parent_meta if supplied.

Other functions¶

`compute`(*args[, traverse, optimize_graph, ...])	Compute several dask collections at once.
`map_partitions`(func, *args[, meta, ...])	Apply Python function on each DataFrame partition.
`to_datetime`()	Convert argument to datetime.
`to_numeric`(arg[, errors, meta])	Convert argument to a numeric type.
`to_timedelta`()	Convert argument to timedelta.

Using Hive Partitioning with Dask

dask.dataframe.DataFrame

Dask DataFrame API (legacy)

Contents

Dask DataFrame API (legacy)¶

Dataframe¶

Series¶

Index¶

Accessors¶

Datetime Accessor¶

String Accessor¶

Categorical Accessor¶

Groupby Operations¶

DataFrame Groupby¶

Series Groupby¶

Custom Aggregation¶

Rolling Operations¶

Create DataFrames¶

Store DataFrames¶

Convert DataFrames¶

Reshape DataFrames¶

Concatenate DataFrames¶

Resampling¶

Dask Metadata¶

Other functions¶