dask.dataframe.from_array

dask.dataframe.from_array

dask.dataframe.from_array(x, chunksize=50000, columns=None, meta=None)[source]

Read any sliceable array into a Dask Dataframe

Uses getitem syntax to pull slices out of the array. The array need not be a NumPy array but must support slicing syntax

x[50000:100000]

and have 2 dimensions:

x.ndim == 2

or have a record dtype:

x.dtype == [(‘name’, ‘O’), (‘balance’, ‘i8’)]

Parameters
xarray_like
chunksizeint, optional

The number of rows per partition to use.

columnslist or string, optional

list of column names if DataFrame, single string if Series

metaobject, optional

An optional meta parameter can be passed for dask to specify the concrete dataframe type to use for partitions of the Dask dataframe. By default, pandas DataFrame is used.

Returns
dask.DataFrame or dask.Series

A dask DataFrame/Series