dask.dataframe.Series.to_csv

Series.to_csv(filename, **kwargs)

Store Dask DataFrame to CSV files

One filename per partition will be created. You can specify the filenames in a variety of ways.

Use a globstring:

>>> df.to_csv('/path/to/data/export-*.csv')  

The * will be replaced by the increasing sequence 0, 1, 2, …

/path/to/data/export-0.csv
/path/to/data/export-1.csv

Use a globstring and a name_function= keyword argument. The name_function function should expect an integer and produce a string. Strings produced by name_function must preserve the order of their respective partition indices.

>>> from datetime import date, timedelta
>>> def name(i):
...     return str(date(2015, 1, 1) + i * timedelta(days=1))
>>> name(0)
'2015-01-01'
>>> name(15)
'2015-01-16'
>>> df.to_csv('/path/to/data/export-*.csv', name_function=name)  
/path/to/data/export-2015-01-01.csv
/path/to/data/export-2015-01-02.csv
...

You can also provide an explicit list of paths:

>>> paths = ['/path/to/data/alice.csv', '/path/to/data/bob.csv', ...]  
>>> df.to_csv(paths) 
Parameters
dfdask.DataFrame

Data to save

filenamestring

Path glob indicating the naming scheme for the output files

single_filebool, default False

Whether to save everything into a single CSV file. Under the single file mode, each partition is appended at the end of the specified CSV file. Note that not all filesystems support the append mode and thus the single file mode, especially on cloud storage systems such as S3 or GCS. A warning will be issued when writing to a file that is not backed by a local filesystem.

encodingstring, optional

A string representing the encoding to use in the output file, defaults to ‘ascii’ on Python 2 and ‘utf-8’ on Python 3.

modestr

Python write mode, default ‘w’

name_functioncallable, default None

Function accepting an integer (partition index) and producing a string to replace the asterisk in the given filename globstring. Should preserve the lexicographic order of partitions. Not supported when single_file is True.

compressionstring, optional

a string representing the compression to use in the output file, allowed values are ‘gzip’, ‘bz2’, ‘xz’, only used when the first argument is a filename

computebool

If true, immediately executes. If False, returns a set of delayed objects, which can be computed at a later time.

storage_optionsdict

Parameters passed on to the backend filesystem class.

header_first_partition_onlyboolean, default None

If set to True, only write the header row in the first output file. By default, headers are written to all partitions under the multiple file mode (single_file is False) and written only once under the single file mode (single_file is True). It must not be False under the single file mode.

compute_kwargsdict, optional

Options to be passed in to the compute method

kwargsdict, optional

Additional parameters to pass to pd.DataFrame.to_csv()

Returns
The names of the file written if they were computed right away
If not, the delayed tasks associated to the writing of the files
Raises
ValueError

If header_first_partition_only is set to False or name_function is specified when single_file is True.