dask.dataframe.DataFrame.melt

dask.dataframe.DataFrame.melt#

DataFrame.melt(id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)[source]#

Unpivot DataFrame from wide to long format, optionally leaving identifiers set.

This docstring was copied from pandas.DataFrame.melt.

Some inconsistencies with the Dask version may exist.

This function is useful to massage a DataFrame into a format where one or more columns are identifier variables (id_vars), while all other columns, considered measured variables (value_vars), are “unpivoted” to the row axis, leaving just two non-identifier columns, ‘variable’ and ‘value’.

Parameters:

id_varsscalar, tuple, list, or ndarray, optional: Column(s) to use as identifier variables.
value_varsscalar, tuple, list, or ndarray, optional: Column(s) to unpivot. If not specified, uses all columns that are not set as id_vars.
var_namescalar, default None: Name to use for the ‘variable’ column. If None it uses frame.columns.name or ‘variable’.
value_namescalar, default ‘value’: Name to use for the ‘value’ column, can’t be an existing column label.
col_levelscalar, optional: If columns are a MultiIndex then use this level to melt.
ignore_indexbool, default True (Not supported in Dask): If True, original index is ignored. If False, original index is retained. Index labels will be repeated as necessary.

Returns:

DataFrame: Unpivoted DataFrame.

See also

melt: Identical method.
pivot_table: Create a spreadsheet-style pivot table as a DataFrame.
DataFrame.pivot: Return reshaped DataFrame organized by given index / column values.
DataFrame.explode: Explode a DataFrame from list-like columns to long format.

Notes

Reference the user guide for more examples.

Examples

>>> df = pd.DataFrame(
...     {
...         "A": {0: "a", 1: "b", 2: "c"},
...         "B": {0: 1, 1: 3, 2: 5},
...         "C": {0: 2, 1: 4, 2: 6},
...     }
... )
>>> df
A  B  C
0  a  1  2
1  b  3  4
2  c  5  6

>>> df.melt(id_vars=["A"], value_vars=["B"])
A variable  value
0  a        B      1
1  b        B      3
2  c        B      5

>>> df.melt(id_vars=["A"], value_vars=["B", "C"])
A variable  value
a        B      1
b        B      3
c        B      5
a        C      2
b        C      4
c        C      6

The names of ‘variable’ and ‘value’ columns can be customized:

>>> df.melt(
...     id_vars=["A"],
...     value_vars=["B"],
...     var_name="myVarname",
...     value_name="myValname",
... )
A myVarname  myValname
0  a         B          1
1  b         B          3
2  c         B          5

Original index values can be kept around:

>>> df.melt(id_vars=["A"], value_vars=["B", "C"], ignore_index=False)
A variable  value
a        B      1
b        B      3
c        B      5
a        C      2
b        C      4
c        C      6

If you have multi-index columns:

>>> df.columns = [list("ABC"), list("DEF")]
>>> df
A  B  C
D  E  F
0  a  1  2
1  b  3  4
2  c  5  6

>>> df.melt(col_level=0, id_vars=["A"], value_vars=["B"])
A variable  value
0  a        B      1
1  b        B      3
2  c        B      5

>>> df.melt(id_vars=[("A", "D")], value_vars=[("B", "E")])
(A, D) variable_0 variable_1  value
0      a          B          E      1
1      b          B          E      3
2      c          B          E      5

dask.dataframe.DataFrame.melt

Contents

dask.dataframe.DataFrame.melt#