xarray.Dataset.groupby#

Dataset.groupby(group=None, *, squeeze=False, restore_coord_dims=False, eagerly_compute_group=None, **groupers)[source]#

Returns a DatasetGroupBy object for performing grouped operations.

Parameters:

group (str or DataArray or IndexVariable or sequence of hashable or mapping of hashable to Grouper) – Array whose unique values should be used to group this array. If a Hashable, must be the name of a coordinate contained in this dataarray. If a dictionary, must map an existing variable name to a Grouper instance.
squeeze (False) – This argument is deprecated.
restore_coord_dims (bool, default: False) – If True, also restore the dimension order of multi-dimensional coordinates.
eagerly_compute_group (False, optional) – This argument is deprecated.
**groupers (Mapping of str to Grouper or Resampler) – Mapping of variable name to group by to Grouper or Resampler object. One of group or groupers must be provided. Only a single grouper is allowed at present.

Returns:

grouped (DatasetGroupBy) – A DatasetGroupBy object patterned after pandas.GroupBy that can be iterated over in the form of (unique_value, grouped_array) pairs.

Examples

>>> ds = xr.Dataset(
...     {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))},
...     coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))},
... )

Grouping by a single variable is easy

>>> ds.groupby("letters")
<DatasetGroupBy, grouped over 1 grouper(s), 2 groups in total:
    'letters': UniqueGrouper('letters'), 2/2 groups with labels 'a', 'b'>

Execute a reduction

>>> ds.groupby("letters").sum()
<xarray.Dataset> Size: 64B
Dimensions:  (letters: 2, y: 3)
Coordinates:
  * letters  (letters) object 16B 'a' 'b'
Dimensions without coordinates: y
Data variables:
    foo      (letters, y) int64 48B 9 11 13 9 11 13

Grouping by multiple variables

>>> ds.groupby(["letters", "x"])
<DatasetGroupBy, grouped over 2 grouper(s), 8 groups in total:
    'letters': UniqueGrouper('letters'), 2/2 groups with labels 'a', 'b'
    'x': UniqueGrouper('x'), 4/4 groups with labels 10, 20, 30, 40>

Use Grouper objects to express more complicated GroupBy operations

>>> from xarray.groupers import BinGrouper, UniqueGrouper
>>>
>>> ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper()).sum()
<xarray.Dataset> Size: 144B
Dimensions:  (y: 3, x_bins: 2, letters: 2)
Coordinates:
  * x_bins   (x_bins) interval[int64, right] 32B (5, 15] (15, 25]
  * letters  (letters) object 16B 'a' 'b'
Dimensions without coordinates: y
Data variables:
    foo      (y, x_bins, letters) float64 96B 0.0 nan nan 3.0 ... nan nan 5.0