xarray.Dataset.groupby#
- Dataset.groupby(group=None, *, squeeze=False, restore_coord_dims=False, eagerly_compute_group=True, **groupers)[source]#
Returns a DatasetGroupBy object for performing grouped operations.
- Parameters:
group (
str
orDataArray
orIndexVariable
or sequence of hashable or mapping of hashable toGrouper
) – Array whose unique values should be used to group this array. If a Hashable, must be the name of a coordinate contained in this dataarray. If a dictionary, must map an existing variable name to aGrouper
instance.squeeze (
False
) – This argument is deprecated.restore_coord_dims (
bool
, default:False
) – If True, also restore the dimension order of multi-dimensional coordinates.eagerly_compute_group (
bool
) – Whether to eagerly computegroup
when it is a chunked array. This option is to maintain backwards compatibility. Set to False to opt-in to future behaviour, wheregroup
is not automatically loaded into memory.**groupers (
Mapping
ofstr
toGrouper
orResampler
) – Mapping of variable name to group by toGrouper
orResampler
object. One ofgroup
orgroupers
must be provided. Only a singlegrouper
is allowed at present.
- Returns:
grouped (
DatasetGroupBy
) – A DatasetGroupBy object patterned after pandas.GroupBy that can be iterated over in the form of (unique_value, grouped_array) pairs.
Examples
>>> ds = xr.Dataset( ... {"foo": (("x", "y"), np.arange(12).reshape((4, 3)))}, ... coords={"x": [10, 20, 30, 40], "letters": ("x", list("abba"))}, ... )
Grouping by a single variable is easy
>>> ds.groupby("letters") <DatasetGroupBy, grouped over 1 grouper(s), 2 groups in total: 'letters': 2/2 groups present with labels 'a', 'b'>
Execute a reduction
>>> ds.groupby("letters").sum() <xarray.Dataset> Size: 64B Dimensions: (letters: 2, y: 3) Coordinates: * letters (letters) object 16B 'a' 'b' Dimensions without coordinates: y Data variables: foo (letters, y) int64 48B 9 11 13 9 11 13
Grouping by multiple variables
>>> ds.groupby(["letters", "x"]) <DatasetGroupBy, grouped over 2 grouper(s), 8 groups in total: 'letters': 2/2 groups present with labels 'a', 'b' 'x': 4/4 groups present with labels 10, 20, 30, 40>
Use Grouper objects to express more complicated GroupBy operations
>>> from xarray.groupers import BinGrouper, UniqueGrouper >>> >>> ds.groupby(x=BinGrouper(bins=[5, 15, 25]), letters=UniqueGrouper()).sum() <xarray.Dataset> Size: 128B Dimensions: (y: 3, x_bins: 2, letters: 2) Coordinates: * x_bins (x_bins) object 16B (5, 15] (15, 25] * letters (letters) object 16B 'a' 'b' Dimensions without coordinates: y Data variables: foo (y, x_bins, letters) float64 96B 0.0 nan nan 3.0 ... nan nan 5.0
See also
- GroupBy: Group and Bin Data
Users guide explanation of how to group and bin data.
- Computational Patterns
Tutorial on
Groupby()
for windowed computation.- Grouped Computations
Tutorial on
Groupby()
demonstrating reductions, transformation and comparison withresample()
.
pandas.DataFrame.groupby
Dataset.groupby_bins
DataArray.groupby
core.groupby.DatasetGroupBy
Dataset.coarsen
Dataset.resample
DataArray.resample