xarray.core.groupby.DatasetGroupBy.sum

xarray.core.groupby.DatasetGroupBy.sum

DatasetGroupBy.sum(dim=None, skipna=True, min_count=None, keep_attrs=None, **kwargs)[source]

Reduce this Dataset’s data by applying sum along some dimension(s).

Parameters
  • dim (hashable or iterable of hashable, optional) – Name of dimension[s] along which to apply sum. For e.g. dim="x" or dim=["x", "y"]. If None, will reduce over all dimensions present in the grouped variable.

  • skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).

  • min_count (int, default: None) – The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA. Only used if skipna is set to True or defaults to True for the array’s dtype. Changed in version 0.17.0: if specified on an integer array and skipna=True, the result will be a float array.

  • keep_attrs (bool, optional) – If True, attrs will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.

  • **kwargs (dict) – Additional keyword arguments passed on to the appropriate array function for calculating sum on this object’s data.

Returns

reduced (Dataset) – New Dataset with sum applied to its data and the indicated dimension(s) removed

Examples

>>> da = xr.DataArray(
...     np.array([1, 2, 3, 1, 2, np.nan]),
...     dims="time",
...     coords=dict(
...         time=("time", pd.date_range("01-01-2001", freq="M", periods=6)),
...         labels=("time", np.array(["a", "b", "c", "c", "b", "a"])),
...     ),
... )
>>> ds = xr.Dataset(dict(da=da))
>>> ds
<xarray.Dataset>
Dimensions:  (time: 6)
Coordinates:
  * time     (time) datetime64[ns] 2001-01-31 2001-02-28 ... 2001-06-30
    labels   (time) <U1 'a' 'b' 'c' 'c' 'b' 'a'
Data variables:
    da       (time) float64 1.0 2.0 3.0 1.0 2.0 nan
>>> ds.groupby("labels").sum()
<xarray.Dataset>
Dimensions:  (labels: 3)
Coordinates:
  * labels   (labels) object 'a' 'b' 'c'
Data variables:
    da       (labels) float64 1.0 4.0 4.0

Use skipna to control whether NaNs are ignored.

>>> ds.groupby("labels").sum(skipna=False)
<xarray.Dataset>
Dimensions:  (labels: 3)
Coordinates:
  * labels   (labels) object 'a' 'b' 'c'
Data variables:
    da       (labels) float64 nan 4.0 4.0

Specify min_count for finer control over when NaNs are ignored.

>>> ds.groupby("labels").sum(skipna=True, min_count=2)
<xarray.Dataset>
Dimensions:  (labels: 3)
Coordinates:
  * labels   (labels) object 'a' 'b' 'c'
Data variables:
    da       (labels) float64 nan 4.0 4.0

See also

numpy.sum, Dataset.sum

GroupBy: split-apply-combine

User guide on groupby operations.