DataArrayGroupBy.sum(dim=None, *, skipna=None, min_count=None, keep_attrs=None, **kwargs)[source]#

Reduce this DataArray’s data by applying sum along some dimension(s).

  • dim (str, Iterable of Hashable, "..." or None, default: None) – Name of dimension[s] along which to apply sum. For e.g. dim="x" or dim=["x", "y"]. If None, will reduce over the GroupBy dimensions. If “…”, will reduce over all dimensions.

  • skipna (bool or None, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).

  • min_count (int or None, optional) – The required number of valid values to perform the operation. If fewer than min_count non-NA values are present the result will be NA. Only used if skipna is set to True or defaults to True for the array’s dtype. Changed in version 0.17.0: if specified on an integer array and skipna=True, the result will be a float array.

  • keep_attrs (bool or None, optional) – If True, attrs will be copied from the original object to the new one. If False, the new object will be returned without attributes.

  • **kwargs (Any) – Additional keyword arguments passed on to the appropriate array function for calculating sum on this object’s data. These could include dask-specific kwargs like split_every.


reduced (DataArray) – New DataArray with sum applied to its data and the indicated dimension(s) removed

See also

numpy.sum, dask.array.sum, DataArray.sum

GroupBy: Group and Bin Data

User guide on groupby operations.


Use the flox package to significantly speed up groupby computations, especially with dask arrays. Xarray will use flox by default if installed. Pass flox-specific keyword arguments in **kwargs. The default choice is method="cohorts" which generalizes the best, other methods might work better for your problem. See the flox documentation for more.

Non-numeric variables will be removed prior to reducing.


>>> da = xr.DataArray(
...     np.array([1, 2, 3, 0, 2, np.nan]),
...     dims="time",
...     coords=dict(
...         time=("time", pd.date_range("2001-01-01", freq="M", periods=6)),
...         labels=("time", np.array(["a", "b", "c", "c", "b", "a"])),
...     ),
... )
>>> da
<xarray.DataArray (time: 6)>
array([ 1.,  2.,  3.,  0.,  2., nan])
  * time     (time) datetime64[ns] 2001-01-31 2001-02-28 ... 2001-06-30
    labels   (time) <U1 'a' 'b' 'c' 'c' 'b' 'a'
>>> da.groupby("labels").sum()
<xarray.DataArray (labels: 3)>
array([1., 4., 3.])
  * labels   (labels) object 'a' 'b' 'c'

Use skipna to control whether NaNs are ignored.

>>> da.groupby("labels").sum(skipna=False)
<xarray.DataArray (labels: 3)>
array([nan,  4.,  3.])
  * labels   (labels) object 'a' 'b' 'c'

Specify min_count for finer control over when NaNs are ignored.

>>> da.groupby("labels").sum(skipna=True, min_count=2)
<xarray.DataArray (labels: 3)>
array([nan,  4.,  3.])
  * labels   (labels) object 'a' 'b' 'c'