xarray.Dataset.quantile#

Dataset.quantile(q, dim=None, method='linear', numeric_only=False, keep_attrs=None, skipna=None, interpolation=None)[source]#

Compute the qth quantile of the data along the specified dimension.

Returns the qth quantiles(s) of the array elements for each variable in the Dataset.

Parameters
  • q (float or array-like of float) – Quantile to compute, which must be between 0 and 1 inclusive.

  • dim (str or Iterable of Hashable, optional) – Dimension(s) over which to apply quantile.

  • method (str, default: "linear") – This optional parameter specifies the interpolation method to use when the desired quantile lies between two data points. The options sorted by their R type as summarized in the H&F paper 1 are:

    1. “inverted_cdf” (*)

    2. “averaged_inverted_cdf” (*)

    3. “closest_observation” (*)

    4. “interpolated_inverted_cdf” (*)

    5. “hazen” (*)

    6. “weibull” (*)

    7. “linear” (default)

    8. “median_unbiased” (*)

    9. “normal_unbiased” (*)

    The first three methods are discontiuous. The following discontinuous variations of the default “linear” (7.) option are also available:

    • “lower”

    • “higher”

    • “midpoint”

    • “nearest”

    See numpy.quantile() or 1 for details. The “method” argument was previously called “interpolation”, renamed in accordance with numpy version 1.22.0.

    (*) These methods require numpy version 1.22 or newer.

  • keep_attrs (bool, optional) – If True, the dataset’s attributes (attrs) will be copied from the original object to the new one. If False (default), the new object will be returned without attributes.

  • numeric_only (bool, optional) – If True, only apply func to variables with a numeric dtype.

  • skipna (bool, optional) – If True, skip missing values (as marked by NaN). By default, only skips missing values for float dtypes; other dtypes either do not have a sentinel missing value (int) or skipna=True has not been implemented (object, datetime64 or timedelta64).

Returns

quantiles (Dataset) – If q is a single quantile, then the result is a scalar for each variable in data_vars. If multiple percentiles are given, first axis of the result corresponds to the quantile and a quantile dimension is added to the return Dataset. The other dimensions are the dimensions that remain after the reduction of the array.

Examples

>>> ds = xr.Dataset(
...     {"a": (("x", "y"), [[0.7, 4.2, 9.4, 1.5], [6.5, 7.3, 2.6, 1.9]])},
...     coords={"x": [7, 9], "y": [1, 1.5, 2, 2.5]},
... )
>>> ds.quantile(0)  # or ds.quantile(0, dim=...)
<xarray.Dataset>
Dimensions:   ()
Coordinates:
    quantile  float64 0.0
Data variables:
    a         float64 0.7
>>> ds.quantile(0, dim="x")
<xarray.Dataset>
Dimensions:   (y: 4)
Coordinates:
  * y         (y) float64 1.0 1.5 2.0 2.5
    quantile  float64 0.0
Data variables:
    a         (y) float64 0.7 4.2 2.6 1.5
>>> ds.quantile([0, 0.5, 1])
<xarray.Dataset>
Dimensions:   (quantile: 3)
Coordinates:
  * quantile  (quantile) float64 0.0 0.5 1.0
Data variables:
    a         (quantile) float64 0.7 3.4 9.4
>>> ds.quantile([0, 0.5, 1], dim="x")
<xarray.Dataset>
Dimensions:   (quantile: 3, y: 4)
Coordinates:
  * y         (y) float64 1.0 1.5 2.0 2.5
  * quantile  (quantile) float64 0.0 0.5 1.0
Data variables:
    a         (quantile, y) float64 0.7 4.2 2.6 1.5 3.6 ... 1.7 6.5 7.3 9.4 1.9

References

1(1,2)

R. J. Hyndman and Y. Fan, “Sample quantiles in statistical packages,” The American Statistician, 50(4), pp. 361-365, 1996