.. currentmodule:: xarray .. _weather-climate: Weather and climate data ======================== .. ipython:: python :suppress: import xarray as xr Xarray can leverage metadata that follows the `Climate and Forecast (CF) conventions`_ if present. Examples include automatic labelling of plots with descriptive names and units if proper metadata is present (see :ref:`plotting`) and support for non-standard calendars used in climate science through the ``cftime`` module (see :ref:`CFTimeIndex`). There are also a number of geosciences-focused projects that build on xarray (see :ref:`ecosystem`). .. _Climate and Forecast (CF) conventions: http://cfconventions.org .. _cf_variables: Related Variables ----------------- Several CF variable attributes contain lists of other variables associated with the variable with the attribute. A few of these are now parsed by xarray, with the attribute value popped to encoding on read and the variables in that value interpreted as non-dimension coordinates: - ``coordinates`` - ``bounds`` - ``grid_mapping`` - ``climatology`` - ``geometry`` - ``node_coordinates`` - ``node_count`` - ``part_node_count`` - ``interior_ring`` - ``cell_measures`` - ``formula_terms`` This decoding is controlled by the ``decode_coords`` kwarg to :py:func:`open_dataset` and :py:func:`open_mfdataset`. The CF attribute ``ancillary_variables`` was not included in the list due to the variables listed there being associated primarily with the variable with the attribute, rather than with the dimensions. .. _metpy_accessor: CF-compliant coordinate variables --------------------------------- `MetPy`_ adds a ``metpy`` accessor that allows accessing coordinates with appropriate CF metadata using generic names ``x``, ``y``, ``vertical`` and ``time``. There is also a `cartopy_crs` attribute that provides projection information, parsed from the appropriate CF metadata, as a `Cartopy`_ projection object. See `their documentation`_ for more information. .. _`MetPy`: https://unidata.github.io/MetPy/dev/index.html .. _`their documentation`: https://unidata.github.io/MetPy/dev/tutorials/xarray_tutorial.html#coordinates .. _`Cartopy`: https://scitools.org.uk/cartopy/docs/latest/crs/projections.html .. _CFTimeIndex: Non-standard calendars and dates outside the Timestamp-valid range ------------------------------------------------------------------ Through the standalone ``cftime`` library and a custom subclass of :py:class:`pandas.Index`, xarray supports a subset of the indexing functionality enabled through the standard :py:class:`pandas.DatetimeIndex` for dates from non-standard calendars commonly used in climate science or dates using a standard calendar, but outside the `Timestamp-valid range`_ (approximately between years 1678 and 2262). .. note:: As of xarray version 0.11, by default, :py:class:`cftime.datetime` objects will be used to represent times (either in indexes, as a :py:class:`~xarray.CFTimeIndex`, or in data arrays with dtype object) if any of the following are true: - The dates are from a non-standard calendar - Any dates are outside the Timestamp-valid range. Otherwise pandas-compatible dates from a standard calendar will be represented with the ``np.datetime64[ns]`` data type, enabling the use of a :py:class:`pandas.DatetimeIndex` or arrays with dtype ``np.datetime64[ns]`` and their full set of associated features. For example, you can create a DataArray indexed by a time coordinate with dates from a no-leap calendar and a :py:class:`~xarray.CFTimeIndex` will automatically be used: .. ipython:: python from itertools import product from cftime import DatetimeNoLeap dates = [ DatetimeNoLeap(year, month, 1) for year, month in product(range(1, 3), range(1, 13)) ] da = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo") Xarray also includes a :py:func:`~xarray.cftime_range` function, which enables creating a :py:class:`~xarray.CFTimeIndex` with regularly-spaced dates. For instance, we can create the same dates and DataArray we created above using: .. ipython:: python dates = xr.cftime_range(start="0001", periods=24, freq="MS", calendar="noleap") da = xr.DataArray(np.arange(24), coords=[dates], dims=["time"], name="foo") Mirroring pandas' method with the same name, :py:meth:`~xarray.infer_freq` allows one to infer the sampling frequency of a :py:class:`~xarray.CFTimeIndex` or a 1-D :py:class:`~xarray.DataArray` containing cftime objects. It also works transparently with ``np.datetime64[ns]`` and ``np.timedelta64[ns]`` data. .. ipython:: python xr.infer_freq(dates) With :py:meth:`~xarray.CFTimeIndex.strftime` we can also easily generate formatted strings from the datetime values of a :py:class:`~xarray.CFTimeIndex` directly or through the ``dt`` accessor for a :py:class:`~xarray.DataArray` using the same formatting as the standard `datetime.strftime`_ convention . .. _datetime.strftime: https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior .. ipython:: python dates.strftime("%c") da["time"].dt.strftime("%Y%m%d") For data indexed by a :py:class:`~xarray.CFTimeIndex` xarray currently supports: - `Partial datetime string indexing`_: .. ipython:: python da.sel(time="0001") da.sel(time=slice("0001-05", "0002-02")) .. note:: For specifying full or partial datetime strings in cftime indexing, xarray supports two versions of the `ISO 8601 standard`_, the basic pattern (YYYYMMDDhhmmss) or the extended pattern (YYYY-MM-DDThh:mm:ss), as well as the default cftime string format (YYYY-MM-DD hh:mm:ss). This is somewhat more restrictive than pandas; in other words, some datetime strings that would be valid for a :py:class:`pandas.DatetimeIndex` are not valid for an :py:class:`~xarray.CFTimeIndex`. - Access of basic datetime components via the ``dt`` accessor (in this case just "year", "month", "day", "hour", "minute", "second", "microsecond", "season", "dayofyear", "dayofweek", and "days_in_month"): .. ipython:: python da.time.dt.year da.time.dt.month da.time.dt.season da.time.dt.dayofyear da.time.dt.dayofweek da.time.dt.days_in_month - Rounding of datetimes to fixed frequencies via the ``dt`` accessor: .. ipython:: python da.time.dt.ceil("3D") da.time.dt.floor("5D") da.time.dt.round("2D") - Group-by operations based on datetime accessor attributes (e.g. by month of the year): .. ipython:: python da.groupby("time.month").sum() - Interpolation using :py:class:`cftime.datetime` objects: .. ipython:: python da.interp(time=[DatetimeNoLeap(1, 1, 15), DatetimeNoLeap(1, 2, 15)]) - Interpolation using datetime strings: .. ipython:: python da.interp(time=["0001-01-15", "0001-02-15"]) - Differentiation: .. ipython:: python da.differentiate("time") - Serialization: .. ipython:: python da.to_netcdf("example-no-leap.nc") xr.open_dataset("example-no-leap.nc") .. ipython:: python :suppress: import os os.remove("example-no-leap.nc") - And resampling along the time dimension for data indexed by a :py:class:`~xarray.CFTimeIndex`: .. ipython:: python da.resample(time="81T", closed="right", label="right", base=3).mean() .. note:: For some use-cases it may still be useful to convert from a :py:class:`~xarray.CFTimeIndex` to a :py:class:`pandas.DatetimeIndex`, despite the difference in calendar types. The recommended way of doing this is to use the built-in :py:meth:`~xarray.CFTimeIndex.to_datetimeindex` method: .. ipython:: python :okwarning: modern_times = xr.cftime_range("2000", periods=24, freq="MS", calendar="noleap") da = xr.DataArray(range(24), [("time", modern_times)]) da datetimeindex = da.indexes["time"].to_datetimeindex() da["time"] = datetimeindex However in this case one should use caution to only perform operations which do not depend on differences between dates (e.g. differentiation, interpolation, or upsampling with resample), as these could introduce subtle and silent errors due to the difference in calendar types between the dates encoded in your data and the dates stored in memory. .. _Timestamp-valid range: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timestamp-limitations .. _ISO 8601 standard: https://en.wikipedia.org/wiki/ISO_8601 .. _partial datetime string indexing: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#partial-string-indexing