xarray.Dataset.interpolate_na#
- Dataset.interpolate_na(dim=None, method='linear', limit=None, use_coordinate=True, max_gap=None, **kwargs)[source]#
Fill in NaNs by interpolating according to different methods.
- Parameters
dim (
Hashable
orNone
, optional) – Specifies the dimension along which to interpolate.method (
{"linear", "nearest", "zero", "slinear", "quadratic", "cubic", "polynomial", "barycentric", "krogh", "pchip", "spline", "akima"}
, default:"linear"
) – String indicating which method to use for interpolation:‘linear’: linear interpolation. Additional keyword arguments are passed to
numpy.interp()
‘nearest’, ‘zero’, ‘slinear’, ‘quadratic’, ‘cubic’, ‘polynomial’: are passed to
scipy.interpolate.interp1d()
. Ifmethod='polynomial'
, theorder
keyword argument must also be provided.‘barycentric’, ‘krogh’, ‘pchip’, ‘spline’, ‘akima’: use their respective
scipy.interpolate
classes.
use_coordinate (
bool
orHashable
, default:True
) – Specifies which index to use as the x values in the interpolation formulated as y = f(x). If False, values are treated as if equally-spaced alongdim
. If True, the IndexVariable dim is used. Ifuse_coordinate
is a string, it specifies the name of a coordinate variable to use as the index.limit (
int
, default:None
) – Maximum number of consecutive NaNs to fill. Must be greater than 0 or None for no limit. This filling is done regardless of the size of the gap in the data. To only interpolate over gaps less than a given length, seemax_gap
.max_gap (
int
,float
,str
,pandas.Timedelta
,numpy.timedelta64
, datetime.timedelta orNone
, default:None
) – Maximum size of gap, a continuous sequence of NaNs, that will be filled. Use None for no limit. When interpolating along a datetime64 dimension anduse_coordinate=True
,max_gap
can be one of the following:a string that is valid input for pandas.to_timedelta
a
numpy.timedelta64
objecta
pandas.Timedelta
objecta
datetime.timedelta
object
Otherwise,
max_gap
must be an int or a float. Use ofmax_gap
with unlabeled dimensions has not been implemented yet. Gap length is defined as the difference between coordinate values at the first data point after a gap and the last value before a gap. For gaps at the beginning (end), gap length is defined as the difference between coordinate values at the first (last) valid data point and the first (last) NaN. For example, consider:<xarray.DataArray (x: 9)> array([nan, nan, nan, 1., nan, nan, 4., nan, nan]) Coordinates: * x (x) int64 0 1 2 3 4 5 6 7 8
The gap lengths are 3-0 = 3; 6-3 = 3; and 8-6 = 2 respectively
**kwargs (
dict
, optional) – parameters passed verbatim to the underlying interpolation function
- Returns
interpolated (
Dataset
) – Filled in Dataset.
Warning
When passing fill_value as a keyword argument with method=”linear”, it does not use
numpy.interp
but it usesscipy.interpolate.interp1d
, which provides the fill_value parameter.See also
Examples
>>> ds = xr.Dataset( ... { ... "A": ("x", [np.nan, 2, 3, np.nan, 0]), ... "B": ("x", [3, 4, np.nan, 1, 7]), ... "C": ("x", [np.nan, np.nan, np.nan, 5, 0]), ... "D": ("x", [np.nan, 3, np.nan, -1, 4]), ... }, ... coords={"x": [0, 1, 2, 3, 4]}, ... ) >>> ds <xarray.Dataset> Size: 200B Dimensions: (x: 5) Coordinates: * x (x) int64 40B 0 1 2 3 4 Data variables: A (x) float64 40B nan 2.0 3.0 nan 0.0 B (x) float64 40B 3.0 4.0 nan 1.0 7.0 C (x) float64 40B nan nan nan 5.0 0.0 D (x) float64 40B nan 3.0 nan -1.0 4.0
>>> ds.interpolate_na(dim="x", method="linear") <xarray.Dataset> Size: 200B Dimensions: (x: 5) Coordinates: * x (x) int64 40B 0 1 2 3 4 Data variables: A (x) float64 40B nan 2.0 3.0 1.5 0.0 B (x) float64 40B 3.0 4.0 2.5 1.0 7.0 C (x) float64 40B nan nan nan 5.0 0.0 D (x) float64 40B nan 3.0 1.0 -1.0 4.0
>>> ds.interpolate_na(dim="x", method="linear", fill_value="extrapolate") <xarray.Dataset> Size: 200B Dimensions: (x: 5) Coordinates: * x (x) int64 40B 0 1 2 3 4 Data variables: A (x) float64 40B 1.0 2.0 3.0 1.5 0.0 B (x) float64 40B 3.0 4.0 2.5 1.0 7.0 C (x) float64 40B 20.0 15.0 10.0 5.0 0.0 D (x) float64 40B 5.0 3.0 1.0 -1.0 4.0