xarray.Dataset.dropna#
- Dataset.dropna(dim, *, how='any', thresh=None, subset=None)[source]#
Returns a new dataset with dropped labels for missing values along the provided dimension.
- Parameters
dim (hashable) – Dimension along which to drop missing values. Dropping along multiple dimensions simultaneously is not yet supported.
how (
{"any", "all"}
, default:"any"
) –any : if any NA values are present, drop that label
all : if all values are NA, drop that label
thresh (
int
orNone
, optional) – If supplied, require this many non-NA values (summed over all the subset variables).subset (iterable of hashable or
None
, optional) – Which variables to check for missing values. By default, all variables in the dataset are checked.
Examples
>>> dataset = xr.Dataset( ... { ... "temperature": ( ... ["time", "location"], ... [[23.4, 24.1], [np.nan, 22.1], [21.8, 24.2], [20.5, 25.3]], ... ) ... }, ... coords={"time": [1, 2, 3, 4], "location": ["A", "B"]}, ... ) >>> dataset <xarray.Dataset> Size: 104B Dimensions: (time: 4, location: 2) Coordinates: * time (time) int64 32B 1 2 3 4 * location (location) <U1 8B 'A' 'B' Data variables: temperature (time, location) float64 64B 23.4 24.1 nan ... 24.2 20.5 25.3
Drop NaN values from the dataset
>>> dataset.dropna(dim="time") <xarray.Dataset> Size: 80B Dimensions: (time: 3, location: 2) Coordinates: * time (time) int64 24B 1 3 4 * location (location) <U1 8B 'A' 'B' Data variables: temperature (time, location) float64 48B 23.4 24.1 21.8 24.2 20.5 25.3
Drop labels with any NaN values
>>> dataset.dropna(dim="time", how="any") <xarray.Dataset> Size: 80B Dimensions: (time: 3, location: 2) Coordinates: * time (time) int64 24B 1 3 4 * location (location) <U1 8B 'A' 'B' Data variables: temperature (time, location) float64 48B 23.4 24.1 21.8 24.2 20.5 25.3
Drop labels with all NAN values
>>> dataset.dropna(dim="time", how="all") <xarray.Dataset> Size: 104B Dimensions: (time: 4, location: 2) Coordinates: * time (time) int64 32B 1 2 3 4 * location (location) <U1 8B 'A' 'B' Data variables: temperature (time, location) float64 64B 23.4 24.1 nan ... 24.2 20.5 25.3
Drop labels with less than 2 non-NA values
>>> dataset.dropna(dim="time", thresh=2) <xarray.Dataset> Size: 80B Dimensions: (time: 3, location: 2) Coordinates: * time (time) int64 24B 1 3 4 * location (location) <U1 8B 'A' 'B' Data variables: temperature (time, location) float64 48B 23.4 24.1 21.8 24.2 20.5 25.3
- Returns
See also