Reshaping and reorganizing data#

Reshaping and reorganizing data refers to the process of changing the structure or organization of data by modifying dimensions, array shapes, order of values, or indexes. Xarray provides several methods to accomplish these tasks.

These methods are particularly useful for reshaping xarray objects for use in machine learning packages, such as scikit-learn, that usually require two-dimensional numpy arrays as inputs. Reshaping can also be required before passing data to external visualization tools, for example geospatial data might expect input organized into a particular format corresponding to stacks of satellite images.

Importing the library#

Shift and roll#

To adjust coordinate labels, you can use the shift() and roll() methods:

array = xr.DataArray([1, 2, 3, 4], dims="x")
array.shift(x=2)

<xarray.DataArray (x: 4)> Size: 32B
array([nan, nan,  1.,  2.])
Dimensions without coordinates: x

array.roll(x=2, roll_coords=True)

<xarray.DataArray (x: 4)> Size: 32B
array([3, 4, 1, 2])
Dimensions without coordinates: x

Reshaping via coarsen#

Whilst coarsen is normally used for reducing your data’s resolution by applying a reduction function (see the page on computation), it can also be used to reorganise your data without applying a computation via construct().

Taking our example tutorial air temperature dataset over the Northern US

air = xr.tutorial.open_dataset("air_temperature")["air"]

air.isel(time=0).plot(x="lon", y="lat");

we can split this up into sub-regions of size (9, 18) points using construct():

regions = air.coarsen(lat=9, lon=18, boundary="pad").construct(
    lon=("x_coarse", "x_fine"), lat=("y_coarse", "y_fine")
)
with xr.set_options(display_expand_data=False):
    regions

9 new regions have been created, each of size 9 by 18 points. The boundary="pad" kwarg ensured that all regions are the same size even though the data does not evenly divide into these sizes.

By plotting these 9 regions together via faceting we can see how they relate to the original data.

regions.isel(time=0).plot(
    x="x_fine", y="y_fine", col="x_coarse", row="y_coarse", yincrease=False
);

We are now free to easily apply any custom computation to each coarsened region of our new dataarray. This would involve specifying that applied functions should act over the "x_fine" and "y_fine" dimensions, but broadcast over the "x_coarse" and "y_coarse" dimensions.

Reshaping and reorganizing data#

Importing the library#

Reordering dimensions#

Expand and squeeze dimensions#

Converting between datasets and arrays#

Stack and unstack#

Stacking different variables together#

Set and reset index#

Shift and roll#

Sort#

Reshaping via coarsen#