Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

earth and related environmental sciences

Intoduction to the xarray EOPF backend

Brockmann Consult GmbH
ESA EOPF Zarr Logo

πŸš€ Launch in JupyterHub

Run this notebook interactively with all dependencies pre-installed

IntroductionΒΆ

xarray-eopf is a Python package that extends xarray with a custom backend called "eopf-zarr". This backend enables seamless access to ESA EOPF data products stored in the Zarr format, presenting them as analysis-ready data structures.

This notebook demonstrates how to use the xarray-eopf backend to access EOPF Zarr datasets. It highlights the key features currently supported by the backend.


Install the xarray-eopf BackendΒΆ

The backend is implemented as an xarray plugin and can be installed using either pip or conda/mamba from the conda-forge channel.

  • πŸ“¦ PyPI: xarray-eopf on PyPI pip install xarray-eopf

  • 🐍 Conda (conda-forge): xarray-eopf on Anaconda conda install -c conda-forge xarray-eopf

    You can also use Mamba as a faster alternative to Conda: mamba install -c conda-forge xarray-eopf


Import ModulesΒΆ

The xarray-eopf backend is implemented as a plugin for xarray. Once installed, it registers automatically and requires no additional import. You can simply import xarray as usual:

import datetime

import pystac_client
import xarray as xr

Main Features of the xarray-eopf BackendΒΆ

The xarray-eopf backend for EOPF data products can be selecterd by setting engine="eopf-zarr" in xarray.open_dataset(..) and xarray.open_datatree(..) method. All data access is lazy, meaning that data is only loaded when requiredβ€”for example, during plotting or when writing to storage. It supports two modes of operation:

  • Analysis Mode (default)

  • Native Mode

Native ModeΒΆ

Represents EOPF products without modification using xarray’s DataTree and Dataset.

  • open_dataset(path, engine="eopf-zarr", op_mode="native", chunks={}) β€” Returns a flattened version of the data tree

  • open_datatree(path, engine="eopf-zarr", op_mode="native", chunks={}) β€” Returns the full DataTree, same as xr.open_datatree(.., engine="zarr")

Analysis ModeΒΆ

Provides an analysis-ready, resampled view of the data (currently for Sentinel-2 only).

  • open_dataset(path, engine="eopf-zarr", op_mode="analysis", chunks={}) β€” Loads Sentinel-2 products in a harmonized, analysis-ready format

  • open_datatree(path, engine="eopf-zarr", op_mode="analysis", chunks={}) β€” Not implemented in this mode (NotImplementedError)

In the following sections, we demonstrate these functions using a Sentinel-2 L2A product.

For additional examples, refer to the notebooks provided for each Sentinel-1, Sentinel-2, and Sentinel-3 product, available in both native and analysis modes.

The native mode operates identically across all Sentinel missions. In contrast, the analysis mode is customized for each individual mission, as illustrated in the respective notebook examples.

Further information about the analysis mode for each Sentinel mission can be found in the xarray-eopf Guide.


Find a Sentinel Zarr Sample via STACΒΆ

To obtain a product URL, you can use the STAC Browser to search for a Sentinel-2 tile. Here, the query parameter is used to select tiles with less than 40% cloud cover, improving the chances of a clear plot.

catalog = pystac_client.Client.open("https://stac.core.eopf.eodc.eu")
items = list(
    catalog.search(
        collections=["sentinel-2-l2a"],
        bbox=[7.2, 44.5, 7.4, 44.7],
        datetime=[str(datetime.date.today() - datetime.timedelta(days=30)), None],
        query={"eo:cloud_cover": {"lt": 40}},
    ).items()
)
items
[<Item id=S2A_MSIL2A_20260326T102701_N0512_R108_T32TLQ_20260326T172711>, <Item id=S2B_MSIL2A_20260326T102019_N0512_R065_T32TLQ_20260326T155529>, <Item id=S2B_MSIL2A_20260319T103019_N0512_R108_T32TLQ_20260319T151320>, <Item id=S2A_MSIL2A_20260316T103041_N0512_R108_T32TLQ_20260316T184508>, <Item id=S2B_MSIL2A_20260316T101649_N0512_R065_T32TLQ_20260316T155405>, <Item id=S2A_MSIL2A_20260313T101741_N0512_R065_T32TLQ_20260313T171916>, <Item id=S2C_MSIL2A_20260304T102921_N0512_R108_T32TLQ_20260304T160811>]

Next, we can inspect the item’s contents, including the additional field xarray:open_datatree_kwargs, which provides the arguments needed to open the product using Xarray’s eopf-zarr engine.

item = items[0]
item
Loading...

Open Sentinel-2 Level-2A in Native Mode as DataTreeΒΆ

We can use the "product" asset to obtain the href and xarray:open_datatree_kwargs from the STAC item, and open the product as an xarray.DataTree as shown below:

dt = xr.open_datatree(
    item.assets["product"].href,
    **item.assets["product"].extra_fields["xarray:open_datatree_kwargs"]
)
dt
Loading...

As an example, we plot the red band (b04) at 60 meters resolution, which will trigger loading and visualization of the data.

dt.measurements.reflectance.r60m.b04.plot.imshow(vmin=0, vmax=1)
<Figure size 640x480 with 2 Axes>

Open Sentinel-2 Level-2A Reflectance Groups in Native Mode as DatasetΒΆ

Similarly, we can open the individual reflectance groups at 10β€―m (asset "SR_10m"), 20β€―m (asset "SR_20m"), and 60β€―m (asset "SR_60m") to access each group as an xarray.Dataset, as shown below:

ds = xr.open_dataset(
    item.assets["SR_60m"].href,
    **item.assets["SR_60m"].extra_fields["xarray:open_dataset_kwargs"]
)
ds
Loading...

We can plot an RGB image as an example, using the red (b04), green (b03), and blue (b02) spectral bands.

ax = (
    (ds[["b04", "b03", "b02"]].to_dataarray(dim="band") / 0.3)
    .clip(0, 1)
    .plot.imshow(rgb="band")
)
ax.axes.set_aspect("equal")
<Figure size 640x480 with 1 Axes>

Open Sentinel-2 Level-2A in Native Mode as DatasetΒΆ

The xarray.DataTree data model is relatively new, introduced in xarray v2024.10.0 (October 2024). To support compatibility with existing workflows that rely on the traditional xr.Dataset model, we provide the function xarray.open_dataset(path, engine="eopf-zarr", op_mode="native", **kwargs). This function flattens the DataTree structure and returns a single xr.Dataset.

In this process, hierarchical groups within the Zarr product are removed by converting their contents into standalone datasets and merging them into one. To ensure uniqueness, variable and dimension names are prefixed with their original group paths, using an underscore (_) as the default separator. For example, a variable named b02 located in the group measurements/reflectance/r10m will be renamed to measurements_reflectance_r10m_b02 in the returned dataset.

ds = xr.open_dataset(
    item.assets["product"].href,
    engine="eopf-zarr",
    op_mode="native",
    chunks={},
)
ds
Loading...

The separator character used in flattened variable names can be customized via the group_sep parameter. Additionally, you can filter the returned variables using the variables keyword argument, which accepts a string, an iterable of names, or a regular expression (regex) pattern.

ds = xr.open_dataset(
    item.assets["product"].href,
    engine="eopf-zarr",
    op_mode="native",
    chunks={},
    group_sep="/",
    variables="measurements/r60m/b0[234]",
)
ds
Loading...

Following the previous steps, we can now select a data variable and display the RGB image as an example.

array = ds[
    ["measurements/r60m/b04", "measurements/r60m/b03", "measurements/r60m/b02"]
].to_dataarray(dim="band")
ax = (array / 0.3).clip(0, 1).plot.imshow(rgb="band")
ax.axes.set_aspect("equal")
<Figure size 640x480 with 1 Axes>

Open Sentinel-2 Level-2A in Analysis Mode as DatasetΒΆ

Next, we use the "product" asset to open the Sentinel-2 product in analysis mode, which presents EOPF products as an analysis-ready xarray Dataset with a single, unified grid mapping. For Sentinel-2, all variables are automatically upscaled or downscaled so that the dataset shares one consistent pair of x and y coordinates. Note that analysis mode is the default in the xarray EOPF plugin.

We use 60 m resolution to keep plotting fast, as full-resolution data can be slow to render in Matplotlib. For full details on xarray.open_dataset configuration options, see the xarray-eopf documentation.

ds = xr.open_dataset(
    items[0].assets["product"].href, engine="eopf-zarr", resolution=60, chunks={}
)
ds
Loading...

And we can plot one spectral band as example:

ds.b04.plot(vmin=0, vmax=1)
<Figure size 640x480 with 2 Axes>

ConclusionΒΆ

This notebook highlighted the main features of the xarray-eopf backend, which enables seamless access to EOPF Zarr products using familiar Xarray methods (xr.open_dataset and xr.open_datatree) by specifying engine="eopf-zarr". Key takeaways:

  • Two operation modes:

    • op_mode="native": Represents EOPF products without modification using Xarray’s DataTree and Dataset structures.

    • op_mode="analysis": Provides an analysis-ready, resampled view of the data and is available for xarray.open_dataset only.

  • Opening parameters are integrated into the STAC items, allowing streamlined data access and management.

  • Subgroups of the data tree are resolved via individual STAC assets and can be accessed with xarray.open_dataset using the provided parameters.

Future Examples

For additional examples, see the notebooks for Sentinel-1, Sentinel-2, and Sentinel-3 products, in both native and analysis modes.