EOPF xarray backend
xarray-eopf
is a Python package that enhances xarray by a new backend
named "eopf-zarr"
. This backend allows for reading the ESA EOPF data products
in Zarr format and representing them using analysis ready data models.
Overview
After installation of the package, you can open EOPF data products using the
usual xarray top-level functions open_dataset()
that provides
a Dataset
object:
dataset = xr.open_dataset(url_or_path, engine="eopf-zarr")
and open_datatree()
that provides a DataTree
object including groups:
datatree = xr.open_datatree(url_or_path, engine="eopf-zarr")
Features
IMPORTANT
Thexarray-eopf
package is in a preliminary development state. The following features are only partly or not at all implemented yet.
The backend supports two distinct modes of operation, "native"
and "analysis"
:
-
op_mode="analysis"
- using the analysis mode, the returnedDataset
orDataTree
objects attempt to be user-friendly and analysis-ready to a maximum extend. Certain preprocessing steps will be applied depending on the specific Sentinel product towards an analysis-ready data model and content. For example, for Sentinel-2, we will support the selection of specific bands and resampling to a desired spatial resolution. For Sentinel-3 we will allow for rectifying the images in satellite coordinates to a user-provided, common CRS. It can be used as shown in the examples below:dataset = xr.open_dataset(url_or_path, engine="eopf-zarr", op_mode="analysis") datatree = xr.open_datatree(url_or_path, engine="eopf-zarr", op_mode="analysis")
-
op_mode="native"
- using the native mode, the returnedDataset
orDataTree
objects try to serve as a 1:1 representation of the actual Zarr product structure and content with either none or minimal preprocessing applied. It can be used as shown in the examples below:dataset = xr.open_dataset(url_or_path, engine="eopf-zarr", op_mode="native") datatree = xr.open_datatree(url_or_path, engine="eopf-zarr", op_mode="native")
More information on the two modes are given in the User Guide
Data variables will always be represented as chunked Dask arrays for efficient out-of core computations and visualisations.
The package has minimal core dependencies: xarray
, zarr
, and dask
.
Packages for accessing remote filesystems are optional, e.g., you will need s3fs
if you need to access EOPF data products in S3-compatible remote object storages.
License
The package is open source and released under the
Apache 2.0 license license.