Cubes & Clouds logo

Aggregate Operators with Pangeo#

Aggregate Operators#

resample: temporal aggregation with predefined intervals#

Start importing the necessary libraries.

# STAC Catalogue Libraries
import pystac_client
import stackstac
spatial_extent = [11.4, 45.5, 11.42, 45.52]
temporal_extent = ["2022-01-01", "2022-12-31"]
bands = ["red","green","blue"]
URL = "https://earth-search.aws.element84.com/v1"
catalog = pystac_client.Client.open(URL)
items = catalog.search(
    bbox=spatial_extent,
    datetime=temporal_extent,
    collections=["sentinel-2-l2a"]
).item_collection()

Create the starting Sentinel-2 datacube:

s2_cube = stackstac.stack(items,
                     bounds_latlon=spatial_extent,
                     assets=bands
)
s2_cube
/srv/conda/envs/notebook/lib/python3.11/site-packages/stackstac/prepare.py:408: UserWarning: The argument 'infer_datetime_format' is deprecated and will be removed in a future version. A strict version of it is now the default, see https://pandas.pydata.org/pdeps/0004-consistent-to-datetime-parsing.html. You can safely remove this argument.
  times = pd.to_datetime(
<xarray.DataArray 'stackstac-ff1f9c70d01b039d57a5e43d35c43d18' (time: 73,
                                                                band: 3,
                                                                y: 228, x: 164)>
dask.array<fetch_raster_window, shape=(73, 3, 228, 164), dtype=float64, chunksize=(1, 1, 228, 164), chunktype=numpy.ndarray>
Coordinates: (12/54)
  * time                                     (time) datetime64[ns] 2022-01-03...
    id                                       (time) <U24 'S2B_32TPR_20220103_...
  * band                                     (band) <U5 'red' 'green' 'blue'
  * x                                        (x) float64 6.874e+05 ... 6.891e+05
  * y                                        (y) float64 5.044e+06 ... 5.041e+06
    grid:code                                <U10 'MGRS-32TPR'
    ...                                       ...
    title                                    (band) <U20 'Red (band 4) - 10m'...
    proj:transform                           object {0, 5100000, 600000, 10, ...
    common_name                              (band) <U5 'red' 'green' 'blue'
    center_wavelength                        (band) float64 0.665 0.56 0.49
    full_width_half_max                      (band) float64 0.038 0.045 0.098
    epsg                                     int64 32632
Attributes:
    spec:        RasterSpec(epsg=32632, bounds=(687430.0, 5041290.0, 689070.0...
    crs:         epsg:32632
    transform:   | 10.00, 0.00, 687430.00|\n| 0.00,-10.00, 5043570.00|\n| 0.0...
    resolution:  10.0

We might be interested in aggregating our data over periods like week, month, year etc., defining what operation to use to combine the data available in the chosen period.

Using resample with a sampling frequency (e.g. ‘1MS’ ) to specify how to resample the data, we can achieve this easily:

s2_monthly_min = s2_cube.resample(time="1MS"). min(dim="time")

Check what happens to the datacube inspecting the resulting xArray object. Now the time dimension has 12 labels, one for each month.

s2_monthly_min
<xarray.DataArray 'stackstac-ff1f9c70d01b039d57a5e43d35c43d18' (time: 12,
                                                                band: 3,
                                                                y: 228, x: 164)>
dask.array<transpose, shape=(12, 3, 228, 164), dtype=float64, chunksize=(1, 1, 228, 164), chunktype=numpy.ndarray>
Coordinates: (12/24)
  * band                                     (band) <U5 'red' 'green' 'blue'
  * x                                        (x) float64 6.874e+05 ... 6.891e+05
  * y                                        (y) float64 5.044e+06 ... 5.041e+06
    grid:code                                <U10 'MGRS-32TPR'
    mgrs:grid_square                         <U2 'PR'
    constellation                            <U10 'sentinel-2'
    ...                                       ...
    proj:transform                           object {0, 5100000, 600000, 10, ...
    common_name                              (band) <U5 'red' 'green' 'blue'
    center_wavelength                        (band) float64 0.665 0.56 0.49
    full_width_half_max                      (band) float64 0.038 0.045 0.098
    epsg                                     int64 32632
  * time                                     (time) datetime64[ns] 2022-01-01...
Attributes:
    spec:        RasterSpec(epsg=32632, bounds=(687430.0, 5041290.0, 689070.0...
    crs:         epsg:32632
    transform:   | 10.00, 0.00, 687430.00|\n| 0.00,-10.00, 5043570.00|\n| 0.0...
    resolution:  10.0