Forest Clear-Cut Detection using Sentinel-2 Time Series
1. Installation of SITS package and its depedencies
First, install sits package with pip. We also need some other packages for displaying data.
[1]:
#!pip install -q --upgrade sits
Now we can import sits modules and some other libraries.
[2]:
import os
from sits import sits, analysis
import geopandas as gpd
from shapely.geometry import box
from datetime import datetime
import matplotlib.pyplot as plt
We also create a directory to store the output files.
[3]:
out_dir = 'test_data'
if not os.path.exists(out_dir):
os.makedirs(out_dir)
2. Study area
We have selected a study site in the Morvan Regional Natural Park, France. The workflow requires two spatial definitions: a bounding box in WGS84 (lat/lon) for querying the STAC catalog, and a projected version in EPSG:3035 to define the output CRS for analysis.
[4]:
# area of study
bbox_4326 = [4.000000, 47.000000, 4.100000, 47.050000]
print(f"bbox in EPSG:4326: {bbox_4326}")
geom = box(*bbox_4326)
# Load into a GeoDataFrame and set the original CRS (WGS84)
gdf = gpd.GeoDataFrame({'geometry': [geom]}, crs="EPSG:4326")
# Transform to EPSG:3035
gdf_3035 = gdf.to_crs("EPSG:3035")
# Extract the bounds as a list: [minx, miny, maxx, maxy]
bbox_3035 = list(gdf_3035.total_bounds)
print(f"bbox in EPSG:3035: {bbox_3035}")
bbox in EPSG:4326: [4.0, 47.0, 4.1, 47.05]
bbox in EPSG:3035: [3864842.734478761, 2671911.2763597923, 3872853.778674381, 2678061.0457251826]
[5]:
# open an interactive map directly in the notebook
gdf.explore()
[5]:
<div style="text-align: center;">
<img src="img/cc_area.png" alt="clear-cut method" width="500">
</div>
3. Pre-processing: STAC Queries and Forest Masking
3.1. Sentinel-2 time series
The retrieval process involves collecting Sentinel-2 (Level-2A) imagery acquired between January 1, 2019, and January 1, 2025, filtered for a cloud cover of less than 20%. We then construct a four-band geospatial data cube, consisting of B04 (Red), B08 (NIR), B11 (SWIR), and SCL (Scene Classification Layer), reprojected to EPSG:3035 at a 10m spatial resolution. During this process, radiometric offsets are corrected. Finally, the SCL layer is utilized to mask invalid pixels, such as clouds and cloud shadows (refer to tutorial 01 for additional details).
[6]:
bands = ['B04', 'B08', 'B11', 'SCL']
cloud_cover = 20
ts_S2 = sits.StacAttack(provider='mpc',
collection='sentinel-2-l2a',
bands=bands)
# search of items based on bbox coordinates and time interval criteria
ts_S2.searchItems(bbox_4326,
date_start=datetime(2019, 1, 1),
date_end=datetime(2025, 1, 1),
query={"eo:cloud_cover": {"lt": cloud_cover}})
# load of the time series in a lazy way
ts_S2.loadCube(bbox_3035, resolution=10, crs_out=3035)
ts_S2.fixS2shift()
ts_S2.mask_conf(mask_values=[0, 1, 3, 6, 8, 9, 10, 11])
ts_S2.mask_apply()
ts_S2.cube
[6]:
<xarray.Dataset> Size: 2GB
Dimensions: (time: 201, y: 616, x: 802)
Coordinates:
* y (y) float64 5kB 2.678e+06 2.678e+06 ... 2.672e+06 2.672e+06
* x (x) float64 6kB 3.865e+06 3.865e+06 ... 3.873e+06 3.873e+06
* time (time) datetime64[ns] 2kB 2019-02-05T10:52:31.024000 ... 202...
spatial_ref int64 8B 0
Data variables:
B04 (time, y, x) float32 397MB dask.array<chunksize=(1, 612, 612), meta=np.ndarray>
B08 (time, y, x) float32 397MB dask.array<chunksize=(1, 612, 612), meta=np.ndarray>
B11 (time, y, x) float32 397MB dask.array<chunksize=(1, 612, 612), meta=np.ndarray>
SCL (time, y, x) float32 397MB dask.array<chunksize=(1, 612, 612), meta=np.ndarray>3.2. Forest mask
The algorithm detects abrupt signal drops regardless of land cover type. Therefore, a forest mask is required to focus the detection exclusively on forested regions and avoid false positives in other areas. For this example, we use the ESRI 10 meter land cover product. This dataset provides a global estimates of 10-class land use/land cover (LULC) for 2020, derived from ESA Sentinel-2 imagery at 10m resolution.
[7]:
lulc = sits.StacAttack(provider='mpc',
collection='io-lulc',
bands=['data'])
lulc.searchItems(bbox_4326,
date_start=datetime(2019, 1, 1),
date_end=datetime(2020, 12, 31))
lulc.loadCube(bbox_3035, resolution=10, crs_out=3035)
lulc.cube.data.plot()
plt.axis('scaled')
plt.show()
Datetime parsing failed: time data '2020-06-01T00:00:00Z' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'
/home/osekenj/.local/lib/python3.11/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
dest = _reproject(
Like most land-cover classifications, this dataset uses integers to represent different classes. We need to determine the specific code used for forest pixels to isolate them for our analysis.
[8]:
for index, value in enumerate(lulc.items_prop["label:classes"][0][0]["classes"]):
print(f"Class index: {index}, Value: {value}")
Class index: 0, Value: nodata
Class index: 1, Value: water
Class index: 2, Value: trees
Class index: 3, Value: grass
Class index: 4, Value: flooded veg
Class index: 5, Value: crops
Class index: 6, Value: scrub
Class index: 7, Value: built area
Class index: 8, Value: bare
Class index: 9, Value: snow/ice
Class index: 10, Value: clouds
Following the ESRI 10m LULC scheme, the value 2 corresponds to the Forest class.
[9]:
forest_mask = (lulc.cube.data != 2).squeeze(drop=True)
forest_mask.plot()
plt.axis('scaled')
plt.show()
To constrain the satellite data using the forest mask, we utilize again the StacAttack.mask_conf() method by passing our boolean array to the mask_array parameter.
[10]:
ts_S2.mask_conf(mask_array=forest_mask)
ts_S2.mask_apply()
4. Clear-cut mapping
4.1. A few words about the method
This detection method exploits vegetation index time series to map forest clear-cuts by identifying abrupt temporal declines. The algorithm uses a convolution-based approach to calculate a value \(d\), which quantifies the transition of a pixel’s index value from a high, stable forest signal to a low post-harvest signal. A clear-cut is detected whenever \(d\) exceeds a configurable threshold \(d_{THRES}\) at a specific date \(t_i\). The intensity of the drop is categorized into
“strong” or “very strong” magnitudes, and metadata, such as the number of samples used for detection, are recorded to validate the result. The algorithm processes each unmasked pixel independently as a temporal stack. Originally implemented via the Orfeo ToolBox (OTB), this approach is now available in sits package.

Clear-cut detection in one pixel of Sentinel-2 time series. Only valid pixel values are shown. The top chart presents the NDVI time series, the bottom chart presents the “NDVI drop” time series. Maximum drop greater than the specified threshold \(d_{THRES}\) is labeled as clear-cut.
source:
Ose and R. Cresson, “Clear-Cuts Detection Services for The Monitoring Needs of the French Ministry of Agriculture,” IGARSS 2019 - 2019 IEEE International Geoscience and Remote Sensing Symposium, Yokohama, Japan, 2019, pp. 4284-4287, doi: 10.1109/IGARSS.2019.8897825.
4.2. Time series of Vegetation Index
As explained in tutorial 03, sits utilizes the spyndex package to calculate spectral indices. In this step, we generate a time series of NDMI.
[11]:
band_mapping = {'B': 'B02', 'G': 'B03', 'R': 'B04', 'N': 'B08', 'S1': 'B11'}
index = 'NDMI'
ts_S2.spectral_index(index, band_mapping)
4.3. Clear-cuts detection
The clear-cut detection logic is implemented within the ClearCut class of the sits.analysis module. This class is designed to detect abrupt changes in univariate time series (e.g., spectral indices like NDVI) by comparing observations within two sliding temporal windows:
Backward Window: summarizes past observations to establish a baseline.
Forward Window: summarizes upcoming observations to detect potential drops.
A change is flagged when the forward window shows a significant decline relative to the backward window, exceeding a user-defined threshold. The detection process follows two mandatory stages:
ClearCut.season_of_interest(): this method filters the data cube to retain only observations within a specific yearly range. By default, we use start=”06-01” and end=”08-31” to focus on the summer season. This is crucial for studying deciduous broadleaf forests; during winter, the natural loss of leaves significantly impacts vegetation index values, which frequently leads to false detections.ClearCut.detect_anomalies(): this core method identifies anomalies by calculating the magnitude of change (the absolute difference between the means of the backward and forward windows). An anomaly is recorded if this magnitude exceeds the primary threshold.
Key Parameters: > - thresholds (list, default [0.2, 0.3, 0.4]): Values used to classify the severity of the anomaly. > - window_backward / window_forward (int): The temporal extent (in days) for the past (720d) and future (60d) baselines. > - min_obs_backward / min_obs_forward (int): The minimum required valid pixels within those windows. If these are not met, the algorithm searches beyond the window limits, and the inrange flag is set to False. > - out_crs: The target CRS for the output (default “EPSG:3035”).
Output Dataset: the resulting Clearcut.detection is an xarray.Dataset containing: > - magnitude_first / max / last: The intensity of the first, strongest, and most recent anomalies. > - date_first / max / last: The corresponding dates (Unix epoch) of these events. > - mask: A binary layer indicating where an anomaly was detected. > - inrange: A flag indicating if the detection stayed within the defined temporal window constraints. > - classif: A categorical layer classifying the event
based on the provided thresholds (low, medium and high confidence).
Optionally, you can export the result to a NetCDF file using ClearCut.detection.to_netcdf() method.
[12]:
cc_detect = analysis.ClearCut(ts_S2.indices.NDMI)
cc_detect.season_of_interest()
cc_detect.detect_anomalies(thresholds=[0.25, 0.35, 0.45],
min_obs_backward=5,
min_obs_forward=5)
cc_detect.detection.to_netcdf(os.path.join(out_dir, f"clearcut_example.nc"))
The output xarray.Dataarray includes 9 variables:
[13]:
cc_detect.detection
[13]:
<xarray.Dataset> Size: 17MB
Dimensions: (y: 616, x: 802)
Coordinates:
* y (y) float64 5kB 2.678e+06 2.678e+06 ... 2.672e+06 2.672e+06
* x (x) float64 6kB 3.865e+06 3.865e+06 ... 3.873e+06 3.873e+06
time datetime64[ns] 8B 2019-06-02T10:40:31.024000
spatial_ref int64 8B 0
Data variables:
magnitude_first (y, x) float32 2MB -0.3028 -0.3303 -0.5003 ... nan nan nan
date_first (y, x) float32 2MB 1.878e+04 1.878e+04 ... nan nan
magnitude_max (y, x) float32 2MB -0.3028 -0.3303 -0.5003 ... nan nan nan
date_max (y, x) float32 2MB 1.878e+04 1.878e+04 ... nan nan
magnitude_last (y, x) float32 2MB -0.2884 -0.2538 -0.304 ... nan nan nan
date_last (y, x) float32 2MB 1.879e+04 1.879e+04 ... nan nan
mask (y, x) bool 494kB True True True True ... False False False
inrange (y, x) bool 494kB True True True True ... True True True
classif (y, x) int64 4MB 1 1 3 2 3 3 3 3 0 0 ... 0 0 0 0 0 0 0 0 0Now we can visualize the results using an interactive Folium map. The following variables are displayed:
date_max: The image acquisition date on which the maximum magnitude drop was detected within the time series.classif: A classification result where 1 represents low magnitude, 2 medium magnitude, and 3 high magnitude. These classes can be interpreted as the level of detection confidence.magnitude_max: The maximum recorded value of the magnitude drop.
[14]:
import folium
from shapely.geometry import Point
import numpy as np
from rasterio.enums import Resampling
def add_xarray_to_map(m, da, name, cmap_name='magma', vmin=None, vmax=None, opacity=0.9):
"""
Reprojects an xarray DataArray, colorizes it, and adds it to a folium map.
"""
# 1. Reproject to WGS84
# Note: Using astype('float32') avoids the 'bool' KeyError
da_4326 = (da.astype('float32')
.rio.write_crs("EPSG:3035", inplace=True)
.rio.reproject("EPSG:4326", resampling=Resampling.nearest))
# 2. Get Bounds for Folium: [[min_lat, min_lon], [max_lat, max_lon]]
b = da_4326.rio.bounds()
bounds = [[b[1], b[0]], [b[3], b[2]]]
# 3. Colorize the data
data = da_4326.values
v_min = vmin if vmin is not None else np.nanmin(data)
v_max = vmax if vmax is not None else np.nanmax(data)
norm = plt.Normalize(vmin=v_min, vmax=v_max)
cmap = plt.get_cmap(cmap_name)
rgba_data = cmap(norm(data))
# 4. Add to map
folium.raster_layers.ImageOverlay(
image=rgba_data,
bounds=bounds,
opacity=opacity,
name=name
).add_to(m)
return bounds
# --- Main Logic ---
# Initialize Map
m = folium.Map(tiles=None)
# Add Satellite Basemap
folium.TileLayer(
tiles='https://mt1.google.com/vt/lyrs=s&x={x}&y={y}&z={z}',
attr='Google',
name='Google Satellite',
overlay=False
).add_to(m)
# Process layers using the function
classif_filtered = cc_detect.detection.classif.where(
cc_detect.detection.classif.isin([1, 2, 3])
)
layers = [
(cc_detect.detection.date_max, "Change Date", 'viridis', None, None),
(classif_filtered, "Change Classif", 'YlOrRd', None, None),
(cc_detect.detection.magnitude_max, "Change Magnitude", 'magma', None, None),
]
last_bounds = None
for da, name, cmap, vmin, vmax in layers:
last_bounds = add_xarray_to_map(m, da, name, cmap_name=cmap, vmin=vmin, vmax=vmax)
# Add 2 reference points
point_A = {"coord_3035":[3869300, 2674000], "coord_4326":[47.01649, 4.05690]}
point_B = {"coord_3035":[3869100, 2675500], "coord_4326":[47.0298, 4.05275]}
folium.Marker(
location=point_A["coord_4326"],
popup="A",
icon=folium.Icon(color="green", icon="leaf")
).add_to(m)
folium.Marker(
location=point_B["coord_4326"],
popup="B",
icon=folium.Icon(color="red", icon="leaf")
).add_to(m)
# Center the map automatically based on the last layer
if last_bounds:
m.fit_bounds(last_bounds)
folium.LayerControl().add_to(m)
m
[14]:
<div style="text-align: center;">
<img src="img/cc_map.png" alt="clear-cut method" width="500">
</div>
4.4. Plotting the reference points
In the figure above, we have represented two points that illustrate different forest dynamics:
Point A (Green): Represents a stable forest area unaffected by changes.
Point B (Red): Corresponds to a clear-cut area.
We will now perform a detailed time-series analysis of the NDMI for both points to better understand the behavior of the sits.analysis.ClearCut method. We utilize the sits.analysis.SitsPlotter class to visualize the index time series and execute the change detection algorithm. This class handles extraction, outlier filtering, and plotting of temporal data for a specific spatial coordinate. It segments the time series into ‘fitting’ and ‘monitoring’ windows relative to a break date and
supports statistical outlier removal (Sigma or IQR).
Reference site A
[15]:
d1 = datetime(2019, 1, 1)
d2 = datetime(2025, 1, 1)
daterange = [d1, d2]
plot_A = analysis.SitsPlotter(ts_S2.indices, "NDMI",
daterange,
break_date=datetime(2021, 1, 1),
coords=point_A["coord_3035"],
ylim=[-0.25, 0.80],
filter_method="sigma",
filter_value=1.5)
plot_A.plot_model(degree=3, include_trend=True, envelope_quantiles=(0.05, 0.95))
[15]:
array([ 6.31387562e-01, -2.60446376e-05, 5.13758415e-02, -2.37047788e-02,
1.38931659e-02, 6.14276945e-03, -8.39480969e-03, 1.23986521e-02])
Reference point B
Since Point B represents a clear-cut site, we execute the run_detection() method of analysis.SitsPlotter to visualize the change detection results. This allows us to examine the temporal anomalies and identify the break date corresponding to the maximum magnitude detected by the analysis.ClearCut algorithm.
[16]:
plot_B = analysis.SitsPlotter(ts_S2.indices, "NDMI",
daterange,
break_date=datetime(2021, 1, 1),
coords=point_B["coord_3035"],
ylim=[-0.25, 0.80],
filter_method="sigma",
filter_value=1.5)
plot_B.plot_model(degree=3, include_trend=True, envelope_quantiles=(0.05, 0.95))
plot_B.run_detection(thresholds=[0.25, 0.35, 0.45])
The class also supports exporting plots directly to PNG files if needed.
[17]:
plot_B.save_plot(out_dir)
[17]:
'test_data/sits_NDMI.png'