Module sits
Here you will find all sits core methods. All classes and functions here are automatically loaded with from sits import sits.
sits.def_geobox
- sits.sits.def_geobox(bbox, crs_out=3035, resolution=10, shape=None)
This function creates an odc geobox.
- Parameters:
bbox (list) – coordinates of a bounding box in CRS units.
crs_out (str, optional) – CRS (EPSG code) of output coordinates. Defaults to 3035.
resolution (float, optional) – output spatial resolution in CRS units. Defaults to 10 (meters).
shape (tuple, optional) – output image size in pixels (x, y). Defaults to None.
- Returns:
geobox object
- Return type:
odc.geo.geobox.GeoBox
Example
>>> bbox = [100, 100, 200, 220] >>> crs_out = 3035 >>> # output geobox closest to the input bbox >>> geobox = def_geobox(bbox, crs_out)
>>> # output geobox with the same dimensions (number of rows and columns) >>> # as the input shape. >>> geobox = def_geobox(bbox, crs_out, shape=(10, 10))
sits.stacAttack
- class sits.sits.StacAttack(provider='mpc', collection='sentinel-2-l2a', key_sat='s2', bands=['B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B11', 'B12', 'SCL'])
Bases:
objectThis class aims to request time-series datasets on STAC catalog and store it as image or csv files.
- stac_conf
parameters for building datacube (xArray) from STAC items.
- Type:
dict
- Parameters:
provider (str, optional) – stac provider. Defaults to ‘mpc’. Can be one of the following: ‘mpc’ (Microsoft Planetary Computer), ‘aws’ (Amazon Web Services).
collection (str, optional) – stac collection. Defaults to ‘sentinel-2-l2a’.
bands (list, optional) – name of the field describing Y coordinates. Defaults to [‘B02’, ‘B03’, ‘B04’, ‘B05’, ‘B06’, ‘B07’, ‘B08’, ‘B8A’, ‘B11’, ‘B12’, ‘SCL’]
Example
>>> stacObj = StacAttack()
- filter_by_mask(mask_cover: float = 0.5, cube: str = 'sat', mask_update: bool = True)
Filters time steps in the specified data cube based on the ratio of masked pixels.
- Parameters:
mask_cover (float, optional) – maximum allowed ratio of masked pixels (min:0, max:1). Defaults to 0.5.
cube (str, optional) – datacube type. Defaults to ‘sat’. Can be one of the following: ‘sat’, ‘indices’.
mask_update (bool, optional) – update the related mask array. Defaults to True.
- fixS2shift(shiftval=-1000, minval=1, proc_keyword='s2:processing_baseline', version=4.0, mask='SCL')
Fix Sentinel-2 radiometric offset applied since the ESA Processing Baseline 04.00. For more information: https://sentinels.copernicus.eu/web/sentinel/-/copernicus-sentinel-2-major-products-upgrade-upcoming
- Parameters:
shiftval (int) – radiometric offset value. Defaults to -1000.
minval (int) – minimum radiometric value. Defaults to 1.
proc_keyword (str) – item metadata related to the version of Sentinel-2 processing baseline. Defaults to ‘s2:processing_baseline’.
version (float) – version of the processing baseline. Defaults to 4.0.
mask (str) – name of mask variable. Defaults to ‘SCL’.
Returns:
StacAttack.imagewith corrected radiometric values.
- gapfill(method='linear', first_last=True, **kwargs)
Gap-fill NaN pixel values through the satellite time-series.
- Parameters:
method (string, optional) – method to use for interpolation (see
xarray.DataArray.interpolate_na). Defaults to ‘linear’.first_last (bool, optional) – Interpolation of the first and last image of the satellite time-series with
xarray.DataArray.bfillandxarray.DataArray.ffill. Defaults to True.**kwargs – other arguments of
xarray.DataArray.interpolate_na.
Example
>>> stacObj.gapfill()
- loadCube(bbox, arrtype='image', dimx=5, dimy=5, resolution=10, crs_out=3035)
Load images according to a bounding box, with in option predefined pixels dimensions (x, y).
- Parameters:
bbox (list) – coordinates of bounding box [xmin, ymin, xmax, ymax] in the output crs unit.
(string (arrtype) – xarray dataset name. Defaults to ‘image’. Can be one of the following: ‘patch’, ‘image’, ‘masked’.
optional – xarray dataset name. Defaults to ‘image’. Can be one of the following: ‘patch’, ‘image’, ‘masked’.
dimx (int, optional) – number of pixels in columns. Defaults to 5.
dimy (int, optional) – number of pixels in rows. Defaults to 5.
resolution (float, optional) – spatial resolution (in crs unit). Defaults to 10.
crs_out (int, optional) – CRS of output coordinates. Defaults to 3035.
- Returns:
geobox object
StacAttack.geobox. xarray.Dataset: time-series imageStacAttack.cube.- Return type:
odc.geo.geobox.GeoBox
Example
>>> aoi_bounds = [0, 0, 1, 1] >>> stacObj.loadCube(aoi_bounds, arrtype='patch', dimx=10, dimy=10)
- mask_apply()
Apply mask pre-loaded as
StacAttack.maskon the satellite time-seriesStacAttack.cube.Example
>>> stacObj.mask() >>> stacObj.mask_apply()
- mask_conf(mask_array=None, mask_band='SCL', mask_values=[3, 8, 9, 10])
Load binary mask.
- Parameters:
mask_array (xarray.Dataarray, optional) – xarray.dataarray binary mask (with same dimensions as
StacAttack.cube). Defaults to None.mask_band (string, optional) – band name used as a mask (i.e. ‘SCL’ for Sentinel-2). Defaults to ‘SCL’.
mask_values (list, optional) – band values related to masked pixels. Defaults to [3, 8, 9, 10].
- Returns:
time-series of binary masks
StacAttack.mask- Return type:
xarray.Dataarray
Example
>>> stacObj.mask()
- searchItems(bbox_latlon, date_start=datetime.datetime(2023, 1, 1, 0, 0), date_end=datetime.datetime(2023, 12, 31, 0, 0), **kwargs)
Get list of stac collection’s items.
- Parameters:
bbox_latlon (list) – coordinates of bounding box.
date_start (datetime.datetime, optional) – start date. Defaults to ‘2023-01’.
date_end (datetime.datetime, optional) – end date. Defaults to ‘2023-12’.
**kwargs – others stac compliant arguments.
- Returns:
list of stac collection items
StacAttack.items.- Return type:
pystac.ItemCollection
Example
>>> stacObj.searchItems(aoi_bounds_4326)
- spectral_index(indices_to_compute: str | list[str], band_mapping: dict = None, **kwargs)
Calculate various spectral indices for remote sensing data using the spyndex and awesome-spectral-indices libraries.
- Parameters:
indices_to_compute (string or list) – The short names (see Spyndex) of spectral indices.
band_mapping (dict, optional) – A dictionary to map your dataset’s band names to spyndex’s standard band names (e.g., {‘R’: ‘B04’, ‘N’: ‘B08’}). If None, it assumes your dataset’s variable names are directly usable by spyndex.
**kwargs – other arguments
- Returns:
time-series image
StacAttack.indices.- Return type:
xarray.Dataset
Example
>>> stacObj.spectral_index('NDVI', {'R': 'B04', 'N': 'B08'})
- to_csv(outdir, gid=None, id_point='station_id')
Convert xarray dataset into csv file.
- Parameters:
outdir (str) – output directory.
gid (str, optional) – column name of ID. Defaults to None.
Example
>>> outdir = 'output' >>> stacObj.to_csv(outdir)
- to_nc(outdir, gid=None, cube='sat', filename=None)
Convert xarray dataset into netcdf file.
- Parameters:
outdir (str) – output directory.
gid (str, optional) – column name of ID. Defaults to None.
cube (str, optional) – datacube type. Defaults to ‘sat’. Can be one of the following: ‘sat’, ‘indices’.
filename (str, optional) – output filename with .nc extension. Defaults to None.
Example
>>> outdir = 'output' >>> stacObj.to_nc(outdir)
sits.Gdfgeom
- class sits.sits.Gdfgeom
Bases:
objectThis class aims to calculate vector’s buffers and bounding box.
- buffer
vector layer with buffer.
- Type:
GeoDataFrame
- bbox
vector layer’s bounding box.
- Type:
GeoDataFrame
- set_bbox(df_attr)
Calculate the bounding box for each
Csv2gdf’s GeoDataFrame feature.- Parameters:
df_attr (str) – GeoDataFrame attribute of class
Csv2gdf. Can be one of the following: ‘gdf’, ‘buffer’, ‘bbox’.outfile (str, optional) – ouput filepath. Defaults to None.
- Returns:
GeoDataFrame object
Csv2gdf.bbox.- Return type:
GeoDataFrame
Example
>>> geotable.set_bbox('buffer')
- set_buffer(df_attr, radius)
Calculate buffer geometries for each
Csv2gdf’s GeoDataFrame feature.- Parameters:
df_attr (str) – GeoDataFrame attribute of class
Csv2gdf. Can be one of the following: ‘gdf’, ‘buffer’, ‘bbox’.radius (float) – buffer distance in CRS unit.
outfile (str, optional) – ouput filepath. Defaults to None.
- Returns:
GeoDataFrame object
Csv2gdf.buffer.- Return type:
GeoDataFrame
Example
>>> geotable.set_buffer('gdf', 100)
- to_vector(df_attr, outfile=None, driver='GeoJSON')
Write a
Csv2gdf’s GeoDataFrame layer as a vector file.- Parameters:
df_attr (str) – GeoDataFrame attribute of class
Csv2gdf. Can be one of the following: ‘gdf’, ‘buffer’, ‘bbox’.outfile (str, optional) – Output path. Defaults to None.
driver (str, optional) – Output vector file format (see GDAL/OGR Vector drivers: https://gdal.org/drivers/vector/index.html). Defaults to “GeoJSON”.
Example
>>> filename = 'mygeom' >>> geotable.to_vector('gdf', f'output/{filename}_gdf.geojson') >>> geotable.to_vector('buffer', f'output/{filename}_buffer.geojson') >>> geotable.to_vector('bbox', f'output/{filename}_bbox.geojson')
sits.Vec2gdf
sits.Csv2gdf
- class sits.sits.Csv2gdf(csv_file, x_name, y_name, crs_in, id_name='no_id')
Bases:
GdfgeomThis class aims to load csv tables with geographic coordinates into GeoDataFrame object. It inherits methods and attributes from
Gdfgeomclass- crs_in
CRS of coordinates described in the csv table.
- Type:
int
- table
DataFrame object.
- Type:
DataFrame
- Parameters:
csv_file (str) – csv filepath.
x_name (str) – name of the field describing X coordinates.
y_name (str) – name of the field describing Y coordinates.
crs_in (int) – CRS of coordinates described in the csv table.
id_name (str, optional) – name of the ID field. Defaults to “no_id”.
Example
>>> csv_file = 'example.csv' >>> crs_in = 4326 >>> geotable = Csv2gdf(csv_file, 'longitude', 'latitude', crs_in)
- del_rows(col_name, rows_values)
Drop rows from
Csv2gdf.tableaccording to a column’s values.- Parameters:
col_name (str) – column name.
rows_values (list) – list of values.
- set_gdf(crs_out)
Convert the class attribute
Csv2gdf.table(DataFrame) into GeoDataFrame object, in the specified output CRS projection.- Parameters:
crs_out (int) – output CRS of GeoDataFrame.
outfile (str, optional) – Defaults to None.
- Returns:
GeoDataFrame object
Csv2gdf.gdf.- Return type:
GeoDataFrame
Example
>>> geotable.set_gdf(3035)
sits.Labels
- class sits.sits.Labels(geolayer)
Bases:
objectThis class aims to produce a image of labels from a vector file.
- Parameters:
geolayer (str or geodataframe) – vector layer to rasterize.
- Returns:
geodataframe
Labels.gdf.- Return type:
GeoDataFrame
Example
>>> geodataframe = <gdf object> >>> vlayer = Labels(geodataframe)
>>> vector_file = 'myVector.shp' >>> vlayer = Labels(vector_file)
- to_raster(id_field, geobox, filename, outdir, ext='tif', driver='GTiff')
Convert geodataframe into raster file while keeping a column attribute as pixel values.
- Parameters:
id_field (str) – column name to keep as pixels values.
geobox (odc.geo.geobox.GeoBox) – geobox object.
filename (str) – output raster filename.
outdir (str) – output directory.
ext (str, optional) – raster file extension. Defaults to “tif”.
driver (str, optional) – output raster format (gdal standard). Defaults to “GTiff”.
Example
>>> bbox = [0, 0, 1, 1] >>> crs_out = 3035 >>> resolution = 10 >>> geobox = def_geobox(bbox, crs_out, resolution) >>> vlayer.to_raster('id', geobox, 'output_img', 'output_dir')
sits.Multiproc
- class sits.sits.Multiproc(array_type, fext, outdir)
Bases:
objectThis class aims to parallelize the production of images or patches.
- Parameters:
array_type (str) – xarray dataset name. Can be one of the following: ‘patch’, ‘image’.
fext (str) – output file format: Can be one of the following: ‘nc’, ‘csv’
outdir (str) – output directory.
Example
>>> mproc = Multiproc('patch', 'nc', 'output')
- addParams_gapfill(method='linear', first_last=True, **kwargs)
Add optional parameters for
StacAttack.gapfill()called throughMultiproc.fetch_func().- Parameters:
method (string, optional) – method to use for interpolation (see
xarray.DataArray.interpolate_na). Defaults to ‘linear’.first_last (bool, optional) – Interpolation of the first and last image of the satellite time-series with
xarray.DataArray.bfillandxarray.DataArray.ffill. Defaults to True.**kwargs – other arguments of
xarray.DataArray.interpolate_na.
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_gapfill(method='nearest', first_last=False):
- addParams_loadCube(dimx=5, dimy=5, resolution=10, crs_out=3035)
Add optional parameters for
StacAttack.loadCube()called throughMultiproc.fetch_func().- Parameters:
dimx (int, optional) – number of pixels in columns. Defaults to 5.
dimy (int, optional) – number of pixels in rows. Defaults to 5.
resolution (float, optional) – spatial resolution (in crs unit). Defaults to 10.
crs_out (int, optional) – CRS of output coordinates. Defaults to 3035.
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_loadCube(dimx=20, dimy=20):
- addParams_mask(mask_array=None, mask_band='SCL', mask_values=[3, 8, 9, 10])
Add optional parameters for
StacAttack.mask()called throughMultiproc.fetch_func().- Parameters:
mask_array (xarray.Dataarray, optional) – xarray.dataarray binanry mask (with same dimensions as
StacAttack.cube). Defaults to None.mask_band (string, optional) – band name used as a mask (i.e. ‘SCL’ for Sentinel-2). Defaults to ‘SCL’.
mask_values (list, optional) – band values related to masked pixels. Defaults to [3, 8, 9, 10].
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_mask(mask_values=[0]):
- addParams_searchItems(date_start=datetime.datetime(2023, 1, 1, 0, 0), date_end=datetime.datetime(2023, 12, 31, 0, 0), **kwargs)
Add optional parameters for
StacAttack.searchItems()called throughMultiproc.fetch_func().- Parameters:
date_start (datetime.datetime, optional) – start date. Defaults to ‘2023-01’.
date_end (datetime.datetime, optional) – end date. Defaults to ‘2023-12’.
**kwargs (optional) – others stac compliant arguments, e.g.
queryparameters to filter according to cloud %.
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_searchItems(date_start=datetime(2016, 1, 1), query={"eo:cloud_cover": {"lt": 20}})
- addParams_spectral_index(indices_to_compute: str | list[str], band_mapping: dict = None, **kwargs)
Add optional parameters for
StacAttack.spectral_index()called throughMultiproc.fetch_func().- Parameters:
indices_to_compute (string or list) – The short names (see Spyndex) of spectral indices.
band_mapping (dict, optional) – A dictionary to map your dataset’s band names to spyndex’s standard band names (e.g., {‘R’: ‘B04’, ‘N’: ‘B08’}). If None, it assumes your dataset’s variable names are directly usable by spyndex.
**kwargs – other arguments
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_spectral_index('NDVI', {'R': 'B04', 'N': 'B08'})
- addParams_stacAttack(provider='mpc', collection='sentinel-2-l2a', key_sat='s2', bands=['B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B8A', 'B11', 'B12', 'SCL'])
Add optional parameters for
StacAttack class instancecalled throughMultiproc.fetch_func().- Parameters:
provider (str, optional) – stac provider. Defaults to ‘mpc’. Can be one of the following: ‘mpc’ (Microsoft Planetary Computer), ‘aws’ (Amazon Web Services).
collection (str, optional) – stac collection. Defaults to ‘sentinel-2-l2a’.
bands (list, optional) – name of the field describing Y coordinates. Defaults to [‘B02’, ‘B03’, ‘B04’, ‘B05’, ‘B06’, ‘B07’, ‘B08’, ‘B8A’, ‘B11’, ‘B12’, ‘SCL’]
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_stacAttack(bands=['B02', 'B03', 'B04'])
- addParams_to_raster(ext='tif', driver='GTiff')
Add optional parameters for
Labels.to_raster()called throughMultiproc.fetch_func().- Parameters:
ext (str, optional) – raster file extension. Defaults to “tif”.
driver (str, optional) – output raster format (gdal standard). Defaults to “GTiff”.
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.addParams_to_raster(driver="COG")
- add_label(geolayer, id_field)
Export an image of labels with the same dimensions than the datacube, by calling the method
Labels.to_raster().- Parameters:
geolayer (GeoDataFrame) – vector file.
id_field (str) – attribute field name.
Example
>>> mproc = Multiproc('patch', 'nc', 'output') >>> mproc.add_label(vlayer, 'myfield')
- dask_compute(scheduler_type='processes')
Call of
dask.computeto trigger the actual execution of delayed tasks (i.e.Multiproc.fetch_dask), gathering their results into a final output.- Parameters:
scheduler_type (str) –
type of scheduler. Defaults to ‘processes’. Can be one of the following: - Single-threaded Scheduler ‘single-threaded’ or ‘sync’:
Runs computations in a single thread without parallelism.
Suitable for debugging or when parallelism isn’t required.
- Threaded Scheduler ‘threads’:
Utilizes a pool of threads to execute tasks concurrently.
Good for I/O-bound tasks and when tasks release the Global Interpreter Lock (GIL).
- Multiprocessing Scheduler ‘processes’:
Uses a pool of separate processes to execute tasks in parallel.
Suitable for CPU-bound tasks and when tasks are limited by the GIL.
- Distributed Scheduler ‘distributed’:
Uses a distributed cluster to execute tasks.
Best for large-scale computations across multiple machines.
Example
>>> mproc.dask_compute()
- del_func()
Clear
Multiproc.fetch_dask, the list ofdask.delayedfunction’s instances.
- fetch_func(aoi_latlong, aoi_proj, gid, mask=False, gapfill=False, **kwargs)
Call of
dask.delayedto convert theMultiproc.__fdask()function into a delayed object, allowing for lazy evaluation and parallel execution, thus optimizing computational workflows.- Parameters:
aoi_latlong (list) – coordinates of bounding box.
aoi_proj (list) – coordinates of bounding box [xmin, ymin, xmax, ymax] in the output crs.
gid (int) – image/patch index.
**kwargs (dict) – additional arguments (i.e.
StacAttack.searchItems(),StacAttack.loadImgs(),StacAttack.loadPatches()).
- Returns:
list of
dask.delayedfunction’s instances.- Return type:
Multiproc.fetch_dask
Example
>>> for bboxes, gid in enumerate(my_df['bboxes']): mproc.fetch_func(bboxes[0], bboxes[1], gid)